Model Audit
Key performance metrics by model
The deep GBM model has consistnetly superior metrics on the test set.
MSE | RMSE | R2 | MAD | |
---|---|---|---|---|
GBM shallow | 1.28e+13 | 3,576,335 | 0.845 | 943,741 |
GBM deep | 6.60e+12 | 2,568,530 | 0.920 | 672,319 |
RF | 1.59e+13 | 3,986,263 | 0.807 | 820,040 |
RM | 1.21e+13 | 3,474,644 | 0.853 | 753,942 |
Residual Distribution
The deep gbm shows lower variance in residuals, as well as lower median residual values.
Predicted vs. Actual Values
The models tend to overestimate value for less expensive players, and underestimate value for expensive players.
Example R Code for producing performance metrics and plots
Below is example R code for producing model performance metrics and plot for the deep gbm model:
Measures for: regression
mse : 6.597346e+12
rmse : 2568530
r2 : 0.9198023
mad : 672318.6
Residuals:
0% 10% 20% 30% 40% 50%
-27531034.41 -1470372.04 -804372.16 -425491.53 -193239.87 32455.74
60% 70% 80% 90% 100%
274858.59 578489.36 996114.61 1896853.48 28142927.91
fifa_md_gbm_deep <- model_diagnostics(fifa_gbm_exp_deep)
plot(fifa_md_gbm_deep,
variable = "y", yvariable = "y_hat") +
scale_x_continuous("Value in Euro", trans = "log10",
labels = dollar_format(suffix = "€", prefix = "")) +
scale_y_continuous("Predicted value in Euro", trans = "log10",
labels = dollar_format(suffix = "€", prefix = "")) +
geom_abline(slope = 1) +
ggtitle("Predicted and observed players' values", "")