4.6 Pythagorean Formula
Bill James, regarded as the godfather of sabermetrics, empirically derived the following non-linear formula to estimate winning percentage, called the Pythagorean expectation.
\[\widehat{W_{\text{pct}}} = \frac{R^{2}}{R^{2} + {RA^{2}}}\]
ch4_data <- ch4_data |>
mutate(Wpct_pyt = R^2 / (R^2 + RA^2),
resid_pyt = Wpct - Wpct_pyt)
# RMSE with exponent of 2
sqrt(mean(ch4_data$resid_pyt^2))
## [1] 0.02570405
4.6.1 What should the exponent be?
\[\frac{W}{W+L} = W_{\text{pct}} \approx \widehat{W_{\text{pct}}} = \frac{R^{k}}{R^{k} + {RA^{k}}}\]
algebra
\[\begin{array}{rcl} \frac{W}{W+L} = W_{\text{pct}} & \approx & \widehat{W_{\text{pct}}} = \frac{R^{k}}{R^{k} + {RA^{k}}} \\ \frac{W}{W+L} & \approx & \frac{R^{k}}{R^{k} + {RA^{k}}} \\ WR^{k} + WRA^{k} & \approx & WR^{k} + LR^{k} \\ WRA^{k} & \approx & LR^{k} \\ \frac{W}{L}\cdot RA^{k} & \approx & R^{k} \\ \frac{W}{L} & \approx & \frac{R^{k}}{RA^{k}} \\ \frac{W}{L} & \approx & \left(\frac{R}{RA}\right)^{k} \\ \ln\frac{W}{L} & \approx & \ln\left(\frac{R}{RA}\right)^{k} \\ \end{array}\]
\[\ln\frac{W}{L} \approx k\ln\left(\frac{R}{RA}\right)\]
ch4_data <- ch4_data |>
mutate(logWratio = log(W/L),
logRratio = log(R/RA))
pyt_fit <- lm(logWratio ~ 0 + logRratio, data = ch4_data)
pyt_fit$coefficients
## logRratio
## 1.834988
ch4_data <- ch4_data |>
mutate(Wpct_pyt = R^1.835 / (R^1.835 + RA^1.835),
resid_pyt = Wpct - Wpct_pyt)
# RMSE with exponent of 1.835
sqrt(mean(ch4_data$resid_pyt^2))
## [1] 0.02494779
4.6.2 Luck
We can find the expected number of wins for a full season by multiplying the estimated win percentage (from the Pythagorean formula with an exponent of 1.835) by 162 games.
2011 Season | |||||
Performance vs Pythag Expectation | |||||
teamID | W | W_pyt | playoff_bool | diff | desc |
---|---|---|---|---|---|
DET | 95 | 88.5 | made playoffs | 6.5 | lucky |
SFN | 86 | 80.0 | missed playoffs | 6.0 | lucky |
MIL | 96 | 90.1 | made playoffs | 5.9 | lucky |
ARI | 94 | 88.3 | made playoffs | 5.7 | lucky |
CLE | 80 | 75.3 | missed playoffs | 4.7 | lucky |
ATL | 89 | 85.3 | missed playoffs | 3.7 | lucky |
CHA | 79 | 75.3 | missed playoffs | 3.7 | lucky |
PIT | 72 | 69.6 | missed playoffs | 2.4 | lucky |
BAL | 69 | 66.7 | missed playoffs | 2.3 | lucky |
SLN | 90 | 88.1 | made playoffs | 1.9 | lucky |
TOR | 81 | 79.2 | missed playoffs | 1.8 | lucky |
WAS | 80 | 78.8 | missed playoffs | 1.2 | lucky |
LAA | 86 | 84.9 | missed playoffs | 1.1 | lucky |
MIN | 63 | 61.9 | missed playoffs | 1.1 | lucky |
CHN | 71 | 70.3 | missed playoffs | 0.7 | lucky |
SEA | 67 | 66.7 | missed playoffs | 0.3 | lucky |
FLO | 72 | 72.4 | missed playoffs | −0.4 | unlucky |
TBA | 91 | 91.4 | made playoffs | −0.4 | unlucky |
PHI | 102 | 102.6 | made playoffs | −0.6 | unlucky |
NYN | 77 | 78.6 | missed playoffs | −1.6 | unlucky |
TEX | 96 | 98.1 | made playoffs | −2.1 | unlucky |
LAN | 82 | 84.8 | missed playoffs | −2.8 | unlucky |
OAK | 74 | 77.2 | missed playoffs | −3.2 | unlucky |
CIN | 79 | 82.5 | missed playoffs | −3.5 | unlucky |
BOS | 90 | 93.7 | missed playoffs | −3.7 | unlucky |
COL | 73 | 77.2 | missed playoffs | −4.2 | unlucky |
NYA | 97 | 101.2 | made playoffs | −4.2 | unlucky |
HOU | 56 | 62.2 | missed playoffs | −6.2 | unlucky |
KCA | 71 | 77.8 | missed playoffs | −6.8 | unlucky |
SDN | 71 | 78.8 | missed playoffs | −7.8 | unlucky |
table code
ch4_data |>
filter(yearID == 2011) |>
mutate(W_pyt = Wpct_pyt*162) |>
select(teamID, W, W_pyt, playoff_bool) |>
mutate(diff = W - W_pyt) |>
mutate(desc = ifelse(diff > 0,
"lucky", "unlucky")) |>
arrange(desc(diff)) |>
gt() |>
cols_align(align = "center") |>
data_color(columns = diff,
palette = "viridis") |>
data_color(columns = playoff_bool,
palette = "viridis",
reverse = TRUE) |>
fmt_number(columns = c(W_pyt, diff),
decimals = 1) |>
tab_header(title = "2011 Season",
subtitle = "Performance vs Pythag Expectation")
2023 Season | |||||
Performance vs Pythag Expectation | |||||
teamID | W | W_pyt | playoff_bool | diff | desc |
---|---|---|---|---|---|
MIA | 84 | 74.9 | made playoffs | 9.1 | lucky |
BAL | 101 | 93.8 | made playoffs | 7.2 | lucky |
DET | 78 | 72.6 | missed playoffs | 5.4 | lucky |
PIT | 76 | 71.2 | missed playoffs | 4.8 | lucky |
CIN | 82 | 77.5 | missed playoffs | 4.5 | lucky |
ARI | 84 | 79.5 | made playoffs | 4.5 | lucky |
WAS | 71 | 67.1 | missed playoffs | 3.9 | lucky |
NYA | 82 | 78.3 | missed playoffs | 3.7 | lucky |
SFN | 79 | 76.2 | missed playoffs | 2.8 | lucky |
ATL | 104 | 101.3 | made playoffs | 2.7 | lucky |
MIL | 92 | 89.7 | made playoffs | 2.3 | lucky |
OAK | 50 | 48.9 | missed playoffs | 1.1 | lucky |
PHI | 90 | 88.9 | made playoffs | 1.1 | lucky |
SLN | 71 | 70.5 | missed playoffs | 0.5 | lucky |
LAA | 73 | 72.5 | missed playoffs | 0.5 | lucky |
TOR | 89 | 88.8 | made playoffs | 0.2 | lucky |
LAN | 100 | 99.9 | made playoffs | 0.1 | lucky |
CHA | 61 | 61.2 | missed playoffs | −0.2 | unlucky |
TBA | 99 | 99.8 | made playoffs | −0.8 | unlucky |
CLE | 76 | 77.2 | missed playoffs | −1.2 | unlucky |
COL | 59 | 60.4 | missed playoffs | −1.4 | unlucky |
BOS | 78 | 80.6 | missed playoffs | −2.6 | unlucky |
SEA | 88 | 91.3 | missed playoffs | −3.3 | unlucky |
HOU | 90 | 93.5 | made playoffs | −3.5 | unlucky |
NYN | 75 | 79.8 | missed playoffs | −4.8 | unlucky |
TEX | 90 | 96.2 | made playoffs | −6.2 | unlucky |
MIN | 87 | 93.2 | made playoffs | −6.2 | unlucky |
CHN | 83 | 90.2 | missed playoffs | −7.2 | unlucky |
KCA | 56 | 63.5 | missed playoffs | −7.5 | unlucky |
SDN | 82 | 92.0 | missed playoffs | −10.0 | unlucky |