7.3 Realistic Example Using Titanic Dataset

This section shows manual calculations of variable importance using a random forest model on the Titanic dataset.

The explanations relate to fictitious passenger, Johnny D, described in an earlier chapter.

Table 7.2: Variable and Interaction Contributions

Column descriptions:

  • Column 1 - Variable or paired variable
  • Column 2 - Paired-variable contributions
  • Column 3 - Net interaction effect - calculated using columns 2 and 4
  • Column 4 - Single variable contributions
Variable \(\Delta^{\{i,j\}|\emptyset}(\underline{x}_*)\) \(\Delta_{I}^{\{i,j\}}(\underline{x}_*)\) \(\Delta^{i|\emptyset}(\underline{x}_*)\)
age 0.270
fare:class 0.098 -0.231
class 0.185
fare:age 0.249 -0.164
fare 0.143
gender -0.125
age:class 0.355 -0.100
age:gender 0.215 0.070
embarked -0.011
embarked:age 0.269 0.010
parch:gender -0.136 -0.008
sibsp 0.008
sibsp:age 0.284 0.007
sibsp:class 0.187 -0.006
embarked:fare 0.138 0.006
sibsp:gender -0.123 -0.005
fare:parch 0.145 0.005
parch:sibsp 0.001 -0.004
parch -0.003
parch:age 0.264 -0.002
embarked:gender -0.134 0.002
embarked:parch -0.012 0.001
fare:sibsp 0.152 0.001
embarked:class 0.173 -0.001
gender:class 0.061 0.001
embarked:sibsp -0.002 0.001
parch:class 0.183 0.000

Example Calculation for Net Interaction Effect of Age:Fair

  • fare contribution (column 4) = 0.143
  • age contribution (column 4) = 0.270
  • fair:age contribution (column 2) = 0.249
  • Net interaction = 0.249 - 0.143 - 0.270 = -0.164

Steps for Calculating Variable Importance Tables and Breakdown Plots

  1. Rank and sort variables and interaction net effects according to absolute value of contributions–see Table 2.
  2. Each variable should only appear once, either as a single variable or as part of an interaction effect. Keep top contribution only.
  3. Calculate variable importance measure as described in chapter six

Table 7.3: Variable-Importance Measures

Variable \(j\) \(v(j,\underline{x}_*)\) \(v_0+\sum_{k=1}^j v(k,\underline{x}_*)\)
intercept (\(v_0\)) 0.235
age = 8 1 0.269 0.505
fare:class = 72:1st 2 0.039 0.544
gender = male 3 -0.083 0.461
embarked = Southampton 4 -0.002 0.458
sibsp = 0 5 -0.006 0.452
parch = 0 6 -0.030 0.422

Below is the break-down plot.

Source: Figure 7.1