7.1 Intuition and Simplified Example
Overview
- Interaction here is defined as the deviation from additivity.
- Effects of a given feature depend on other features
- In this chapter, we focus on pairwise interactions, but can extend to higher order interactions
Simplified Data Example
For illustrative purposes, create 2-way table using Titanic Dataset:
- Limit to male subpopulation
- Bifurcate Age variable into “Boys (0-16)” and “Adults (>16)” categories
- Simplify Class into “2nd” class and “other” levels
Table 7.1: Proportion of Male Survivors on Titanic
Class | Boys (0-16) | Adults (>16) | Total |
---|---|---|---|
2nd | 11/12 = 91.7% | 13/166 = 7.8% | 24/178 = 13.5% |
other | 22/69 = 31.9% | 306/1469 = 20.8% | 328/1538 = 21.3% |
Total | 33/81 = 40.7% | 319/1635 = 19.5% | 352/1716 = 20.5% |
Explain Survival Probability for Boys in 2nd Class
Additive Explanation 1
- Marginal survival probability for 2nd class is 13.5%, so additive effect of class for this instance is -7% (13.5% - 20.5%) from mean.
- Survival probability for boys in 2nd class is 91.7%, so additive effect of boys is 78.2% (91.7% - 13.5%)
Additive Explanation 2
- Probability of survival for boys is 40.7%, so additive effect of boys from the mean is 20.2% (40.7% - 20.5%)
- 2nd class survival chance for boys is 91.7%. Therefore, the 2nd class additive effect is 51% (91.7% - 40.7%)
Interaction Explanation
- Additive explanations give different effects depending on order due to interaction
- In other words, class depends on age and vice versa
- Calculate Interaction effect
- Calculate contribution of 2nd class/Boys together: (91.7% - 20.5% = 71.2%)
- Subtract individual variable contributions to calculate net interaction effect: (71.2% - (-7%) - 20.2%), or 58%,
- If there were no interaction effects, the probability of survival could be calculated the mean be the mean survival rate + individual variable effects (i.e., 20.5% -7% + 20.2% = 33.7%). However, this is incorrect here due to the presence of the class/age interaction.