Method

Basic Equations

Mathematical representation of PD profile value for model f(), variable j at value z: gjPD(z)=EX_j{f(Xj|=z)}

where X_j refers to joint distribution of all explanatory variables other than XJ

We rarely know true distribution of X_j, so we typically estimate using the empirical distribution in our training data:

ˆgjPD(z)=1nni=1f(x_j|=zi).

The above equation refers to the mean of CP profiles for XJ

Clustered partial-dependence profiles

  • Mean of CP profiles might not be a good representation if profiles are not parallel.
  • Alternative approach would be to create multiple clusters of CP profiles:
    • Use K-means or hierarchical clustering to identify clusters
    • Can use Euclidean distance between CP profiles for identifying similar instances

Example clustered PDP using rf model on titanic dataset Source: Figure 17.2

Grouped partial-dependence profiles

  • We can use grouped PDPs if we can explicitly identify features that influence the shape of the CP profile for the explanatory variable of interest
  • Obvious use case is when model includes interaction between variable of interest and another one.

Example grouped PDP using rf model on titanic dataset Source: Figure 17.3

Contrastive partial-dependence profiles

We can plot PD profiles for multiple models together on same chart.

Example grouped PDP using rf model on titanic dataset Source: Figure 17.4