Pros and Cons
Pros | Cons |
---|---|
Popular, well-understood in DS community | Maximum number of features in plot is two |
Simple, intuitive way to summarize effect of feature on target variable | Assumption of independence; problematic with correlated explanatory variables |
Multiple software packages in R and Python; also easy to implement from scratch | Heterogeneous effects may be hidden in basic PDP plot; may need grouped or clustered profiles for better insight |
Can be used to assess variable importance | Can be time-consuming to run for medium and large datasets; likely need to use samplse |
Calculation has a casual interpretation for the model of interest. | Can be misleading in areas where data are sparse in the training sample |
Note:
- Pros/Cons above reference both the EMA text as well as Christopher Molnar’s Interpretable ML book: https://christophm.github.io/interpretable-ml-book/pdp.html