16.11 Localized step-wise procedure

Observing only single feature effects at a time implies that dependencies between features are not taken into account. That could produce inaccurate and misleading explanations of the model’s internal logic.

16.11.1 Implementation

Source: https://ema.drwhy.ai/iBreakDown.html

titanic_imputed <- archivist::aread("pbiecek/models/27e5c")
titanic_rf <- archivist:: aread("pbiecek/models/4e0fc")
(henry <- archivist::aread("pbiecek/models/a6538"))
##   class gender age sibsp parch fare  embarked
## 1   1st   male  47     0     0   25 Cherbourg
library("DALEX")
## Welcome to DALEX (version: 2.4.3).
## Find examples and detailed introduction at: http://ema.drwhy.ai/
## Additional features will be available after installation of: ggpubr.
## Use 'install_dependencies()' to get all suggested dependencies
## 
## Attaching package: 'DALEX'
## The following object is masked _by_ '.GlobalEnv':
## 
##     titanic_imputed
## The following object is masked from 'package:vip':
## 
##     titanic
## The following object is masked from 'package:dplyr':
## 
##     explain
library("randomForest")
## randomForest 4.7-1.1
## Type rfNews() to see new features/changes/bug fixes.
## 
## Attaching package: 'randomForest'
## The following object is masked from 'package:rattle':
## 
##     importance
## The following object is masked from 'package:dplyr':
## 
##     combine
## The following object is masked from 'package:ggplot2':
## 
##     margin
explain_rf <- DALEX::explain(
  model = titanic_rf,  
  data = titanic_imputed[, -9],
  y = titanic_imputed$survived == "yes", 
  label = "Random Forest"
)
## Preparation of a new explainer is initiated
##   -> model label       :  Random Forest 
##   -> data              :  2207  rows  8  cols 
##   -> target variable   :  2207  values 
##   -> predict function  :  yhat.randomForest  will be used (  default  )
##   -> predicted values  :  No value for predict function target column. (  default  )
##   -> model_info        :  package randomForest , ver. 4.7.1.1 , task classification (  default  ) 
##   -> model_info        :  Model info detected classification task but 'y' is a logical . Converted to numeric.  (  NOTE  )
##   -> predicted values  :  numerical, min =  0 , mean =  0.2353095 , max =  1  
##   -> residual function :  difference between y and yhat (  default  )
##   -> residuals         :  numerical, min =  -0.892 , mean =  0.0868473 , max =  1  
##   A new explainer has been created!
bd_rf <- predict_parts(
  explainer = explain_rf,
  new_observation = henry,
  type = "break_down_interactions"
)


bd_rf
##                                             contribution
## Random Forest: intercept                           0.235
## Random Forest: class = 1st                         0.185
## Random Forest: gender = male                      -0.124
## Random Forest: embarked:fare = Cherbourg:25        0.107
## Random Forest: age = 47                           -0.125
## Random Forest: sibsp = 0                          -0.032
## Random Forest: parch = 0                          -0.001
## Random Forest: prediction                          0.246
plot(bd_rf)

h2o::h2o.shutdown(prompt = FALSE)