8.17 Adding color to mosaic plot
In mosaic plots, the color of the tiles can also be used to indicate the degree relationship among the variables.
If we assume that these three variables are independent, we can examine the residuals (the error between a predicted value and the observed actual value) from the model and shade the tiles to match.
In the graph below, dark blue represents more cases than expected given independence, and dark red represents less cases than expected if independence holds.
mosaic(tbl,
shade = TRUE,
legend = TRUE,
labeling_args = list(set_varnames = c(Sex = "Gender",
Survived = "Survived",
Class = "Passenger Class")),
set_labels = list(Survived = c("No", "Yes"),
Class = c("1st", "2nd", "3rd", "Crew"),
Sex = c("F", "M")),
main = "Titanic data")
We can see that if class, gender, and survival are independent, we are seeing many more male crew perishing, and 1st, 2nd and 3rd class females surviving than would be expected.
Conversely, far fewer 1st class passengers (both male and female) died than would be expected by chance, thus the assumption of independence is rejected.
NOTE: For complicated tables, labels can easily overlap. See
labeling_border
, for plotting options.