8.17 Adding color to mosaic plot

  • In mosaic plots, the color of the tiles can also be used to indicate the degree relationship among the variables.

  • If we assume that these three variables are independent, we can examine the residuals (the error between a predicted value and the observed actual value) from the model and shade the tiles to match.

  • In the graph below, dark blue represents more cases than expected given independence, and dark red represents less cases than expected if independence holds.

mosaic(tbl, 
       shade = TRUE,
       legend = TRUE,
       labeling_args = list(set_varnames = c(Sex = "Gender",
                                             Survived = "Survived",
                                             Class = "Passenger Class")),
       set_labels = list(Survived = c("No", "Yes"),
                         Class = c("1st", "2nd", "3rd", "Crew"),
                         Sex = c("F", "M")),
       main = "Titanic data")

  • We can see that if class, gender, and survival are independent, we are seeing many more male crew perishing, and 1st, 2nd and 3rd class females surviving than would be expected.

  • Conversely, far fewer 1st class passengers (both male and female) died than would be expected by chance, thus the assumption of independence is rejected.

  • NOTE: For complicated tables, labels can easily overlap. See labeling_border, for plotting options.