11.8 Data cleaning
Coerce date to correct format, replace attendance missing values with 0
chi_attendance <- chi_attendance |>
mutate(
the_date = ymd(Date),
Attendance = na_if(Attendance, 0)
)
Chicago ballpark attendance comparison
chi_attendance |>
ggplot(aes(
x = wday(the_date), y = Attendance,
color = HomeTeam
)
) +
geom_jitter(height = 0, width = 0.2, alpha = 0.2) +
geom_smooth() +
scale_y_continuous("Attendance") +
scale_x_continuous(
"Day of the Week", breaks = 1:7,
labels = wday(1:7, label = TRUE)
) +
scale_color_manual(values = crc_fc)
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## Warning: Removed 10 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 10 rows containing missing values or values outside the scale range
## (`geom_point()`).