3.4 Why Tidy Principles and {tidymodels}?

The {tidyverse} has four guiding principles which {tidymodels} shares.

  • It is human centered, i.e. the {tidyverse} is designed specifically to support the activities of a human data analyst.

    • Functions use sensible defaults, or use no defaults in cases where the user must make a choice (e.g. a file path).
    • {recipes} and {parnsip} enable data frames to be used every where in the modeling process. Data frames are often more convenient than working with matrices/vectors.
  • It is consistent, so that what you learn about one function or package can be applied to another, and the number of special cases that you need to remember is as small as possible.

    • Object orientated programming (mainly S3) for functions such as predict() provide a consistent interface to the user.
    • broom::tidy() output is in a consistent format (data frame). List outputs provided by package-specific functions vary.
  • It is composable, allowing you to solve complex problems by breaking them down into small pieces, supporting a rapid cycle of exploratory iteration to find the best solution.

    • {recipes}, {parsnip}, {tune}, {dials}, etc are separate packages used in a tidy machine learning development workflow. It may seem inconvenient to have so many packages to perform specific tasks, but such a paradigm is helpful for decomposing the whole model design process, often making problems feel more manageable.
  • It is inclusive, because the tidyverse is not just the collection of packages, but it is also the community of people who use them.

    • Although the {tidyverse} and {tidymodels} are opinionated in their design, the developers are receptive to public feedback.