• Modelado Tidy con R - Club de Lectura
  • Bienvenida
    • Reuniones del club de lectura
  • 1 Software para modelado
    • 1.1 The pit of success
    • 1.2 Types of models
    • 1.3 Terminology
    • 1.4 The data analysis process
    • 1.5 The modeling process
    • 1.6 Videos de las reuniones
      • 1.6.1 Cohorte 1
  • 2 A tidyverse primer
    • 2.1 Tidyverse design Principles
    • 2.2 Design for Humans - Overview
    • 2.3 Design for Humans and the Tidyverse
    • 2.4 Reusing existing data structures
    • 2.5 Designed for the pipe
    • 2.6 Designed for Functional Programming
    • 2.7 Tibbles vs. Data Frames
    • 2.8 How to read and wrangle data
    • 2.9 Further Reading
    • 2.10 Videos de las reuniones
      • 2.10.1 Cohorte 1
  • 3 A review of R modeling fundamentals
    • 3.1 R formula syntax
      • 3.1.1 Recap
    • 3.2 Inspecting and developing models
    • 3.3 More of {base} and {stats}
    • 3.4 Why Tidy Principles and {tidymodels}?
    • 3.5 Videos de las reuniones
      • 3.5.1 Cohorte 1
  • Revisión de Tidyverse
    • Usando pipes
    • Iteraciones en R
      • En paralelo
    • Referencias
    • Videos de las reuniones
      • Cohorte 1
  • Basics
  • 4 The Ames housing data
    • 4.1 Pittsburgh: a parallel real world example
    • 4.2 Videos de las reuniones
      • 4.2.1 Cohorte 1
  • 5 Spending our data
    • 5.1 Spending our data
    • 5.2 Common methods for splitting data
    • 5.3 Class imbalance
      • 5.3.1 Stratified sampling simulation
    • 5.4 Continuous outcome data
    • 5.5 Time series data
    • 5.6 Multi-level data
    • 5.7 What proportion should be used?
    • 5.8 Summary
      • 5.8.1 References
    • 5.9 Videos de las reuniones
      • 5.9.1 Cohorte 1
  • 6 Ajustando modelos con parsnip
    • 6.1 Crear un modelo
      • 6.1.1 Diferentes Interfaces de Modelado
      • 6.1.2 Especificación del Modelo
      • 6.1.3 Ajustando el Modelo
      • 6.1.4 Argumentos de Modelo Generalizados
    • 6.2 Usar los resultados del Modelo
    • 6.3 Haciendo Predicciones
    • 6.4 Paquetes Adjacentes a {tidymodels}
    • 6.5 Resumen
    • 6.6 Videos de las reuniones
      • 6.6.1 Cohorte 1
  • AI Ethics
    • 6.7 Videos de las reuniones
      • 6.7.1 Cohorte 1
  • 7 A model workflow
    • 7.1 Workflows
    • 7.2 Demonstration
      • 7.2.1 Some data exploration and cleaning
    • 7.3 Modeling with workflows
      • 7.3.1 Different model, same recipe
      • 7.3.2 Same model, different preprocessing
    • 7.4 Managing many workflows
    • 7.5 Notes
    • 7.6 Videos de las reuniones
      • 7.6.1 Cohorte 1
  • 8 Feature engineering with recipes
    • 8.1 Videos de las reuniones
      • 8.1.1 Cohorte 1
  • 9 Judging model effectiveness
    • 9.1 Measures of Model Fit
    • 9.2 Disclaimers
    • 9.3 Regression Metrics
    • 9.4 Binary Classification Metrics
    • 9.5 References
    • 9.6 Videos de las reuniones
      • 9.6.1 Cohorte 1
  • Review of chapters 4-9
    • 9.7 Videos de las reuniones
      • 9.7.1 Cohorte 1
  • Tools for Creating Effective Models
  • 10 Resampling for evaluating performance
    • 10.1 Why?
    • 10.2 Resubstitution approach
    • 10.3 Resampling methods
      • 10.3.1 Cross-validation
      • 10.3.2 Validation sets
      • 10.3.3 Boostrapping
      • 10.3.4 Rolling forecasting origin resampling
    • 10.4 Estimating performance
    • 10.5 Parallel processing
    • 10.6 Saving the resampled objects
    • 10.7 Videos de las reuniones
      • 10.7.1 Cohorte 1
  • 11 Comparing models with resampling
    • 11.1 Calculate performance statistics
    • 11.2 Calculate performance statistics: {workflowsets}
    • 11.3 Within-resample correlation
    • 11.4 Practical effect size
    • 11.5 Simple Comparison
    • 11.6 Bayesian methods
    • 11.7 Videos de las reuniones
      • 11.7.1 Cohorte 1
  • 12 Model tuning and the dangers of overfitting
    • 12.1 What is a Tuning Parameter?
      • 12.1.1 Examples
    • 12.2 When not to tune
    • 12.3 Decisions, Decisions…
    • 12.4 What Metric Should We Use?
    • 12.5 Can we make our model too good?
    • 12.6 Tuning Parameter Optimization Strategies
    • 12.7 Tuning Parameters in tidymodels {dials}
    • 12.8 Let’s try an example:
    • 12.9 Build our random forest model:
    • 12.10 Add tuning parameters:
    • 12.11 Updating tuning parameters:
    • 12.12 Finalizing tuning parameters:
    • 12.13 What is next?
    • 12.14 Videos de las reuniones
      • 12.14.1 Cohorte 1
  • 13 Grid search
    • 13.1 Regular and non-regular grids
      • 13.1.1 Regular Grids
      • 13.1.2 Irregular Grids
    • 13.2 Evaluating the grid
    • 13.3 Finalizing the model
    • 13.4 Tools for efficient grid search
      • 13.4.1 Submodel optimization
      • 13.4.2 Parallel processing
      • 13.4.3 Benchmarking Parallel with boosted trees
      • 13.4.4 Racing Methods
    • 13.5 Chapter Summary
    • 13.6 Videos de las reuniones
      • 13.6.1 Cohorte 1
  • 14 Iterative search
    • 14.1 SVM model as motivating example
    • 14.2 Bayesian Optimization
      • 14.2.1 Gaussian process model, at a high level
    • 14.3 Simulated annealing
      • 14.3.1 How it works
      • 14.3.2 The tune_sim_anneal() function
    • 14.4 References
    • 14.5 Videos de las reuniones
      • 14.5.1 Cohorte 1
  • 15 Screening Many Models
    • 15.1 Obligatory Setup
    • 15.2 Creating workflow_sets
    • 15.3 Ranking models
    • 15.4 Finalizing the model
    • 15.5 Videos de las reuniones
      • 15.5.1 Cohorte 1
  • Review of chapters 10-15
    • 15.6 Videos de las reuniones
      • 15.6.1 Cohorte 1
  • 16 Dimensionality reduction
    • 16.1 {recipes} without {workflows}
    • 16.2 Principal Component Analysis (PCA)
    • 16.3 Partial Least Squares (PLS)
    • 16.4 Independent Component Anysis (ICA)
    • 16.5 Uniform Manifold Approximation and Projection (UMAP)
    • 16.6 Modeling
    • 16.7 Videos de las reuniones
      • 16.7.1 Cohorte 1
  • Other Topics
  • 17 Encoding categorical data
    • 17.1 Slide 1 Title
    • 17.2 Slide 2 Title
    • 17.3 Videos de las reuniones
      • 17.3.1 Cohorte 1
  • 18 Explaining models and predictions
    • 18.1 Chapter 18 Setup
    • 18.2 Overview
    • 18.3 Local Explanations
    • 18.4 Local Explanations for Interactions
    • 18.5 Global Explanations
    • 18.6 Global Explanations from Local Explanations
    • 18.7 References
    • 18.8 Videos de las reuniones
      • 18.8.1 Cohorte 1
  • 19 When should you trust predictions?
    • 19.1 Equivocal Results
    • 19.2 Model Applicability
    • 19.3 Videos de las reuniones
      • 19.3.1 Cohorte 1
  • 20 Ensembles of models
    • 20.1 Ensembling
    • 20.2 Ensembling with stacks!
    • 20.3 Define some models
    • 20.4 Initialize and add members to stack.
    • 20.5 Blend, fit, predict
    • 20.6 Videos de las reuniones
      • 20.6.1 Cohorte 1
  • 21 Inferential analysis
    • 21.1 Dataset used for demonstrating inference
    • 21.2 Tidy method from the {broom} package
    • 21.3 {infer} for simple, high level hypothesis testing
      • 21.3.1 p value for idependence based on simulation with permutation
      • 21.3.2 Confidence interval for correlation based on simulation with bootstrapping
      • 21.3.3 Use theory instead of simulation
      • 21.3.4 Linear models with multiple explanatory variables
    • 21.4 reg_intervals from {rsample}
    • 21.5 Inference with lower level helpers
    • 21.6 Videos de las reuniones
      • 21.6.1 Cohorte 1
  • Publicado con bookdown

Modelado Tidy con R - Club de Lectura

Capítulo 14 Iterative search

Learning objectives:

  • Use tune::tune_bayes() to optimize model parameters using Bayesian optimization.
    • Describe how a Gaussian process model can be applied to parameter optimization.
    • Explain how acquisition functions can be expressed as a trade-off between exploration and exploitation.
    • Describe expected improvement, the default acquisition function used by {tidymodels}.
  • Use finetune::tune_sim_anneal() to optimize model parameters using Simulated annealing.
    • Describe simulated annealing search.