16.8 Uniform Manifold Approximation and Projection (UMAP)

  • Non-linear, like ICA
  • Powerful: divides the group a lot
  • Uses distance-based nearest neighbor to find local areas where data points are more likely related
  • Creates smaller feature set
  • Unsupervised and supervised versions
  • Can be sensitive to tuning parameters
library(embed)
bean_rec_trained %>%
  step_umap(all_numeric_predictors(), num_comp = 4) %>%
  plot_validation_results() +
  ggtitle("UMAP (unsupervised)")

bean_rec_trained %>%
  step_umap(all_numeric_predictors(), outcome = "class", num_comp = 4) %>%
  plot_validation_results() +
  ggtitle("UMAP (supervised)")

The supervised method looks to perform better.