5.3 Approaches for Novel Categories
Suppose that a model is built to predict the probability that an individual works in a STEM profession and that this model depends on city names. The model will be able to predict the probability of STEM profession if a new individual lives in one of the cities in the training set.
But what happens to the model prediction when a new individual lives in a city that is not represented?
One strategy would be to use the previously mentioned “other” category to capture new values. This approach can also be used with feature hashing.