5.3 Approaches for Novel Categories

Suppose that a model is built to predict the probability that an individual works in a STEM profession and that this model depends on city names. The model will be able to predict the probability of STEM profession if a new individual lives in one of the cities in the training set.

But what happens to the model prediction when a new individual lives in a city that is not represented?

One strategy would be to use the previously mentioned “other” category to capture new values. This approach can also be used with feature hashing.