Method: Gower similarity measure

  • Pro: It can be used for vectors with both categorical and continuous variables.
  • Con: It takes into account neither correlation between variables nor variable importance.

dgower(x_i,x_j)=1ppk=1dk(xki,xkj)

For a continuous variable:

dk(xki,xkj)=|xkixkj|max

For a categorical variable:

d^k(x_i^k, x_j^k)=1_{x_i^k \neq x_j^k}