NBC Math
MLEs
- binary features
\[\hat{\theta}_{dc} = \frac{N_{dc}}{N_{c}}\]
- discrete features
\[\hat{\theta}_{dck} = \displaystyle\frac{N_{dck}}{N_{c}}\]
- numerical features
\[\begin{array}{rcl} \hat{\mu}_{dc} & = & \displaystyle\frac{1}{N_{dc}} \displaystyle\sum_{n:y_{n} = c} x_{nd} \\ \hat{\sigma}_{dc}^{2} & = & \displaystyle\frac{1}{N_{dc}} \displaystyle\sum_{n:y_{n} = c} (x_{nd} - \hat{\mu}_{dc})^{2} \\ \end{array}\]
- MAP: add-one smoothing
\[\begin{array}{rcl} \bar{\theta}_{dc} & = & \displaystyle\frac{1 + N_{dc1}}{2 + N_{dc}} \\ p(y = c|\vec{x}, D) & \propto & \bar{\pi}_{c}\displaystyle\prod_{d}\prod_{k} \bar{\theta}_{dck} \cdot I(x_{d} = k) \\ \end{array}\]
Imputation
Suppose that we are missing the value of \(x_{j}\)
- Gaussian discriminant analysis
\[p(y=c|\vec{x}_{i \neq j}, \vec{\theta}) = p(y = c)\displaystyle\sum_{x_{j}} p(x_{j}, \vec{x}_{i \neq j}|y = c, \vec{\theta})\]
- Naive Bayes classifier
\[\displaystyle\sum_{x_{j}} p(x_{j}, x_{i \neq j} | y = c, \vec{\theta}) = \displaystyle\prod_{i \neq j}^{D} p(x_{i}|\vec{\theta}_{dc})\]