Bayes and Logistic Regression

The class posterior distribution for a Naive Bayes classification model has the same form as multinomial logistic regression:

\[p(y = c|\vec{x}, \vec{\theta}) = \displaystyle\frac{e^{\beta_{c}^{T}\vec{x} + \gamma_{c}}}{\displaystyle\sum_{c'=1}^{C} e^{\beta_{c}^{T}\vec{x} + \gamma_{c}}}\]

Naive Bayes

\[f(y | x_{1}, x_{2}, ..., x_{p}) = \frac{f(y) \cdot L(y | x_{1}, x_{2}, ..., x_{p})}{\sum_{y'} f(y') \cdot L(y' | x_{1}, x_{2}, ..., x_{p})}\]

  • conditionally independent \(\rightarrow\) computationally efficient
  • generalizes to more than two categories
  • assumptions violated commonly in practice
  • optimizes joint likelihood \(\displaystyle\prod_{n} p(y_{n},\vec{x}_{n}|\vec{\theta})\)

Logistic Regression

\[\log\left(\frac{\pi}{1-\pi}\right) = \beta_{0} + \beta_{1}X_{1} + \cdots + \beta_{k}X_{p}\]

  • binary classification
  • coefficients \(\rightarrow\) illumination of the relationships among these variables
  • optimizes conditional likelihood \(\displaystyle\prod_{n} p(y_{n}|\vec{x}_{n},\vec{\theta})\)