Bayes and Logistic Regression

The class posterior distribution for a Naive Bayes classification model has the same form as multinomial logistic regression:

$p(y = c|\vec{x}, \vec{\theta}) = \displaystyle\frac{e^{\beta_{c}^{T}\vec{x} + \gamma_{c}}}{\displaystyle\sum_{c'=1}^{C} e^{\beta_{c}^{T}\vec{x} + \gamma_{c}}}$

Naive Bayes

$f(y | x_{1}, x_{2}, ..., x_{p}) = \frac{f(y) \cdot L(y | x_{1}, x_{2}, ..., x_{p})}{\sum_{y'} f(y') \cdot L(y' | x_{1}, x_{2}, ..., x_{p})}$

conditionally independent $\rightarrow$ computationally efficient
generalizes to more than two categories
assumptions violated commonly in practice
optimizes joint likelihood $\displaystyle\prod_{n} p(y_{n},\vec{x}_{n}|\vec{\theta})$

Logistic Regression

$\log\left(\frac{\pi}{1-\pi}\right) = \beta_{0} + \beta_{1}X_{1} + \cdots + \beta_{k}X_{p}$

binary classification
coefficients $\rightarrow$ illumination of the relationships among these variables
optimizes conditional likelihood $\displaystyle\prod_{n} p(y_{n}|\vec{x}_{n},\vec{\theta})$