6.5 Exercises
6.5.1 Exercise 7
We take our model as :
yi=β0+p∑j=1xijβj+ϵi
Here the ϵi are IID from a normal random distrubtion N(0,σ2) The likelihood is simply a product of normal distributions with mean μi=β0+∑pj=1xijβj and standard deviation σ :
L∝e−12σ2∑i(yi−(β0+∑pj=1xijβj))2 we only care about the parts that depends on the βi so dont worry about the normalization.
The posterior is simply proportional to the product of L and the prior
P(β|Data)∝P(Data|β)P(β)
P(β|Data)∝e−12σ2∑i(yi−β0−∑pj=1xijβj)2p∏j=1e−|βi|/b again dropping any constants of proportionality that do not depend on the parameters.
Now combine the exponentials:
P(β|Data)∝e−12σ2∑i(yi−β0−∑pj=1xijβj)2−∑pj=1|βi|/b
The mode of this distribution is the value for the βi for which the exponent is maximized, which means to find the mode we need to minimize:
12σ2∑i(yi−β0−p∑j=1xijβj)2+p∑j=1|βi|/b or after multiplying through by 2σ2
∑i(yi−β0−p∑j=1xijβj)2+p∑j=12σ2|βi|/b
This is the same form as 6.7
I think it should be clear that if you work throuhg the exact same steps with prior (for each βi) e−β2i2c you end up with the posterior:
e−12σ2∑i(yi−β0−∑pj=1xijβj)2−∑pj=1β2i2c
And to find the to find the mode of the posterior, to finding the minimum of:
∑i(yi−β0−p∑j=1xijβj)2+p∑j=1σ2cβ2i which is of the same form as 6.5. That this mode is also the mean follows since the posterior in this case is a multinormal distribution in βi (it’s quadratic)