9.3 Building the regression model
9.3.1 Data model
We will have n data pairs of bike ridership (Y) and temperature (X) :
{(Y1,X1),(Y2,X2),...,(Yn,Xn)}
Here prior knowledge suggest positive linear relationship between ridership and temperature: the warmer it is, the more likely people are using bike share service.
We are now moving away from the global mean (μ) to local mean (μi, where i is one day). If the relationship is linear :
μi=β0+β1Xi
β0 is the intercept coefficent but it is hard to interpret (would you rent bike when it is 0 degree F?)
β1 is the Temperature coefficient it indicates the typical change in ridership for every one unit increase in temperature. In case we have just one quantitative predictor it is called the slope.
We can plunk this assumption in our model :
Yi|β0,β1,σind∼N(μi,σ2)withμi=β0+β1Xi
As you can see σ is now about variability about the local mean
9.3.2 Normal regression assumptions
Structure of the data: accounting for X, Y for one day is independent of an other day
Structure of the relationship: Y can be written as a linear function of predictor X : μ=β0+β1X
Structure of the variability: at any value of X, Y will vary normally around μ with a consistent standard deviation σ
9.3.3 Specifying the priors
Quiz: What are our parameters ?
Results
β0,β1,σ
First assumption our parameters are independent
β0∼N(m0,s20)
β1∼N(m1,s21)
m0,m1,s0,s1 are parameters of parameters so they are hyperparameters
σ∼Exp(l)
9.3.4 Putting it all together
Yi|β0,β1,σind∼N(μi,σ2)withμi=β0+β1Xi β0∼N(m0,s20)
β1∼N(m1,s21)
σ∼Exp(l)
Model building one step at a time!
Y is discrete or continuous → appropriate model for data
Rewrite the mean of Y as a function of predictors X (e.g. μ=β0+β1X)
Identify unknown parameters in your model
Note the values these parameters might take → Identify appropriate priors