9.3 Building the regression model

9.3.1 Data model

We will have n data pairs of bike ridership (Y) and temperature (X) :

{(Y1,X1),(Y2,X2),...,(Yn,Xn)}

Here prior knowledge suggest positive linear relationship between ridership and temperature: the warmer it is, the more likely people are using bike share service.

We are now moving away from the global mean (μ) to local mean (μi, where i is one day). If the relationship is linear :

μi=β0+β1Xi

β0 is the intercept coefficent but it is hard to interpret (would you rent bike when it is 0 degree F?)

β1 is the Temperature coefficient it indicates the typical change in ridership for every one unit increase in temperature. In case we have just one quantitative predictor it is called the slope.

We can plunk this assumption in our model :

Yi|β0,β1,σindN(μi,σ2)withμi=β0+β1Xi

As you can see σ is now about variability about the local mean

9.3.2 Normal regression assumptions

  • Structure of the data: accounting for X, Y for one day is independent of an other day

  • Structure of the relationship: Y can be written as a linear function of predictor X : μ=β0+β1X

  • Structure of the variability: at any value of X, Y will vary normally around μ with a consistent standard deviation σ

9.3.3 Specifying the priors

Quiz: What are our parameters ?

Results

β0,β1,σ

First assumption our parameters are independent

β0N(m0,s20)

β1N(m1,s21)

m0,m1,s0,s1 are parameters of parameters so they are hyperparameters

σExp(l)

9.3.4 Putting it all together

Yi|β0,β1,σindN(μi,σ2)withμi=β0+β1Xi β0N(m0,s20)

β1N(m1,s21)

σExp(l)

Model building one step at a time!

  1. Y is discrete or continuous appropriate model for data

  2. Rewrite the mean of Y as a function of predictors X (e.g. μ=β0+β1X)

  3. Identify unknown parameters in your model

  4. Note the values these parameters might take Identify appropriate priors