13.4 Latent-data formulation

  • Alternate formulation using a continuous ‘latent’ variable zi. It is completely equivalent to the logistic regression.

  • ‘latent’ means unobserved

$$ yi={1if zi>00if zi<0zi=Xiβ+ϵi $$

Here the ϵi are independent and have the logistic distribution:

  • The distribution of the error terms is similar to a Gaussian with σ=1.6. What we relaxed that and use a Gaussian fit σ ?

  • Answer: It wont work because the ‘latent’ scale parameter σ is non-identifiable. You can pick any $you want and you can get the same predictions by scaling the slope and intercept!

  • So why bother?

    • In some settings direct information is available for the zi’s

    • We will see in later chapters this latent formulation can be useful.