The Bootstrap
The bootstrap works like this:
- Assume that the sample distribution is representative of the population distribution.6
- Construct a new simulated population by making a gazillion copies of your sample (in practice, this just means you draw from the sample with replacement).
- Now it is easy to do lots of simulated experiments on your simulated population!
- Draw lots of bootstrap samples from your simulated population, with each sample having the same n as your original sample.
- Calculate your sample statistic for each of the bootstrap samples.
- Calculate the standard deviation of the resulting distribution (of bootstrapped sample statistics); this is the estimated standard error for your measured sample statistic.
- Calculate the difference between the mean of the bootstrap distribution and the statistic calculated on your original sample; this is the estimated bias for your measurement.
We can use the same sampling demo to get a feel for how bootstrapping works.
It doesn’t have to be perfect, but if it’s too far off then nothing you calculate from the sample will be meaningful anyway.↩︎