The Bootstrap

The bootstrap works like this:

Assume that the sample distribution is representative of the population distribution.⁶
Construct a new simulated population by making a gazillion copies of your sample (in practice, this just means you draw from the sample with replacement).
Now it is easy to do lots of simulated experiments on your simulated population!
- Draw lots of bootstrap samples from your simulated population, with each sample having the same n as your original sample.
- Calculate your sample statistic for each of the bootstrap samples.
- Calculate the standard deviation of the resulting distribution (of bootstrapped sample statistics); this is the estimated standard error for your measured sample statistic.
- Calculate the difference between the mean of the bootstrap distribution and the statistic calculated on your original sample; this is the estimated bias for your measurement.

We can use the same sampling demo to get a feel for how bootstrapping works.

It doesn’t have to be perfect, but if it’s too far off then nothing you calculate from the sample will be meaningful anyway.↩︎