preview

Economics

Better Essays

Bootstrapping Regression Models
Appendix to An R and S-PLUS Companion to Applied Regression

John Fox
January 2002

1

Basic Ideas

Bootstrapping is a general approach to statistical inference based on building a sampling distribution for a statistic by resampling from the data at hand. The term ‘bootstrapping,’ due to Efron (1979), is an allusion to the expression ‘pulling oneself up by one’s bootstraps’ – in this case, using the sample data as a population from which repeated samples are drawn. At first blush, the approach seems circular, but has been shown to be sound.
Two S libraries for bootstrapping are associated with extensive treatments of the subject: Efron and
Tibshirani’s (1993) bootstrap library, and Davison and …show more content…


Next, we compute the statistic T for each of the bootstrap samples; that is Tb = t(S∗ ). Then the b ∗ distribution of Tb around the original estimate T is analogous to the sampling distribution of the estimator
T around the population parameter θ. For example, the average of the bootstrapped statistics,


T = E ∗ (T ∗ ) =

R b=1 R


Tb


estimates the expectation of the bootstrapped statistics; then B ∗ = T − T is an estimate of the bias of T , that is, T − θ. Similarly, the estimated bootstrap variance of T ∗ ,
V ∗ (T ∗ ) =

R

b=1 (Tb



− T )2
R−1

estimates the sampling variance of T .
The random selection of bootstrap samples is not an essential aspect of the nonparametric bootstrap:
At least in principle, we could enumerate all bootstrap samples of size n. Then we could calculate E ∗ (T ∗ ) and V ∗ (T ∗ ) exactly, rather than having to estimate them. The number of bootstrap samples, however, is astronomically large unless n is tiny.2 There are, therefore, two sources of error in bootstrap inference: (1) the error induced by using a particular sample S to represent the population; and (2) the sampling error produced by failing to enumerate all bootstrap samples. The latter source of error can be controlled by making the number of bootstrap replications R sufficiently large.

2

Bootstrap Confidence Intervals

There are several approaches to

Get Access