Table 2

Cross-section individual level data Monte Carlo Rejection Rates of True Null Hypothesis (Slope = 0) with Different Number of Clusters and Different Rejection Methods

Nominal 5 percent rejection rates
Numbers of Clusters
Wald Test Method610203050
Different standard errors and critical values
    White robust, T(N – k) for critical value0.4390.4570.4710.4620.498
    Cluster on state, T(N – k) for critical value0.2150.1470.1040.0830.078
    Cluster on state, T(G – 1) for critical value0.1250.1030.0820.0690.075
    Cluster on state, T(G – 2) for critical value0.1050.0990.0760.0690.075
    Cluster on state, CR2 bias correction, T(G − 1) for critical value0.0820.0700.0620.0600.065
    Cluster on state, CR3 bias correction, T(G − 1) for critical value0.0480.0500.0500.0520.061
    Cluster on state, CR2 bias correction, T(IK degrees of freedom)0.0520.0500.0470.0470.054
    Cluster on state, T(CSS effective number of clusters)0.1140.0790.0570.0560.061
    Pairs cluster bootstrap for standard error, T(G − 1) for critical value0.0820.0720.0690.0670.074
Bootstrap percentile-Γ methods
    Pairs cluster bootstrap0.0090.0310.0460.0510.061
    Wild cluster bootstrap, Rademacher two-point distribution, low p-value0.0970.0650.0620.0510.060
    Wild cluster bootstrap, Rademacher two-point distribution, mid p-value0.0680.0650.0620.0510.060
    Wild cluster bootstrap, Rademacher two-point distribution, high p-value0.0410.0640.0620.0510.060
    Wild cluster bootstrap, Webb 6 point distribution0.0790.0670.0610.0510.061
    Wild cluster bootstrap, Rademacher two-point, do not impose null hypothesis0.0860.0630.0500.0530.056
IK effective DOF (mean)
IK effective DOF (5th percentile)
IK effective DOF (95th percentile)
CSS effective number of clusters (mean)
Average number of observations1,5542,6185,2107,80313,055
  • Notes: March 2012 CPS data, 20 percent sample from IPUMS download. For six and ten clusters, 4,000 Monte Carlo replications. For 20–50 clusters, 1,000 Monte Carlo replications. The bootstraps use 399 replications. “IK effective DOF” from Imbens and Kolesar (2013), and “CSS effective number of clusters” from Carter, Schnepel, and Steigerwald (2013), see Section VID. Row 11 uses lowest p-value from interval, when Wild percentile-T bootstrapped p-values are not point identified due to few clusters.

  • Row 12 uses midrange of interval, and Row 13 uses largest p-value of interval.