Documentation for Hoxby Comment
This page contains material to accompany the paper,
Rothstein, Jesse (2007). "Does Competition Among Public Schools Benefit Students and Taxpayers? Comment," American Economic Review 97(5), December 2007, pp. 2026-2037.
I. Background Information
The paper is a comment on
Hoxby, Caroline M. "Does Competition among Public Schools Benefit Students and Taxpayers?" American Economic Review, December 2000, 90(5), pp. 1209-38.
II. The Exchange
The current exchange has three parts:
- My Comment (click here for a public-access, pre-publication version)
- Hoxby's Reply
- A Rejoinder to Hoxby's Reply
Items 1 and 2 are forthcoming in the American Economic Review. Item 3 is available only here. Readers may also be interested in a detailed Appendix to the Comment, containing additional specifications and detailed definitions of the variables and samples.
The Comment relies on restricted-use data from the National Education Longitudinal Study (NELS). Because these data are confidential, neither the original data nor the analysis sample can be distributed freely. Researchers hoping to replicate my analysis will need to obtain licenses for the NELS restricted-use data. They will need three data sets, all available to NELS license-holders:
- The restricted-access version of the NELS 88/94, base year through third follow-up data, NCES 96-130.
- A CD that Hoxby has asked NCES to distribute on her behalf, containing her analysis programs, the raw data that she uses (beyond what comes from the NELS), and the resulting data set. The CD that I use is dated September 2, 2004.
- A CD that I have created containing my programs and all of the raw and intermediate data sets, along with log files and documentation. This CD is dated November 2007.
Again, these data sets can be obtained only by researchers who have executed licenses for the confidential version of the NELS data. Interested researchers should contact me, as I may be able to help navigate the process.
A note on versions: Hoxby's CD (item #2, above) was created prior to the dissemination of her Reply, and does not produce the results in that Reply. I have been unable to obtain an updated version of the CD that does produce those results, though one may eventually become available through NCES. Also, I prepared an earlier version of my CD (item #3) to accompany the working paper version of my comment. This was dated December 2004. It produces identical results to the current version and differs only in that it is less completely documented.
A public-use version of the replication archive
I have also created a public-use version of the replication archive, for those who do not have access to the confidential data sets described above. This contains my programs and the public-use raw and intermediate data sets, but not the NELS data nor any intermediate data sets that derive from the NELS data.
There is one portion of my programs, consisting of fewer than ten lines of Stata code, that is not distributed here. This code manually assigns MSA codes to a a few NELS schools for which the zip code information (used in Table 5 of the Comment) was insufficient to permit a machine match. I cannot distribute this publicly, as it would reveal the location of the relevant NELS schools, and would therefore violate the confidentiality restrictions which govern access to the NELS data.
This omission affects only the zip-code match, which is used only in Table 5 of the Comment. Users of the "public use" archive who also have access to the NELS restricted-use data and to Hoxby's CD can reproduce all of the results in the Comment but those in Table 5. That table relies on the excluded program, so the one created by the "public use" programs differs slightly from what is in the published paper. The public use archive contains an alternative version of Table 5 that does not rely on the confidential code. This is nearly identical to the published Table, and can be used to check that the programs are running correctly.
There are two versions of the public use archive. The AER data archive contains a complete version, with all of the raw data used by the programs. This results in an extremely large archive, as some of the raw data sets are quite sizable. This web site contains a smaller version of the archive that excludes the raw data. This version, which includes all programs used in the Comment, is less than 1% as large as the full archive. Because it includes the MSA- and district-level intermediate data sets that are created by the included programs from the raw data and that then form the basis for all further analyses, researchers who do not need to reproduce the processing of the raw data should be able to accomplish their goals with this version.
- To obtain the full version of the public-use archive, click here. (Warning: 417 MB zipped.)
- To obtain the smaller version of the public-use archive, click here. (3 MB zipped.)
Both versions contain extensive documentation. When unzipping, be sure to preserve the directory structure. Please contact me with any questions, comments, or suggestions.
IV. Additional material
Readers may be interested in several additional items:
- An earlier version of the Comment, written before Hoxby had made any data available.
- An earlier version of Hoxby's Reply.
- A "policy brief" created by the Woodrow Wilson School, describing the debate in non-technical language.
- Coverage of the debate in the Wall Street Journal and the Harvard Crimson.