image image image image
image
The Collection Collection Status The Contributors
The Genesis of the Database

The U.S. Supreme Court Database traces its history back about two decades ago, when Harold J. Spaeth asked the National Science Foundation to fund a database that would be so rich in content that multiple users—even those with vastly distinct projects and purposes in mind—could draw on it. Professor Spaeth's goal was at once refreshingly simple and extremely ambitious: to produce a dataset that would include and classify every single vote by a Supreme Court justice in all argued cases over a five-decade period. After securing the funding, Spaeth collected and coded the data, performed reliability checks, and eventually amassed the Database. In the late 1980s, he made it (and the documentation necessary to use it) publicly available.

Since then, Professor Spaeth has not only updated it each term; he has also continued to perform reliability analyses, thereby ensuring its integrity with each release, and added new variables. Today's version of the Database houses 247 pieces of information for each case, roughly broken down into six categories: (1) identification variables (e.g., citations and docket numbers); (2) background variables (e.g., how the Court took jurisdiction, origin and source of the case, the reason the Court agreed to decide it); (3) chronological variables (e.g., the date of decision, term of Court, natural court); (4) substantive variables (e.g., legal provisions, issues, direction of decision); (5) outcome variables (e.g., disposition of the case, winning party, formal alteration of precedent, declaration of unconstitutionality); and (6) voting and opinion variables (e.g., how the individual justices voted, their opinions and interagreements).

The success of Spaeth's Database is without question. Virtually all systematic analyses of the contemporary Supreme Court and its members have relied on it. This holds for research conducted by social scientists and their graduate students and, increasingly, by legal academics; and it holds for quantitative and qualitative studies, as well as those more descriptive in nature. In fact, several inventories of peer-reviewed journals show that it is the rare article on the Court that derives its data from an alternative source. Monographs published by top presses also regularly rely on the Database, and the many numerical studies of the Court receiving public attention in recent years have made liberal use of the data it houses. By the same token, journalists seeking to illuminate dimensions of the Court's work regularly deploy Spaeth's product; indeed, Linda Greenhouse, the Pulitizer-prize winning reporter, once referred to it as "a computerized treasure trove...created under a grant from the National Science Foundation," and has cited it (or research relying on it) in her writings.

In short, the U.S. Supreme Court Database has not just helped fill gaps in our knowledge. It is one of those rare creatures in the law and social science world: an invention that has substantially advanced a large area of study, inspiring research by scholars hailing from no fewer than three and as many as seven disciplines.

This Website

However invaluable the Database, some of its features rendered it difficult to use. First, only those with knowledge of statistical software packages could operate it—hence eliminating the vast majority of law professors, humanists, policy makers, journalists, even some social scientists, and of course many undergraduate and law students. Second, even knowledge of Stata, SPSS and the like was insufficient. To select the subset of cases of interest, users had to understand the various identification variables as well. So, for instance, if one hoped to analyze only orally argued cases resulting in a signed opinion of the Court, a particular combination of the ANALU and DEC_TYPE variables was in order.

Happily, modern technology—unavailable at the time Spaeth created the Database— makes learning arcane variable names and values unnecessary; it also enables movement away from statistical software packages for those with little need to learn them.

It is this modern technology that we have now attached to the Spaeth Database. Via this site users can easily access all variables (with descriptive names) and can quickly select the set of cases they desire to explore—whether a single decision or, say, all First Amendment disputes that were orally argued and resulted in a signed opinion—using an intuitive search interface. Once a selection is made, users can generate a list of selected cases (with links to full opinions) or conduct various analyses—from the very basic to the more complex. Should they desire, downloading the entire Database or subsets in one of seven forms (e.g., Stata, SPSS, Excel) now requires no special skills. Finally, because the site houses legacy versions of the Database, users are now able to replicate any analysis conducted with earlier versions of data they downloaded (but discarded) or even used on the site.

Modernizing the interface is not all we have done. Previous users will see that we have made some improvements to the existing contents. For example, the original version of the Database did not include the names of cases; and, for recent terms it provided only the L.Ed. citation. Not only have we added the case names and the full complement of citations (U.S., S.Ct., L.Ed., and U.S. Lexis); we also have linked each case to the full text of the Court's decision. This innovation enables a broader set of users to engage in a hybrid method of research that, we believe, lends even greater power to the Database. Social scientists, law professors, legal historians and their undergraduate and graduate students can now identify the cases of interest with systematic rigor, but then for particular projects perform more nuanced textual and linguistic analysis of the opinions themselves.

Second, and less transparently, in the process of creating the new interface we also devised a sophisticated data management system that makes data entry far more efficient and accurate. Now, as the Court hands down its decisions, information can be coded, integrated into the Database, and made immediately available for analysis or downloading at the new website. The occasional adjustment in the data too is easier to make, with each iteration reported on the website.

The Future

This website facilitates use of Spaeth's data; it does not house any new variables or any new data (aside from updates each term). This may change in the future. We now have a proposal pending at the National Science Foundation which requests funds to backdate the data to the Court's first reported decision, Georgia v. Brailsford (1792). For all cases decided with an opinion since Brailsford we intend to code and attach the existing Spaeth variables. These "new" cases will be seamlessly integrated into the existing Database and, of course, searchable via the interface on this website.

In short, the Database, which now starts with the 1946 term, will begin at the beginning—with Volume 2 of the U.S. Reports. (Volume 1 does not contain any U.S. Supreme Court decisions.) Our hope is that systematic, historical data on the Court will offer an even more valuable "treasure trove"—stimulating scholars and their students to explore new avenues of inquiry, as well as to revisit enduring questions that have yet to be addressed with reliable and valid data.

  CONTACT   TOP
The Collection Collection Status The Contributors
image The Supreme Court Database has been generously supported by the National Science Foundation.
Creative Commons License