The Genesis of the Database

The U.S. Supreme Court Database traces its history back about two decades ago, when Harold J. Spaeth asked the National Science Foundation to fund a database that would be so rich in content that multiple users—even those with vastly distinct projects and purposes in mind—could draw on it. Professor Spaeth's goal was at once refreshingly simple and extremely ambitious: to produce a database that would include and classify every single vote by a Supreme Court justice in all argued cases over a five-decade period. After securing the funding, Spaeth collected and coded the data, performed reliability checks, and eventually amassed the Database. In the late 1980s, he made it (and the documentation necessary to use it) publicly available.

Since then, Professor Spaeth has not only updated it each term; he has also continued to perform reliability analyses, thereby ensuring its integrity with each release, and added new variables. Today's version of the Database houses 247 pieces of information for each case, roughly broken down into six categories: (1) identification variables (e.g., citations and docket numbers); (2) background variables (e.g., how the Court took jurisdiction, origin and source of the case, the reason the Court agreed to decide it); (3) chronological variables (e.g., the date of decision, term of Court, natural court); (4) substantive variables (e.g., legal provisions, issues, direction of decision); (5) outcome variables (e.g., disposition of the case, winning party, formal alteration of precedent, declaration of unconstitutionality); and (6) voting and opinion variables (e.g., how the individual justices voted, their opinions and interagreements).

The success of Spaeth's Database is without question. Virtually all systematic analyses of the contemporary Supreme Court and its members have relied on it. This holds for research conducted by social scientists and their graduate students and, increasingly, by legal academics; and it holds for quantitative and qualitative studies, as well as those more descriptive in nature. In fact, several inventories of peer-reviewed journals show that it is the rare article on the Court that derives its data from an alternative source. Monographs published by top presses also regularly rely on the Database, and the many numerical studies of the Court receiving public attention in recent years have made liberal use of the data it houses. By the same token, journalists seeking to illuminate dimensions of the Court's work regularly deploy Spaeth's product; indeed, Linda Greenhouse, the Pulitizer-prize winning reporter, once referred to it as "a computerized treasure trove...created under a grant from the National Science Foundation," and has cited it (or research relying on it) in her writings.

In short, the U.S. Supreme Court Database has not just helped fill gaps in our knowledge. It is one of those rare creatures in the law and social science world: an invention that has substantially advanced a large area of study, inspiring research by scholars hailing from no fewer than three and as many as seven disciplines.

This Website

However invaluable the Database, some of its features rendered it difficult to use. First, only those with knowledge of statistical software packages could operate it—hence eliminating the vast majority of law professors, humanists, policy makers, journalists, even some social scientists, and of course many undergraduate and law students. Second, even knowledge of Stata, SPSS and the like was insufficient. To select the subset of cases of interest, users had to understand the various identification variables as well. So, for instance, if one hoped to analyze only orally argued cases resulting in a signed opinion of the Court, a particular combination of the ANALU and DEC_TYPE variables was in order.

Happily, modern technology—unavailable at the time Spaeth created the Database— makes learning arcane variable names and values unnecessary; it also enables movement away from statistical software packages for those with little need to learn them.

It is this modern technology that we have now attached to the Spaeth Database. Via this site users can easily access all variables (with descriptive names) and can quickly select the set of cases they desire to explore—whether a single decision or, say, all First Amendment disputes that were orally argued and resulted in a signed opinion—using an intuitive search interface. Once a selection is made, users can generate a list of selected cases (with links to full opinions) or conduct various analyses—from the very basic to the more complex. Should they desire, downloading the entire Database or subsets in one of seven forms (e.g., Stata, SPSS, Excel) now requires no special skills. Finally, because the site houses legacy versions of the Database, users are now able to replicate any analysis conducted with earlier versions of data they downloaded (but discarded) or even used on the site.

Modernizing the interface is not all we have done. Previous users will see that we have made some improvements to the existing contents. For example, the original version of the Database did not include the names of cases; and, for recent terms it provided only the L.Ed. citation. Not only have we added the case names and the full complement of citations (U.S., S.Ct., L.Ed., and U.S. Lexis); we also have linked each case to the full text of the Court's decision. This innovation enables a broader set of users to engage in a hybrid method of research that, we believe, lends even greater power to the Database. Social scientists, law professors, legal historians and their undergraduate and graduate students can now identify the cases of interest with systematic rigor, but then for particular projects perform more nuanced textual and linguistic analysis of the opinions themselves.

Second, and less transparently, in the process of creating the new interface we also devised a sophisticated data management system that makes data entry far more efficient and accurate. Now, as the Court hands down its decisions, information can be coded, integrated into the Database, and made immediately available for analysis or downloading at the new website. The occasional adjustment in the data too is easier to make, with each iteration reported on the website.

