Once thought to be infallible, the use of unvalidated or improper forensic sciences was a contributing factor in 47 percent of the first 325 DNA exonerations across the country, see The Innocence Project, “Causes of Wrongful Conviction” (last visited Aug. 11, 2018). For nearly a decade the Pennsylvania Innocence Project has been successfully challenging the use of unvalidated or outdated forensic sciences through advocacy for those wrongly convicted, and by trying to prevent these injustices from happening. The Pennsylvania Innocence Project has secured the release of clients convicted based on an outdated arson investigation, invalid “shaken baby syndrome” diagnosis, and a bite-mark comparison. In addition to challenging flawed forensic sciences in post-conviction litigation, the Pennsylvania Innocence Project consults pretrial in flawed forensic science cases.

Although the improper use of forensic techniques played out across the country in individual trials and exonerations, the full scope of the problem only became clear in 2009 with the publication of the National Academies of Science (NAS) report, “Strengthening Forensic Science in the United States: A Path Forward.” The NAS committee evaluated the most common forensic science comparison disciplines, including fingerprints, firearms and tool mark examination, bite marks, blood stain pattern analysis and hair comparison. The committee found “with the exception of nuclear DNA analysis, however, no forensic method has been rigorously shown to have the capacity to consistently, and with a high degree of certainty, demonstrate a connection between evidence and a specific individual or source,” see Committee On Identifying the Needs of the Forensic Science Community, National Research Council of the National Academies of Science, “Strengthening Forensic Science in the United States: A Path Forward,” 7 (2009).

Though touted in courtrooms as hard science, the NAS committee found that most comparison disciplines lacked the statistical underpinnings or validity of science. Most were developed in crime laboratories or police departments, and had never been subject to “strict scientific scrutiny.” The committee's findings highlighted two assumptions about comparison disciplines largely unchallenged in court: that a properly trained analyst can make an association between a known (typically taken from the suspect) and unknown item (typically found at the crime scene), and that a properly trained analyst can provide an assessment of the rareness, or probativeness of that association (the likelihood that the unknown item could be associated with anyone other than the defendant or victim). Though most comparison disciplines have never been scientifically validated, courts admitted pattern evidence based on precedent alone—rarely exercising their gatekeeping function through rigorous admissibility hearings. Furthermore, even when a comparison discipline had some limited logical or scientific value, forensic analysts routinely testified in ways that exceeded the validity of the science.

These problems persist today. In 2016, seven years after the NAS report, the President's Council of Advisors on Science and Technology (PCAST) issued a report that highlighted the continued need for clarity about the scientific standards for the validity and reliability of forensic methods, see “Forensic Science in Criminal Cases: Ensuring Scientific Validity of Feature-Comparison Methods,”(2016) (last visited August 11, 2018). The council had grave concern about both the foundational validity of comparison disciplines, and the validity as applied by individual analysts.

An example is the use of microscopic hair comparison analysis—a visual comparison a questioned and a known hair under a microscope. An analyst would declare a “match” between the hairs based on a nebulous number of similar “features.” The FBI developed this discipline in its crime lab in the 1960s; it was never validated outside of that lab. From there, FBI analysts testified in thousands of federal and state criminal trials across the country. Attempts to validate microscopic hair comparison by calculating the probability of a random “match” were futile because the uniqueness of human hair is not known. There is no way to say with any reliability that a hair came from one individual to the exclusion of all others.

Yet, FBI analysts—and the state actors they trained—held themselves out as experts in this field and were routinely qualified to testify. FBI hair analysts were involved in approximately 3,000 cases in which an analyst made a positive association between two hairs. In 2012, following the exoneration of three men from Washington, D.C. who were convicted based on flawed testimony from FBI hair analysts, the FBI and Department of Justice partnered with the Innocence Project and the National Association of Criminal Defense Lawyers to review the testimony of the FBI analysts. The results of the review indicated that FBI hair analysts testified erroneously over 95 percent of the time. This included 32 death penalty cases with flawed testimony, see Hsu, Spencer, “FBI admits flaws in hair analysis over decades,” Washington Post, 2015.

Although FBI analysts had no way of knowing how unique the microscopic characteristics of hair are, the chances that two hairs could “match” randomly, or how accurate they were at associating hairs, the review of their testimony found that they routinely overstated the strength of their conclusions by stating or implying that an evidentiary hair could be associated with an individual to the exclusion of all others assigning a statistical weight to the association using their own experience in the lab, citing the number of hairs they themselves associated in place of a valid statistical weight, Norman L. Reimer, “The Hair Microscopy Review Project: An Historic Breakthrough for Law Enforcement and a Daunting Challenge for the Defense Bar,” The Champion (2013) (last visited August 11, 2018).

The effect of this erroneous testimony is not limited to cases in which FBI analysts actually testified. The FBI also trained thousands of state and local analysts to conduct hair comparison and testify. Analysts from at least 48 states, including Pennsylvania, were trained over a period of thirty years. The same pattern of erroneous testimony has led to an exoneration in North Carolina, and reviews of state lab work in Montana, Washington D.C., Texas, and Iowa, among others, Hsu, Spencer “Review of FBI forensics does not extend to federally trained state, local examiners,” Washington Post, 2012.

The Pennsylvania Supreme Court decided in November 2017 that defendants convicted based on microscopic hair comparison testimony who filed petitions under the Post-Conviction Relief Act (PCRA) within 60 days of the April 20, 2015, admission of error by the FBI have a timely new facts exception to the PCRA's new evidence bar, see Commonwealth v. Chmiel, 173 A.3d 617 (Pa. 2017). A handful of PCRA cases have since been remanded for consideration on the merits of the new evidence.

The problems that plagued hair comparison plague all pattern and comparison disciplines—even the vaulted fingerprint analysis. As these cases wind their way through the courts it's important to keep in mind the lessons from the NAS and PCAST reports and encourage judges to analyze the foundational validity of the discipline, as well as validity as applied by an individual analyst, rather than simply relying on precedent for admission.

Amelia Maxfield joined the staff of the Pennsylvania Innocence Project in February. She is completing a two-year fellowship focusing on litigating flawed forensic science cases. Prior to joining the staff, Maxfield was a public defender, and former post-conviction counsel for the National Association of Criminal Defense Lawyers where she helped manage the FBI microscopic hair comparison review.