Clarkson University Criminal Justice System Software Testing

Background

Forensic DNA evidence has been an important part of criminal investigations and prosecutions for decades, but in recent years, software has been introduced to interpret evidence that was previously considered too complex for manual analysis. For example, this probabilistic genotyping software is used when the evidence is a mixture of contributions from many people, when only trace amounts of evidence is collected or when the evidence has been substantially degraded by environmental factors. Such software report the probability of the evidence under two competing hypotheses — mostly commonly that the defendant contributed to the DNA sample versus an unknown person contributed to it. However, what lab technicians can not do manually, they also can not manually verify and there have been substantial concerns over the accuracy and reliability of the software results from groups such as the Legal Aid Society and journalists who investigate machine bias and algorithmic injustice.

The landscape of probabilistic genotyping software is complicated. There are over a dozen software packages — some developed by individuals and later commercialized, others developed by corporations and still others developed in house by forensic labs. This software is often expensive and developers are keen to protect their intellectual property and reputation. They often strenuously resist any attempts to question methods they consider proprietary, even in the case of defense experts reviewing the system under a protective order. As a result, defendants are being sent to jail based on results of proprietary software, sometimes even in the absence of other physical evidence, even though there are substantial, open questions about its accuracy and reliability. For example, has the software been thoroughly tested on a wide range of racial groups? Has it proven effective with mixtures that contain multiple contributors from the same family? And even more simply, was the software developed with good software engineering principles that convince one that major bugs have been identified and fixed? We all know what it is like to run beta-level software and observe bugs that are eventually reported and fixed. What would happen if every time a bug was reported, the response was simply you are just complaining because you are guilty?.

The startling discovery of a hidden function in the code of one such probabilistic genotyping program the very first time it was reviewed by a defense expert has raised questions about evidence used in over a thousand cases. The Forensic Statistical Tool (FST) was developed by New York City's Office of Chief Medical Examiner (OCME) and has been used since 2010, both in New York and in other jurisdictions. When defense expert Nathan Adams conducted the world's first FST source code review in a federal criminal case, he found that the application of the FST software included a function that does not reflect, and is even counter to, the methodology as described by the laboratory in testimony, peer reviewed publications and at scientific conferences. This hidden function discards data under certain circumstances, yet its existence was completely unknown until the source code was reviewed. In late 2017, in response to a motion filed by the defense and Propublica, a federal judge unsealed the source code for one such software program, the Forensic Statistical Tool (FST) as well as affidavits submitted by expert witnesses who reviewed the FST source code under protective order. The affidavits document problems revealed by the FST source code including the exclusion of data that could impact the software results and overall concerns about sloppy software practice that often reveals the presence of bugs. These findings provided some of the first public evidence to support the concerns defense teams have been voicing for years, but much more work is needed.