ADDITIONAL CASES The People of the State of New York, Plaintiff v. David Seepersad, Defendant; 2939/16 The People of the State of New York, Plaintiff v. Johnnie Jackson, Defendant; 727/17 DECISION AND ORDER The three captioned defendants were separately indicted in 2015, 2016, and 2017 on completely unrelated gun possession charges. In each case the defense challenged the introduction of DNA evidence created by the Forensic Statistical Tool (“FST”). The FST was an analytic tool with which the city’s Office of the Chief Medical Examiner (“OCME”) assigned “likelihood ratios” to forensic samples made up of DNA from not one, but two or three, individuals.1 A scientist could use FST results to opine that a two-person DNA mixture was X times more (or less) likely to be made up of DNA from a particular known individual and one unknown, unrelated individual than DNA from two unknown, unrelated individuals. Similarly, the analyst could testify that a three-person mixture was X times more (or less) likely to be from a particular known individual and two unknown, unrelated individuals than from three unknown, unrelated individuals. The defendants’ challenges asserted that the FST results were not the product of procedures generally accepted in the “community” of DNA forensic scientists. This court once before faced such a claim, and ruled after a Frye2 hearing that FST results should indeed be excluded on that ground. People v. Collins, 49 Misc3d 595 (Sup Ct Kings Co 2015). For this case, this court assessed whether developments in the community of forensic scientists since the Collins decision of July 2, 2015, should change that conclusion. For the reasons noted below, this court decided on October 16, 2017 that it should again exclude the challenged FST evidence. The court has continued reviewing developments as best it can in the period since, seeing no basis to re-think the conclusion that nothing has changed. The decision was announced well over a year and a half ago. Even before that date, OCME discontinued use of the FST. Likelihood ratios for DNA mixtures are now calculated with a commercial product known as STRmix. After this court’s Collins decision, all three defendants pleaded guilty. This judge’s other obligations, and the lack of urgency that has resulted from the retirement of the FST, have long delayed this opinion. In the meantime, the overwhelming majority of New York’s trial judges considering the matter have disagreed with Collins’ views on the FST, and the opinion is essentially a dead letter. One might ask: what is point of bringing up the FST issue in a 2019 opinion? Perhaps there is none. But several factors persuade this judge otherwise. First, many defendants convicted after trials in which the People introduced FST evidence may still wish to challenge the FST on appeal. To date, the several appellate decisions deferring to the trial courts’ exercises of discretion will not encourage them. See, e.g., People v. Degracia, 173 AD3d 1199 (2d Dept 2019); People v. Easley, 171 AD3d 785 (2d Dept 2019); People v. Gonzalez, 155 AD3d 507 (1st Dept 2017). But perhaps this opinion, whether or not likely to change a result, should be available to future appellants. Second, this judge has been distressed by unfair attacks on witnesses who testified for the defense at the Frye hearing in Collins. The witnesses are very accomplished scientists, and it is unlikely that they spend their weekends worrying about what New York trial judges think of them. Still, a rebuttal for the record is in order. Third, in 2016 the President’s Council of Advisors on Science and Technology (“PCAST”) published an extremely valuable report on the use of forensic science in court cases. Among the techniques discussed was DNA mixture analysis. The report has since been unfairly and sometimes nonsensically attacked by litigants and judges. Contrary views on those attacks may be of substantial interest to New York courts hearing similar attacks on the report as to DNA analysis, and also other types of forensic evidence, in the future. Fourth, there is also the prospect, likely or not, that this judge’s views may in the future influence other trial judges considering how to decide Frye issues about forensic science that are unrelated to the FST. A final preliminary matter: the procedural circumstances of the cases are narrated as matters stood when the court’s decision was announced in October 2017. I Defendant Marcus Thompson was indicted for Criminal Possession of a Weapon in the Third Degree after a police officer allegedly recovered a loaded handgun from a clothes hamper in his apartment. At Thompson’s trial the People hope to introduce evidence that a mixture of DNA, thought to be from three individuals, was found on the weapon. The mixture was analyzed at New York City’s Office of the Chief Medical Examiner with OCME’s Forensic Statistical Tool. An analyst would testify to the FST results: that the mixture of DNA is approximately 21,800 times more likely if it is from Thompson and two unknown, unrelated individuals than if it is from three unknown, unrelated individuals. The analyst would add that this is “very strong support” for the conclusion that Thompson touched the handgun. Defendant David Seepersad was indicted for Criminal Possession of a Weapon in Third Degree after a police officer allegedly recovered a defaced handgun and ammunition inside a bag in his apartment. At Seepersad’s trial the People hope to introduce evidence that a mixture of DNA, thought to be from two individuals, was found on the weapon. The mixture was analyzed with the FST, and the analyst would testify to the FST results: that the mixture is about 172 million times more likely if it is from defendant and an unknown, unrelated individual than if it is from two unknown, unrelated individuals. This is “very strong support” for the conclusion that Seepersad touched the handgun. Defendant Johnnie Jackson was indicted for Criminal Possession of a Weapon in the Second Degree after a police officer allegedly recovered a loaded firearm from a dresser drawer in his apartment. At Jackson’s trial the People hope to introduce evidence that a mixture of DNA, thought to be from three individuals, was found on the weapon. The mixture was analyzed with the FST, and the analyst would testify to the FST results: that the mixture is about 52.4 million times more likely if it is from defendant and two unknown, unrelated individuals than if it is from three unknown, unrelated individuals. This is “very strong support” for the conclusion that Jackson touched the gun. All three defendants have moved to preclude the FST testimony. They argue that analysis from the FST is not generally accepted in the relevant scientific community, and thus that the proffered testimony fails the Frye test of admissibility. See Frye v. United States, 293 F 1013 (D.C. Cir 1923). All parties agree that the Frye standard applies here. II A As stated, this judge addressed the general issue in People v. Collins, 49 Misc3d 595 (Sup Ct Kings Co 2015). In Collins this judge concluded in mid-2015, as urged by the defendants in two Brooklyn cases, that the FST processes were not generally accepted in the relevant scientific community — and thus that they flunked the Frye test.3 The Collins opinion ended with this thought: This court conclude[s] that evidence derived…from the FST is not yet proved to be admissible under the Frye test. * * * [T]his court understands the sincere effort that [OCME has] put into the development of the FST. They must continue, if they are to persuade. People v. Collins, 49 Misc3d at 629, supra. The court advised the parties in the three New York County cases now under consideration that its focus would be on whether developments since the Collins decision show that a consensus in favor of the FST has emerged in the relevant scientific community. After review of the parties’ submissions and other relevant materials, the court concludes that no such consensus has emerged. But before explaining that, this court will interrupt itself to provide a short refresher on the FST. B Apart from identical twins, every human has a unique genetic pattern made up of DNA. Forensic scientists have developed procedures for identifying the sources of DNA samples that may be relevant to a crime. Often only one person’s DNA is recovered at, for example, a crime scene. In New York City, at times pertinent here, when DNA was recovered an analyst would examine the material at 15 standard positions of the genome — 15 “loci” — to create a DNA profile of the source of the DNA. While so limited an analysis will not provide the contributor’s unique profile, the profile produced still will be vanishingly rare. It thus will serve as well as would a profile developed from all loci in the genome.4 Individuals’ profiles vary because the DNA patterns at the 15 chosen loci differ from one person to another. At each locus the pattern of the chemical components of one person’s DNA will repeat a different number of times than will the patterns of most other people. The number of repeats — 11, 15, 17 or whatever — at each locus in a sample is calculated by the DNA analyst. These numerical “alleles,” when embodied in individuals’ profiles, will distinguish one person from another. To emphasize, an “allele” is not a tangible thing; it is simply the number of times that a pattern repeats at a locus. If one allele at a locus for person A is 11, A’s DNA cannot be mistaken for that of a person who does not have 11 repeats of the pattern at that locus. In a “pristine” sample from one person there can be up to 30 distinguishing alleles at the 15 chosen loci. That number is 30, and not 15, as everyone inherits an allele at each locus from his mother and another from his father. There may be somewhat fewer than 30, if the maternal and paternal alleles at one or more of the loci by chance happen to be identical — i.e., if the person is at that locus a “homozygote.” But real-world DNA samples choose not to oversimplify the work of DNA analysts. Many contain mixtures of the DNA of two or more individuals. A pristine mixture from two people could contain 60 alleles. A pristine mixture from three people could contain 90 alleles. One from four people could contain 120.5 If one person’s DNA is analyzed, it is easy to know that two alleles at the same locus are from the same person and go together into that person’s DNA profile. The daunting task of the analyst of a mixture is to determine which of more than two alleles at each locus are one contributor’s pair — and also match up with particular alleles from the other 14 loci to create the profile of that one contributor. The lucky analyst will find that the mixture is a simple one. First, in a two-person mixture one contributor — the victim of a sex crime, perhaps — may be known. The alleles in that person’s profile can in effect be subtracted from the results, to expose the other contributor’s profile. Second, some mixtures can be “deconvoluted” because the contributors provided greatly different amounts of DNA. If the analyst determines that half the alleles in a two-person mixture are present in four times the quantity of the other half, he may be able to assign alleles to “contributor A” and “contributor B” profiles based on that distinction. Similarly, if one-third of the alleles present in a three person mixture are present in four times the quantity of the rest, the analyst may be able to create a profile for at least one of the three contributors. However, the unlucky analyst will not be tasked with examining so “simple” a mixture. He will find himself unable to create individual profiles, because there will be no way to determine which alleles (if any) in a mixture were left by a particular person of interest. The analyst will be left to assess the probability that DNA of the person of interest is in the soup of alleles provided by a number of contributors. In the past, analysts provided gross estimates of such probabilities. An analyst might say, for example, that if many alleles of a person of interest were not present in a mixture, this person was “excluded” as a contributor. If all of that person’s alleles were present, the person of interest “could be” a contributor. If most alleles were present, and the absence of the others could be explained, the person of interest “could not be excluded” as a contributor. Such gross reports were, for obvious reasons, not satisfactory. And indeed, in 2009 the National Academy of Sciences (“NAS”), addressing forensic evidence more generally, concluded that expert opinions connecting suspects to crimes should be based on hard probability statistics.6 Other authoritative bodies have since agreed. For example, SWGDAM7 stated its view in 2010. SWGDAM Interpretation Guidelines for Autosomal STR Typing by Forensic DNA Testing Laboratories,13-14. https://www.forensicdna. com/assets/swgdam_2010.pdf and PCAST8 said the same thing in 2016. Forensic Science in Criminal Courts: Ensuring Scientific Validity of Feature Comparison Methods (hereafter “PCAST Report”) at 53-54, 56. https://obamawhitehouse.archives.gov/sites/default/files/ microsites/ostp/PCAST/pcast forensic science report final.pdf These opinions are not those only of such “official” bodies.9 For DNA analysis of individual samples and of the obvious contributors to simple mixtures, satisfying that concern was not a problem. “Hard numbers” could easily be provided, and indeed were already being provided. But it was a problem as to complex mixtures, where vague conclusions like “could not be excluded” did not remotely meet the demand for hard statistics. Attaching hard numbers to the probability of one person’s inclusion in a complex DNA mixture is one of most difficult tasks now facing forensic science. But brave attempts to find a reliable method have been made, and one of them is OCME’s creation of the FST. As explained in more detail in Collins, the FST uses Bayesian assessment of electropheris analysis and a computer in an effort to calculate the probability that the DNA of a person of interest is present in a mixture. C But the Bayesian results do not depend only on Bayes’ universally accepted mathematics and a computer. Bayes said nothing about what should be computed with his math, and in particular of course about how probabilities could be identified from electropheresis analysis of a DNA sample. Critical to Bayesian analysis are the formulae by which the computer is told to assign probabilities to different outcomes. If a computer is wrongly told that half of struck baseballs will go over the fence in fair territory, then a Bayesian analysis and a computer would conclude that 50 percent of struck pitches will yield home runs — and Babe Ruth would look like a very poor slugger. In the real world, DNA samples relevant to a crime are not “pristine.” Samples are complicated by “stochastic effects” that may eliminate the DNA that is represented by an allele, and thus eliminate that allele from the resultant profile. Likewise, stochastic effects may add an allele to the profile, one that should not be there. If the computer is wrongly told that there will never be allelic “drop-out”10 at a particular locus, or is given incorrect information on the likely extent of the drop-out, the probability findings for a subject’s inclusion in a mixture will be skewed by inaccurate information as much as an evaluation of Babe Ruth. And if the computer is wrongly told that a crime scene mixture will never be contaminated by extraneous “drop-in” alleles, or is given incorrect information on the likely extent of that drop-in, the Bayesian analysis will wrongly assess the probability that a possible contributor’s DNA is in the mixture. Through no fault of his own, Bayes’ math does not work with flawed inputs about baseball or about DNA. What is at issue in Frye analysis of the FST and similar programs is whether the relevant scientific community generally approves of the ways in which the OCME formulae assess the probability that an individual’s alleles are in a mixture. The bottom line: critical to a proper formula to measure the impact of stochastic effects are the real-world causes of and frequency of drop-out and drop-in. As noted, law enforcement authorities do not recover “pristine” DNA samples. Crime scene samples like those recovered in these three cases can be contaminated or degraded. Time or sunshine, for example, can degrade a sample. Alleles drop out, and cannot be detected in the lab, when samples are so affected. Alleles of higher and lower numbers, and alleles at different loci, will drop out at different rates. One person’s alleles will degrade more quickly than those of another. Drop-out and drop-in alleles will as a general matter alter small samples much more often than large samples. The formulae for mixture analysis should be based on statistics for these factors. The FST does assign probability statistics to whether the disappearance of alleles, or the untoward appearance of alleles, may have occurred in a large or small sample. Other computer programs, many commercially available, do this as well. But, as noted in Collins, some methods used in the FST to assess probabilities are novel. For example, the FST is almost alone in the world in calculating the likelihood of stochastic effects from a simple early assessment of the overall quantity of DNA in recovered samples, rather than from the heights of the individual allele indicators (“peaks”) in post-electropheresis print-outs of DNA profiles. Even OCME acknowledges that its assessments of “quant” are quite imprecise. Moreover, the likelihood of a particular allele being present at a specified locus varies by race; the FST uses its own and lonely calculations of distinctions among racial groupings to estimate those various likelihoods. The FST stands out from some mixture programs in another, extremely critical way. The results of mixture analyses can be impacted hugely by the preliminary assessment of the number of contributors to a mixture. The statistics that will be produced differ, often dramatically, based on the number of contributors thought to have DNA in a sample. The OCME analyst will make a judgment call on that often thorny question of contributor numbers, and will do so under the protocols OCME has created. Based alone on this hypothesis about contributor numbers, the FST then reports the probability that the mixture contains the alleles of the person of interest — if that OCME call was correct. There is no information for the jury about how probable it is that the analyst was correct in calling the number of contributors. The FST will not report the probability if the actual defense theory of the case — its hypothesis — is that a different number of people were contributors. A related feature underscored in Collins is that the FST is a “black box.” An independent expert has no ability to calculate probabilities under FST protocols should there actually be a different number of contributors. Moreover, there are no tests of the extent that subjective analysis — perhaps affected by the FST’s inability to report statistics if there are more than three contributors — plays a role in OCME analysts’ conclusions about the number of contributors. No one knows whether consistent assessments of the contributor numbers in a particular case would be made by other experts — or how often they would be made. Notably, it is extremely problematic to make a call as to whether three, or instead four or more, people contributed to a complex mixture. Estimates as to the likelihood of an incorrect estimate where there actually are four or more contributors run to over 50 percent. See, e.g., Perez et al., Estimating the Number of Contributors to Two-,Three-, and Four-person Mixtures Containing DNA in High Template and Low Template Amounts, Croat. Med. J. 52 (3) 314 (2011).11 Jurors are told none of these things. D The People believe that the members of the relevant scientific community are not troubled by questions about the FST’s assessment of allelic probabilities. This judge noted more than once in Collins that he is not charged with deciding the validity of novel scientific procedures…. Judges should be “counting scientists’ votes,” and not “verifying the soundness of a scientific conclusion.” People v. Collins, 49 Misc3d at 603, supra (citations omitted). The court’s job is to resolve whether, as the People assert, there is a general consensus in the relevant community in favor of the FST procedures. In Collins this judge evaluated lengthy testimony and numerous exhibits, and in 2015 concluded that there was no such consensus. Among the points seriously contested among forensic scientists: 1) should independent analysts have been offered a chance to assess the FST as it was being validated. 2) is the FST “black box” approach, which guarantees that only the FST’s creators can fully evaluate results in particular cases, consistent with good science. 3) is it appropriate that only OCME’s call as to the number of contributors, and thus its estimates of the likelihood of stochastic effects, be considered, particularly in “LCN” cases. 4) is the preliminary “quant” assessment a valid way to determine the likelihood of stochast ic effects. 5) did OCME’s validation tests on pristine laboratory samples provide a reliable basis for assessing probabilities of “real world” degradation effects. 6) was it appropriate that OCME amended the results of its tests to create more “straight line” conclusions about allele probabilities than OCME’s testing had provided. 7) were the probabilities of stochastic effects for different races, and “Asians” in particular, reasonably calculated in the FST formulae. See People v. Collins, 49 Misc3d at 617-20, supra. III A Again, the question presented by the three defendants’ applications is whether developments since 2015 should change this judge’s conclusion that the FST failed its Frye exam. The court believes that there have been no significant developments since Collins that favor the FST, and thus that there still is no consensus in the scientific community in favor of the FST. Relevant are scientists’ views on DNA mixture analysis in general, and others on the FST’s methods in particular. In this section, the court will consider those views. The court begins by recognizing that the People suffer here under a handicap. This court is assessing whether a change in scientists’ views on the FST is evident — but scientists have for good reasons not been examining the FST. In the first place, OCME remains the only institution that has used the FST. That obviously reduces scientists’ interest in studying its reliability. This factor is reinforced by the circumstance that even OCME no longer uses the FST. As of January 1, 2017, OCME began using a commercial program, STRmix, to analyze mixtures. And beyond even that, the FST remains generally immune to peer review because OCME resisted efforts by others to examine it. As noted in Collins, there may well have been understandable proprietary interests behind that resistance. But many scientists cannot comfortably form opinions about a program which they cannot fully study. The continuing inability of scientists (and others) to assess and test the FST is a huge obstacle to a conclusion that it passes the Frye test. In short: since mid-2015 scientists have had little ability or reason to assess, and approve or disapprove, the procedures employed in the FST. But what may or may not be interesting to scientists does not control whether the efficacy of FST analysis remains of interest to defendants against whom FST results are reported at trial. And so the troubling factors of 2015 must be re-assessed to the limited extent that subsequent developments in mixture analysis permit, even if those developments are not focused on the FST. B The court’s first concern is with the most fundamental factor in mixture analysis, and among the most difficult calls in forensic science. That factor concerns the number of people who contributed to complex mixtures like those at issue in these cases. See Gill et al., Genotyping and Interpretation of STR-DNA: Low-template Mixtures and Database Matches — Twenty Years of Research and Development,18 EFRSIG100, FSI Genetics 18:100-17 (2015) (“Likelihood ratios are crucially dependent upon the assumptions or the propositions used in the models [e.g. number of contributors.]“). Again, probability conclusions based on the call often vary to a high degree. In particular, a decision that there were three contributors to a mixture will usually yield a probability report much different from a decision that there were two contributors. A conclusion that there were four contributors will likely yield a report still farther from one that assumes a two-person mixture — if a four-contributor analysis can even be done.12 And the odds generally become yet more divergent as the number of contributors goes farther up. The difficulty in deciding the number of mixture contributors seems universally to be accepted. Critically, those who agree include even the FST’s creators, Dr. Theresa Caragine and Dr. Adele Mitchell. See Perez et al., supra, Croat. Med. J. 52 (3) 314-26 passim (2011). One might say, therein lies the rub. The FST, unlike programs such as STRmix, relies on a single estimate of the contributor number, and analyzes a sample based solely on that estimate. That number defines not only the People’s Bayesian hypothesis, but also is used to define the supposed “defense hypothesis” as well. With the FST, the jury learns nothing about the odds that the contributor number chosen by the FST analyst is correct. No information is imparted to the jury as to the protocols used in making the decision. The jury is not told how often the true number is found under the protocols. The jury does not know how subjective an analyst’s choice is. To the extent that subjective choices are made, the jury does not know (from proficiency tests) how often an individual analyst makes correct choices. That is true even though FST protocols create a presumption that the lowest possible number of contributors be assumed, without defining what might lead an analyst to override that presumption in a particular case. See New York City Office of the Chief Medical Examiner Forensic Biology Protocols for Forensic STR Analysis, Forensic Statistical Tool (FST) at 362 (effective December 24, 2015). Some authorities agree that “there needs to be significant court involvement with the process of formulating alternative propositions….” Gill et al., 18 EFSRIG 100, supra. Second, even if the jury could assess the odds that an FST analyst correctly identified the number of contributors, the FST provides no information about what the probability results would be if there actually were a different number of contributors. To repeat, OCME will not analyze a mixture on a theory, an actual “defense hypothesis,” that involves a number of contributors different from that chosen by the OCME analyst. But an analyst’s report after an assumption about contributor numbers is at best a beginning, not an end, to a full report on the results. There seems to be very strong disagreement with the FST approach on this front — particularly on whether a different number of contributors could be employed in the prosecution and defense hypotheses when the correct number is open to doubt. See, e.g., Buckleton et al., The Effect of Varying the Number of Contributors in the Prosecution and Alternate Propositions, FSI Genetics 38: 225-31 (2019)13; Gill et al., 18 EFRSIG 100, id. (“There is no reason for numbers of contributors to be the same under alternate hypotheses”). And the official forensic oversight body in the United Kingdom concurs. Forensic Science Regulator, Guidance — DNA Mixture Interpretation, FSR-G-222, Issue 2 (2018) at 28-29. It is not requiring the impossible to suggest that the jury hear at least more than the FST provides. Efforts have been underway for some time to state the probability that a call of the number of contributors is correct.14 And very certainly the probabilities, should there be a different number of contributors, can be reported. Unlike the FST, some programs (such as STRmix) apparently take into account those probabilities. But with the FST, jurors are not told even that contributor numbers matter and have not been identified with certainty. Since Collins this court has reviewed OCME protocols for FST mixture analysis that were in effect in late 2015 if not before. See NYC Office of the Chief Medical Examiner, Biology Protocols for Forensic STR Analysis (effective December 24, 2015). Those protocols note that choosing the lowest possible number of contributors will “usually” result in the lowest likelihood ratio, but they do not articulate when it is that this lowest contributor number should or should not “usually” be employed. The jury is not told even whether this case was treated as a “regular” or “irregular” one and why. In particular, there is no mandate, or even suggestion, that the analyst should report the probability that a different number of individuals were contributors. The calculation is simply left to the analyst’s unexplained and unadmitted discretion.15 By coincidence, two of the three cases before this court convincingly underscore these concerns. In People v. Jackson, the FST reported probabilities based only on a conclusion that the mixture at issue was produced by three DNA contributors. The resulting statistic was huge: 52.4 million. Had the case gone forward with that FST result, the jury would not have known the results if, for example, there were more or fewer contributors. Nor would the jury have known the probabilities that there were more or fewer contributors. Since then, however, the mixture was re-analyzed with OCME’s new software, STRmix. And STRmix concluded that the Johnson mixture was in fact made by four contributors, not the FST’s three. And STRmix, like the FST, is not able to assess probabilities in a four person mixture. Perhaps even more stark are results in defendant Seepersad’s case. Working with the FST, the first OCME analyst concluded that there were two contributors to a DNA mixture on the gun’s trigger swab. Further, the mixture was found to be 172 million times more probable if from the defendant and one unknown, unrelated contributor than instead from two unknown, unrelated contributors. Working with STRmix, a later analyst concluded that there were three contributors. The resulting likelihood ratio was more inculpatory by many orders of magnitude. That change would favor the People, but such likelihood ratio changes can go in either direction. They can even turn the ratio from inculpatory to exculpatory.16 Fairness, as it is understood by bodies like the NAS, SWGDAM, and PCAST, requires that juries understand at least the possibility that the analyst used an incorrect contributor number, and what that could mean to the results. Based simply on the FST’s shortcomings as to contributor numbers, many experts seem to question whether the FST provides the jury with a meaningful assessment of the odds that a subject’s DNA is in a mixture. To be sure, some scientists and groups of scientists like the New York State Commission on Forensic Science are unconcerned. But given the authority behind the view that the jury must be given a fair statistical assessment, and indeed that this is why there was a need to develop mixture programs like the FST in the first place, it cannot be thought that there is a consensus view in favor of the FST’s methods. C But there is much more. In the 2016 PCAST report, which was of course not available to this court in 2015, the President’s Council of Advisors on Science and Technology addressed not only DNA mixture analysis, but forensic procedures more generally. The report opines the obvious — that forensic results should not be admissible unless the process for producing them was developed through proper scientific methods. PCAST Report at 4-5, 122, 143- 44. One of the necessary methods is proper validation of a new procedure before it is introduced by a laboratory. Among other things, “proper” validation requires participation and input from scientists other than the developers of the program. PCAST Report at 79, 80. “Home court” developers may be subject to conscious or subconscious bias that skews their objectivity. But (with perhaps one exception) OCME did not engage outside scientists to participate in the validation of the FST. And, at risk of annoying repetition, OCME does not even allow access to the FST to permit independent efforts to confirm the OCME validation after the fact. There is a very strong view in the community of DNA scientists, and forensic experts in general, that forensic programs should be “open source” to permit outside appraisal. See, e.g., PCAST Report at 14, 80-81; Gill et al., 118 EFSRIG 100, supra; Gill et al., Recommendations on the Evaluation of STR Typing Results that may Include Drop-out and/or Drop-in Using Probabilities Methods, FSI Genetics 6: 679, 684 (2012) (“[T]he black box approach is strongly discouraged.”) This court will address three other aspects of the FST among those seriously debated by scientists — all having to do with the critical question whether the FST statistics for stochastic effects were properly determined. First, the FST program considers the likelihood of stochastic effects like the “drop-in” of contaminating alleles and the “drop-out” of relevant alleles. But it does so based on analysis of “clean” laboratory samples which were not subjected to degradation. Crime scene samples are not pristine. As noted, factors like time and heat will degrade samples in complex ways, with significant impact on stochastic effects. There is naturally no consensus in favor of likelihood calculations that do not take account of the critical effects of degradation. This court can discern no retreat among scientists from this obvious view. Scientists who have not “seen” inside the FST cannot conclude whether the FST reasonably assesses degradation. Second, this court noted above that a fundamental factor impacting on the likelihood of stochastic effects is the quantity of DNA in a sample. Almost every mixture program grounds its probability assessments on peak heights — the heights of the peaks along the baseline of a DNA profile based on electropheris indicating which alleles are present. The FST is close to alone in basing statistical probabilities instead on an earlier pre-electrophesis assessment of “quant” — the quantity of DNA in the sample. In 2015 this court heard no voice in favor of the quant approach from anyone outside OCME except that of Dr. Hilda Haned. Dr. Haned, whose expertise in DNA mixture analysis is far beyond question, testified before this court that she was considering future use of the quant method in her own laboratory in the Netherlands. Notably, however, she apparently gave similar indications in testimony in 2013 in the Rodriguez case. People v. William Rodriguez, Ind No 5471-2009 (Sup Ct NY Co October 24, 2013) at 39-40.17 This court has not been informed of any reliance on quant analysis in her laboratory’s work to date. Moreover, in 2015 Dr. Haned was a co-author of an academic article with, among others, Dr. Peter Gill, perhaps the most respected forensic DNA expert in the United Kingdom. The article notes that “Stochastic effects that lead to profile imbalance and dropout are not solely a function of the DNA quantity” and, with focus on LCN analysis, gives examples. Gill et al., 18 EFRSIG 100, id. We now know that at least one other crime laboratory has employed the quant method in its mixture analysis. The Forensic Science Division of the Austin, Texas Police Department has done so. The laboratory has since lost its accreditation in part for that very reason. In a blistering report the Texas Forensics Science Commission stated: Using a quant-based [stochastic threshold] to determine potential stochastic effects in DNA mixtures is neither scientifically valid nor supported by the forensic DNA community. The review team is not aware of any peer-reviewed journal article citing the acceptance of a quant-based [stochastic threshold] for mixture interpretation….[The] quantity of DNA is not an appropriate metric to assess potential stochastic effects that occur during amplification for DNA mixture evidence. Texas Science Forensic Commission, Final Audit Report for Austin Police Department Forensic Services Division DNA Section (2016) at pp.12-13.18 Third, probabilities that particular alleles will appear at the chosen loci differ based on race. Statistics about a crime scene sample are computed by the FST for each of four possibilities: that the relevant contributor was white, black, Asian or Hispanic. The lowest of the four probability numbers is the one reported. For better or for worse, there seems to be little debate in the scientific community about the propriety of the “four races” procedure — unduly simplistic as it might dramatically appear to anyone who has worked in the New York City criminal justice system. But plainly, all four numbers are important. The Asian number will impact on which among the four is the lowest, and thus often on which stochastic probabilities should be employed in the computer. If that number is unreasonable, perhaps because the FST does not factor in the genetics of people from Laos or Ceylon, then it will subvert the whole purpose of the “four races” approach. There are statistical ways to “fix” such problems after a reasonable overall assessment of the population of a continent. But there certainly is no consensus, and there seems still not even to be support, for the method through which OCME calculated its probabilities for Asians. See People v. Collins, 49 Misc.3d at 619, supra. OCME chose exactly three people to represent the two billion individuals on the world’s most populous, and most diverse, continent. Most of the relatively few computer simulations run by OCME to determine allelic probabilities in Asia were done with the alleles of a single one of these three people. That approach can most charitably be assessed as “casual,” and far from scientific. As far as this court can tell, no one has ever tried to justify calculating the “race” statistics for a whole continent based on so meager a sample. * * * This court will concede: one could “intuit” general acceptance in the relevant scientific community for a program with a single questionable aspect, or even more than one. In this developing field, doubtless every program has at least one controversial component. The mixture analysis field is young; new and better ideas are constantly being advanced. There is no “you must do it this way” view at all evident in the literature. “[T]here is no agreement within the forensic community on the best approach, and it is unrealistic to suppose that any single method will be universally adopted.” Gill et al., 118 EFSRIG 100, supra. Indeed, there is no settled consensus even that the “likelihood relationship” statistics are the best way to report on mixtures. But with the FST there are many challenged features. And the People cite absolutely nothing in the recent (or prior) literature to indicate a scientific “change of heart” as to overall assessment of the FST’s component procedures. The relevant scientific community has not endorsed it. IV The diligent reader may be disappointed to learn it, but more remains to be said — about the PCAST report, about the opinions of other judges, and about the People’s arguments here. First, the PCAST report examined seven “feature comparison” areas of forensic analysis.19 It concluded that certain scientific standards must be met before forensic testing results are admitted at a criminal trial. The PCAST report agreed with virtually everyone on one thing: the forensic “gold standard” award goes to simple DNA analysis on a sizable amount of DNA. For complex mixture analysis, however, the report stated — as Collins did as to the FST, see People v. Collins, 49 Misc3d at 629 — that the creators of mixture programs had to do more before they could be thought to have demonstrated the admissibility of mixture conclusions in criminal cases. The PCAST report included this statement: A number of papers have been published that analyze known mixtures in order to address some of these issues. Two points should be noted about these studies. First, most of the studies evaluating software packages have been undertaken by the software developers themselves. While it is completely appropriate for method developers to evaluate their own methods, establishing scientific validity also requires scientific evaluation by other scientific groups that did not develop the method. Second, there have been few comparative studies across the methods to evaluate the difference among them — and, to our knowledge, no comparative studies conducted by independent groups. Most importantly, current studies have adequately explored only a limited range of mixture types (with respect to a number of contributors, ratio of minor contributors, and total amount of DNA). The two most widely used methods (STRmix and TrueAllele) appear to be reliable within a certain range, based on the available evidence and the inherent difficulty of the problem. Specifically, these methods appear to be reliable for three-person mixtures in which the minor contributor constitutes at least 20 percent of the intact DNA in the mixture and in which the DNA amount exceeds the minimum level required for the method. PCAST Report, supra at 80 (emphasis added) (footnotes omitted). To many other types of forensic analysis not involving DNA, and thus not relevant here, the report was far less generous. The PCAST report drew a storm of criticism. A very large amount of it came from the law enforcement community and from others whose oxen had been gored.20 A fair amount of the criticism was directed at the report’s conclusion that DNA mixture evidence generally is not yet appropriate for admission at criminal trials. In particular, critics focused on the fact that the president’s council did not include any forensic scientists. And that was true. The council had 19 members. The council’s charge was by no means limited to its work on the forensic report. It was designed to be a permanent group that advised the president about all scientific areas, and not just about the admissibility of forensic evidence in courtrooms. It included experts in genetics, computer science, astrophysical sciences, the environment, physics, medicine, electrical engineering, chemistry, biology, biochemistry, and nanotechnology. Five of the members were current or retired business executives, including the President and CEO of The Aerospace Corporation, and the Executive Director of Google and now Alphabet, Inc. Many of the scientists could call themselves merely “prominent” only by displaying extreme modesty. For example, the co-chair of the council, who was also the chair of the working group that produced the 2016 PCAST report on forensic evidence, was Dr. Eric Lander. Dr. Lander is the president and founding director of the Broad Institute of MIT and Harvard. He is a geneticist, a molecular biologist, and a mathematician. Beyond that, Dr. Lander was a professor of biology at MIT and a professor of systems biology at the Harvard Medical School. Dr. Lander is perhaps best known for his role in understanding the human genome and explaining biomedical applications of it. In fact, he was a principal leader of the International Genome Project from 1992 to 2013, and his group was the largest contributor to the ultimate mapping and sequencing of the human genome. Dr. Lander’s professional achievements have been acknowledged more than once. He has been awarded a MacArthur Fellowship, the Woodrow Wilson Prize for Public Service, the Gairdner Foundation International Award of Canada, the City of Medicine Award, the Abelson Prize Award for Public Understanding of Sciences and Technology, the Albany Prize in Medicine and Biological Research, the Dan David Prize of Israel, the Mendel Medal of the Genetics Society of the United Kingdom, the Breakthrough Prize in Life Sciences, and the James R. Killian Jr. Faculty Achievement Award. The court will not detail the accomplishments of all the members of what was, after all, the scientific advisory council to the President of the United States. It does bear mention, however, that in 1995 a member of the council, Dr. Mario Molina, received the Nobel Prize in Chemistry. Beyond that, suffice it to say that on the council were science and/or technology professors at Princeton, Berkeley, Northwestern, the University of St. Louis, Harvard, the University of Texas at Austin, the University of Michigan, and the University of Maryland. The working group for the PCAST forensics report included five of these individuals along with Professor William Press, who teaches Computer Science and Integrative Biology at the University of Texas at Austin. The senior legal advisors of the working group included eight federal appellate or district court judges; four members of the faculties at the University of California at Hastings, UCLA, and Harvard; and statistics professors at Carnegie Mellon and the University of Virginia. The forensics working group was also advised about particular aspects of the report by 77 additional individuals. Among them were over a dozen professors, at least four prosecutors, forensic experts in the individual disciplines discussed in the report, and forensic scientists from the FBI laboratory, the Minnesota Bureau of Criminal Apprehension, the Maine State Police Crime Lab, the Orange County Sheriff’s Department, the Baltimore Police Department, the San Francisco Police Department, and the Kentucky State Police. Perhaps most importantly, the experts included Doctors John Buckleton, Bruce Budowle, and John Butler. Anyone who has studied the scientific literature on forensic DNA practices will recognize that trio as among the world’s most respected and influential scientists in their field. In short, the attack on the expertise of those who produced the PCAST report is beyond frivolous. Certainly, the 19 members of the council themselves were not forensic scientists. As noted, the council’s charge was far broader than advising the president on forensic evidence in criminal cases. But the council members must be considered well able to teach about proper scientific method. Broadly speaking, that is just what the PCAST report did. And advice from many legal experts and forensic scientists aided the council’s analysis in particular forensic disciplines — such as the analysis of complex DNA mixtures. This court is very content to acknowledge once more that it should not decide whether the PCAST report’s conclusions are correct. But to this court it is obvious that the often bitter debate about the PCAST verdict conclusively confirms what this judge believes: there is nothing close to a consensus on the FST procedures in the relevant scientific community. C Many judges have disagreed, sometimes strongly, with this court’s Frye assessment. That is nothing extraordinary, and indeed criticism comes with the job. But a repeated aspect of this disagreement involves one complaint that is patently unfair to professionals who testified at the Collins hearing. At the hearing this judge was informed by extremely well qualified witnesses presented by both sides. That provided a basis for a first-hand assessment of the witnesses’ credibility, after extensive and thoroughly expert cross-examination. This judge found that both sides’ witnesses were honest, skilled, and impressive professionals. Other judges who did not hear or see the testimony have been less kind to three of the defense witnesses. One of them, Dr. Bruce Budowle, was the long-time architect of DNA analysis at the FBI laboratory, earning universal respect as a pioneer in his field. His “CV” is as long as, and in fact actually much longer than, your arm; he now runs a laboratory at the University of North Texas that does DNA testing. He has also been a director of the Texas Forensic Science Commission. Beyond that, he remains a forensic DNA pioneer, with credentials not exceeded by any expert known to this court, and as noted he was an advisor to PCAST’s forensic evidence working group. Two of the other defense witnesses work with him at the University of North Texas. In some cases the People have argued, and some judges have accepted, that the employment relationship among these three experts makes their testimony questionable. In this judge’s view, that is nonsense. Again, Dr. Budowle is one of the most respected DNA experts in the world. His colleagues are highly qualified and respected as well. And Dr. Budowle and his collegues have no financial interest in the FST issue. Their lab at the University of North Texas does not do DNA work in criminal cases, as to mixtures or anything else. The three witnesses are disinterested scientists. Among the experts at the Collins hearing were two prosecution witnesses who worked for OCME. The court does not at all retract its Collins statement that it found these witnesses, like the others, honest. But if one sought to find reasons to challenge witness credibility based on employment and career concerns, one would have to start with these two. The doctors testified while they were employed by OCME. They were the co-directors of the project that created the FST. They received a New York City award for their work. Much of their professional credibility was plainly pinned to the success or failure of the FST. If anyone had a career or financial motive to falsify here, these two scientists were the ones. And that should put the senseless credibility attacks on the defense witnesses in proper context. What we are faced with here is simply an honest disagreement among highly qualified experts. And that we have an honest disagreement among experts like these is precisely what is determinative of the Frye question. The efforts to blink away that reality by attacking the bona fides of the defense witnesses is senseless. Beyond that, the court notes that a few judges’ opinions lend support to Collins — even if, admittedly, those opinions are dramatically outnumbered. For example, Justice Riviezzo has twice ordered Frye hearings as to the FST. People v. Rysheek Dixon, Ind No 1208/2016 (Sup Ct Kings Co May 2, 2017); People v. Quentin Abney, Ind No.1234/2013 (Sup Ct Kings Co July 7, 2015). Justice Sciarrino has ordered one as well, with the observation that “[c]learly, there is a diminished consensus in the legal community as to the efficacy of the FST.” People v. Jillian Slacks, Ind No 7212/2016 (Sup Ct Kings Co September 7, 2017).21 This judge does not mean to discount the very true fact that a large majority of decisions, before and after Collins, concluded that the FST enjoyed general approval in the relevant scientific community. As to some of those decisions, observations were made in Collins. But an exhaustive review now of the many pro-FST decisions before and since Collins would serve little purpose. This court will make only a few comments. The first and only other opinion on the FST issued after a Frye hearing is unfortunately unreported: People v. William Rodriguez, NY Co Ind No 5471-2009 (Sup Ct NY Co October 24, 2013) (Carruthers, J.)22 Justice Carruthers concluded that the FST was accepted because it relied on PCR-STR analysis, the standard technique for identifying alleles, and also on the likelihood ratio approach. Id. at 8-9, 30. The opinion did go further, at the defendant’s urging. It accepted the OCME quant approach on the testimony of Doctors Mitchell, Caragine, and Haned. In doing so, it overruled the objections of two defense experts by quite clearly doubting their credibility and findings. The court credited the People’s evidence that the quant approach was acceptable. Id. at 31-32, 39-40, 47-48. The court added that the OCME experts had each written at least one peer-reviewed article and had made presentations at DNA conferences. The opinion concluded that all the FST did was add the “likelihood ratio” approach and computer computation to routine PCR-STR analysis. This court does not have any knowledge of what was argued by the defense in Rodriguez. But, with all respect to the judge and the defense counsel in that case, the presentation resulted in incorrect Frye conclusions. The most significant aspect of the FST approach addressed by the Rodriguez opinion was the quant issue — and it was resolved without any apparent assessment (if any was available) by any scientist who did not testify at the Frye hearing. Nor were any other problematic aspects of the FST approach addressed, with an overall “community” opinion in mind. As to procedures as novel as those of the FST, in so complex an area, conclusions on those procedures were not adequately grounded. Some decisions, especially some of the more recent ones, recognize that appropriate Frye analysis of a DNA mixture tool like the FST requires more than an observation that it employs Bayesian mathematics and a computer. It now seems apparent to most judges that a Frye court must consider whether scientists generally agree that the tool employs adequate methods to assess stochastic effects. Recent opinions baldly state that there is such a consensus as to the FST’s consideration of such effects. See, e.g., People v. Carter, 50 Misc3d 1210(A) (Sup Ct Queens Co 2016), aff’d, 156 AD3d 898 (1st Dept 2017);23 People v.Debraux, 50 Misc3d 247, 256-57 (Sup Ct NY Co 2015). But the opinions do not explain how this could be, and thus do not sway this judge. For example, they do not address how unaccepted it is to use “quant” as the basis for assessing the likelihood of stochastic effects. They do not discuss scientists’ views about the shortcomings in what the jury hears about contributor numbers. The opinions do not notice that the FST validation utilized only “pristine” samples and almost no outside expertise, nor do all recognize that it remains a “black box.” They do not discuss the FST’s method of determining statistical differences among races. Indeed, in most opinions none of the questioned FST procedures are even mentioned.24 It must be recalled that a Frye decision turns on whether there is expert controversy, not on whether the People’s experts’ views might seem reasonable. And as to the FST, controversies plainly exist. A trend in the pro-FST opinions that disturbs at least this judge should be more specifically addressed. Comments in several of those opinions, including Debraux and Carter, suggest that the reliability of FST procedures and the weight to be given to them should be left to the jury if at least some assessment of stochastic effects has been made by a mixture analysis program. This position depends on jurors’ ability to evaluate, from the cross-examinations of experts, whether (for example) the lab used proper peak height thresholds to determine the extent of drop-in, or reasonable standards to resolve whether it is drop-out or a homozygote that is responsible for a result at a particular locus. Or (for example) jurors might be asked to resolve whether “quant” should be used to assess stochastic effects. This judge continues to think, as expressed in Collins, that this view is hugely inconsistent with the Frye test. What is at issue is not whether scientists agree, as they certainly do, that stochastic effects must be assessed in mixture analysis. What is at issue is whether the FST assesses them through proper techniques. It is unrealistic to think that a jury can resolve a “battle of the experts” on the FST procedures. If forensic scientists are divided on a question, how does it make sense to let jurors decide that question? Frye says that this would be wrong. And so does the NAS. In 2018 the NAS issued a call for “more science in forensic science.” Bell et al., A Call for More Science in Forensic Science, 115 Proceedings of the NAS 4541, 4541 (2018) http://www.pnas.org/content/early/2018/04/11/1712161116 That body noted: As science — and forensic science more specifically — continues to advance, it becomes increasingly absurd to ask or expect lawyers, judges, and juries to take sole responsibility for critically evaluating the quality and validity of scientific evidence and testimony. See also PCAST Report at 45; Giannelli, Forensic Science: Daubert’s Failure, 68 Case W. Res. L. Rev. 869 at 933-34 (2018). For what it is worth, this judge wholeheartedly agrees. Scientists’ views should be determinative. Collins noted that judges cannot reliably decide whether scientific procedures, like DNA mixture procedures, are reliable. But this judge will acknowledge that the five defendants whose FST issues were resolved in Collins and in this case were guilty — they all eventually pleaded guilty, and the proof apart from contested FST evidence was, for each, at least strong. But that is not at all the point. The question is whether certain evidence should be considered at a trial, as to defendants who have not pleaded guilty. * * * And that leads to a final comment. This opinion began with a recognition that its purpose might be questioned. This thought should be added to that discussion. Now as in 2015, there is no scientific consensus that the FST procedures provided acceptable assessments of probability numbers, as to the presence in DNA mixtures of defendants’ DNA. This judge well understands the difficulty facing trials courts required to follow Frye, when complex scientific techniques arguably have or have not yielded reliable evidence. As noted numerous times before, this judge is grateful that his job is to assess only whether scientists generally agree on the reliability of challenged procedures employed by forensic experts. And this judge continues to conclude that we should not toss unresolved scientific debates into judge’s chambers, and especially not into the jury room. That conclusion applies to FST evidence — still. Dated: September 25, 2019