Dictionaries long have been the go-to source for the meaning of words in statutes when there are questions of ordinary meaning or ambiguity. But dictionaries do not always lead to a clear result. The majority opinion in Muscarello v. United States, 524 U.S. 125 (1998), provides an interesting discussion of the shortcomings of dictionaries and other literature. The Oxford English Dictionary included alternate meanings that would encompass the interpretations of both the majority and the dissent. Other dictionaries were similarly imperfect, as were various sources in the Bible and English literature. That two textualists were on opposite side of the battle provides particular irony.

Since Muscarello, a number of articles in the legal literature have reviewed the many instances when statutory meaning can be particularly elusive and considered applying current-day computer analyses to the task—a study that has come to be known as "corpus linguistics." A concurrence in Wilson v. Safelite Grp., Inc., 2019 U.S. App. LEXIS 20472 (6th Cir. July 10, 2019) (Thapar, J., concurring), suggested "adding this tool to [courts' interpretive] belts."

The Third Circuit took up the challenge in Caesars Entertainment Corp. v. International Union of Operating Engineers, Local 68 Pension Fund, 2019 U.S. App. LEXIS 22991 (Aug. 1, 2019). The issue was whether 1980 amendments to ERISA required a former employer to continue to contribute to a multi-employer union pension plan for "work of the type for which contributions were previously required." In deciding that "previously" meant work that was no longer required, the Court analyzed a computer database of word usage, i.e., a linguistic corpus, to determine the most common synonyms used for "previously," and how often words such as "had" and "been" were co-occurring with previously—thus strengthening the conclusion that "previously" indicated a completed action.

Dueling concurrences in Wilson illustrate that not all judges agree that sua sponte searches by a judge or a law clerk of a current or historical linguistic corpus provides legally useful information. A similar debate occurred in concurring opinions in State v. Rasabout, 2015 UT 72, 356 P.3d 1258, 1262 (Ut. 2015), one of the earliest mentions of the methodology in a reported case, where the majority concluded, "We should … refuse our inclination to contrive of interesting research projects that require expertise in fields in which we have no training." Some may view corpus linguistics as nothing more than textualism gone wild. Others may see it as merely an improvement on Judge Richard Posner's use of Google in United States v. Costello, 666 F.3d 1040 (7th Cir. 2012) (searching for uses of "harbored"), or Justice Alito's word search in LEXIS or Westlaw in Texas Dep't of Community Affairs v. Inclusive Communities Project, Inc., 125 S. Ct. 2507, 2534 (1915) (Alito, J., dissenting) (analyzing the term "because of"). Still others may agree that such analysis may have its uses in the hands of linguistic experts—after all, dictionaries are now compiled and updated by linguists using computer analysis.

We urge a more measured and circumspect approach. We should be very cautious about claims made for artificial intelligence, especially in unqualified hands. One day we may have programmed computers to analyze speech and to specify what particular texts mean, but we are far from having AI systems that can analyze speech in that way. Judges and lawyers are not trained linguists, and their intrusion into corpus linguistics may be influenced by any number of human biases or predispositions, especially in the absence of scientific protocols. As noted by the Rasabout majority, corpus linguistics in the untrained hands of a judge may be nothing more than "scientific research that is not subject to scientific review." The prospect of the judges or lawyers wading into the morass of other technical, scientific areas of inquiry to render definitive decisions, unaided by expert opinion, is imprudent; so too here. At the very least, courts should discuss with the parties if its use may be helpful. The best protocol should permit the parties to advance the issue or have the opportunity to comment on amicus or expert submissions. Our generally positive view of corpus linguistics in a prior editorial, "On Language, Lawyers and Judges Don't Have All the Answers," 225 NJLJ No. 12, at 22 (Mar. 25, 2019), was premised on that very assumption that this is an area of expert scientific inquiry, not unassisted judicial opinion, much less bald attorney argument.