One of the more hotly disputed issues regarding the use of predictive coding is whether parties may use keyword searches to remove nonresponsive documents from a collection of potentially relevant information. Courts have so far provided mixed guidance on this issue, leaving litigants guessing whether their choice of combining keyword and predictive coding search methodologies—if challenged by an adversary—would receive judicial approval. Nevertheless, a new ruling from the Rio Tinto v. Vale (S.D.N.Y. 2015) litigation confirms that parties may blend these search methods to achieve reasonable and proportional productions of highly relevant information.

Mixed Signals on the Use of Keywords with Predictive Coding

By way of background, the courts until recently had sent litigants mixed signals on the use of keywords in connection with predictive coding. On the one hand, multiple courts—most notably the court in In re Biomet (N.D. Ind. 2013)—have approved this holistic approach toward document productions, finding that it satisfied a party's discovery obligations under the Federal Rules of Civil Procedure.

In Biomet, the plaintiffs argued that the defendant's production of documents was incomplete given that the defendant identified responsive information with search terms and predictive coding. The company first applied keyword searches and deduplication methods to reduce the universe of potentially responsive information from 19.5 million to 2.5 million documents. It then searched the remaining subset using a predictive coding process. Relying on scholarly research and statistical reports, the plaintiffs challenged the ability of keywords to return an acceptable recall of responsive information. Because the recall of keyword searches was arguably too low and could leave out too much responsive data, the plaintiffs urged the company to redo its production by running the predictive coding process against the original universe of 19.5 million documents.

The court nevertheless declined to order a redo, holding instead that the company's production of documents satisfied its discovery obligations under Rule 26(b) and Rule 34(b)(2). Nothing in the rules, observed the court, required the company to forego keyword searches. Moreover, even if some marginally relevant information had been bypassed in connection with the keyword searches, redoing the production at the anticipated seven-figure cost estimate would violate proportionality standards set forth in Rule 26(b)(2)(C).