New automated technologies offer an alluring hope that the expensive and complicated duty to preserve and review electronically stored information (ESI) will be mitigated. But how effective are these technology-assisted reviews (TAR), and do they satisfy the Federal Rules of Civil Procedure governing discovery?

When litigation is reasonably anticipated, relevant ESI must be identified and preserved, and when litigation is pending, it must be reviewed for relevance and privilege. This process (especially during litigation) can consume costly attorney review time.

Given the volume of ESI, full manual review of all documents is usually impractical. Instead, most law firms use key word searches for collecting relevant documents, often with “Boolean operators” (e.g., “and”, “or”, “but not”). Although key word searches are widely accepted by courts as sufficient to satisfy parties' duty to collect responsive documents, this type of search can result in document collections that are both over-inclusive in some areas and under-inclusive in others. For example, emails often contain informal abbreviations, acronyms, and spelling errors that key words will invariably miss. The results of key word searches must then be reviewed by attorneys for relevance and privilege.

Recently, sophisticated technologies have emerged that are more efficient than key word searches. Studies show that these technologies are also more accurate than manual review in identifying relevant documents and excluding irrelevant ones.

Computer-assisted coding, often referred generally to as “predictive coding,” uses complex algorithms that allow for the efficient review of vast amounts of documents and pinpoint those of relevance. The coding process is relatively simple. Review teams code a “seed set” of documents, and the computer software identifies properties of these documents that it then uses to code others. Reviewers then spot-check the computer's efforts and provide the computer feedback on how it is coding the documents. Usually only a few thousand documents need to be reviewed manually before the documents coded by people and those coded by the computer sufficiently overlap. One drawback of predictive coding is the significant initial expense involved to create the algorithm; consequently, unless there is a sufficiently large amount of documents to review, there is a risk that predictive coding will not save clients money. Other TAR search methodologies include “Bayesian classifiers,” “Fuzzy Search Models,” “Clustering,” and “Concept and Categorization Tools,” all of which focus on word patterns contained in documents. In all cases, any TAR review must include sufficient sampling, attorney review, quality control measures, and effective project management.

Some courts have indicated their support for the use of TAR, provided certain safeguards are included in the document review process. For example, a recent judicial decision in the Southern District of New York validated the use of predictive coding, but noted that the most important issue was the use of an appropriately-tailored review process. See Da Silva Moore v. Publicis Groupe, 2012 U.S. Dist. Lexis 23350, at *3, 40 (S.D.N.Y. Feb. 24, 2012) (“computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases,” but “counsel must design an appropriate process”).

New technologies may also assist with data preservation. For example, some companies routinely “index” their entire universe of ESI as a prophylactic measure even before litigation is anticipated. This is supposed to allow for responsive documents to be easily identified—similar to when a “Windows Search” is performed on a computer running Windows. Some indexing software allow for searches of backed up files, and some are intended to enable a party to “crawl” through active data. ESI that is identified from indexes can then be locked down “in place” or transferred to secure locations. This process of indexing the entire universe of ESI is expensive, however, and commentators have noted that the capabilities of indexing software are largely unproven. As a result, the sufficiency of the preservation process subsequently may be challenged, potentially leaving the party at risk for a claim of spoliation.

The 2006 Amendments to Fed. R. Civ. P. 26(f) focused on electronic discovery, but the Amendments did not identify particular approaches or technologies that must be used in preservation or review of ESI. Instead, the Amendments provided that parties should agree on the reasonable steps needed to comply with discovery of ESI. Given the rapid and unpredictable development of technology, this seems like the right approach; most issues can be resolved by agreement between parties. In the absence of agreement, however, we recommend that parties engage the court early in the process so that preservation and review methodologies will not be challenged after parties have undertaken substantial investments in their ESI processes.