Thomas Gricks, Catalyst

In a word, yes. But it's not what you might think.

The weak link preventing technology-assisted review (TAR) from achieving its true potential is a lack of clarity surrounding the technology—the components, the development and the distinctions. No doubt, TAR is seeing greater acceptance and refinement in the legal space. But with a deeper understanding of the technology, TAR can be even more useful and effective.

Understanding the Technology

To start, TAR is a process by which reviewers code documents for some target criteria (e.g., responsiveness), and an algorithm uses those coding decisions to efficiently manage the review of the unseen documents—known as “supervised machine learning.” Some TAR processes manage review by categorizing the remaining documents, others manage by ranking the collection. Either way, the goal is to effectively train the algorithm and minimize the number of documents that need to be reviewed to achieve recall objectives for the target criteria.

If coding decisions are not being used to train the algorithm (known as “unsupervised machine learning”), the process simply is not a TAR process. Therefore, while clustering, near-duplicate analysis and email threading all use technology to aid in the review process, they are not TAR for purposes of this discussion.

A true TAR application has three layers. The base layer consists of feature extraction, where the documents are decomposed into the elements, or “features,” that will be used by the algorithm to evaluate coding decisions, and compare and make decisions about unreviewed documents. On top of feature extraction sits the supervised machine learning algorithm layer. And the entire TAR operation is directed by the “process” layer, which controls all aspects of the training protocol.

Contemporary feature extraction techniques typically focus on the text in the body of individual documents. Features most often consist of individual words or word fragments. However, expanding the feature set to include two- and three-word segments has been found to improve performance. Conversely, feature reduction techniques such as latent semantic indexing, which consolidate multiple words into a single proxy feature, have been shown to degrade performance with most TAR algorithms.

As research and development continue, the feature extraction layer is likely to see expansion beyond the body text, and continued refinement to improve TAR efficiency. See, e.g., Jones, Amanda, et al., ”The Role of Metadata in Machine Learning for Technology Assisted Review,” DESI VI Workshop, June 8, 2015.

At the next level, the consistent emphasis on identifying the specific TAR algorithm is a prime example of the educational weak link that inhibits progress. With a few exceptions, the supervised machine learning algorithms used in TAR applications, all other things being equal, will see somewhat equivalent results. Whether it's SVM (support vector machine), logistic regression, Naïve Bayes or even a proprietary algorithm, operational differences typically do not depend on the specific TAR algorithm being used.

Certainly, however, there are a few exceptions. The 1-nearest neighbor algorithm has been shown to be somewhat ineffective in e-discovery review applications. And there is simply not enough training data to take advantage of deep learning algorithms in e-discovery. Conversely, incorporating reinforcement learning may well improve the effectiveness of a TAR algorithm.

As an aside to clarify messaging, the fact that TAR applications rely on supervised machine learning algorithms means that TAR is, by definition, using artificial intelligence or AI, since supervised machine learning is indeed one form of AI.

Differentiating TAR 1.0 from TAR 2.0

Perhaps the most significant distinction between TAR applications is found at the process layer, which can be broken down into two principal categories that are most often referred to as TAR 1.0 and TAR 2.0. The primary distinction stems from the protocol for training the algorithm.

In a TAR 1.0 application, documents are reviewed and coded to train the algorithm only until either the algorithm shows no further improvement (referred to as stabilization); or the production metrics of recall and precision appear to be sufficient, typically by reference to a random, representative control set designed to monitor progress. Training usually consists of a few thousand documents. The algorithm will then automatically classify the remaining documents or, alternatively, rank them to facilitate a manual classification. Once classified, the presumptively positive documents may or may not be reviewed and coded, but will not further train the algorithm.

TAR 1.0 applications can be further divided into simple passive learning (SPL) and simple active learning (SAL) protocols, depending upon the manner in which training documents are selected. With an SPL protocol, training documents are selected at random. The protocol is “simple” because there is a discrete training phase, after which training ceases regardless of further coding. It is passive because the algorithm does not select the random training documents.  With a SAL, protocol, the algorithm typically selects training documents from those about which the algorithm is the least certain. This is known as “uncertainty sampling,” and it is considered an active protocol because the algorithm actively selects the training documents.

With TAR 2.0, documents are continuously reviewed and coded to train the algorithm until enough positive documents have been located, reviewed, and coded to achieve production objectives. Training documents are primarily selected through relevance feedback, which focuses on documents the algorithm sees as most likely to be positive. This protocol is called continuous active learning (CAL). The protocol is “continuous” because every coding decision is used to train the algorithm. And again, it is active because the algorithm actively selects the training documents. This is typically accomplished by ranking the entire collection so the most likely positive documents at the top can be reviewed first.

Studies show that CAL (TAR 2.0) is typically more efficient than either TAR 1.0 protocol when the presumptively-positive (e.g., responsive) documents will be reviewed. That is simply because, while TAR 1.0 training is very limited, the resultant presumptively-positive set contains more negative documents than would be reviewed with CAL.

CAL also overcomes many of the practical obstacles to adoption that are inherent in the operation of TAR 1.0. A control set is not required, making it easier to handle rolling collections. There is no need for a subject matter expert (SME) to train the algorithm to avoid propagating erroneous decisions—CAL is noise tolerant, and our studies have shown that contract review attorneys train the algorithm as well as, and in some cases better than, an SME. Eliminating the SME also means that document review can start immediately, rather than waiting for an SME to code the control set and the training set. And the review can focus on the documents most likely to be positive (i.e., the best or most relevant documents), rather than the random or uncertain documents used to train TAR 1.0 applications.

Advances at the process level are most likely to come from operational refinements and workflow improvements to the CAL protocol. For example, studies show that more frequent ranking tends to improve CAL efficiency. And, since TAR operates at the document level, eliminating family batching will reduce the number of negative documents reviewed. J. Pickens, et al. “Break up the Family: Protocols for Efficient Recall-Oriented Retrieval Under Legally-Necessitated Dual Constraints.” Proceedings of the Second Annual Workshop on Big Data Analytics in the Legal Industry, IEEE Big Data 2018 (Seattle).

Advancing Legal Application

TAR is certainly moving in the direction of greater acceptance by the judiciary. Indeed, the court in Winfield v. City of New York, 2017 WL 5664852 (S.D.N.Y. 2017) essentially directed the use of TAR to improve the pace of discovery. And the New York Commercial Division adopted as Rule 11-e(f) the goal of using the most efficient review techniques, expressly including TAR. This trend will only continue as ESI collections grow, technical familiarity with TAR improves, and proportionality considerations prescribe efficiency.

Courts are necessarily refining the boundaries of cooperation and transparency surrounding TAR protocols, with particular emphasis on demonstrable production deficiencies. See, Entrata v. Yardi Systems, No. 2:15-cv-00102 (D. Utah 2018) (rejecting a post hoc demand for sweeping disclosures); Winfield (directing production of a sample of nonresponsive documents to “increase transparency”).

As parties become more sophisticated, there is a greater emphasis on the negotiation and use of TAR protocols in litigation. These protocols can be very comprehensive, addressing a wide range of issues such as keyword culling procedures, transparency obligations, and validation parameters. See, In Re Broiler Chicken Antitrust Litigation, No. 1:16-cv-08637 (N.D. Ill.) (No. 586).

Sophisticated parties are also taking maximum advantage of TAR techniques both inside and outside the courthouse. When comprehensive review may be unnecessary, such as second requests and subpoena responses, respondents may resort to TAR 1.0 protocols. Conversely, given that review begins immediately, CAL protocols are expanding into early case assessment, investigations and compliance monitoring.

Ultimately, with a clear understanding of the technology, TAR promises to see increasing utility, and significantly enhance document review on any number of fronts. Technological advances and workflow optimization will incrementally improve TAR efficiencies. And knowledgeable innovation will lead to ever-expanding application opportunities.

Thomas Gricks is managing director, Professional Services, Catalyst. Gricks advises corporations and law firms on best practices for applying TAR technology.