Exploring two effective approaches to technology-assisted review
Both language- and artificial intelligence-based methods can cut the time and cost of the document review process
October 03, 2012 at 04:30 AM
10 minute read
The original version of this story was published on Law.com
For more than 10 years, corporate-generated electronically stored information (ESI) has been growing exponentially. The expense of maintaining these growing data stores often stresses corporate bottom lines, but litigation can bring them to the breaking point. Specifically, the greatest concern comes from the document review phase, which in most cases, can take up to 75 percent of the discovery budget. Worse still, many corporate budgets are still unadjusted to account for this reality, and the time frame for production remains the same. Hence, once litigation begins, it often tests the limits of these budgets.
These realities have brought our system of litigating civil disputes to a tipping point, but a solution is in sight. Two respected judges have endorsed a new spectrum of approaches for reviewing documents, known as technology-assisted Review (TAR). In this article, we will be discussing these approaches and when corporations can best apply them.
Before we delve into the analysis of approaches, let's first level-set on what TAR actually means and why it is of interest in dealing with these issues. A recent law review article generally described these approaches as follows:
…a process that involves the interplay of humans and computers (meaning various search technologies) to identify documents in a collection that are responsive to a production request, or to identify those documents that should be withheld on the basis of privilege.
TAR improves on linear review by more quickly sifting through large volumes of data and identifying the documents that are potentially responsive, sometimes accelerating this review process by 75 percent or more. This acceleration then manifests in proportional cost savings, particularly when compared to linear review.
TAR can involve a number of methods to yield a defensible production set, but a standard methodology has not yet been established. The encouraging news is that recent court decisions have addressed two effective approaches:
- AI-based methodology leverages computer-provided artificial intelligence (AI) to identify potentially relevant data in a document collection
- Language-based methodology relies on a human's understanding of language to identify potentially relevant data
Both deliver significant savings in time and cost, but each approach has specific instances in which its use is ideal.
AI-based TAR is generally the best fit, when two elements are in play:
- The need to arrive at quick decisions early in the litigation (assess the case early to decide whether to settle or litigate)
- Enough time remains to read an average of 10,000 documents in order to train the system with a “seed set”
The workflow leverages AI to look for potentially relevant data, meaning that a computer, rather than a reviewer, is performing the lion's share of the decision making. The reviewer identifies a handful of potentially relevant documents, and the computer takes this input and looks for “more like this” across the current corpus of data to find what it concludes are also responsive documents.
While this process can be quick and relatively painless to counsel at the outset, as we have seen in recent court opinions, this approach can also present challenges, particularly in the crafting and updating of the seed set. Downstream from the review process, we have also seen that, when challenged, explaining how an analytics-based TAR seed set performs can be difficult.
Conversely, language-based TAR relies on a human's understanding of the actual language in a data set to identify and prioritize documents that are potentially responsive. This results in sets of documents that are prioritized according to relevance (i.e., can't possibly be relevant, possibly relevant, definitely relevant), and reduces the set of documents that are advanced for review by 50 percent, on average. Those “possibly relevant” documents then move on to a first-pass review phase in which the specific relevant language in each document is highlighted, and then other documents with similar language are tagged accordingly, taking advantage of the redundancy in language from one document to the next. At the same time, reviewers can defensibly set documents that “can't possibly be relevant” aside with clear transparency as to why they have been excluded, and move those that are “definitely relevant” straight to second-pass, privilege or QC review. This process turns the process of search upside down, looking first for what's not relevant, which dramatically reduces the time and costs of review when compared to linear review.
Because of this emphasis on human decision-making, language-based TAR is most appealing when transparency and insight into coding decisions are of paramount concern, when the ability to audit reviewers in real time is important (highlighting language enables real-time oversight), and when an organization wants to make this a regular business practice (a corporate dictionary is an output that companies can leverage in future cases).
In conclusion, TAR can deliver significant time and cost savings regardless of which approach an organization takes. If a corporation needs a quick decision on the exposure a case presents, and if there is enough time to read a significant number of documents prior to review, then an AI-based TAR approach may be best. If transparency and insight into coding decisions is important, or the process of a regular business practice is appealing, then language-based TAR is the way to go. In either case, both approaches are court-tested, making them reasonable for use in offsetting the risk and cost of document review.
For more than 10 years, corporate-generated electronically stored information (ESI) has been growing exponentially. The expense of maintaining these growing data stores often stresses corporate bottom lines, but litigation can bring them to the breaking point. Specifically, the greatest concern comes from the document review phase, which in most cases, can take up to 75 percent of the discovery budget. Worse still, many corporate budgets are still unadjusted to account for this reality, and the time frame for production remains the same. Hence, once litigation begins, it often tests the limits of these budgets.
These realities have brought our system of litigating civil disputes to a tipping point, but a solution is in sight. Two respected judges have endorsed a new spectrum of approaches for reviewing documents, known as technology-assisted Review (TAR). In this article, we will be discussing these approaches and when corporations can best apply them.
Before we delve into the analysis of approaches, let's first level-set on what TAR actually means and why it is of interest in dealing with these issues. A recent law review article generally described these approaches as follows:
…a process that involves the interplay of humans and computers (meaning various search technologies) to identify documents in a collection that are responsive to a production request, or to identify those documents that should be withheld on the basis of privilege.
TAR improves on linear review by more quickly sifting through large volumes of data and identifying the documents that are potentially responsive, sometimes accelerating this review process by 75 percent or more. This acceleration then manifests in proportional cost savings, particularly when compared to linear review.
TAR can involve a number of methods to yield a defensible production set, but a standard methodology has not yet been established. The encouraging news is that recent court decisions have addressed two effective approaches:
- AI-based methodology leverages computer-provided artificial intelligence (AI) to identify potentially relevant data in a document collection
- Language-based methodology relies on a human's understanding of language to identify potentially relevant data
Both deliver significant savings in time and cost, but each approach has specific instances in which its use is ideal.
AI-based TAR is generally the best fit, when two elements are in play:
- The need to arrive at quick decisions early in the litigation (assess the case early to decide whether to settle or litigate)
- Enough time remains to read an average of 10,000 documents in order to train the system with a “seed set”
The workflow leverages AI to look for potentially relevant data, meaning that a computer, rather than a reviewer, is performing the lion's share of the decision making. The reviewer identifies a handful of potentially relevant documents, and the computer takes this input and looks for “more like this” across the current corpus of data to find what it concludes are also responsive documents.
While this process can be quick and relatively painless to counsel at the outset, as we have seen in recent court opinions, this approach can also present challenges, particularly in the crafting and updating of the seed set. Downstream from the review process, we have also seen that, when challenged, explaining how an analytics-based TAR seed set performs can be difficult.
Conversely, language-based TAR relies on a human's understanding of the actual language in a data set to identify and prioritize documents that are potentially responsive. This results in sets of documents that are prioritized according to relevance (i.e., can't possibly be relevant, possibly relevant, definitely relevant), and reduces the set of documents that are advanced for review by 50 percent, on average. Those “possibly relevant” documents then move on to a first-pass review phase in which the specific relevant language in each document is highlighted, and then other documents with similar language are tagged accordingly, taking advantage of the redundancy in language from one document to the next. At the same time, reviewers can defensibly set documents that “can't possibly be relevant” aside with clear transparency as to why they have been excluded, and move those that are “definitely relevant” straight to second-pass, privilege or QC review. This process turns the process of search upside down, looking first for what's not relevant, which dramatically reduces the time and costs of review when compared to linear review.
Because of this emphasis on human decision-making, language-based TAR is most appealing when transparency and insight into coding decisions are of paramount concern, when the ability to audit reviewers in real time is important (highlighting language enables real-time oversight), and when an organization wants to make this a regular business practice (a corporate dictionary is an output that companies can leverage in future cases).
In conclusion, TAR can deliver significant time and cost savings regardless of which approach an organization takes. If a corporation needs a quick decision on the exposure a case presents, and if there is enough time to read a significant number of documents prior to review, then an AI-based TAR approach may be best. If transparency and insight into coding decisions is important, or the process of a regular business practice is appealing, then language-based TAR is the way to go. In either case, both approaches are court-tested, making them reasonable for use in offsetting the risk and cost of document review.
This content has been archived. It is available through our partners, LexisNexis® and Bloomberg Law.
To view this content, please continue to their sites.
Not a Lexis Subscriber?
Subscribe Now
Not a Bloomberg Law Subscriber?
Subscribe Now
NOT FOR REPRINT
© 2025 ALM Global, LLC, All Rights Reserved. Request academic re-use from www.copyright.com. All other uses, submit a request to [email protected]. For more information visit Asset & Logo Licensing.
You Might Like
View AllFired by Trump, EEOC's First Blind GC Lands at Nonprofit Targeting Abuses of Power
3 minute readTrump's Inspectors General Purge Could Make Policy Changes Easier, Observers Say
Keys to Maximizing Efficiency (and Vibes) When Navigating International Trade Compliance Crosschecks
6 minute readTrending Stories
- 1Troutman Pepper, Claiming Ex-Associate's Firing Was Performance Related, Seeks Summary Judgment in Discrimination Suit
- 2Law Firm Fails to Get Punitive Damages From Ex-Client
- 3Over 700 Residents Near 2023 Derailment Sue Norfolk for More Damages
- 4Decision of the Day: Judge Sanctions Attorney for 'Frivolously' Claiming All Nine Personal Injury Categories in Motor Vehicle Case
- 5Second Judge Blocks Trump Federal Funding Freeze
Who Got The Work
J. Brugh Lower of Gibbons has entered an appearance for industrial equipment supplier Devco Corporation in a pending trademark infringement lawsuit. The suit, accusing the defendant of selling knock-off Graco products, was filed Dec. 18 in New Jersey District Court by Rivkin Radler on behalf of Graco Inc. and Graco Minnesota. The case, assigned to U.S. District Judge Zahid N. Quraishi, is 3:24-cv-11294, Graco Inc. et al v. Devco Corporation.
Who Got The Work
Rebecca Maller-Stein and Kent A. Yalowitz of Arnold & Porter Kaye Scholer have entered their appearances for Hanaco Venture Capital and its executives, Lior Prosor and David Frankel, in a pending securities lawsuit. The action, filed on Dec. 24 in New York Southern District Court by Zell, Aron & Co. on behalf of Goldeneye Advisors, accuses the defendants of negligently and fraudulently managing the plaintiff's $1 million investment. The case, assigned to U.S. District Judge Vernon S. Broderick, is 1:24-cv-09918, Goldeneye Advisors, LLC v. Hanaco Venture Capital, Ltd. et al.
Who Got The Work
Attorneys from A&O Shearman has stepped in as defense counsel for Toronto-Dominion Bank and other defendants in a pending securities class action. The suit, filed Dec. 11 in New York Southern District Court by Bleichmar Fonti & Auld, accuses the defendants of concealing the bank's 'pervasive' deficiencies in regards to its compliance with the Bank Secrecy Act and the quality of its anti-money laundering controls. The case, assigned to U.S. District Judge Arun Subramanian, is 1:24-cv-09445, Gonzalez v. The Toronto-Dominion Bank et al.
Who Got The Work
Crown Castle International, a Pennsylvania company providing shared communications infrastructure, has turned to Luke D. Wolf of Gordon Rees Scully Mansukhani to fend off a pending breach-of-contract lawsuit. The court action, filed Nov. 25 in Michigan Eastern District Court by Hooper Hathaway PC on behalf of The Town Residences LLC, accuses Crown Castle of failing to transfer approximately $30,000 in utility payments from T-Mobile in breach of a roof-top lease and assignment agreement. The case, assigned to U.S. District Judge Susan K. Declercq, is 2:24-cv-13131, The Town Residences LLC v. T-Mobile US, Inc. et al.
Who Got The Work
Wilfred P. Coronato and Daniel M. Schwartz of McCarter & English have stepped in as defense counsel to Electrolux Home Products Inc. in a pending product liability lawsuit. The court action, filed Nov. 26 in New York Eastern District Court by Poulos Lopiccolo PC and Nagel Rice LLP on behalf of David Stern, alleges that the defendant's refrigerators’ drawers and shelving repeatedly break and fall apart within months after purchase. The case, assigned to U.S. District Judge Joan M. Azrack, is 2:24-cv-08204, Stern v. Electrolux Home Products, Inc.
Featured Firms
Law Offices of Gary Martin Hays & Associates, P.C.
(470) 294-1674
Law Offices of Mark E. Salomone
(857) 444-6468
Smith & Hassler
(713) 739-1250