The 2019 EDRM TAR Guidelines: Recognizing the Evolving Role of the Subject Matter Expert
While the Guidelines maintain the role of the SME in ensuring reviewer accuracy and assisting in training the model, they also acknowledge the emergence of new technologies which can reduce the burden on the SME.
April 09, 2019 at 07:00 AM
5 minute read
After reading the new Technology Assisted Review (TAR) Guidelines from EDRM, it is clear that the evolution of the underlying technology in TAR solutions is reshaping the role of the subject matter expert (SME). While the Guidelines maintain the role of the SME—typically an experienced (and expensive) attorney most familiar with the project's subject matter—in ensuring reviewer accuracy and assisting in training the model, they also acknowledge the emergence of new technologies which can reduce the burden on the SME.
Newer active learning solutions allow for a continuous training of the model through a prioritized review. This spares the SME the review of multiple training and QC rounds associated with TAR 1.0 solutions, and allows more time for more targeted training. TAR 1.0 solutions can take more time to train the model, whereas the training of the active learning model begins after reviewing a smaller threshold amount of documents.
Training sets, traditionally a TAR 1.0 feature, are offered with some active learning solutions. These allow the SME to elevate training through an isolated review of conceptually-rich key documents, while the review team focuses on the prioritized review queue. The training set can either be a completely randomized sample across the corpus, or a seed set supplemented with a randomized sample. This approach aims to minimize bias while still injecting richness in the sample.
The Guidelines acknowledge that there are different views for the best method of selecting training sets. The different approaches result from varying levels of concern over bias in the training set by relying on “human judgment” or “differing preferences by human reviewers” to select the documents. The Guidelines instruct that any approach to selecting training data will produce an effective predictive model if it is used to produce a sufficiently broad training set. “Thus, differing views over selection of training data are less about whether an effective predictive model can be produced, than about how much work it will take to do so.”
Newer TAR solutions alleviate the burden of training in other ways. In some platforms, multiple models can run concurrently. This allows a reviewer training for relevance to simultaneously train for privilege or specific issues, thereby cutting back on costly re-review efforts.
Active learning solutions can also more easily address the challenge of supplemental collections. With earlier (TAR 1.0) solutions, when new datasets introduced new document features or concepts to the corpus, the model would need additional training in order to properly understand and categorize these new document types. Due to the static nature of the predictive coding index, each addition of this type would require the process of training to be started anew. This included the rebuilding of the index and repetition of the human review process. This redoubled review effort can include coding a seed set, and conducting the numerous rounds of training and QC review to reach stability.
With an active learning solution, since the model is continuously learning and improving its predictions, it can leverage its existing training to incorporate the new collection. This prevents the need to “start from scratch.”
With more time savings in model training through active learning, the SME can lend more of their expertise in QC review. In active learning solutions, differences between human coding decisions and model predictions are typically served up in two separate conflicts queues. These queues can be batched out or sampled for SME review. Where the documents in the project are comprised of user-created content and represent multiple concepts, the data set is considered to have a high conceptual richness. This may lead to a higher percentage of documents with features that the predictive coding model does not understand, which then can lead to disparate confidence levels and document populations with low coverage, posing a challenge to training.
The model's understanding of these documents and resulting prediction scores can be improved by training the system on more documents from lower coverage sets. To address this problem, some of today's active learning solutions have coverage queues and visualizations which eliminate the need for complex saved searches to review these sets. The SME can, therefore, easily sample documents from these sets to improve predictions for the greater review team.
With earlier TAR technologies, the SME might have been heavily involved with training the model throughout the life of the project. The newer features of today's active learning solutions can help to alleviate their burden and allow them to have time for other priorities. In providing a lower barrier to implementation, both in time and cost savings, active learning has become a more attractive option for fulfilling the proportionality and reasonableness of review requirements, both for the end client and the SME.
Erin Baksa is a Senior Business Development Manager at Everlaw. Prior to Everlaw, she worked in ediscovery consulting as a Senior Manager for the Forensic Technology Services team at A&M Asia in Hong Kong. Previous consulting firms include Stroz Friedberg and DTI. Erin is a licensed attorney and has worked in the litigation industry for over 10 years.
This content has been archived. It is available through our partners, LexisNexis® and Bloomberg Law.
To view this content, please continue to their sites.
Not a Lexis Subscriber?
Subscribe Now
Not a Bloomberg Law Subscriber?
Subscribe Now
NOT FOR REPRINT
© 2024 ALM Global, LLC, All Rights Reserved. Request academic re-use from www.copyright.com. All other uses, submit a request to [email protected]. For more information visit Asset & Logo Licensing.
You Might Like
View AllHow Legal Tech Providers Can Walk the Walk on AI, Not Just Talk the Talk
5 minute readDemystifying Data-First Contracting: Transforming Contracting Into a Value Creation Asset
6 minute readTrending Stories
Who Got The Work
Clark Hill members Vincent Roskovensky and Kevin B. Watson have entered appearances for Architectural Steel and Associated Products in a pending environmental lawsuit. The complaint, filed Aug. 27 in Pennsylvania Eastern District Court by Brodsky & Smith on behalf of Hung Trinh, accuses the defendant of discharging polluted stormwater from its steel facility without a permit in violation of the Clean Water Act. The case, assigned to U.S. District Judge Gerald J. Pappert, is 2:24-cv-04490, Trinh v. Architectural Steel And Associated Products, Inc.
Who Got The Work
Michael R. Yellin of Cole Schotz has entered an appearance for S2 d/b/a the Shoe Surgeon, Dominic Chambrone a/k/a Dominic Ciambrone and other defendants in a pending trademark infringement lawsuit. The case, filed July 15 in New York Southern District Court by DLA Piper on behalf of Nike, seeks to enjoin Ciambrone and the other defendants in their attempts to build an 'entire multifaceted' retail empire through their unauthorized use of Nike’s trademark rights. The case, assigned to U.S. District Judge Naomi Reice Buchwald, is 1:24-cv-05307, Nike Inc. v. S2, Inc. et al.
Who Got The Work
Sullivan & Cromwell partner Adam S. Paris has entered an appearance for Orthofix Medical in a pending securities class action arising from a proposed acquisition of SeaSpine by Orthofix. The suit, filed Sept. 6 in California Southern District Court, by Girard Sharp and the Hall Firm, contends that the offering materials and related oral communications contained untrue statements of material fact. According to the complaint, the defendants made a series of misrepresentations about Orthofix’s disclosure controls and internal controls over financial reporting and ethical compliance. The case, assigned to U.S. District Judge Linda Lopez, is 3:24-cv-01593, O'Hara v. Orthofix Medical Inc. et al.
Who Got The Work
Attorneys from Cadwalader, Wickersham & Taft and Pryor Cashman have entered appearances for Diageo Americas Supply d/b/a Ciroc Distilling Co. and Sony Songs, a division of Sony Music Publishing, respectively, in a pending lawsuit. The case was filed Sept. 10 in New York Southern District Court by the Bloom Firm and IP Legal Studio on behalf of Dawn Angelique Richard. The plaintiff, who performed as a member of producer Sean 'Diddy' Combs girl group Danity Kane and later his band, Diddy - Dirty Money, claims that she was financially exploited by Combs and subjected to inhumane working conditions. Among other violations, Richard claims that Combs required group members to remain at his residences and studios, deprived them of adequate food and sleep and forced them to rehearse for 36 to 48 hours without breaks. The case, assigned to U.S. District Judge Katherine Polk Failla, is 1:24-cv-06848, Richard v. Combs et al.
Who Got The Work
Mathilda McGee-Tubb and Kevin M. McGinty of Mintz, Levin, Cohn, Ferris, Glovsky and Popeo, as well as Jesse W. Belcher-Timme of Doherty, Wallace, Pillsbury & Murphy, have stepped in to defend Peter Pan Bus Lines in a pending consumer class action. The suit, filed Sept. 4 in Massachusetts District Court by Hackett Feinberg PC and KalielGold PLLC, accuses the defendant of charging undisclosed 'junk fees' on top of ticket prices during checkout. The case, assigned to U.S. District Judge Mark G. Mastroianni, is 3:24-cv-12277, Mulani et al v. Peter Pan Bus Lines, Inc.
Featured Firms
Law Offices of Gary Martin Hays & Associates, P.C.
(470) 294-1674
Law Offices of Mark E. Salomone
(857) 444-6468
Smith & Hassler
(713) 739-1250