Discovery Cluster
Data analytic software organizes electronic files for more efficient document review.
March 31, 2007 at 08:00 PM
6 minute read
When Pfizer Inc. began putting together an e-discovery strategy in 2003, it didn't have many models to follow.
“At the time, there was very limited e-discovery going on,” says Laura Kibbe, senior counsel at Pfizer. “To the extent it was happening, parties would just use a simple exchange of keywords to search through relatively low volumes of data.”
But with talk of new federal guidelines dictating the rules of e-discovery, Kibbe foresaw an explosion in the burden and costs associated with searching and retrieving electronic data. And as e-mail continued to generate an ever-larger body of data to cull through, Kibbe realized existing search processes would quickly become obsolete.
So in 2004 she teamed up with SPi, an e-discovery consultancy and solution provider. “We partnered with SPi because of this data analytics capability they were developing,” she says.
Data analytic software is a powerful tool used to make sense of large amounts of information. It can organize vast volumes of data into topical piles so that reviewers only need to search for and read documents in those piles rather than spending hours digging through hundreds of disorganized and potentially irrelevant documents. This not only reduces the time it takes to review a set of documents, but also reduces the amount spent on attorney review.
Since relying on data analytics, Pfizer has reduced the number of documents it sends to review by nearly 70 percent on some of its largest matters.
“If you look at e-discovery today, 70 percent of your costs are dedicated to review,” Kibbe says. “To get that cost down, you have to make sure that the relevant, responsive stuff gets to the review room while the junk doesn't. Data analytics helps you ensure that you're sending the right stuff to be reviewed and leaving the wrong stuff out of the funnel.”
Guess Work
To do this, data analytics employs complex statistical analysis to automatically group related documents.
First, an entire body of documents is fed through the software. The software scans each document's content and compares it to every other document in the system. Documents with statistically similar content are then lumped into a pile together. Based on the shared content, the software automatically names the pile for easy identification.
This is a drastic change from traditional culling methods. Prior to the advent of data analytics in 2003, lawyers had to rely on keyword search tools. This required users to manually type in search terms, a technique that's inherently flawed.
“With a Boolean search you'll wind up with things you don't care about and miss things that you do,” says Stephen Whetstone, vice president, client development and strategy for Stratify Inc., a developer of data analytic solutions. “It's impossible to come up with a comprehensive list of terms and phrases on the front end of the process that will return only what you want.”
Take a sexual harassment case, for example, in which lawyers have to cull through thousands of documents, the majority of which are completely unrelated to the matter.
Using keyword searches, the lawyer makes educated guesses as to what terms are likely to retrieve the relevant data. In order not to miss any relevant material, the search has to cast a pretty wide net. Therefore they may choose words such as “harassment” and “sex,” which are likely to retrieve loads of irrelevant documents and may miss the more subtle responsive documents, too.
Computerized Culling
Data analytics alters this process, incorporating traditional search methods only after the software has automatically sorted the vast body of information into manageable piles. For example, if a lawyer used data analytics on the same sexual harassment matter it would remove much of the attorney guesswork.
“Data analytics will automatically compare documents on a contextual basis, understanding that words such as 'explicit' and 'abuse' are related to the word 'harassment,'” says Michele Lange, staff attorney for legal technologies at Kroll Ontrack, an e-discovery consultancy and service provider. “On top of that, it will put all documents with related topics into folders marked accordingly. So it pre-sorts the data for you.”
What counsel are left with is a number of folders with names based on their contents. So one folder might say “invoices” while another might say “harassment.” Attorneys can then quickly rule out any documents in the “invoices” folder and concentrate their efforts on the contents of “harassment.”
“Rather than attempting to make search lists before the body of data is understood, you can become more informed about what kind of data is there and then start forming your search lists,” Whetstone says.
Once data analytics has sifted through the data, lawyers can search the documents using keywords. But instead of scanning through thousands of documents, they only have to search folders containing a few hundred documents that are likely to be responsive. “Data analytics is really good at separating the wheat from the chaff,” says Mike Kinnaman, vice president of marketing for Attenex Corp., a data analytics vendor. “You can remove all the information that has no bearing, allowing you to concentrate your downstream review efforts on only what is really potentially responsive.”
Vendor Assistance
Although data analytic software is available in an off-the-shelf form, it's fairly burdensome to install and implement. As a result, a number of vendors have sprouted up over the past couple of years to help legal departments use data analytics for e-discovery searches. These providers charge roughly $300 an hour for their services.
This cost includes assistance with culling and reviewing the information. One of the most useful services they provide, though, is “non-hit” sampling. The vendor basically samples documents not returned from keyword searches to ensure no responsive information was missed.
“It's good to up that level of defensibility, especially when there's a high degree of risk,” says Peter McLaughlin, director of review management services at FIOS, an e-discovery consultancy. “So we'll sample the material the keywords didn't hit. The attorneys can keep the results in their back pocket if they ever need to present it in court.”
Although provider fees aren't cheap, Kibbe believes their results make financial sense.
“Where I save money is not by sending 10 documents for review to find out only three are relevant,” she says. “It's when I'm sending 10 documents into the review room with seven or eight coming out responsive. That's clearly more cost effective.”
This content has been archived. It is available through our partners, LexisNexis® and Bloomberg Law.
To view this content, please continue to their sites.
Not a Lexis Subscriber?
Subscribe Now
Not a Bloomberg Law Subscriber?
Subscribe Now
NOT FOR REPRINT
© 2024 ALM Global, LLC, All Rights Reserved. Request academic re-use from www.copyright.com. All other uses, submit a request to [email protected]. For more information visit Asset & Logo Licensing.
You Might Like
View AllCoinbase Hit With Antitrust Suit That Seeks to Change How Crypto Exchanges Operate
3 minute readBaker Botts' Biopharma Client Sues Former In-House Attorney, Others Alleging Extortion Scheme
Trending Stories
- 1Semiconductor Component Maker Accused of Deceiving Investors About Market Downturn, Export Curbs
- 2Zuckerman Spaeder Gets Ready to Move Offices in DC, Deploy AI Tools in 2025
- 3Pardoning Jan. 6 Defendants May Send Bad Message About Insurrection, Rule of Law
- 4Looming Clash Over Abortion Pills Shows Overturning 'Roe v. Wade' Settled Nothing
- 53rd Circuit Strikes Down NLRB’s Monetary Remedies for Fired Starbucks Workers
Who Got The Work
Michael G. Bongiorno, Andrew Scott Dulberg and Elizabeth E. Driscoll from Wilmer Cutler Pickering Hale and Dorr have stepped in to represent Symbotic Inc., an A.I.-enabled technology platform that focuses on increasing supply chain efficiency, and other defendants in a pending shareholder derivative lawsuit. The case, filed Oct. 2 in Massachusetts District Court by the Brown Law Firm on behalf of Stephen Austen, accuses certain officers and directors of misleading investors in regard to Symbotic's potential for margin growth by failing to disclose that the company was not equipped to timely deploy its systems or manage expenses through project delays. The case, assigned to U.S. District Judge Nathaniel M. Gorton, is 1:24-cv-12522, Austen v. Cohen et al.
Who Got The Work
Edmund Polubinski and Marie Killmond of Davis Polk & Wardwell have entered appearances for data platform software development company MongoDB and other defendants in a pending shareholder derivative lawsuit. The action, filed Oct. 7 in New York Southern District Court by the Brown Law Firm, accuses the company's directors and/or officers of falsely expressing confidence in the company’s restructuring of its sales incentive plan and downplaying the severity of decreases in its upfront commitments. The case is 1:24-cv-07594, Roy v. Ittycheria et al.
Who Got The Work
Amy O. Bruchs and Kurt F. Ellison of Michael Best & Friedrich have entered appearances for Epic Systems Corp. in a pending employment discrimination lawsuit. The suit was filed Sept. 7 in Wisconsin Western District Court by Levine Eisberner LLC and Siri & Glimstad on behalf of a project manager who claims that he was wrongfully terminated after applying for a religious exemption to the defendant's COVID-19 vaccine mandate. The case, assigned to U.S. Magistrate Judge Anita Marie Boor, is 3:24-cv-00630, Secker, Nathan v. Epic Systems Corporation.
Who Got The Work
David X. Sullivan, Thomas J. Finn and Gregory A. Hall from McCarter & English have entered appearances for Sunrun Installation Services in a pending civil rights lawsuit. The complaint was filed Sept. 4 in Connecticut District Court by attorney Robert M. Berke on behalf of former employee George Edward Steins, who was arrested and charged with employing an unregistered home improvement salesperson. The complaint alleges that had Sunrun informed the Connecticut Department of Consumer Protection that the plaintiff's employment had ended in 2017 and that he no longer held Sunrun's home improvement contractor license, he would not have been hit with charges, which were dismissed in May 2024. The case, assigned to U.S. District Judge Jeffrey A. Meyer, is 3:24-cv-01423, Steins v. Sunrun, Inc. et al.
Who Got The Work
Greenberg Traurig shareholder Joshua L. Raskin has entered an appearance for boohoo.com UK Ltd. in a pending patent infringement lawsuit. The suit, filed Sept. 3 in Texas Eastern District Court by Rozier Hardt McDonough on behalf of Alto Dynamics, asserts five patents related to an online shopping platform. The case, assigned to U.S. District Judge Rodney Gilstrap, is 2:24-cv-00719, Alto Dynamics, LLC v. boohoo.com UK Limited.
Featured Firms
Law Offices of Gary Martin Hays & Associates, P.C.
(470) 294-1674
Law Offices of Mark E. Salomone
(857) 444-6468
Smith & Hassler
(713) 739-1250