Sedona Conference Develops E-Discovery Standards
Consultants Search For Ways To Evaluate Vendor Products
October 31, 2005 at 07:00 PM
10 minute read
If you still think e-discovery search issues are best left to legal department underlings, consider the nightmare Jason Baron and the National Archives and Records Administration (NARA) just went through.
As part of the federal government's lawsuit against the tobacco industry, Maryland-based NARA had to review some 18 million Clinton White House e-mail records stored away in its repository. Because the Clinton administration originally brought the lawsuit, the defense's discovery request asked for relevant White House e-mails.
The review began in late 2004 with an automated keyword search using dozens of terms, such as tobacco, tar, cigarette and nicotine. Then 25 NARA archivists, four in-house attorneys and two law clerks set about a mind-numbing manual review of 200,000 e-mails.
The six-month ordeal stretched NARA's resources to the limit. The archivists had to sort e-mails with multiple attachments manually–not only to segregate responsive and non-responsive records, but also to protect privilege. Even more maddening was the process of weeding out the large number of “false positives” the automated search turned up.
“When we did the keyword search, there were a tremendous number of hits,” says Baron, NARA's litigation director. “But there also were a lot of false positives. For example, 'Marlboro' sometimes brought up responses for 'Upper Marlboro, Maryland.' Or, in another search, 'TI' for 'Tobacco Institute' brought up references to the song “Do Re Mi” from the Sound of Music ('Ti (tea), a drink with jam and bread').”
If Baron's experience proves anything, it's that when it comes to searching massive electronic databases, common keyword and Boolean search techniques leave a lot to be desired. And while e-discovery vendors continue to tout alternative search methods, there's really no way of determining whether they're any better.
Attorneys, judges and even e-discovery vendors argue that the only solution is to develop standards or benchmarks that will help attorneys and others in the judicial system better understand what search and retrieval methods are available, how they work and which are best suited to their specific needs.
“A keyword search is a useful tool, but it's not especially reliable,” says George Socha, a Minneapolis-based attorney and consultant who advises clients on e-discovery matters. “While studies suggest it's better than a manual search, it's too broad in some respects and too narrow in others. On the one hand, you get a lot of garbage you don't need. At the same time, you can miss things too. So there's an enormous need for more sophisticated alternatives, as well as ways to evaluate them.”
And that's exactly what's happening.
Standard Practice
The Sedona Conference–a Sedona, Ariz.-based think tank of lawyers, academics and e-discovery experts–formed a search and retrieval group in 2004 to examine common search processes, work with vendors, academics and specialists to establish benchmarks and then use the benchmarks to evaluate various technologies, methodologies and best practices.
“We've got digital data volumes that never existed before, and as a result the vendor marketplace is literally exploding,” says Richard Braman, the founder of the Sedona Conference and its executive director. “We've got a relatively new problem, vendors offering new solutions, and both need to be evaluated. We've got to find a way to stop the tail of e-discovery from wagging the dog.”
Meanwhile another Sedona group–this one focusing on retention and production issues–just released two white papers, “Navigating the Vendor Proposal Process: Best Practices for the Selection of Electronic Discovery Vendors” and “The Sedona Conference Glossary For E-Discovery and Digital Information Management (May 2005 Version).” The papers include sample RFP and vendor contracts, as well as a 50-page glossary of e-discovery terms and definitions. One of the purposes of these papers is to shed some light on an industry that hasn't been that forthcoming in explaining how its products or methods work.
“This is a very young industry,” Socha says. “There are vendors who have been around since the 1980s, but the bulk of them have only been at it for two or three years. You can go to their Web sites, but too many play fast and loose with e-search terms, which makes it difficult to do an evaluation. And there are some vendors out there from whom you should just turn and run away screaming.”
As a result, the search and retrieval group will spend most of 2006 examining typical e-discovery processes and procedures. Furthermore, it will conduct simple dry runs in which volunteers will manually search a set of digital documents containing a known amount of information and evaluate the results. Another set of volunteers will conduct a keyword search on the same set of documents, followed by Boolean search, a natural language search and so forth.
The idea is to determine the differences between a human conducting a manual search–the old fashioned way without any automated assistance–and humans conducting various automated searches. The hope is that the results will not only lead to standard approaches, but also offer lawyers some confidence that the search tools on the market are reliable.
“We've got senior partners who still cling to the nonsensical notion that scores of contract lawyers and first-year associates can pore over boxes of documents, or sit at computer screens for days and nights on end, living on pizza and Diet Coke, and somehow make accurate relevance and privilege decisions,” says Ken Withers, senior education attorney at the Federal Judicial Center in Washington, D.C. “The sheer volumes involved in electronic discovery, in particular, threaten to bring down the whole civil justice system. We need to fight technology with technology.”
A Model Project
The Sedona Conference isn't the only group developing standards. Socha and Tom Gelbmann, an information technology consultant based in Minnesota, are working together on the Electronic Discovery Reference Model Project. Fifty-nine member organizations support this project, primarily e-discovery vendors and a handful of law firms and in-house counsel–including attorneys from Pfizer Inc. and Halliburton.
Socha and Gelbmann plan to examine and define the many concepts and relationships common to the e-discovery process. That means charting how data is stored, collected, processed and reviewed; and how attorneys determine what constitutes privileged information before submitting the documents to the requesting party.
“To establish guidelines and standards, you first have to agree on the basic definitions and processes,” Socha says. “Right now, there is no right way to conduct an e-discovery search. It's really hard for someone to tackle that right now. But by next May, we should have a document that even the inexperienced can turn to for guidance on a practical level.”
Having standards in place will help vendors too, because a third-party will be able to validate the reliability of their products.
“Most vendors either aren't prepared or don't want to divulge exactly how their technology works,” says Judge James Francis, U.S. Magistrate Judge for the Southern District of New York. “And a judge isn't going to be able to accept the use of advanced technologies absent some kind of validation. And that's what the standards would provide.”
A Growing Problem
Perhaps even more important, adds Jim Daley, a partner at Shook Hardy & Bacon and a member of the Sedona Conference search and retrieval group, is that standards will ultimately facilitate the use of more advanced search methods (see sidebar).
Daley and others like him believe that advanced search technologies can do more than simply cull through and review massive heaps of digital data. He argues they also can be used to better manage the entire document lifecycle, from retention to harvesting. For example, a search technology could theoretically be used to scan an employee's personal computer and tell the individual which electronic documents he or she should keep to comply with the company's retention policy and which he or she should discard.
“There's a broad interest in this subject,” Daley says. “It's not just lawyers and judges. It's IT people and records management people too. They know this is a big problem and that it's not going away.”
Braman agrees.
“We've reached the point where everything that's digital can be saved,” he says. “You've got digital footage gathered from 100,000 video cameras in the United Kingdom. You've got voicemail, e-mail and instant messaging. There's an incredible amount of information being gathered and mined for different reasons. It represents a fundamental change in how society operates.”
[SIDEBAR]
Smart Searching
As alternatives to keyword searching go, none hold more promise than an artificial intelligence (AI)-based “concept” search.
In a keyword or Boolean search, lawyers specify that they are looking for any documents where A equals B and C, but not Y or Z. The problem is that Boolean searches are limited. For instance, if you are searching for the name “Robert,” the search will only find documents containing that specific name. The search will not identify documents containing the words “Rob,” “Bob” or “Bobby.”
On the other hand, a concept search is based on the relationships between words. Using complex mathematical algorithms, a concept search knows that the terms “Ford Mustang,” “convertible” and “four-door” are all used to describe an automobile.
One AI system, called “Cyc” (pronounced “Psych”) is scheduled for commercial release in 2006 by its developer, Austin, Texas-based Cycorp Inc. Like other AI-based technologies, Cyc is comprised of a “knowledge base,” a database filled with basic common sense knowledge, complete with terms, rules and relationships.
When connected to a search engine via an interface, the Cyc database helps the search engine execute a more precise query.
“Cyc is not a search engine. It supports a search engine, thereby making it more productive,” says Larry Lefkowitz, Cycorp's executive director of business solutions. A relatively simple application would involve typing a query into a search engine (Google, for example), which interfaces with Cyc over the Internet. Cyc then takes the query, modifies it and sends it back to the search engine to facilitate a more precise search.
Cyc is the brainchild of Doug Lenat, a computer scientist from Stanford University who dreamed of developing and commercializing the world's first true artificial intelligence. For two decades Lenat and a team of mostly research assistants patiently and carefully inputted literally millions of facts, assertions and concepts into the Cyc knowledge base. The idea was to create a system that could support a variety of knowledge-intensive products and services.
“We're at the point now where Cyc's information and language capability is ready to bear fruit,” Lefkowitz says. “Have we conducted a legal application? No, not yet. Is it applicable? Absolutely.”
This content has been archived. It is available through our partners, LexisNexis® and Bloomberg Law.
To view this content, please continue to their sites.
Not a Lexis Subscriber?
Subscribe Now
Not a Bloomberg Law Subscriber?
Subscribe Now
NOT FOR REPRINT
© 2025 ALM Global, LLC, All Rights Reserved. Request academic re-use from www.copyright.com. All other uses, submit a request to [email protected]. For more information visit Asset & Logo Licensing.
You Might Like
View AllMeta Hires Litigation Strategy Chief, Tapping King & Spalding Partner Who Was Senior DOJ Official in First Trump Term
What to Know About the New 'Overlapping Directorship' Antitrust Development
4 minute readThe Met Hires GC of Elite University as Next Legal Chief
Tesla, Musk Appeal Chancery Compensation Case to Delaware Supreme Court
2 minute readTrending Stories
- 1On The Move: Energy Infrastructure Pro Joins Moore & Van Allen, Adams & Reese Changes Atlanta Leadership
- 2Miami Attorneys Secure $4M Settlement Despite Insurance Limits
- 3NY Judge Admonished Over Contributions to Progressive Political Causes
- 4Legaltech Rundown: Alexi Launches an AI Litigation Tool, Hotshot Announces Private Equity Practice Courses, and More
- 56-48. It’s Comp Time Again: How To Crush Your Comp Memo
Who Got The Work
Michael G. Bongiorno, Andrew Scott Dulberg and Elizabeth E. Driscoll from Wilmer Cutler Pickering Hale and Dorr have stepped in to represent Symbotic Inc., an A.I.-enabled technology platform that focuses on increasing supply chain efficiency, and other defendants in a pending shareholder derivative lawsuit. The case, filed Oct. 2 in Massachusetts District Court by the Brown Law Firm on behalf of Stephen Austen, accuses certain officers and directors of misleading investors in regard to Symbotic's potential for margin growth by failing to disclose that the company was not equipped to timely deploy its systems or manage expenses through project delays. The case, assigned to U.S. District Judge Nathaniel M. Gorton, is 1:24-cv-12522, Austen v. Cohen et al.
Who Got The Work
Edmund Polubinski and Marie Killmond of Davis Polk & Wardwell have entered appearances for data platform software development company MongoDB and other defendants in a pending shareholder derivative lawsuit. The action, filed Oct. 7 in New York Southern District Court by the Brown Law Firm, accuses the company's directors and/or officers of falsely expressing confidence in the company’s restructuring of its sales incentive plan and downplaying the severity of decreases in its upfront commitments. The case is 1:24-cv-07594, Roy v. Ittycheria et al.
Who Got The Work
Amy O. Bruchs and Kurt F. Ellison of Michael Best & Friedrich have entered appearances for Epic Systems Corp. in a pending employment discrimination lawsuit. The suit was filed Sept. 7 in Wisconsin Western District Court by Levine Eisberner LLC and Siri & Glimstad on behalf of a project manager who claims that he was wrongfully terminated after applying for a religious exemption to the defendant's COVID-19 vaccine mandate. The case, assigned to U.S. Magistrate Judge Anita Marie Boor, is 3:24-cv-00630, Secker, Nathan v. Epic Systems Corporation.
Who Got The Work
David X. Sullivan, Thomas J. Finn and Gregory A. Hall from McCarter & English have entered appearances for Sunrun Installation Services in a pending civil rights lawsuit. The complaint was filed Sept. 4 in Connecticut District Court by attorney Robert M. Berke on behalf of former employee George Edward Steins, who was arrested and charged with employing an unregistered home improvement salesperson. The complaint alleges that had Sunrun informed the Connecticut Department of Consumer Protection that the plaintiff's employment had ended in 2017 and that he no longer held Sunrun's home improvement contractor license, he would not have been hit with charges, which were dismissed in May 2024. The case, assigned to U.S. District Judge Jeffrey A. Meyer, is 3:24-cv-01423, Steins v. Sunrun, Inc. et al.
Who Got The Work
Greenberg Traurig shareholder Joshua L. Raskin has entered an appearance for boohoo.com UK Ltd. in a pending patent infringement lawsuit. The suit, filed Sept. 3 in Texas Eastern District Court by Rozier Hardt McDonough on behalf of Alto Dynamics, asserts five patents related to an online shopping platform. The case, assigned to U.S. District Judge Rodney Gilstrap, is 2:24-cv-00719, Alto Dynamics, LLC v. boohoo.com UK Limited.
Featured Firms
Law Offices of Gary Martin Hays & Associates, P.C.
(470) 294-1674
Law Offices of Mark E. Salomone
(857) 444-6468
Smith & Hassler
(713) 739-1250