Directions to an e-discovery solution: Keep collection and processing on-premises

The idea of conducting targeted collection and preservation, rather than collecting everything up-front in a scorched-earth manner, has taken root with most people and organizations. However, the devil is in the details as to how a technology works to accomplish this task.

December 14, 2012 at 03:30 AM

5 minute read

By Daniel Lim

The original version of this story was published on Law.com

Part one of this series introduced the following best-practice strategy for deploying e-discovery software solutions: on-premises software for the left side of the Electronic Discovery Reference Model (EDRM), and cloud-based technology for the right-side processes. In part two, we'll talk about how this model impacts collection and preservation, and processing.

On-premises left-hand side means using behind-the-firewall technology to build an organization's customized process for the following: pre-collection analytics, issuing legal hold notices, collecting and preserving electronically stored information (ESI), processing ESI and first-pass review of ESI. Part I discussed the reasons for using on-premises software for the pre-collection analytics and legal hold. Now, we're taking a look at key issues for collection/preservation and processing.

Collection and preservation

One methodology that continues to be touted is dubbed the “put everything into a magic box” approach. The general premise is that unstructured data sources, such as laptops, desktops and file shares, are “bad” and that the best way to manage data is by locking it all into a structured repository where theoretically it can be managed, accessed and presumably deleted as needed. Email archiving is a subset of this category that gained early momentum as a solution organizations used to try to manage e-discovery needs.

While organizations may need some degree of archiving for records management purposes, many have found that rather than storing just “business data” or even litigation-related data, these data stores begin to be a redundant source of all data because of their inability to separate relevant from non-relevant information. Such an approach requires a constant migration and assessment of data just to be ready “in case” an organization needs to use the repository to retrieve necessary information. Similar issues exist for solutions that propose to index an organization's entire IT infrastructure.

The ideal on-premises collection and preservation solution avoids this duplication of data by searching original data stores for relevant ESI without the interim step of either migrating the data to a repository or indexing the data source up-front. Also, collection of ESI should be more of an automated process in which every relevant data source is searched for responsive data at the time the preservation obligation begins—nothing more and nothing less. The organization dives into the original data source, captures and preserves relevant ESI in a defensible manner and then moves on, allowing the end-user to continue business activities without disruption or delay.

The need to conduct targeted collections from original data sources mitigates in favor of having on-premises software for this capability.

Processing

Processing ESI, i.e., de-duplicating and further culling data, has traditionally been the provenance of review and vendor-hosted solutions. We often think of processing as a specific step in the linear process of e-discovery that sits between the collection of ESI and review.

But with an in-house processing capability, this can be an iterative step that can take place at multiple points during discovery. For example, at the onset of litigation, an organization can take a sample set of data, such as data from a key witness's computer, run some initial processing on the files to isolate the relevant file types and then begin crafting and testing search terms to cull data for the remainder of the collections process from other data sources.

Another important capability of an in-house solution is the ability to conduct different types of processing at different steps in the e-discovery process to make effective cuts at data when and where it makes the most sense. At the initial stages of a collection, the easiest cuts at data are the ones that do not require substantive review, such as file types and date ranges. Ideally, organizations can make these cuts without having to index the data sets or move them into an archiving or enterprise content management repository.

A more advanced processing technique is to perform “rolling de-duplication.” This type of enhanced processing moves up processing to the collection stage and allows organizations to compare files that they have already collected with the files that are being scanned for collection. Rather than de-duplicating all files after collection, files are de-duplicated during the collection stage so that organizations only collect a single instance of relevant files while maintaining a record of where the duplicates exist. This technique greatly reduces the number of files collected. In a large case, one organization was able to review ESI for more than 800 custodians and reduce an initial data set of 146 terabytes of data to 17 terabytes using this process.

Although processing is often necessary at the full-review stage, this capability also needs to be on-premises to maximize its effectiveness in the earlier stages of discovery.

Conclusion

Organizations understand that they need a mix of technology solutions to address the rising costs and risks of producing ESI. What has hindered progress is the tendency to focus on a particular step in the e-discovery process without a broader view of the best way to accomplish all of the other steps in a seamless and effective manner. Fortunately, enough time has passed to develop a uniform approach that allows tailoring of a solution to an organization's needs and infrastructure at the points most needed on-premises, i.e., the early stages of the e-discovery process, and allows the organization to take advantage of best-of-breed review and production capabilities with the performance, features, security and control needed. There is a role for cloud-based e-discovery solutions, though, and we'll discuss that in our concluding column.

This content has been archived. It is available through our partners, LexisNexis® and Bloomberg Law.

To view this content, please continue to their sites.

Go To Lexis →

Not a Lexis Subscriber?
Subscribe Now

Go To Bloomberg Law →

Not a Bloomberg Law Subscriber?
Subscribe Now

NOT FOR REPRINT

You Might Like

January 07, 2025

US Tower General Counsel Elevated to President

By Chris O'Malley

3 minute read

January 07, 2025

White Castle GC Becomes Chain's First President From Outside Family

By James Palmer

3 minute read

January 07, 2025

DLA Piper Adds Former Verizon GC Amid In-House Hiring Spree

By Dan Roe

3 minute read

January 07, 2025

Deep-Seated Legal Department Woes—From Data in Disarray to Communication Dysfunction—Threaten to Temper Gen AI's Transformative Powers

By Trudy Knockless

5 minute read

Latest

Trending

Who Got The Work

Michael G. Bongiorno, Andrew Scott Dulberg and Elizabeth E. Driscoll from Wilmer Cutler Pickering Hale and Dorr have stepped in to represent Symbotic Inc., an A.I.-enabled technology platform that focuses on increasing supply chain efficiency, and other defendants in a pending shareholder derivative lawsuit. The case, filed Oct. 2 in Massachusetts District Court by the Brown Law Firm on behalf of Stephen Austen, accuses certain officers and directors of misleading investors in regard to Symbotic's potential for margin growth by failing to disclose that the company was not equipped to timely deploy its systems or manage expenses through project delays. The case, assigned to U.S. District Judge Nathaniel M. Gorton, is 1:24-cv-12522, Austen v. Cohen et al.

Who Got The Work

Edmund Polubinski and Marie Killmond of Davis Polk & Wardwell have entered appearances for data platform software development company MongoDB and other defendants in a pending shareholder derivative lawsuit. The action, filed Oct. 7 in New York Southern District Court by the Brown Law Firm, accuses the company's directors and/or officers of falsely expressing confidence in the company’s restructuring of its sales incentive plan and downplaying the severity of decreases in its upfront commitments. The case is 1:24-cv-07594, Roy v. Ittycheria et al.

Who Got The Work

Amy O. Bruchs and Kurt F. Ellison of Michael Best & Friedrich have entered appearances for Epic Systems Corp. in a pending employment discrimination lawsuit. The suit was filed Sept. 7 in Wisconsin Western District Court by Levine Eisberner LLC and Siri & Glimstad on behalf of a project manager who claims that he was wrongfully terminated after applying for a religious exemption to the defendant's COVID-19 vaccine mandate. The case, assigned to U.S. Magistrate Judge Anita Marie Boor, is 3:24-cv-00630, Secker, Nathan v. Epic Systems Corporation.

Who Got The Work

David X. Sullivan, Thomas J. Finn and Gregory A. Hall from McCarter & English have entered appearances for Sunrun Installation Services in a pending civil rights lawsuit. The complaint was filed Sept. 4 in Connecticut District Court by attorney Robert M. Berke on behalf of former employee George Edward Steins, who was arrested and charged with employing an unregistered home improvement salesperson. The complaint alleges that had Sunrun informed the Connecticut Department of Consumer Protection that the plaintiff's employment had ended in 2017 and that he no longer held Sunrun's home improvement contractor license, he would not have been hit with charges, which were dismissed in May 2024. The case, assigned to U.S. District Judge Jeffrey A. Meyer, is 3:24-cv-01423, Steins v. Sunrun, Inc. et al.

Who Got The Work

Greenberg Traurig shareholder Joshua L. Raskin has entered an appearance for boohoo.com UK Ltd. in a pending patent infringement lawsuit. The suit, filed Sept. 3 in Texas Eastern District Court by Rozier Hardt McDonough on behalf of Alto Dynamics, asserts five patents related to an online shopping platform. The case, assigned to U.S. District Judge Rodney Gilstrap, is 2:24-cv-00719, Alto Dynamics, LLC v. boohoo.com UK Limited.

Learn More About Radar

Featured Firms

Law Offices of Gary Martin Hays & Associates, P.C.

(470) 294-1674

Law Offices of Mark E. Salomone

(857) 444-6468

Smith & Hassler

(713) 739-1250

Directions to an e-discovery solution: Keep collection and processing on-premises

This content has been archived. It is available through our partners, LexisNexis® and Bloomberg Law.

You Might Like

Featured Firms

More from ALM

Subscribe to Corporate Counsel