The Real Impact of Redundant Data and What to Do About It
Data is duplicative by nature, but the way your operation stores and manages data is likely exposing it to unnecessary and costly redundancy. Most organizations handling e-discovery today could very well have a cumulative data set that is anywhere from five to 10 times bigger than necessary.
February 05, 2018 at 04:57 PM
4 minute read
Data is duplicative by nature, but the way your operation stores and manages data is likely exposing it to unnecessary and costly redundancy. Most organizations handling e-discovery today could very well have a cumulative data set that is anywhere from five to 10 times bigger than necessary.
The following are the areas impacted by data sprawl, why it occurs and what you can do about it.
|Where ESI Data Gets Duplicated
Redundant data typically shows up in two forms: duplicative original data and export/import duplication.
Duplicative Original Data
It is estimated that more data has been created in the past two years than previously existed in all of time. The exponential increase in data volumes will continue to impact e-discovery. Duplicative original data is almost always guaranteed through the collection process as discoverable data is being harvested for potential relevance. While culling and de-duplication are not novel concepts, much of the duplication is a result of data management and workflows associated with e-discovery.
As data is ingested for normalization, most standalone systems repeatedly create additional copies of the same data and store them on the designated file systems. If an email is sent to 10 people, 10 copies of that email and any associated attachments will be collected and processed. If any of those attachments were saved locally by any of the recipients, those files will also be duplicative. As a result, a significant amount of duplicative orginal data is being stored in various places within an organization.
If data volumes are not properly taken into consideration while establishing e-discovery protocols, traditional workflows will always result in an increase in the duplication of original data.
Export/Import Duplication
Most organizations rely on multiple applications for handling ESI during various aspects of their discovery processes. Each time an application is used to perform data processing, analysis, review or production, data is exported from the preceding application and imported into the next.
Quantifying the amount of redundant data being created is dependent on a particular workflow. It is not uncommon for a data set to expand five or six times in a simple two-product system. Now imagine stepping through a workflow that uses three to four products.
The result: Significant data sprawl.
|Now What?
If you know or simply suspect that your organization is exposed to data sprawl, taking a closer look at your current e-discovery ecosystem is a good first step. Then double down on targeting the two areas causing the majority of redundant data.
Move to Single-Instance Solution
Even within a single system, data is often automatically duplicated when stored, creating a considerable amount of redundancy.
Regardless of de-duplication protocols, legacy solutions store each duplicative file for retrieval at any given time. A single-instance solution is able to identify these duplicative records and only store a single instance of each rendition associated with the record. In the case of the 10 emails and corresponding attachments from above, a single-instance solution will only store a single copy of the duplicative email and direct all duplicative records to that single file.
Platforms that offer single-instance storage not only identify duplicates as a reporting method, but they apply that analysis and only maintain unique files on the server.
Consider an End-to-End Software Solution
And finally, consolidating functionality into a single platform simplifies how you interact with your data from management to workflow and can eliminate or significantly reduce duplication due to export/import between solutions.
Elie Francis, founder and CEO, ONE Discovery, Inc. is a strategic, data-driven technology professional with over 15 years experience in the legal technology industry. After co-founding Driven, Inc. in 2001, Francis went on to spin off ONE Discovery to build a platform with e-discovery professionals' needs and wants at the forefront of the development.
This content has been archived. It is available through our partners, LexisNexis® and Bloomberg Law.
To view this content, please continue to their sites.
Not a Lexis Subscriber?
Subscribe Now
Not a Bloomberg Law Subscriber?
Subscribe Now
NOT FOR REPRINT
© 2025 ALM Global, LLC, All Rights Reserved. Request academic re-use from www.copyright.com. All other uses, submit a request to [email protected]. For more information visit Asset & Logo Licensing.
You Might Like
View AllSpecial Section: 2024 Labor & Employment/Workers' Compensation
Insurers Are Misusing IMEs to Prematurely Cut Off Injured Workers' Benefits
7 minute readSupreme Court's Ruling in 'Students for Fair Admissions' and Its Impact on DEI Initiatives in the Workplace
6 minute readMembership Has Its Privileges: Bankruptcy Court Examines LLC's Authority to File Bankruptcy
8 minute readTrending Stories
- 1Supreme Court To Review Second Circuit Decision Striking Down 'Deemed Consent' Law Reaching PLO and PA for Making Payments to 'Martyrs' Who Injure or Kill US Nationals
- 2McDermott Adds Covington, Polsinelli Laterals to Build DC Bench
- 3From TAR to Generative AI: A Revolution in Document Review
- 4Legal Tech's Predictions for E-discovery in 2025
- 5Lessons Learned from the Pager Attack: the Law of War, Warfighting, and the Weaponization of the Supply Chain
Who Got The Work
Michael G. Bongiorno, Andrew Scott Dulberg and Elizabeth E. Driscoll from Wilmer Cutler Pickering Hale and Dorr have stepped in to represent Symbotic Inc., an A.I.-enabled technology platform that focuses on increasing supply chain efficiency, and other defendants in a pending shareholder derivative lawsuit. The case, filed Oct. 2 in Massachusetts District Court by the Brown Law Firm on behalf of Stephen Austen, accuses certain officers and directors of misleading investors in regard to Symbotic's potential for margin growth by failing to disclose that the company was not equipped to timely deploy its systems or manage expenses through project delays. The case, assigned to U.S. District Judge Nathaniel M. Gorton, is 1:24-cv-12522, Austen v. Cohen et al.
Who Got The Work
Edmund Polubinski and Marie Killmond of Davis Polk & Wardwell have entered appearances for data platform software development company MongoDB and other defendants in a pending shareholder derivative lawsuit. The action, filed Oct. 7 in New York Southern District Court by the Brown Law Firm, accuses the company's directors and/or officers of falsely expressing confidence in the company’s restructuring of its sales incentive plan and downplaying the severity of decreases in its upfront commitments. The case is 1:24-cv-07594, Roy v. Ittycheria et al.
Who Got The Work
Amy O. Bruchs and Kurt F. Ellison of Michael Best & Friedrich have entered appearances for Epic Systems Corp. in a pending employment discrimination lawsuit. The suit was filed Sept. 7 in Wisconsin Western District Court by Levine Eisberner LLC and Siri & Glimstad on behalf of a project manager who claims that he was wrongfully terminated after applying for a religious exemption to the defendant's COVID-19 vaccine mandate. The case, assigned to U.S. Magistrate Judge Anita Marie Boor, is 3:24-cv-00630, Secker, Nathan v. Epic Systems Corporation.
Who Got The Work
David X. Sullivan, Thomas J. Finn and Gregory A. Hall from McCarter & English have entered appearances for Sunrun Installation Services in a pending civil rights lawsuit. The complaint was filed Sept. 4 in Connecticut District Court by attorney Robert M. Berke on behalf of former employee George Edward Steins, who was arrested and charged with employing an unregistered home improvement salesperson. The complaint alleges that had Sunrun informed the Connecticut Department of Consumer Protection that the plaintiff's employment had ended in 2017 and that he no longer held Sunrun's home improvement contractor license, he would not have been hit with charges, which were dismissed in May 2024. The case, assigned to U.S. District Judge Jeffrey A. Meyer, is 3:24-cv-01423, Steins v. Sunrun, Inc. et al.
Who Got The Work
Greenberg Traurig shareholder Joshua L. Raskin has entered an appearance for boohoo.com UK Ltd. in a pending patent infringement lawsuit. The suit, filed Sept. 3 in Texas Eastern District Court by Rozier Hardt McDonough on behalf of Alto Dynamics, asserts five patents related to an online shopping platform. The case, assigned to U.S. District Judge Rodney Gilstrap, is 2:24-cv-00719, Alto Dynamics, LLC v. boohoo.com UK Limited.
Featured Firms
Law Offices of Gary Martin Hays & Associates, P.C.
(470) 294-1674
Law Offices of Mark E. Salomone
(857) 444-6468
Smith & Hassler
(713) 739-1250