Cloud Technology Work E-Discovery Credit: wan wei/Shutterstock.com
|

Data types evolve faster than law. New data types are expanding the scope of discoverable data. The variety, velocity and complexity of electronic evidence challenge legal processes and the technology-enabled legal applications that are designed to support them. While email, documents and spreadsheets continue to comprise the majority of electronically stored information (ESI), social media, streaming media, big data and data products will all play roles in the pursuit of justice. They cannot be avoided.

Rules vary across venues, but the U.S. Federal Rules of Civil Procedure (FRCP) provide good context. In fact, Rule 1 is a touchstone for many e-discovery practitioners. Ideally, whatever the venue, the discovery process supports behaviors consistent with the FRCP Rule 1 guidance that the rules should be used to "secure the just, speedy, and inexpensive determination" of civil actions and proceedings. Expanding the scope of data to be identified, preserved, collected, processed, reviewed and produced has the potential to slow down discovery and increase cost, but if new data types contain uniquely dispositive content (either exculpatory facts or "smoking guns"), it will be necessary to include them in order to achieve just determinations.

We are already seeing the inclusion of social media, mobile device content, cloud content and artifacts of artificial intelligence in the definition of relevant ESI. I recently explored this with Judge Andrew Peck (Ret.), of counsel at DLA Piper. Asked if an artificial intelligence contains evidence of behavior, or patterns of behavior, could it contain a "smoking gun" in the context of litigation? Judge Peck responded, "There are already news reports of Fitbits and pacemakers being looked at in criminal cases to show that the digital data contradicts the defendant's story of what happened."

And asked how big data, machine learning or artificial intelligence might be used as evidence, Judge Peck commented, "We are seeing mapping of GPS data used in some employment cases—particularly wage and hour (overtime) cases, to prove the employee detoured off his route for long periods of time, or had a three hour lunch, and thus his claim he had to work overtime to get the job done was not true. When that is analyzing such data over the course of months or a year or more, that is big data."

Clearly, sources of potentially relevant ESI have expanded. The electronic discovery community needs to consider how that expansion impacts core tenets of our processes, including what it means to have "possession, custody or control" of these new data types, and how such data must be identified and preserved. For example, when criminals use photographs or streaming video in their commission of crimes, the tools required are substantially different from the tools traditionally used to search for and process typical office documents. In a front page article in The New York Times on Sunday, November 10, 2019, reporters Michael Keller and Gabriel Dance wrote about some of the ways child pornographers use photography and streaming video software often used in business meetings to commit their crimes. PhotoDNA, a tool used to identify similar still photos (even if they have been slightly altered) is used by some cloud data hosts to screen for known child pornography, but it is not perfect and is not universally used. Filtering streamed content for similar offenses is substantially more challenging.

The next wave of relevant ESI is astoundingly different. Analysis of large volumes of social media or GPS data over time is one thing—it's hard to do, but fairly easy to explain. There are even more exotic forms of data emerging that will stretch the legal and technical capabilities of the electronic discovery technical and legal community.

Consider, for example, the data definition of genomic material. Andrew Hessel is president of Humane Genomics, a seed stage company that makes cancer-fighting viruses. I spoke with him to better understand the state of data-defined organisms. I asked, "Is it true that viruses can now be defined sufficiently well by data to be 'printed'?"

Andrew replied, "My belief is this is true now for all engineered microorganisms. Virus genomes are small enough that the entire genome can be synthesized and assembled from scratch. But complete bacterial genomes can be synthesized, too. For example, recently the complete E. coli genome was synthesized. Several companies have modified organisms to make everything from foods to fuels. The genomic changes made to these organisms are electronically defined. Construction of new bacterial genomes is anecdotal today except at high-throughput 'fabs' like those operated by Boston-based Ginkgo Bioworks, but over the next decade will become more commonplace in R&D, perhaps even miniaturized for the lab bench."

The implications of data-defined organisms for e-discovery are extraordinary. An immediately apparent use of such data would be in prosecuting or defending intellectual property claims. To understand whether or intellectual property rights were abridged, it will be necessary for lawyers and legal technologists to understand the underlying data structures of engineered microorganisms, much as in email, document or spreadsheets, they must understand the composition of the filetype to adequately preserve and analyze metadata.

A more frightening scenario emerges from the potential criminal use of engineered microorganisms. I asked Hessel if a printed virus or other data-defined genomic product caused a problem, would it be possible to confirm who "printed" it, or could printed genomic products' data files be "smoking guns" in future litigation? He responded, "If criminal use did happen, it may not always be possible to conclusively determine the creators. But like any product made today, there would be a variety of digital files associated with the creation of an engineered organism, including digital DNA files, process automation log files, orders submitted for DNA synthesis, etc. that would be part of any investigation."

As a final example of emerging data types that may be included in future discovery, consider the patterns of data that are generated by event driven architectures. AWS defines event driven architectures as ones that use:

"[E]vents to trigger and communicate between decoupled services… Events can either carry the state (the item purchased, its price, and a delivery address) or events can be identifiers (a notification that an order was shipped). Event-driven architectures have three key components: event producers, event routers, and event consumers. A producer publishes an event to the router, which filters and pushes the events to consumers. Producer services and consumer services are decoupled, which allows them to be scaled, updated, and deployed independently."

A key aspect of applications based on such an architecture is that the events themselves trigger subsequent events, and that producers and consumers are separate entities. When events at issue in civil litigation are triggered in such a distributed commercial ecosystem, lawyers, legal technologists and courts will have to grapple with complex issues. Questions including who is responsible and who has "possession, custody or control" of data subject to preservation when there is reasonable anticipation of litigation will beg answers. The written preservation notice is a pillar of the practice of e-discovery law. To whom should it be addressed when a potentially illegal act results from events triggered by "decoupled" producers and consumers?

These examples illuminate a few of myriad current and emerging data types the e-discovery community must address. Just because a new data type is complicated does not diminish its potential relevance. The rules and processes of discovery must keep pace.

Cliff Dutton advises corporations and law firms on legal operations and technology-enabled legal services. A pioneer in e-discovery and legal risk analytics, he previously served as Chief Innovation Officer at Epiq and Senior Vice President of Legal Operations at AIG.