Animoji being announced at the Sept. 12 Apple event. |

With the growth of text messages and social media content, modern e-discovery practitioners face a difficult time staying on top of all the mobile data they need to preserve, collect and review. While many are rising to the challenge, there are areas of mobile data that remain out of reach for even the best e-discovery platform or practitioner.

Emojis, for example, largely elude the grasp of many modern search tools. And while it is possible to reconfigure e-discovery search engines to account for such pictorial information, it is near impossible to account for all emoji types and iterations.

For now, most e-discovery tools just aren't made to handle this new type of content.

“In many cases, emoji search is just dead on arrival, because the search indexes haven't been configured to take emojis into account at all,” explained Jeff Kerr, CEO and founder of CaseFleet, a legal practice management software company. “So even if you enter into the search box the actual emoji character you are looking for with an emoji keyboard, there is no chance of it matching with anything.”

The problem stems from how search technologies deconstruct and interpret documents. “Search engines take various steps to sanitize [the data] they put in their indices to limit the amount of junk content and make sure that your search” yields relevant results, Kerr said. When dealing with a document, for example, search engines will discard spaces and punctuation, along with visuals such as emojis, to focus on words and phrases instead.

But this is far from the only problem. Even if one were to reconfigure search engines to account for emojis, “you would still have to enter that actual emoji into the search bar with an emoji keyboard, which I think for obvious reasons is just an impractical way of finding relevant documents,” Kerr said.

He added, “The better approach, which to my knowledge no e-discovery company has implemented to date, is to index documents containing emojis in a way that not only includes the actual emoji itself, but also its descriptive name.”

To be sure, reconfiguring search engines to account for emojis is “definitely not impossible,” Kerr said. He pointed to the Elasticsearch tool, which is built on open source software called Apache Lucence. “There is actually a plug-in for Elasticsearch that can be installed very easily to basically turn on the ability to index emojis.”

Still, Yaniv Schiff, director of digital forensics at QDiscovery, noted that while emojis can be identified and processed in documents by some technologies, when it comes to e-discovery review, “the level of support for emojis is dependent on the e-discovery platform” and can vary widely.

What's more, the variety and diversity of emoji content means that no one tool can ever account for every type of emoji that may pop up in a document.

While most emojis are categorized in Unicode, a computing standard that lists “all of the atomic pieces of which language is composed,” there are new emojis being created all the time, Kerr said.

Private companies and mobile users, for example, can create proprietary emojis for their own personal use outside of Unicode standards.

As an example, Kerr pointed to chat platform Slack, which allows companies to make their own emojis. Because of the platform, “people have their own private emoji collections” which cannot be easily indexed or, in some cases, even deciphered, he said.

But lagging behind an evolving data type is nothing new to the e-discovery field. “E-discovery platforms have a constant challenge with any emergent technology and any emerging data,” Schiff said. “Emoji is one example of that. There are all sort of other types of data that e-discovery has issues with, and that's just the nature of the business. It's impossible to keep up with every new type of data or every new type of file that is coming out.”