Robot-Artificial-Intelligence

Late last year, lawyers at Gowling WLG were approached by a client with a seemingly herculean task: break down a large data set of contracts and find out exactly what was in each one.

"They had a contract management system they acquired, and they wanted to put all this contract information into the system," recalls Rick Kathuria, the firm's national director of project management and legal logistics. "They looked at doing it themselves, but realised they just didn't have the time."

The problem was that the clients' contracts were unknown entities. Many had started out in standardised templates, but through sales and business interactions, each became a unique legal agreement. Gowling could have deployed a vast team of lawyers to review and dissect each contract. Instead, it decided to shun the manual review process for something far more efficient. "[We] asked if we should use artificial intelligence (AI) for this, to see if AI can identify a lot of the information they were are looking for," Kathuria says.

The client approved, and Gowling soon found its contract review process happening at record speed and precision. But it was not without its fair share of burdens. "We spent a fair bit of time training the AI" before we could deploy it, Kathuria explains.

And herein lies the paradox of AI: for all the time it saves automating manual tasks, it can be a fairly laborious and fastidious tool to set up in the first place. The technology often requires a significant amount of specific data and a long period of hands-on care before it can function independently.

At its core, teaching AI is still a delicate, complex endeavour, rife with specific demands that legal professionals hoping to use the technology must learn to recognise, respect and work around. Legal's future may be defined by the automation AI enables, but its progress or postponement, at least initially, will depend on the tutelage legal professionals, data scientists and others provide.

Training behind the scenes

For Gowling, training an AI contract review tool was demanding, but ultimately manageable. Many AI developers strive to make the technology easy to use, often doing much of the necessary heavy lifting behind the scenes. Gowling's AI tool, for example, came pre-trained for contract clauses. "It knew how to identify the parties, it knew how to identify other types of things, and it could pull those things out," Kathuria explains.

We're finding that accuracy is pretty high - it's in the 80%-90% range

Most off-the-shelf AI legal technology begins its life with at least a grade school-level education: It can read and comprehend language. And depending on the use cases for which the AI was designed, it will be able to understand technical, industry-specific language, such as legalese. "You can preload it with any lexicon, just like a vocabulary system, and we have a bunch of specialised lexicons" that can go into any platform, says Daniel Katz, associate professor of law at the Illinois Institute of Technology's Chicago-Kent College of Law and co-founder of legal analytics company LexPredict.

But embedding language skills into AI - known as natural language processing (NLP) and understanding (NPU) - is an immense undertaking. It requires manually labelling individual parts of a sentence, breaking down punctuation and grammar, and continuously identifying increasingly complex hierarchical relations between such labels.

"In my experience, there is a lot of art and work and labour that goes into annotating the text in terms of the more [complex] concepts," says Kevin Ashley, professor at University of Pittsburgh School of Law. Ashley is also a faculty member in the University of Pittsburgh's multidisciplinary graduate program in Intelligent Systems. "And that amount of work, labour and expense is far greater than whatever might be saved by convenient tools that already have that packaged into the program."

Annotating, however, is just one step of the process. To make sure the AI really can apply its knowledge in as broad and malleable a way as possible, developers turn to supervised learning, the primary training method for AI that works as a type of quality control process. Give AI examples of what you want it to learn to identify, such as grammatical relationships, or specific types of contract clauses, then have it run tests on unannotated texts and correct it when it is wrong. Run these tests over and over, and after some time, the system will start to understand lexicon on a deep, conceptual level.

"The thing about machine learning and the way that AI works today is that it doesn't require the words 'change of control' in an actual clause," Kathuria says. "The machine can actually understand that this is what a change control clause looks like regardless of how it is worded."

Getting AI up to this level of understanding, however, takes a lot of trial and error, says Pratik Patel, co-founder and vice president, global innovation and products, at Elevate Services, a legal service provider that builds AI solutions for law firms and legal departments. He explains that after creating an AI system's algorithms, and running several iterations of supervised learning, Elevate's team of data scientists will review the AI's work and constantly tweak its algorithms to make the system more accurate. "We repeat this process of data validation, quality control and data science refinement constantly."

For some, like Eric Kaufman, assistant director of research and knowledge management services at US firm Stroock & Stroock & Lavan, a firm that is considering using AI for due diligence and contract review, these behinds-the-scenes efforts can have immediate payoffs.

"Depending on the product, we're finding that without [additional] training, accuracy is pretty high; it's in the 80%-90% range," he says.

Surpassing human accuracy

Of course, AI developers aren't able to account for every specific type of legalese a law firm might be searching for in a data set. So while Gowling benefited from developers' efforts, there was still "a lot of the information the client was looking for that didn't come pre-trained in the platform," Kathuria says.

Gowling's team worked through the AI platform to annotate examples of contract clauses it needed the machine to learn. When the AI ran tests on clean contracts, the training team let the system know when it was wrong, thereby refining its understanding of the new information with increasing accuracy. "At a certain point in the training," the AI system will "basically identify clauses for you that you might have missed" during your own manual review, Kathuria says.

But training AI is more than providing iterative examples of a certain data points. "Another thing that the AI also needs are examples of where it can't find the information," Kathuria says. "So for example, here are 10 contracts where it doesn't have any information about" a specific contract clause.

Inversely, with too few examples of where a data point is located, the system will struggle to form a high-level understanding of how to identify it. "It's funny, you would think a title or a date within a contract or a document would be pretty easy [to learn]. After all, that's very straightforward," Kaufman says. But with "a title or a date, there is very little text, there is very little information on it, so that actually makes it more difficult."

Such nuances of machine learning can come as a surprise to those new to AI. But chances are, most legal professionals involved in training will come face-to-face with them at some point. "For the most part, regardless of the platform or the application, training seems to be the same," Kaufman adds.

Simplicity and subjectivity

In the end, it took Gowling "about two weeks to get [the AI platform] trained up to understand what to look for and how to look," Kathuria says. Initially the firm started with a team of two staff training the AI, but then opened it up to "about between five to 10 lawyers, not working on it full time, but over a period of a week."

The time it took to train AI was particular to Gowling's specific project, and ultimately depended on the type of information the firm was looking for and the amount of examples it could provide the system.

There is no standard, therefore, of how long it takes to train AI. "When you think of training AI in one clause, it could be done in a few days, but again it depends on how many contracts you have and how long it's going to take you to find that information in those contracts," Kathuria says.

While time spent training will vary, there are uniform ways in which to shorten its duration. For the most part, the key to holding training time down is to keep the user experience as simple and seamless as possible. "The systems that are really good for training are the ones used for a very specific scope and need, such that the lawyers are being asked very simple questions," Elevate's Patel says.

But while helpful, such training processes may be inhibiting the ability of AI systems to evolve more freely. When AI systems are pre-trained before their release, their algorithms are constantly tweaked and updated multiple times to further increase accuracy.

Once deployed by a client, these systems can continue with such updates automatically. But it may involve more work than legal AI users are comfortable with. A system can say, "just give me a thumbs up or thumbs down if you would like to improve the rule to include these types of patterns," Patel notes. "But to be quite honest, I haven't seen a bit of that yet in legal."

While AI platforms can stick to asking simple questions, they cannot always ensure there will be simple answers. Identifying a particular contract clause is a fairly straightforward endeavour. But other tasks, like figuring out what documents are relevant to a case, can pose some difficulties.

"Human annotation is generally regarded as an upper bound on the accuracy of machine learning. That is to say that if you can't get smart humans to agree as to the positive or negative instances of a concept, then you're never going to get a machine learning program to do that,"  Ashley says.

It's vital therefore, for a legal team to have consensus before diving into training. "One of the most important things is [to] sit down before you begin [training] to clearly identify what it is you are looking for the program to do" and agree on the data set, Kaufman advises.

Equally important is also having the people who train the AI be one among the most knowledgeable and experienced subject matter or legal experts at the firm. "Don't take shortcuts with who is training the system," Patel says. "If you get an expert to train the system, you'll save a ton of time because the knowledge that expert is contributing into the system is going to be much more accurate."

Today's AI is therefore tethered to human knowledge, dependent on trainers to plant the seeds and nurture the development of its expertise, and to jump it when it reaches the limits of its bounds.

For a technology that often brings to minds automatons and robots, reality is vastly different. As any legal AI trainer will divulge, AI is just an extension of themselves, the harnessing of their expertise into an external automation. It can achieve new heights of efficiency and accuracy, but at its core, the scope of its cognition will, for now, trail that of its teacher.