With big data poised to shake up the legal industry, Georgia State University College of Law is getting into the data analytics business.

GSU Law has launched a Legal Analytics Lab to apply big data to law—unearthing patterns in civil litigation, patent filings and corporate compliance disclosures to shed light on legal questions and predict future outcomes in ways that earlier generations of lawyers could not have imagined.

A few other law schools have also started legal analytics labs, but the GSU Law lab's director, Charlotte Alexander, said its focus on real-world applications makes it unique.

The Legal Analytics Lab is housed at GSU's Institute for Insight, which is the J. Mack Robinson College of Business's own data analytics lab, started in 2015. Companies and law firms can engage GSU law professors and data scientists to test out proof-of-concept data analysis projects, called “sprints,” said Alexander, a professor at GSU's business and law schools who studies employment law.

“The sprints are a learning opportunity for our students and a way for a company to see that it's possible to do with the data they do have,” she explained.

By working on the sprints, law students can learn data science skills for the legal jobs of the future—an important part of the lab's mission. “We are at the beginning of a real disruption in the way firms practice law and the way legal research is done,” Alexander said.

The lab germinated from a legal research project in which Alexander is analyzing federal district court judges' decisions since 2008 in worker misclassification cases—a timely issue in light of numerous plaintiffs employment suits brought by “gig” workers challenging their classification as contractors by Uber, Amazon, FedEx and other big players in the flexible new world of work.

She enlisted a data scientist, Javad Feizollahi, at the Institute for Insight to help. Alexander realized there were plenty of other legal analysis projects that GSU law professors and data scientists could undertake, and so she enlisted colleagues to “create an incubator at GSU for this type of research and see what's possible.”

Other GSU Law professors working with the lab are Timothy Lytton, Jonathan Todres, Anne Tucker and Doug Yarn.

Right now the lab is working on a sprint for an insurance company that offers director and officer liability coverage, Alexander said. The insurer is mining the text of securities complaints and corporate disclosures to predict the likelihood of securities litigation that could affect its clientele.

Another project is identifying financial technology patent applications and tracking their impact on the financial services industry.

Big Law Takes Note

The lab is already attracting interest from law firms such as Seyfarth Shaw. Brett Bartlett, who heads its Atlanta labor and employment group, sees mining the lab's database of federal employment cases for predictive patterns as a high-tech way to help clients mitigate risk.

He and Kevin Young, another Seyfarth partner in the group, are talking to Alexander about analytics projects that could help their clients spot compliance problems, head off of employment claims and reduce the uncertainty and expense of litigation. “Any time you have employment claims made, it creates operational drag,” Bartlett said.

Seyfarth is considering looking at all federal employment cases nationally and “determining on a granular level, down to the judge, where problems arise under Title VII [prohibiting employment discrimination]—or any employment law,” Bartlett said—useful to a client with 50 single-plaintiff Title VII cases in different jurisdictions.

“The list of what we might dig into is endless,” Young said, adding that the data could help them advise clients on litigation tactics. “We could create modeling for when to settle, for how much—or if the case should go to trial.”

There's an Algorithm for That

Alexander's interest in automating data analytics sprang from her inquiry into the factors that cause judges to decide whether a worker is an employee or a contractor.

U.S. labor and employment laws leave a lot of grey area in distinguishing between the two, she explained, so analyzing judges' opinions sheds light on what criteria federal district courts use to draw the line—an important distinction, since only employees receive legal protections against job discrimination, overtime wage rights and other benefits.

“I basically wanted to find out more about how judges apply the law in employment cases, and I felt I could do it more quickly and efficiently if I could automate the process,” Alexander said.

Instead of using law students to code the text of opinions, as she had done for past projects, Alexander said she wondered if she could “get an algorithm to do it.”

She and Feizollahi, the GSU data scientist, won an almost $250,000 grant from the Department of Labor to develop a massive database of federal judge's opinions from PACER, the government repository of federal cases.

Existing legal research services like Westlaw and LexisNexis have databases of court documents, but Alexander said their terms of service do not allow for machine-searching or sharing downloaded document sets.

Instead, she and Feizollahi partnered with the Free Law Project in Emeryville, California, which makes PACER documents publicly available. The GSU project added 3.4 million orders and opinions from 1.5 million federal district and bankruptcy court cases to the nonprofit's RECAP collection.

After mining misclassification opinions from PACER, the team created algorithms to tally judges' use of classification criteria—for instance how many times an opinion mentions “tools,” since a worker using their own tools is one criterion for a contractor.

An algorithm can also identify characteristics that a group of opinions, such as those classing workers as employees, have in common, she said.

“That is what I think is particularly exciting here,” Alexander said. “There might be factors that we're missing—so the algorithm can give us new insights into legal documents.”

No two cases are the same, she cautioned. “We are not at the point where we can just feed in a set of variables and come up with a prediction. We're talking about judges. They're not robots.”

“But it's a good tool for the toolkit,” she added.

Mapping Plaintiffs Networks

Alexander is working on another project to map the networks among plaintiffs lawyers who've filed wage and hour suits under the Fair Labor Standards Act.

“I want to see if we can map links among cases and lawyers to understand why FLSA litigation has boomed—and in certain jurisdictions more than others,” she said.

“Employers are worried about this. They're buying insurance against it,” Alexander added.

Tracking plaintiffs suits is useful, she said, because, unlike the heavily regulated European model, “so many of our rights and protections are not enforced by the government but by plaintiffs lawyers.”

The U.S. system “relies on people who are harmed to identify that harm and go to court,” Alexander said, so she's interested in figuring out what factors make it more likely for people to sue and how judges apply the law in court.

Mapping plaintiffs' lawyers networks is a way to examine their role as “the intermediary between the worker and the court system,” she said.

“If we want to get to know better how our litigation-based system of enforcement is working, we need to do more than just read individual cases,” Alexander said.