The impact that pretrial risk or risk and needs assessment (RNA) tools have on criminal justice decisions is usually relative to each jurisdiction. After all, states use different tools at different times for different purposes. How much risk assessment tools exacerbate or mitigate bias within criminal justice processes, therefore, often comes down to how they're designed, validated and considered by a judge or correctional officer.

But outside of individual implementations, the use of risk assessments as a whole, and the value—or lack thereof—they provide is a matter of intense debate.

For some, no matter which tool is used or how, risk assessment will always propagate the structural racism plaguing courts and corrections departments. Others, however, say well-validated tools help shed light on bias in the criminal justice system, and by extension, enable action.

To be sure, few disagree that these tools grapple with the same fundamental bias issues as the criminal justice systems that they seek to aid. But there's much less consensus over whether this is a fundamental flaw that renders these tools a lost cause, or just a pitfall risk assessment instruments can work around.

|

The Bias Question

Few believe that risk assessment tools are a panacea for one of the biggest problems facing the U.S criminal justice system. "Let me be clear, they don't eliminate bias—there is a very large systemic, cultural problem, and no one should be suggesting it eliminates all bias," says David D'Amora, senior policy adviser at the Council of State Governments Justice Center.

Still, research has shown that with a "well-constructed, appropriately validated tool … decisions made are more accurate and have less bias to them" than without one, he adds.

But Dr. Jennifer Skeem, professor of Public Policy at the University of California, Berkeley, says the benefits aren't as explicit. "There isn't, in my view, a great deal of research that backs up the claim that the use of risk assessment instruments will reduce bias in human decision-making."

She explains, "I think we have a great deal of indirect research that suggests that when we reduce the amount of discretion [officials] have when they're making decisions about people, we also tend to reduce the amount of racial disparity in those decisions. But that research has mostly been done in other contexts like policing and in employment decision making."

However, the tools have made a clear difference in at least one area. "In contrast, there is a great deal of research that tells us very directly that risk assessment instruments can increase our accuracy and consistency in assessing risk of recidivism compared to unaided judgment, " Skeem says.

But more accurate, tech-aided judgments often run into the same bias problems as unaided ones.

Both judges and assessment tools are "relying [on] and using court-provided and criminal information, [and] both sources of those information are biased," says Colin Doyle, staff attorney at the Harvard Law School's Criminal Justice Policy Program, who works on pretrial reform and bail issues. "Whether you're a judge or an algorithm, that bias is baked into your thinking about predicting future crimes."

Are risk assessment tools, then, inherently biased? The answer is both simple and complex. The simple answer: ones that are designed and validated poorly certainly can be.

"There are many different types of risk assessment instruments and they are validated to greater or lesser extent, so I imagine it is quite possible that you can find biased risk assessment," Skeem says.

Whether bias is inherent in assessment tools that do not suffer from such flaws, however, is more complex. In this case, looking at the broader context in which an instrument was created and used is just as important as examining the instrument itself.

|

Two Sides of the Same Coin

In 2016, research done by ProPublica found that the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) tool had higher false positive rates for African Americans than whites. In other words, "Blacks were twice as likely as whites to be have been misclassified as medium or high risk by the COMPAS … so the courts' use of COMPAS to inform decisions would inappropriately subject Black defendants to harsh treatment more often than white defendants," Skeem says.

But COMPAS's owner, Equivant (formerly Northpointe), as well as a host of academic researchers, pushed back on claims that the tool was biased. They noted that the same percentage of Blacks and whites COMPAS classified as high risk went on to reoffend. In other words, the tool is equally predictive of recidivism regardless of race.

The thing is, the different findings are two sides of the same coin. COMPAS cannot have the same predictive accuracy across different races without having higher false positives for one race. Skeem noted it's a "mathematical problem," explaining that "when Black defendants have a higher rate of reoffending than white defendants … we're always going to have a higher rate of false positives for Black than white defendants."

And this problem is not exclusive to COMPAS. "There is often a trade-off inherent in the task of prediction. Whether we predict recidivism based on our human judgment alone or based on an instrument, we face the same mathematical problem," Skeem says.

Put another way: the incarceration and rearrest rate among African Americans is disproportionately high when compared to other demographics in the U.S., in significant part due to systemic racism, i.e., the disparate treatment they receive at the hands of law enforcement and in the criminal justice system. And because predictions, whether done by machine or human, extrapolate from historical data, it's likely that more African Americans will be predicted to reoffend than actually do.

Essentially, the data used on these risk assessment tool "is not neutral," Doyle says. "It's a record of police activity and not people's activity. When you use that to evaluate people's risk, whatever police do will distort your system."

|

Balancing Fairness and Accuracy

To be sure, the way Equivant and researchers define fairness—i.e., as predicting an outcome with the same accuracy across racial groups—is standard in many assessment practices.

"It's the definition of fairness that's consistent with standards for educational and psychological testing, which really focuses on predictive fairness," Skeem says.

But some believe it's time for a change. "Just because something is the standard does not mean it is just," says Meredith Broussard, a data journalism professor at the Arthur L. Carter Institute at New York University.

"We know in America, white and Black people across populations aren't similarly situated," Harvard Law's Doyle adds. "That has to inform our awareness of fairness."

Changing the standard, however, also means changing tools' predictive accuracy. "If you try to equalize false positive rates, you may find that your calibration suffers, you're going to misclassify people in terms of their likelihood of reoffending. But if you have really good calibration, you're going to have unbalanced error rates, and that's really the conundrum," Skeem says.

Changing what risk factors that assessment tools consider also runs into the same problem. But some argue it's necessary given that certain factors, such as criminal history, education and employment, can act as "race proxies" due to the disparate experiences and opportunities people of color have in the U.S.

However, DSkeem says that when she studied whether criminal history acts as a proxy, she found "a weak association" between criminal history and race. What's more, criminal history "was generally predictive of reoffending, including violent reoffending" for both white and Black demographics. Still, she adds that "any association is of some concern."

There are, however, opportunities to change how criminal history is assessed to make predictions more race neutral. Skeem notes that, where possible, tools should avoid criteria that is impacted by the differential treatment African Americans receive in the criminal justice system.

"A really good example is arrest for drug offense. We know that policing patterns make it such that Blacks are much more likely to be arrested for drug offenses than whites, even though there isn't much difference at the behavioral level and in terms of rates of drug use, etc. If we use a bias outcome like that and trained our algorithmic to predict it, then we would be concerned about a biased instrument, because that's bias begetting bias."

What's more, "historically we always looked at age at first arrest," says James Bonta, a consultant for corrections and criminal behavior who has worked with assessment tool developer Multi-Health Systems (MHS). "What we know now is that if we do that, it can potentially create bias. And so lots of the newer tools have removed that and look at other factors" such as age at first conviction.

Skeem believes that ultimately, those in the jurisdiction in which a risk assessment tool is used should decide what risk factors it considers. But she cautioned that while risk factors can be taken out or added, which ones predict criminal justice outcomes, and how well they predict, is something that cannot be set by communities themselves.

"A group of stakeholders will have to make decisions based on value, about what trade-offs they are willing to operationalize between social justice and racial justice [on one hand] and crime prevention on the other. That's where the community really needs to weigh in. … But that doesn't amount to dictating how what actually predicts the type of reoffending that you want to prevent."

|

Are They Worth It?

Balancing how risk assessment tools handle bias with how accurately they predict outcomes can be a tricky endeavor. But is it worth all the effort in the first place?

For proponents of risk assessment, the value of using these instruments is clear: Well-validated tools not only allow for more accurate criminal justice decisions, but also provide transparency into—and by extension, an opportunity to change—what has been thus far a hidden decision-making process.

"The reality is that risk-based decisions are made every day by police, probation officers, judges, and parole officers/commissioners who may or may not be biased—we cannot know, because their decision-making is opaque," Skeem says. "Largely without the aid of risk assessment, we are in a place where a young Black man is about six times more likely to be incarcerated than a young white man."

She adds, "Risk assessment instruments are not a single entity—there are better and worse versions; [but] data can be used to identify and even correct biases when they exist."

D'Amora, at the Council of State Governments Justice Center, also believes that "risk assessment tools can act as a canary in a coal mine."

He explains, "When I look at results, for example, say from two different counties and they're both using validated tools, but I'm seeing much higher rates of arrest or risk levels among one group versus another, it makes me now want to look what's happening in that county. Do I have a problem in terms of what's happening with law enforcement? Do I have a problem of what's happening with the courts?"

However, others see risk assessment tools not as a solution, but as part of the problem. "It still contributes to institutional racism to generate these scores in the first place. It's a waste of public dollars, it's unnecessarily punitive and it is one of the many things we need to reform in the criminal justice system," says Broussard at New York University.

The focus on risk assessment tools, she adds, is part of what she calls "tech chauvinism," which essentially boils down to "building tools to help the world. [But] the thing is, we have plenty of tools already, we just need to apply the ones we have. We don't necessarily need new ones."

As an example, she pointed to restorative justice efforts and putting resources into cutting down crime by addressing its root causes.

Others have said that if local criminal justice systems continue to use risk assessment tools, then at the very least, there needs to be more open discussions about how they work.

"We probably have to do a better job of communicating about these tools," Doyle says. He notes, however, that some jurisdictions, such as Pennsylvania and Massachusetts, have allowed for community involvement and transparency in their risk assessment roll-outs.

Mark Bergstrom, executive director of the Pennsylvania Sentencing on Commission, says that for the state's Sentence Risk Assessment Instrument, "we did a lot of research over a long period of time, and we tried to make sure we were publishing what we did. We would write papers and post it on our website and make it available to others. We are engaged with different groups."

The state also involved local communities in the tool's design, which led to the instrument using prior convictions as a risk factor instead of prior arrests, though the latter was "stronger statistically," Bergstrom says.

He adds, "The bottom line is, you have to be transparent even if you're going to get negative feedback … at the end we came up with a better product. Even though at times it was difficult, [it was] a necessary process."

Still, not all were pleased. "One of the main things I want to emphasize is [what] the Commission on Sentencing got from the community was the overarching feeling that the community absolutely hates risk assessment," says Troy Wilson, a Philadelphia-based attorney. "Risk assessment is a direct pipeline to mass incarceration. The community gave their input rather reluctantly, but we had to as a community or what was eventually pushed through would have been more racist than what they originally tried to push through."