Same Score, Different Impact: States Can Decide Who Assessment Tech Deems 'High Risk'

Implemented in different jurisdictions, the same risk assessment tool can yield vastly different results. While a part of that is by design, some of it also comes down to a jurisdiction's preferences and tolerance for error.

July 15, 2020 at 07:00 AM

11 minute read

By Rhys Dipshan

The promise of risk assessment tools is the promise of data-driven decision-making: less reliance on personal whims and prejudices, and more uniform results. But while data-driven decision-making is possible, uniformity is another story.

Assessment tools in use across the U.S. assign scores to defendants or convicted offenders based on the likelihood they'll recidivate, or fail to appear at a pretrial hearing or commit a crime before that hearing.

Certain scores will correspond to low-, medium- or high-risk designations, and in the case of Risk and Need Assessment (RNA) tools, will also highlight what factors contribute to that risk as well. Scores are calculated by looking at multiple predictive factors—often concerning a person's characteristic traits, socioeconomic situation, and criminal history—which researchers test to accurately predict recidivism or pretrial outcomes.

What constitutes "accurately," however, varies across the country. How well or poorly a particular assessment tool predicts outcomes isn't universal, but instead depends on the group of people for whom it is predicting, and the level of error individual jurisdictions will allow.

"There is no industrywide standard level of minimum accuracy—I've never seen it," says Dr. James Bonata, a consultant for corrections and criminal behavior who has worked with assessment tool developer Multi-Health Systems (MHS). He adds, "In terms of dealing with false positives and errors, there's no standard practice."

The process of testing how accurately a tool predicts outcomes is known as validating an instrument. Validations are performed by a variety of entities, including independent researchers, universities, private and nonprofit organizations, and state agencies, among others. Oftentimes, assessment tool developers themselves will validate their own product for states, though some jurisdictions will instead use independent parties.

While most, if not all, validations test how well certain factors predict criminal justice outcomes, they can also determine whether an instrument erroneously assigns higher risk to a specific racial group or gender.

What's more, validations can allow jurisdictions to set their own thresholds for what scores constitute low-, medium-, and high-risk levels, and decide what factors an assessment tool considers and how it weighs them in its scoring. This is usually done to ensure an instrument predicts well for the population on whom it's being used, particularly if it's a third-party assessment tool initially developed and validated on another population.

Still, validations don't always happen as expected. Some jurisdictions that lack criminal justice outcome data, for instance, will implement a third-party tool without first testing it on their own population, with the aim to do so after a few years of tracking local criminal justice outcomes. States will also differ in how often they revalidate tools to confirm the instruments still work as intended, a necessity given demographic changes and new research findings. While some revalidations are required every few years by law in some states, in others, their timing can depend as much on available resources as need.

Why Risk, and Accuracy, Are Relative

One common misconception about assessment tools is that they predict an individual's recidivism or pretrial risk.

In reality, "they're not predicting [an] individual, they're predicting a group," says Dr. Edward Latessa, a professor and director of the School of Criminal Justice at the University of Cincinnati, who helped develop the Ohio Risk Assessment System (ORAS).

He adds, "A judge thinks if you're standing in front of them and you scored low risk, you're not going to [recidivate]. No—that just means you're in a group with a low percentage of failure."

To be sure, no assessment tool predicts outcomes for groups with total accuracy. There's always a chance for false positives in every risk level—for example, someone who scores as low risk but goes on to recidivate or commit a crime before a pretrial hearing, or vice versa.

How much error is associated with each risk level depends in large part on the data collected during a validation study, which will show how scores from a specific selection of people correlate to criminal justice outcomes.

"When you look at the data, you have this distribution of scores and you have a distribution of outcomes and it's never perfectly linear," Latessa explains. "And so you start to look at the data to see where are the best cutoffs because not every low-risk person makes it and not every high-risk person [recidivates]."

In some cases, the data itself highlights cutoffs for different risk levels. "If the data says that anybody scores less than a '10' has a 15% or lower recidivism rate and anyone that scores over a '10' has a 40% recidivism rate, that's a real easy one," he adds.

In that scenario, a jurisdiction could then deem anyone that scores less than a "10″ as low risk, in which case they would have a false positive rate of at most 15% (i.e., 15% of low-risk-designated people would likely reoffend).

But the numbers aren't always so clear-cut, and a jurisdiction can adjust cutoffs to its liking, but only if the data supports it. "If they say to me, 'We want a 10% failure rate,' I might say to them, 'Great, but you don't have anyone in that group—you don't have a low-risk group,'" Latessa says.

He adds, "If [they're] looking to go higher, in other words, increase the score, but increase your failure rate, sometimes you can do that with your instrument, sometimes you can't."

Why jurisdictions would want to adjust risk level thresholds comes down to how they use assessment tools to allocate resources.

If a jurisdiction's high-risk cutoff for parolees has a 10% failure rate, then one out of 10 offenders will likely not recidivate, but will still receive the same level of parole supervision and programming as the other nine. If a high-risk threshold has a 20% failure rate, more resources will be spent on those who do not need them. But because more people may be classified as high risk overall, the total number of offenders receiving supervision and programming, including the number of those who do need it, will increase.

Decisions about risk thresholds and related resource allocation have to be made for each criminal justice population. After all, risk scores and their related failure rates are specific to particular populations, not just within a jurisdiction, but within different parts of the criminal justice system as well.

"There is nothing agreed upon as to what constitutes high risk," Latessa says. "So, for example, if you're looking at a pretrial sample, a high-risk group may have a 20% failure rate. That would never be high risk for a parole group just because you're looking at different follow-up periods, different type of offenders."

The Local Touch

Validations not only allow jurisdictions to adjust risk level cutoffs, but also to determine what factors assessment tools weigh when evaluating risk. Risk factors are changed to ensure a tool accounts for a locality's specific characteristics. Oakland County, Michigan, for instance, calibrated its Virginia Pretrial Risk Assessment Instrument (VPRAI)—which it renamed Praxis—in 2009 and 2012, after validations by Dr. Marie VanNostrand, the tool's creator and founder of pretrial justice consulting firm Luminosity.

Originally the tool looked at the length of time a defendant had spent at their current residence. Staying at a residence less than a year was one of the factors that could drive up a person's likelihood of failing to appear at a pretrial hearing.

Due to the economic situation in the state at the time, however, Oakland County, Michigan, took that risk factor out. Barbara Hankey, manager of Oakland County Community Corrections, explains that after the Great Recession, "what we were seeing was people who had worked their whole careers, 25-plus years, at their automotive companies were losing their pensions and homes. We didn't think people should be penalized for that."

Similar to Oakland County, the state of Texas also calibrated its ORAS tool after a validation. According to a 2017 validation study by officials from the Harris County Community Supervision & Corrections and Latessa, the state adjusted seven items from the original ORAS tool, including giving more weight to how illegal drug use and its impact factored into scoring.

Like many other states, Texas collected criminal justice outcome data and validated their tool before it put it into use. But not all states will wait. In fact, some opt to implement a third-party tool right out of the gate. "It's not uncommon for a state to [first] adopt a tool and collect data and then go through the validation process," Latessa says.

Legaltech News found that Illinois, Missouri and Montana did not validate their ORAS deployments on their specific population before implementation.

For Illinois, the choice was simple. "If we had to wait from implementation for it [to be] validated in Illinois, that would have delayed the roll out by several years," says Chris Bonjean, director of communications for the Illinois Supreme Court.

For other states, however, validation before implementation is essential. "One of the most important things about a risk assessment of any kind is that its validated on the population its going to be used for," says Dr. Teresa May, department director of the Community Supervision & Corrections Department in Harris County, Texas. She adds that Texas revalidated the ORAS tool for its own population because it is demographically different than Ohio's population on which the tool was originally validated.

Still, May notes that implementing the ORAS locally before a validation may be acceptable if a state's population is similar to Ohio. "The best practice generally is to revalidate on your population. But I could see if the demographics are very similar, and if they did the training well, and continue to administer the tool as it was designed to [be administrated], and they're doing it right, with integrity, it may be OK."

The Road to Revalidation

To be sure, it can take considerable amount of time and resources to collect, let alone test, the criminal justice outcome data needed for a validation.

Kim Bushey, program services director at the Vermont Department of Corrections, notes that validating a tool can require "at least three years of data." And it may take longer depending on how much data states can obtain over time. "We're a small state so sometimes it takes more than three years."

Validation is also rarely, if ever, a one-time process. Tara Blair, executive officer for the Kentucky Administrative Office of the Courts, says, "It's industry practice that you review your tool every three to five years … [though] there's nothing set in stone."

Some states, however, do codify revalidation timelines into law. Georgia, for instance, mandates that its state-built assessment tool for prison programming, called The Next Generation Assessment (NGA), be revalidated every five years. Colorado does the same for its Colorado Actuarial Risk Assessment Scale (CARAS) instrument, which it uses for parolees. Similarly, Idaho requires that its third-party tool, Level of Service Inventory Revised (LSI-R), be revalidated every five years by an outside researcher not affiliated with tool developer MHS.

But in other jurisdictions, when a revalidation occurs depends on when resources are made available. Oakland County, for instance, knows its Praxis tool is overdue for a validation. But while many other state counties use the tool, Michigan's state government is focused on a program piloting another pretrial risk assessment instrument, the Public Safety Assessment (PSA). "So we're working on [getting it validated], the big question is funding … the county is going to foot the bill for the validation because the state went down the road of the PSA," says Oakland County Community Corrections' Hankey.

She notes Oakland County is currently exploring a few options, including grants, a partnership with Macomb County, Michigan—whereby the two counties split the cost—or having Wayne State University in Detroit perform and procure funding for the validation.

Whatever the path, it's clear that it's been too long since the 2012 validation. "It needs to be done, particularly in light of some of the controversy or the assertions that are being made about risk assessments, in that they cause racial disparity," Hankey says. "We sort of want to put that to the test. Most of the research that is out there indicates that that's not the case, that they don't cause racial disparity, but I want to be able to know that."

Tomorrow, we'll examine the various different ways states train their staff on risk assessment tools and ensure the integrity of data input into these instruments. We'll also look at the wide discretion judges have to consider risk assessment scores in their rulings, and how that is both limiting and expanding the impact these tools have in courtrooms around the nation.

This content has been archived. It is available through our partners, LexisNexis® and Bloomberg Law.

To view this content, please continue to their sites.

Go To Lexis →

Not a Lexis Subscriber?
Subscribe Now

Go To Bloomberg Law →

Not a Bloomberg Law Subscriber?
Subscribe Now

NOT FOR REPRINT

You Might Like

Latest

Trending

Who Got The Work

J. Brugh Lower of Gibbons has entered an appearance for industrial equipment supplier Devco Corporation in a pending trademark infringement lawsuit. The suit, accusing the defendant of selling knock-off Graco products, was filed Dec. 18 in New Jersey District Court by Rivkin Radler on behalf of Graco Inc. and Graco Minnesota. The case, assigned to U.S. District Judge Zahid N. Quraishi, is 3:24-cv-11294, Graco Inc. et al v. Devco Corporation.

Who Got The Work

Rebecca Maller-Stein and Kent A. Yalowitz of Arnold & Porter Kaye Scholer have entered their appearances for Hanaco Venture Capital and its executives, Lior Prosor and David Frankel, in a pending securities lawsuit. The action, filed on Dec. 24 in New York Southern District Court by Zell, Aron & Co. on behalf of Goldeneye Advisors, accuses the defendants of negligently and fraudulently managing the plaintiff's $1 million investment. The case, assigned to U.S. District Judge Vernon S. Broderick, is 1:24-cv-09918, Goldeneye Advisors, LLC v. Hanaco Venture Capital, Ltd. et al.

Who Got The Work

Attorneys from A&O Shearman has stepped in as defense counsel for Toronto-Dominion Bank and other defendants in a pending securities class action. The suit, filed Dec. 11 in New York Southern District Court by Bleichmar Fonti & Auld, accuses the defendants of concealing the bank's 'pervasive' deficiencies in regards to its compliance with the Bank Secrecy Act and the quality of its anti-money laundering controls. The case, assigned to U.S. District Judge Arun Subramanian, is 1:24-cv-09445, Gonzalez v. The Toronto-Dominion Bank et al.

Who Got The Work

Crown Castle International, a Pennsylvania company providing shared communications infrastructure, has turned to Luke D. Wolf of Gordon Rees Scully Mansukhani to fend off a pending breach-of-contract lawsuit. The court action, filed Nov. 25 in Michigan Eastern District Court by Hooper Hathaway PC on behalf of The Town Residences LLC, accuses Crown Castle of failing to transfer approximately $30,000 in utility payments from T-Mobile in breach of a roof-top lease and assignment agreement. The case, assigned to U.S. District Judge Susan K. Declercq, is 2:24-cv-13131, The Town Residences LLC v. T-Mobile US, Inc. et al.

Who Got The Work

Wilfred P. Coronato and Daniel M. Schwartz of McCarter & English have stepped in as defense counsel to Electrolux Home Products Inc. in a pending product liability lawsuit. The court action, filed Nov. 26 in New York Eastern District Court by Poulos Lopiccolo PC and Nagel Rice LLP on behalf of David Stern, alleges that the defendant's refrigerators’ drawers and shelving repeatedly break and fall apart within months after purchase. The case, assigned to U.S. District Judge Joan M. Azrack, is 2:24-cv-08204, Stern v. Electrolux Home Products, Inc.

Learn More About Radar

Featured Firms

Law Offices of Gary Martin Hays & Associates, P.C.

(470) 294-1674

Law Offices of Mark E. Salomone

(857) 444-6468

Smith & Hassler

(713) 739-1250

Same Score, Different Impact: States Can Decide Who Assessment Tech Deems 'High Risk'

Why Risk, and Accuracy, Are Relative

The Local Touch

The Road to Revalidation

This content has been archived. It is available through our partners, LexisNexis® and Bloomberg Law.

You Might Like

Who Got The Work

Who Got The Work

Who Got The Work

Who Got The Work

Who Got The Work

Featured Firms

More from ALM

Subscribe to Legal Tech News