Internet Explorer 11 is not supported

For optimal browsing, we recommend Chrome, Firefox or Safari browsers.

Data-Based Decisionmaking Works Great, Til Someone Cheats

There’s a long list of government agencies that have fudged numbers in misleading ways.

Homeless man walking with shopping cart full of cans.
New York City changed how it measures critical incidents among the homeless in a way that is potentially misleading.
(Shutterstock)
Over the years, we’ve written often about performance-informed government and the reticence of managers or elected officials to use that data to make decisions -- mainly, they say, because they don’t trust it. We’ve tended to cast this claim as an excuse to ignore data-based reality in favor of politically popular decisions.  

More recently, even though we’re still strong supporters of the move to make decisions based on measures, we’ve begun to get a little skeptical about the validity of some measures and concerned about the way government officials sometimes misuse the data -- inadvertently or, on occasion, intentionally. As Kip Memmott, the audit director for the Oregon Secretary of State, tells us, “Reporting that information is critical and important, but the information can be a lot of smoke and mirrors.”

Sometimes, that’s done purposefully. You could call it cheating. For example, when Cynthia Eisenhauer, a now-retired expert in government performance measurements, was giving customer service training sessions in New York City, she discovered that the staff person assigned to work with her threw away the results of the people who had the lowest scores. 

There was the well-known case in Atlanta a few years ago, when 11 teachers were convicted by a jury for altering student scores on standardized tests. According to media reports at the time, teachers and administrators were under pressure to meet certain scores or they risked termination. 

A few years back, the Ohio auditor told us this story: “If there’s an oil spill and fish die, you send a biologist out to count dead critters. They get a [certain] amount for each dead critter [they find], and they send you the bill. I’m not saying anyone is cheating, but how can you possibly consider those numbers reliable?”

Then there’s this incident at the New York City Department of Homeless Services, which in 2017 experienced a sudden drop in “critical incidents,” such as fights or weapons possession, in its Bedford-Atlantic Armory shelter. When the Daily News looked into the “surprising” results, it found the real reason for this allegedly good news. The mayor’s office had changed the definition of critical incidents to involve firearms possession, but not other weapons commonly found in shelters, such as shivs, steak knives and locks in socks. In fact, even when an arrest is made, that incident doesn’t necessarily show up in the city’s performance measures if the arrest was not made for a particularly serious offense.

Officials of the Homeless Services Department believed that the numbers it was using before the definitional change exaggerated concerns about safety. But others see the situation differently. “Though I can only state surely that the numbers are underreported,” says Gregory Floyd, president of Teamsters Local 237, which represents employees in the department, “my speculation is that the city is underreporting numbers so it can show that the shelters are safer than they are.” 

One of the areas in which measurements frequently leave misleading impressions is emergency response time. David Ammons, professor of public administration at the University of North Carolina at Chapel Hill, tells us that when local governments report the average emergency response time, citizens and members of the city council tend to think that it is a measure of the time from when a citizen calls with a 911 emergency until the emergency unit rolls up onto the scene. 

But in fact, the stopwatch may start when emergency units are dispatched, not when the calls come in. That means that citizens and council members may believe the emergency response time in their city is around six minutes when it may actually be eight minutes once the time between calls and dispatch is figured in. Fire chiefs say they use this measure because their departments don’t directly control the dispatch, so they don’t want those minutes included in the measures. “But if they’re doing that,” Ammons says, “then there may be a problem with dispatch and the city council and the citizenry are oblivious to that.”

There are instances in which the data are inadvertently misleading and create an unfortunate impression. Phoenix, for instance, is known for its high rate of reported bias-related crimes, which makes the city look bad when compared to other communities and in benchmarking exercises. But, in reality, the Phoenix police are one of the few agencies in the country with a dedicated Bias Crimes Unit. It has mandatory reporting when officers suspect bias. As a result, more cases are identified, investigated and reported.

Unfortunately, unlike financial data, which is certified through the audit process, very few performance measures are validated in a scientific way. As a result, fallacious or misleading data can pass from an agency to the legislature or the public without the kind of confirmation that inspires confidence. 

This is all unfortunate because, as Sharon Erickson, the city auditor of San Jose, Calif., says, “The basic premise of American democracy is that people will trust their government, if their government provides accurate information.”

We still believe that measuring and disclosing data about government operations is useful and meaningful. But we also know that much more attention needs to be paid to the data behind the measures. 

Special Projects