Addicted To Data

Policy makers are demanding unified databases, but mixing and matching data are more difficult than they think.
by | November 2006

Statewide voter databases ordered up by Congress after the 2000 election debacle were supposed to bring 21st-century computerized efficiency to the democratic process. Sadly, familiar obstacles got in the way in many states--bad vendors, poor management, insufficient funds, unrealistic deadlines, entrenched bureaucracies.

But state and local election officials also fell victim to "database- itis"--a malady that is reaching the pandemic point in public policy circles. The leading cause is exposure to a well-intentioned but unrealistic belief that simply creating or linking certain government databases will make it easy to share information and solve problems. Unfortunately, a lot of government data are so messy that mixing and matching are more difficult than policy makers appreciate.

Congress is an especially productive incubator for database-itis. In the past two years, lawmakers in Washington have written laws urging or requiring states to maintain databases of sex offenders, prescription drug use and gang activity. And the 2005 REAL ID Act requires states to create interconnected databases to collect, exchange and validate information about driver's license holders and applicants.

Washington also mandated the new statewide voter registration systems, and those systems offer a cautionary case study on database integration. As part of a 2002 election law--Help America Vote Act, or HAVA--each state was ordered to centralize voter lists and cross-check and verify them using information from other government databases, including Social Security numbers, driver's license records and lists of ineligible felons. While that may sound simple enough, it has turned out to be anything but--despite millions of dollars and four years of effort. Most states needed two-year extensions to come close to completing the work, and many had trouble meeting the new January 2006 deadline.

The problems may not end there. In the run-up to this month's general election, some advocates have warned that thousands of eligible voters may have difficulty casting ballots because of minor discrepancies between database records. For example, matching driver's license numbers or the last four digits of a Social Security number against the records of state motor vehicle departments and the Social Security Administration turns out to be tricky. Dates, letters and numbers are easily transposed or recorded incorrectly, addresses and names change, some foreign names are easily misspelled or transliterated in different ways, and some names come in confounding variety--Rob, Bob, Robert, Robby, Bobby. Hyphenated names, names with dashes and double names (Billy Bob or Anne Marie) also can be problematic.

New York University's Brennan Center for Justice underscored the challenge in a report that noted a September 2004 test run by the New York City Board of Elections. Of 15,000 voter registration records sent to the state's DMV, 20 percent could not be matched initially with driver's licenses because of typos by city officials and another 4 percent did not match because of minor errors, such as reversed numbers, made by the voter.

State and local officials have scrambled to avoid such glitches. Software tools can help spot and correct some of the errors, but election officers in many places are looking to lower-tech solutions, such as sending letters and making phone calls to registrants--labor-intensive work that requires a lot of overtime and extra staffing.

One does not have to look far down the policy agenda to find more potential outbreaks of database-itis. In Washington, some lawmakers looking for solutions to the illegal immigration problem have proposed requiring employers to use Social Security records to verify potential employees' eligibility to work in the United States. This would require dramatically expanding a voluntary immigration-enforcement program offered by the Department of Homeland Security and the Social Security Administration. But some business groups complain that the current voluntary program has an error rate as high as 20 percent--"false negatives" that inconvenience employers and employees alike. Nonetheless, two states, Georgia and Colorado, have raced ahead of the feds, enacting laws in 2006 to encourage or require employers to participate in the voluntary program, despite the problematic errors.

Databases are powerful tools. Making full use of them requires that policy makers respect and understand the inherent limits of such technology and adequately account for the time, cost and complexity of correcting human error. It's not that databases are inherently fallible. It's that data are.

Mark Stencel
Mark Stencel  |  Former Editor
mailbox@governing.com  | 

Join the Discussion

After you comment, click Post. You can enter an anonymous Display Name or connect to a social profile.

More from Tech Talk