Democratizing Data

The District of Columbia is making practically all of its data available to the public. The effort to promote transparency and trust is generating excitement.
July 29, 2009
Jerry Mechlin
By Jerry Mechling  |  Contributor
A consultant and former faculty member of the Harvard Kennedy School

Need to improve government transparency and trust? Why not adopt the D.C. Data Feeds philosophy: Take all your data, hold back only those elements that would compromise privacy or security, and release the rest in computer-readable forms to the public?

The Information Age has made oceans of personal and governmental data available -- details of ATM and purchasing transactions, GPS readings showing where your car is now and/or has been, and video of everything from candidates saying things they wish they hadn't to all interactions between police and suspected felons. Compared to even a decade ago, huge quantities of data are collected.

Governments have used this data primarily for internal decision-making. They've created more finely grained financial management systems, more aggressive operational analysis (see the original New York City CompStat program and a variety of derivative CitiStat programs), and reports distributed to the public via Web sites and other channels. Standardizing and sharing data has been a major theme, but progress has been slow as agencies fight to maintain their uniqueness and independence. Data sharing is growing more popular, if rather less quickly than hoped.

What governments had not done prior to the recent D.C. Data Feeds program, and subsequently the federal and other efforts, is make nearly all of the unfiltered data directly available to the public. Some jurisdictions had made crime information available (ChicagoCrime was one of the first of these). Weather was another "data feed" widely accessible in real time.

What Washington, D.C., did differently was to "democratize data." Unless overridden by privacy or security concerns, the city decided to make all government data freely available to the public. Only in the District can you get all transactions on all the purchasing cards used by the city, all the time-reporting and invoicing data from vendors on IT staff-augmentation contracts, and all sorts of the hundreds of data sources collected routinely by city agencies.

Three elements in combination prompted the D.C. program:

o A need to collect and share data internally for the District's CapStat program (a classic performance measurement and improvement program).

o A need to improve not only productivity, but also the interrelated goals of transparency, accountability, civic engagement, and trust.

o The capability to store and communicate data more cost-effectively given the growth of digital data and networks.

Since the District was already moving to collect and share data via CapStat, why couldn't it share the same data directly with the public? For one thing, citizens can make better personal decisions when they possess information such as the sale prices of homes in their neighborhoods, the locations and types of coins taken by parking meters, and crime concentrations that might indicate unsafe neighborhoods for walking at night. Moreover, releasing this data might begin to show that -- as should be true in any democracy -- government welcomes oversight of what it's doing.

Starting with a few databases, the D.C. program now makes move than 200 data feeds publicly available, including building permits, business licenses, crime, housing-code enforcement, liquor-license applications, construction projects, purchase orders and public-space permits. The program drew public attention to the data by running an "Apps for Democracy" contest, gathering citizen-made applications that made creative use of the data. The result was 47 applications whose value, based on what it would have cost the city to build them in-house, was estimated at $2.6 million. (To explore these applications, see the D.C. Digital Public Square.)

By making data available and useful, the program adds value for citizens, saves costs on Freedom of Information Act requests (due to not having to assemble data for individual requests after the fact) and, perhaps most importantly, shows that the city has nothing to hide. We know from Gallup and other polls that most people don't trust government to do the right thing most of the time, so there is clearly much room to improve trust in jurisdictions beyond the District of Columbia.

The D.C. approach was pushed by Mayor Adrian Fenty and Vivek Kundra, then the city's chief technology officer. When Kundra became chief information officer for the Obama administration, he initiated similar federal efforts. A key thrust which the Obama people carried from the campaign to the executive branch is using technology for transparency and civic engagement. Making data visible is critical for both.

It's too early to know what the ultimate impacts of "democratizing data" will be. Regardless, it is already clear that the idea is generating excitement. Leaders of political and technology initiatives are now also paying close attention to transparency and civic-engagement initiatives. Releasing data and making it analyzable could eventually become more important than any other technology-enabled innovation to date, reaching agencies and jurisdictions across the U.S. and around the globe.

(Would "democratizing data" be a good for you? The D.C. Data Feeds program is a finalist in the Harvard Kennedy School's Innovations in American Government Awards program this year. Personally, I'm very interested in understanding whether and how the Data Feeds program (and similar programs) could and should be disseminated to other locations. If moves to democratize data interest you, please e-mail me.)