Sie sind auf Seite 1von 12

Susanna Hall IST 618: Survey of Telecommunication & Information Policy Professor Raed Sharif August 11, 2011

Final Paper

To what extent is Data.govs public sector information valuable for the U.S. general public?

What is public sector information? Public sector information (PSI) is information (content and data) funded, held, and produced by the United States federal government. The U.S. Copyright Act Section 105 specifies the status of PSI and states that nearly all works produced by the federal government automatically go into the public domain and are not protected by copyright. This policy framework does not use intellectual property restrictions or cost-recovery policies (charging the public for the cost of publishing and distributing PSI) to hinder public access and re-use of PSI. Therefore, information created by the U.S. federal government is free to use for any purpose. The benefits of this policy include: supporting the transparency of government activities; promoting democratic ideals; encouraging the dissemination of information to enhance public health, safety, and general welfare; and supporting scientific and technical research functions (Uhlir, 2004).

Even though the U.S. Copyright Act frees PSI from copyright restrictions, historically, federal departments and agencies have generally had natural inclinations towards secrecy and have often restricted data by making it classified (Lakhani, Austin, & Yi, 2010). Also, information managers have not been hired and data distribution systems have long been given

low priority, thus becoming nonexistent or unreliable. Additionally, the Copyright Act only protects nearly all works--copyright protections do apply to works that are not directly produced by federal government employees or officers, such as acquired works, works produced by contractors, or works funded by grants (Linksvayer, 2011). Recent years have seen an increase in the desire to make the U.S. government more accessible and transparent and to modernize PSI dissemination to the public. In a new world of Linux open source software, Wikipedia, social media, Web 2.0, and smartphone use, the open government and government 2.0 movements have emerged alongside President Obamas administration, which has brought a new sense of urgency to this issue. On his first day in office, January 21, 2009, Obama launched the ambitious Open Government Initiative. His memorandum established three objectives : 1. Government should be transparent. 2. Government should be participatory. 3. Government should be collaborative. (Lakhani et. al., 2010)

A key part of the Obama Administrations Open Government Initiative has been the creation of the website Data.gov.

What is Data.gov? The Data.gov initiative was established in March, 2009 by the Federal Chief Information Officers Council and the E- Government and Information Technology Office at the Office of Management and Budget. Vivek Kundra was appointed as the very first Chief Information

Officer (CIO) of the United States and was thus charged with building Data.gov from the ground up. At a time when the government had 24,000 .gov domains that translated into 30 million web pages, Kundra aimed to create one web address that would centralize all PSI for simple and efficient, open and unrestricted, public access. Data.gov would move as much raw government data to the web as possible without compromising national security or privacy and would encourage the general public to use the data in innovative ways. Guiding principles for this digital public sphere included: Focus on Access: strengthen democratic institutions through a transparent, collaborative, and participatory platform that makes data more discoverable. The intended audience included policy analysts, researchers, application developers, nonprofit organizations, entrepreneurs and the general public, and it was the hope that data visualizations and mash-ups would be created that could solve problems faced by the general public. Open platform: modular architecture and API developed with agency and public input. User feedback: to set priorities and determine most valuable datasets. Shared responsibility: amongst agency program executives and data stewards to ensure information quality and security, protect privacy, and use best practices for information management (Office of E-Government and IT et. al., 2010).

The site was launched to the public on May 21, 2009, but its development was not without debate. The main controversy was around the format in which the data would be presented. Wholesale data is raw data, whereas retail data has been aggregated, formatted into displays,

and explained, contextualized, and interpreted for the general public. Many agencies worried that presenting raw data to the public would result in data misinterpretations, the use of harmless data to create harmful data, and adverse consequences. However, Kundra firmly believed in the wholesale data model, and decided to make information quality control a priority in order to try to prevent misuse of the raw data. Today, Data.gov is a repository of 389,914 searchable and downloadable machine-readable raw datasets from the Executive Branch of the federal government that can be used to build applications, conduct analyses, and perform research (www.data.gov/about). Each submitting Department or Agency retains version control over the datasets, but there are no controls placed on the end use of the data. There are three catalogs on Data.gov: 1. Raw data catalog: provides instant downloads of 3,506 machine-readable platform independent datasets. Clicking on the name of the dataset brings up metadata including an abstract, keywords, citation information, statistical methodology, data quality information, and external links. Download formats include XML, Text/CSV, Keyhole Markup Language and Compressed Keyhole Markup Language (KML/KMZ), ESRI Shapefile (for GIS data), Feeds, or XLS (a Glossary of Terms is available to define these open file formats and freely available apps). 2. Geodata catalog: provides records of geospatial data searchable by category (biology, atmosphere, agriculture, human health, etc.) or agency (EPA, NASA, FWS, etc.). 3. Tool catalog: provides hyperlinks to tools or agency web pages that allow you to mine datasets. Meant to provide the public with simple, application-driven access to Federal data (http://www.data.gov/faq).

Analysis It is telling that the intended audience for Data.gov (as stated in the CONOPS document) lists the general public last, after policy analysts, researchers, application developers, non-profit organizations, and entrepreneurs. Although the CONOPS table of core users states that the general public can access the data by downloading datasets, this process seems completely inaccessible to the general user. The first issue is the titles of the datasets--on the Government Apps homepage, for instance, one of the most popular apps is unhelpfully titled US GAAP RSS Feed of XBRL Financials. The second issue is format--on the Raw Data homepage, one of the most popular datasets, GSISs realtime Worldwide M1+ Earthquakes, Past 7 Days, looks fascinating, but the raw text file in the CSV download is not tabulated and difficult to read, and the KML file can only be opened if the user first downloads a data extraction app and has access to Google Earth (version 4.2, or higher). The tool catalog is not readily linked to from the home page (though 2009 screenshots on the outdated How to Use Data.gov Tutorial page show a tools icon used to be present), and searches for KMZ and KML tool come up with about 50 results each, which refer to specific datasets rather than tutorials or links that could help users download the appropriate data extraction app. Even a tech-savvy general user would likely give up in the face of such an unintuitive search interface. These apps may be free, but they are not easily findable and not generally used by the general public. Therefore, this data is by no means simple, efficient, or easily discoverable by the general public. The CONOPS core user table also notes that the general public can also discover and access Federal data via third-party visualizations, applications, tools or data infrastructure. I

suspect that the developers of Data.gov understand that this is the real key to data accessibility for the general public, who are used to seeing retail data rather than working with raw data. In fact, for this data to be used most effectively for the public good, Data.gov must rely heavily on users who can be considered middle men--the policy analysts, researchers, application developers, non-profit organizations, and entrepreneurs referenced in the CONOPS table. These are the users at whom this digital public sphere is actually aimed, and these are the technology and data-savvy users that are most wooed by the Data.gov team (via outreach and collaboration efforts). These users will most likely get the most out of the Data.gov interface, which has recently been redesigned to include Communities in which related datasets are grouped by major issue (such as Restore the Gulf, Energy, Law, Health, and Open Data) so that findability is increased for these middle men. Basically, the value of these datasets can best be unlocked by people who actually know how to work with them. In the case studies below, I show two main ways that the Data.gov data has been interpreted by middle men in ways that attempt to benefit the general public.

CASE ONE: Apps for America 2, Quakespotter, and other citizen-developed apps The Sunlight Foundation is a non-governmental non-profit government accountability and transparency advocacy group that organized the Apps for America 2: The Data.gov Challenge contest in 2009 with the aim to catalyze the development of useful applications and visualizations to make this information more comprehensible to more people (Lakhani et. al., 2010). Sponsored by Google, OReilly Media, and TechWeb, they offered a first prize of $10,000 to the winning app that used at least one Data.gov dataset, looked great, would be useable over a

long period of time, and would help citizens see things they couldnt see before the app existed. Forty-seven apps were created in three months for the contest, and the three winners were Datamasher.org (mashup datasets into state-by-state visualizations), Govpulse.org (search the Federal register, site currently unavailable), and Quakespotter.org (a desktop app that shows where earthquakes are happening and matches them to relevant tweets). Whereas the Datamasher app seems aimed at data enthusiasts and the Federal Register certainly has a specialist audience in mind, Quakespotter (winner of the bonus prize of $2,500 for best data visualization) actually seems to be of more immediate interest to a global general public concerned with natural disasters and the safety of their family and friends. The Quakespotter Version 1.0 app downloads quickly and simply to a Mac or PC, and the user opens it to see glowing white dots representing the most recent seismic activity all over the world. Upon clicking on one of the dots, data (location, date, magnitude) from the United States Geological Survey (USGS) realtime dataset Worldwide M1+ Earthquakes, Past 7 Days (mentioned above) appears in the lower left corner. Clicking Search Twitter brings up more realtime data: five tweets about this specific earthquake. Unfortunately, the app is buggy--the Twitter feed was static and limited and included links that were not actually linkable or copyable, the URLs given for the USGS pages were bad links, and the app froze several times in quick succession. Furthermore, the Quakespotter.org website is bare bones, including no About, Help, How To, or Contact sections. This app had a lot of potential with its innovative mashup of realtime data from USGS and Twitter, but after the PR blitz of the contest results subsided, it seems to have been neglected rather than developed into a robust, reliable, and useable app.

The Data.gov homepage also includes a link to a showcase of 236 citizen-developed apps which actually displays only nine apps. A quick survey of five of these apps was disappointing yet again. Two of the sites were temporarily unavailable (FixMyCityDC) and currently overloaded (Visualizing Community Health Data), two of the sites provided quick data but in a superficial, non-contextualized way (the Employment Marker Explorer and the National Obesity Comparison Tool), and one (the Plant Heartiness Zone Map) had an esoteric, poorly explained interface overloaded with advertisements. My conclusion? Citizen-developed apps based on Data.gov datasets are being enthusiastically created, but it seems that none of them have enjoyed much ongoing development or funding.1

CASE TWO: The 2009 Forbes list of Americas safest cities This case is mentioned in the Harvard Business Schools Data.gov case study as a powerful example that shows only the beginning of what is possible with Data.gov datasets. I imagine it is highlighted because Forbes is one of the U.S.s top national bi-weekly business magazines, well-known for its lists of the top colleges and richest celebrities. In this 2009 list, included in the magazines popular lifestyle section, Zack OMalley Greenburg used federal government information from the Bureau of Labor Statistics (2008 workplace death rates), the National Highway Traffic Safety Administration (2008 traffic death rates), Federal Bureau of Investigation (2008 uniform crime report), and historical natural disaster data from the National Oceanic and Atmospheric Administration, the United States Geological Survey, the Department of Homeland Security, and the Federal Emergency Management Agency. He also used three non-

NB: The 51 mobile apps noted on the Data.gov homepage link to the U.S. governments app store. These apps, while created mainly by government agencies, are not officially created using Data.gov datasets.
1

governmental sources: SustainLane.com (urban sustainability rankings), Risk Management Solutions (a private catastrophic risk modeling company), and Sperlings Best Places (run by Bert Sperling). The Harvard Business School report suggests that the government data came exclusively from Data.gov, but it does not provide evidence (say, from an interview with the Forbes reporter), so there is no way of knowing how innovative this approach actually was-Greenberg could have compiled this federal data from long-existing agency websites or even from an old-fashioned almanac. This case is also underwhelming because journalists have long combined various datasets in order to rank cities, people, and businesses. Was this data uniquely available through the Data.gov initiative, or were the Harvard Business School professor-authors looking to highlight the potential value, rather than actual value, of Data.gov?

Conclusion President Obama took office in January, 2009, Data.gov was created and launched within the following four months, and the Apps for America 2 contest, the Forbes list, and the Harvard Business School case study were all rolled out by May, 2010. This was a heady time within the open government movement, full of the hope and potential of a more transparent federal government. On the event of Data.govs one-year birthday, Vivek Kundra wrote excitedly in The White House Blog that Data.gov had mobilized citizen-developers via an emerging app economy and had spawned a global movement of nations, states, and cities establishing open data platforms. Yet however genuinely thrilling this movement has been within the federal IT community and developer circles, the case studies herein have shown that Data.gov datasets have thus far provided more potential value than actual value to the U.S. general public.

Data.gov has recognized this weakness and continues to push forward. In October, 2010, the General Services Administration contracted Socrata, Inc. to move Data.gov from agency servers into the cloud. For this Next Generation Data.gov, Socrata is slated to improve the site as a public utility so ordinary, non-technical constituencies will be able to find the data (Socrata, 2010). However, today the future of Data.gov is at risk. In April, 2011, Congress agreed to a 75% budget cut for the Electronic Government Fund (from $34 million to $8 million) that is likely to scale back or eliminate Data.gov (Matthews, 2011). The States News Service reported in April that Funding is set to run out beginning on April 20 for seven signature websites, including Data.gov, USASpending.gov and the IT Dashboard (2011). Earlier this month, the nation plunged into further budgetary uncertainty with the debt reduction law approved by Obama and the Congress that will see $900 billion in budget reductions for federal agencies over the next ten years, including their IT budgets (Lipowicz, 2011). Meanwhile, in June, Federal CIO and head of Data.gov Vivek Kundra announced that he will resign from the Obama administration in mid-August and take on a joint appointment at Harvard Universitys Kennedy School and the Berkman Center for Internet and Society (International Business Times, 2011). He will be replaced by former Microsoft executive Steven VanRoekel, who is beginning his new position with a positive outlook: Rarely do you get to take over in a place where so much good work has been done and so much momentum is already established with teams charging ahead at full steam (Lipowicz, 2011). Heres hoping that the momentum of which VanRoekel speaks will continue on full steam so that over the coming years, despite budget cuts, federal government data will become more truly transparent and its accessibility to the general public will be vastly improved.

10

References "EDITORIAL MEMO: PROPOSED BUDGET CUTS WILL WIPE OUT DATA TRANSPARENCY." States News Service 1 Apr. 2011. Academic OneFile. Web. 11 Aug. 2011. Greenburg, Z. O. (October 26, 2009). Americas Safest Cities. Forbes. Retrieved August, 2011 from http://www.forbes.com/2009/10/26/safest-cities-ten-lifestyle-real-estate-metrosmsa.html. Kundra, V. (May 20, 2011) From Data to Apps: Putting Government Information to Work for You. The White House Blog. Retrieved August, 2011 from http://www.whitehouse.gov/ blog/2011/05/20/data-apps-putting-government-information-work-you Lakhani, K., Austin, R., & Yi, Y. (2010). Data.gov Case Study. Harvard Business School. Retrieved August, 2011, from http://www.data.gov/documents/ hbs_datagov_case_study.pdf Linksvayer, M. (2011). License or public domain for public sector information? Creative Commons Blog. Retrieved from https://creativecommons.org/weblog/entry/27895. Lipowicz, A. (Aug 4, 2011). New federal CIO gets praise, advice from community. Federal Computer Week. Retrieved August, 2011, from http://fcw.com/Articles/2011/08/04/ VanRoekel-appointment-as-new-Fed-CIO-met-with-many-cheers-and-a-fewreservations.aspx?Page=2

Matthews, William. "Transparency websites hit by budget ax." NextGov.com 13 Apr. 2011. Academic OneFile. Web. 11 Aug. 2011.

11

Office of E-Government and IT and Office of Management and BudgeData.gov. (2010) Concept of Operations Version 1.0. (CONOPS). Retrieved August, 2011 from http:// www.data.gov/documents/data_gov_conops_v1.0.pdf Socrata, Inc. (2010) Socrata welcomes the opportunity to help Data.gov and Federal Agencies deliver universal data access. Retrieved August, 2011 from http://www.socrata.com/ newsroom/press-releases/socrata-welcomes-the-opportunity-to-help-data-gov-andfederal-agencies-deliver-universal-data-access/ The Sunlight Foundation (September 9, 2009). The Sunlight Foundation Names Apps for America2 Winners. The Sunlight Foundation. Retrieved August, 2011 from http:// sunlightfoundation.com/press/releases/2009/09/09/sunlight-names-apps-america2winners/ Uhlir, P. (2004). Policy guidelines for the development and promotion of governmental public domain information. UNESCO, at v-vi. Retrieved August, 2011, from http:// unesdoc.unesco.org/images/0013/001373/137363eo.pdf. Vollmer, T. (Feb 25, 2011). State of Play: Public Sector Information in the United States. European Public Sector Information Platform Topic Report no. 25. Creative Commons. Retrieved August, 2011, from http://www.epsiplus.net/topic_reports/ topic_report_no_25_state_of_play_public_sector_information_in_the_united_states "Washington's First CIO, Vivek Kundra to Resign in Mid August." International Business Times - US ed. 17 June 2011. Academic OneFile. Web. 11 Aug. 2011.

12

Das könnte Ihnen auch gefallen