Sie sind auf Seite 1von 64

Big Data in the Public Sector

Institutions for
Development Sector
Selected Applications and Lessons Institutional Capacity of the
Learned State Division

Authors: DISCUSSION
Louisa Tomar PAPER Nº
IDB-DP-483
William Guicheney
Hope Kyarisiima
Tinashe Zimani

Coordinators:
Benjamin Roseth
Sebastián Acevedo

October 2016
Big Data in the Public Sector

Selected Applications and Lessons Learned

Authors:
Louisa Tomar
William Guicheney
Hope Kyarisiima
Tinashe Zimani

Coordinators:
Benjamin Roseth
Sebastián Acevedo

October 2016
http://www.iadb.org

Copyright © 2016 Inter-American Development Bank. This work is licensed under a Creative Commons IGO 3.0
Attribution-NonCommercial-NoDerivatives (CC-IGO BY-NC-ND 3.0 IGO) license (http://creativecommons.org/
licenses/by-nc-nd/3.0/igo/legalcode) and may be reproduced with attribution to the IDB and for any non-
commercial purpose. No derivative work is allowed.

Any dispute related to the use of the works of the IDB that cannot be settled amicably shall be submitted to
arbitration pursuant to the UNCITRAL rules. The use of the IDB's name for any purpose other than for attribution,
and the use of IDB's logo shall be subject to a separate written license agreement between the IDB and the user and
is not authorized as part of this CC-IGO license.

Note that link provided above includes additional terms and conditions of the license.

The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the
Inter-American Development Bank, its Board of Directors, or the countries they represent.

Contact: Benjamin Roseth, broseth@iadb.org; Sebastián Acevedo, sacevedo@iadb.org.


Big Data
in the Public Sector
Selected Applications and Lessons Learned

Authors:
Louisa Tomar, William Guicheney, Hope Kyarisiima, and Tinashe Zimani

Coordinators:
Benjamin Roseth and Sebastián Acevedo
ABSTRACT
This paper analyzes different ways in which big data can be leveraged to improve the efficiency and
effectiveness of government. It describes five cases where massive and diverse sets of information
are gathered, processed, and analyzed in three different policy areas: smart cities, taxation, and citi-
zen security. The cases, compiled from extensive desk research and interviews with leading academ-
ics and practitioners in the field of data analytics, have been analyzed from the perspective of public
servants interested in big data and thus address both the technical and the institutional aspects of
the initiatives. Based on the case studies, a policy guide was built to orient public servants in Latin
America and the Caribbean in the implementation of big data initiatives and the promotion of a data
ecosystem. The guide covers aspects such as leadership, governance arrangements, regulatory
frameworks, data sharing, and privacy, as well as considerations for storing, processing, analyzing,
and interpreting data.

JEL Codes: O31; O33

Keywords: big data; innovation; service delivery

vii
Table of Contents

Acronyms ....................................................................................................................................................... ix

Prologue ........................................................................................................................................................ 1

EXECUTIVE SUMMARY ..................................................................................................................................... 3

INTRODUCTION ................................................................................................................................................. 5

BIG DATA DEFINED ............................................................................................................................................ 7

METHODOLOGY ................................................................................................................................................. 9

SMART CITIES .................................................................................................................................................. 11

TAXATION .......................................................................................................................................................... 21

CITIZEN SECURITY .......................................................................................................................................... 31

POLICY GUIDE ................................................................................................................................................. 37

REFERENCES .................................................................................................................................................. 49

viii
Acronyms
B2B Business to business

B2G Business to government

D4D Data for Development

DAS Domain Awareness System

ECLAC Economic Commission for Latin America and the Caribbean

GDP Gross domestic product

GTP Golden Tax Project

ICT Information and communications technology

IDB Inter-American Development Bank

IS Information systems

ISO International Organization for Standardization

IT Information technology

LAC Latin America and the Caribbean

LSE London School of Economics and Political Science

NYPD New York Police Department

PPP Public–private partnership

RTCC Real-Time Crime Centre

SAT State Administration of Tax

SPED Sistema Público de Escrituração Digital

TfL Transport for London

VAT Value-added tax

ix
Prologue

I
t is by now no surprise that we live in a world of developed- and developing-country contexts—
of data. Data are produced in greater quanti- with cases from multiple levels of government
ties and by more sources than ever before within the United States, the United Kingdom,
and analyzed faster and with greater sophistica- China, and Brazil—provides readers with insights
tion than was imaginable just a few years ago. from diverse corners of the globe. The report is
Every day, new tools are created to turn raw written for policymakers, taking into account the
data into information, and information into visual practical constraints they face and the tradeoffs
representations. The reach and applicability of inherent in investment decisions. While much
big data seem limitless. work to date focuses on the “what” and the “why”
For leaders in the public sector, how- of big data, this report aims to tackle the equally
ever, investments in technology are often as important issues of “who” and “how.”
synonymous with “boondoggle” as they are This effort complements a wide range of
with “progress.” History is riddled with stories IDB lending and knowledge-generation activi-
of information technology (IT) “upgrades” that ties in support of digital government initiatives
end up over budget, behind schedule, and more throughout the LAC region. From cybersecurity
trouble than they are worth. Because govern- to open government to the digitalization citizen
ment investments are made with taxpayer dol- services, the IDB promotes open, efficient, and
lars and the budgets of many Latin American effective public sector institutions throughout the
and Caribbean (LAC) governments are sensi- region. Big data is one tool with great prom-
tive to the volatility of the global commodities ise in this ever-evolving challenge. This report
market, IT projects must be undertaken with gives readers a useful review of the dynamics
great care. They must meet a strategic need at play. We wish to thank the London School
and must be consistent with policy priorities, of Economics and Political Science (LSE) for
adaptable to legal and administrative frame- their ongoing relationship with the IDB and par-
works, and feasible within capacity constraints. ticularly to the master’s degee students in the
It is in this complex context that the Development Management master’s students
Institutional Capacity of the State Division of the that are the main authors of the report.
Inter-American Development Bank (IDB) is study-
ing the topic of big data. This report provides a Carlos Santiso
first glance at a range of applications of big data Chief
in the public sector, focusing on three key areas: Institutional Capacity of the State Division
smart cities, taxation, and citizen security. The mix Inter-American Development Bank

1
EXECUTIVE SUMMARY

T
his report was commissioned by the organizations. The opening, co-mingling, and
Inter-American Development Bank (IDB) sharing of government data across agencies is
as a consultancy project for Develop- essential for creating a foundation in which big
ment Management Master’s students at the data insights can be derived.
London School of Economics and Political Sci- In addition, the cases analyzed here
ence (LSE). It seeks to answer the question indicate that commitment from leadership is
of whether and how big data can help govern- the cornerstone of successful implementation
ments improve policy design and service deliv- of data projects. Management and key deci-
ery with emphasis on identifying keys to suc- sion makers must first establish a clear, com-
cess as well as primary obstacles. The report prehensive vision for the use of data that falls
identifies leading examples of big data’s cur- within a larger development plan and includes
rent uses by national and local governments in accessible procedures and incentive alignment
the areas of smart cities, taxation, and citizen for creators, analyzers, and users of data. The
security, as each represent policy space rele- engagement of key stakeholders inside gov-
vant to countries in the LAC region. The report ernment is an important factor to ensure (i)
concludes with a policy guide which identifies access of relevant and timely data and (ii) the
how big data can be integrated into public-sec- cultural transformations needed in the orga-
tor initiatives through strategic policymaking, nizations to bring about data-driven decision-
regulatory improvements, and investments in making processes. In each of the case studies
technology and human capital.   included in this report, big data was instituted
The main contribution of the report is the for a particular agency to better achieve its
policy guide, which emphasizes (i) institutional specific objectives, that is, improving public
arrangements and (ii) investments in human transportation, addressing tax evasion and
and physical capital. With insights at both the fraud, and reducing crime.
micro and macro levels, the policy guide is Finally, the development of data-driven
meant to demonstrate the institutional and tech- environments in the public sector requires a
nical foundations that best facilitate big data’s delicate balance that considers protections
practical applications in the public sector, with from data misuse while not stifling important
resource constraints in mind. sharing and innovation. As the cases show, the
The case studies highlight big data’s existence of open data regulations and inter-
need for innovative and flexible institutional ministerial data exchange mechanisms is a
arrangements, given the highly context- key condition for the exploitation of big data for
specific nature of its integration into public policy objectives.

3
Hosting an effective data environment with the strategic vision of the data usage
requires investment in technical tools, such as and have a background in data science as
the cloud or data warehouses, where informa- well as the specific tools being employed.
tion can be stored and used by creators and Once the strategic and technical ground-
consumers of the data. In many cases, govern- work is laid, establishing common proce-
ment data that is already being stored requires dures and training civil servants to interact
specific sharing practices and regulations. efficiently with the data are fundamental
The human capital required to imple- steps to ensure its integration into every-
ment the technical tasks should be familiar day tasks.

4
INTRODUCTION

D
ata are being produced at unprec- ICT for development continues to be a
edented levels globally. By the end of priority policy area for the LAC region. Some
2015, there were a reported 3.2 billion countries, such as Brazil and Mexico, continue
Internet users and more than 4.6 billion users to close the digital divide, while others are lag-
of mobile phones to communicate and transact ging behind. The Economic Commission for Latin
(World Bank, 2016). Many innovations have America and the Caribbean (ECLAC), through its
been made toward expanding the technological “eLAC” initiatives, has identified cloud comput-
capacity to generate, store, and analyze data ing and big data as the key focus areas for the
from an array of sources and for a multitude of post-eLAC15 agenda. Strengthening policy and
purposes. Information and communication tech- promoting investments in these two areas offer
nology (ICT) tools are faster, more efficient, and potential tools for “changing production patterns,
increasingly accessible to the poorest and most generating quality employment, creating local
underserved segments of the world’s popula- value-added, and enhancing the region’s com-
tion. Individuals, firms, machines, and govern- petitiveness and integration into global markets”
ment agencies produce data at unprecedented (CSTD, 2015: 20). Among the specific requests
rates. Some 2.5 quintillion bytes of data are pro- to ECLAC from policymakers were policies that
duced every day, and approximately 90 percent facilitate “structural change that foster more
of existing data was produced in the last two knowledge- and innovation-intensive produc-
years alone (IBM, undated). The ever increasing tion and promote sustainable growth with social
data footprint provides a range of possibilities equality” (CSTD, 2015: 20). This report seeks to
for usage by government. comment on the structural change that fosters
The researchers set out to determine more knowledge sharing and the range of pos-
whether big data could help governments sibilities that big data tools offer to enhance exist-
improve policy design and service delivery by ing governance practices with a particular focus
identifying leading examples of big data uses in on smart cities, taxation, and citizen security.
the public sector. Because governments have a Each case study looks at high-level
range of opportunities to use data, this publication institutional arrangements and technical con-
focuses on the key institutional arrangements and siderations that facilitated big data’s integra-
technical considerations that facilitate the innu- tion into that particular policy space. The report
merable options available for data integration. was compiled from desk research conducted
Concurrently, this paper assesses the analytical over four months and interviews with leading
processes most commonly used by public bod- academics and practitioners in the field of data
ies to capture insights from the data they collect. analytics. The study approached the question

5
of big data’s inclusion in policy making with all understand the institutional and organizational
the countries of LAC in consideration. Due to investments covered in the policy guide, a first
the differing degrees of technological capacity step is to become familiar with the range of tools
and regulatory preparedness of data creation, that fit under the umbrella of “big data.”
curation and mining in the region, the research Much of the available information on big
was analyzed beyond the basis of technological data focuses on its capacity to save and gener-
transfer of cutting-edge technologies. ate revenue within the private sector. Practical
The case studies were selected based applications of big data in the public sector are
on the availability of information regarding both relatively new and understudied. The research-
technical aspects of information systems— ers encountered a dearth of peer-reviewed,
infrastructure, human capital and operational academic analyses on the impact of big data in
procedures—and how the integration of data either the public or the private sector. Because
systems was achieved. These include: (i) formal of the rapid growth and financial value of the
institutions, including policy, governance struc- data analytics industry, the researchers were
tures and the legal-regulatory framework; and careful to limit the inclusion of sources that were
(ii) implications, including organizational and promotional in nature. There are two key areas
management structures and considerations for of the report that the researchers were unable
incorporating data usage into the day-to-day to analyze sufficiently: quantitative cost-benefit
duties of civil servants. analyses of the financial investments made
There is much enthusiasm surrounding in integrating big data tools, as there is little
the capacity of big data to strengthen govern- publicly available information on this aspect;
ments. Yet, much of the ground work to facilitate and the spectrum of data privacy and protec-
this is the result of specific policy interventions tion laws that governments and civil society are
under comprehensive management visions. To debating locally and globally.

6
BIG DATA DEFINED
The 3Vs of Big Data

E
fforts to gather, store, and analyze large ● Velocity: Data streams in at an unprec-
amounts of information are not new. edented speed and must be dealt
Many governments and firms have with in a timely manner to be relevant.
been collecting large amounts of data about Technological innovations, such as RFID
their citizens or customers to better understand tags1, smart metering and sensors, offer
their preferences and provide better services data in real-time, which greatly increases
and products. The concept of “big data” began the potential for identifying useful pat-
to gain momentum in the early 2000s, when terns for immediate decision making.
Doug Laney, an analyst working for the META ● Variety: The types of formats that data
Group, an international investment and advi- may take are extremely diverse—from
sory company, articulated a now widely used structured, numeric data to text docu-
broad definition of big data, that continues to ments, emails, and even video and
be expanded (SAS, 2016). Broadly defined, big audio. The processing power required
data refers to the recent exponential growth in to analyze such an array of data collec-
the quantity and variety of digital data, and the tively is now available.
power of the hardware and software used to
analyze it. Big data is categorized by the three While the three Vs describe the physi-
Vs (IBM, undated; WEF, 2012): cal characteristics of big data, the technological
● Volume: It was estimated that every innovations for storing and processing diverse,
two days in 2010, the volume of data large data sets represent its potential impact.
being created was the same as that Cloud-based data storage allows much larger
created in all of recorded history until volumes of information to “rest” together. New
2003. Today, it is estimated that 2.5 tools that run algorithmic analytics are increas-
quintillion bytes of data are created ingly more powerful and accurate. Data mining
every day. Data are produced from a and predictive analytics offer a range of pos-
wide array of sources, including busi- sibilities for strengthening governments’ capac-
ness transactions, social media and ity to understand countries’ complex socioeco-
email, machine-to-machine interactions nomic issues, from the spread of epidemics to
and sensors, photos, audio, video and unemployment trends and confidence in the
interpersonal communication. economy.

1
Radio Frequency Identification (RFID) tags are embedded data chips that take the place of barcodes.

7
As the case studies show, even with rel- geometric models rather than being
atively small datasets, conducting the correct explicitly programmed by an individual.
analysis or finding the hidden pattern can help ● Digital footprint: Big data is often a
a public entity reduce its operating costs and cost-free byproduct of digital interaction.
optimize its day-to-day effectiveness. Big data By virtue of the ICT tools at the world’s
analytics can reduce the time it takes to spot disposal, everyday activities, from
bottlenecks and inefficiencies, thus allowing tweets to texts to credit card payments,
the public sector to address immediate issues leave behind digital footprints that are
in a streamlined, rapid, and targeted manner. aggregated into big data.  
Analysis of big data trends can strengthen or ● Variability: In addition to increases in
tailor specific policy interventions or public ser- the velocity and variety of data, data
vices by limiting manual processing time and flows can be inconsistent with periodic
providing more precise evidence for decision peaks, requiring organizations to design
making. highly adaptive systems to allocate their
Data may come as extremely large, scarce data storage and processing
complex, and dynamic information whose ben- resources efficiently. An example is the
efit is not always obvious, or they may be highly spike in social media and communica-
time sensitive. Additional important character- tion data following a natural disaster.
istics of big data analytics are outlined below: ● Complexity: Because data come from
● Machine learning: Data analysis is multiple sources, in different formats,
often highly automated. Patterns in the it is increasingly difficult to link, match,
data are automatically identified by pow- cleanse, and transform data across sys-
erful analytics programs tied to powerful tems. Without the proper analytical meth-
computers which learn from the stream odology and protocols, big data analytics
of information though probabilistic or can lose its timeliness and value.

8
METHODOLOGY

I
n light of the scale of challenges and oppor- municipalities with tools for addressing issues
tunities that cutting-edge data collection tied to high urbanization rates and overburdened
and analysis represent for the public sector or absent public transportation systems.
throughout the LAC region, this paper presents Emphasis was placed on selecting
case studies that focus on areas where pro- cases in which a significant amount of relevant
cessing and analyzing big data offers promis- information could be accessed to explore each
ing solutions in various areas associated with topic thoroughly. In this study, we conducted
social and economic development: smart cities, desk research from international public and
taxation, and citizen security. private publications regarding the current and
The case studies represent policy spaces potential uses for big data in the public sector
that are common to all countries as well as those and interviewed practitioners and academics in
in which big data analytics has shown to be a the field. The interviews focused on the practi-
valuable ICT tool for strengthening public sec- cal application of big data in the public sector,
tor capacity. They demonstrate efforts to remove with insights on less visible obstacles and the
endemic knowledge-sharing failures of public specific technologies being utilized.
service delivery that many developing coun- To provide a comprehensive and holistic
tries face. For example, many countries suffer account of the factors that led to the successful
from a low tax-to-GDP ratio and high levels of implementation of big data systems, this report
tax evasion, which dramatically reduces a gov- focuses on two key elements. The first are the
ernment’s ability to generate revenue that funds institutional arrangements that are conducive to
important goods and services. The combination the successful establishment of information sys-
of powerful information systems and regulatory tems. This is further broken down into (i) formal
frameworks offers an opportunity to reverse institutions, including policy, governance struc-
this by increasing the tax base and detecting tures, and the legal regulatory framework; and
fraudulent activity. With regard to citizen secu- (ii) implications, including externalities such as
rity, many LAC countries are facing levels of the behavioral impact of data-driven management
violence that are adversely affecting the qual- structures on civil servants. The second are the
ity of life of many citizens. Combining predictive technical aspects of information systems, includ-
analytics with innovative management struc- ing both infrastructure and physical capital. This
tures may provide real-time insights for curbing covers human capital and operational procedures
high levels of petty crime, organized crime, and required to efficiently operate a big data system. 
trafficking. Additionally, recent global efforts to To properly showcase the complex inter-
create “smarter” data-driven cities provide LAC actions between the multiple factors presented

9
above, different narrative structures were created descriptive. The policy guide at the end of this
for each case study. This reflects the heteroge- report lists the main recommendations.
neity of the context of each example, including One limitation of the study was the scant
the variety of institutional and technical dimen- access to information on the financial costs and
sions. Ultimately, each case is designed to illus- investments in the integration or scaling of big
trate how governments incorporated big data data technologies and human capital in the pub-
analytics for improving policy design and service lic and private spheres. Literature from private
delivery with a particular focus on the primary firms emphasizes reduced costs over the long
barriers to success and how they were overcome term and positive externalities that arise from
in each case. Because of the different contexts, data creation, curation, and usage in the public
investments, and outcomes, the case studies are and private sectors.

10
SMART CITIES

A
s a result of London’s impressive rate of population growth—12 percent in the past
decade—and the growing strain on the city’s infrastructure, Mayor Boris Johnson
launched the 2020 Vision Report in 2012. This report laid out a strategy focused on
creating complementary Smart City initiatives by leveraging the technological expertise of
the private sector, including human and social capital and the municipal authority’s ability to
coordinate with local firms. It underscored the need to transform the city’s data ecosystem into
one that is both centralized and open, embodied in the Smart London Plan, and to maximize the
benefits of daily collection and analysis of huge quantities of information by the city’s multiple
public and private organizations.
This case study identifies the challenges and opportunities faced by Transport for London
(TfL), the local government body responsible for the majority of London’s public transportation
networks, as it integrates itself within this new structure. In recent years, the organization has
dramatically altered its data management and processing operations to streamline the data col-
lection and analytics process, which has dramatically improved their ability to understand the
behavior of their customers and identify components of the transport network that are particularly
critical and vulnerable.

Urbanization and Development: A Cutting-Edge Solution


To address the many challenges brought about As populations continue to migrate from rural
by rapid global urbanization, a range of stake- to urban areas, ensuring that cities are effi-
holders have pushed for the implementation of ciently governed will be an important task for
innovative initiatives that would integrate mul- policymakers. Economic growth, public service
tiple technological systems designed to address delivery,environmental sustainability, and social
issues such high energy consumption and out- welfare will require special attention.
moded public transit systems. In Latin America, Urban centers that make use of high-
over 80 percent of the population now lives in tech data configurations to address the issues
cities, leading to an unparalleled rate of urban- mentioned above are increasingly being
ization (Paranagua, 2012). These economic referred to as “smart cities.” They integrate
centers now account for over two-thirds of the investment in human and social capital—
LAC region’s gross national product (GNP). through educational opportunities and public
UN-Habitat predicts that these urban centers forums and discussions—with the expansion
will continue to expand and fuel Latin American’s of traditional (transportation) and modern
growth for years to come (UN-Habitat, 2012). (ICT) communication infrastructures to fuel

11
sustainable economic growth while promising toward effectively promoting the use
urban dwellers a higher quality of life (Berst, of data to address specific urban
2015). Smart cities typically have a combina- challenges
tion of ccomponents, including the following: The research conducted on London’s
● Technological factors: physical infra- Smart City initiative revealed that the keys to
structure; smart, virtual, and mobile their success were (i) the establishment of a clear
technologies; digital networks vision and implementation strategy, (ii) success-
● Data collection tools: human-directed ful integration of necessary technological factors
(surveillance through satellites, drones, under the supervision of skilled professionals,
CCTV), automated (digital devices, sen- and (iii) the development of a flexible institutional
sors, transponders, financial transac- arrangement that is conducive to the creation of
tions), and volunteered (social media, a governance structure that can efficiently absorb
crowdsourcing, etc.) and process increasingly large volumes of data.
● Human factors: engineers, skilled statis- To showcase the importance of combin-
ticians and computer scientists and spe- ing these factors, we will analyze the example
cialized teams sensitive to data ethics of London, Western Europe’s most populous
and regulations city, which has pursued a cutting-edge strategy
● Institutional factors: governance, pol- in response to urban challenges that are par-
icy, regulations, and directives oriented ticularly relevant to Latin American megacities.

London Context, Vision, Strategy

London boasts the sixth highest gross domestic and maintaining a socioeconomic environment
product (GDP) of any metropolitan area, which that encourages private sector growth and
is a testament to the city’s key position in the innovation. The similarities between London
globalized economy. Yet, it is not immune to and Latin America’s rapidly urbanizing cities
the many complex municipal infrastructure and provides the impetus for analyzing the mea-
public service delivery challenges facing cities sures implemented by the City of London’s
across the globe (Hill, 2015). managers to both find solutions to its pressing
As a result of London’s position as a short- and long-term challenges.
true global city, more and more people are In 2012, Mayor Boris Johnson issued a
migrating to this economic hub in search of report, entitled Vision 2020, which set forth a
employment and lifestyle opportunities. The framework that he believed would allow London
impressive rate of population growth—12 per- to thrive in the coming decades. To ensure
cent since 2001 (Hill, 2015)—in London’s met- that the city’s infrastructure could accom-
ropolitan area rivals that of Latin American cit- modate the growing population, his agenda
ies, and it is expected to grow by over 1 million focused on leveraging the private and civilian
people in the next decade. The strain that this sectors. Mayor Johnson wanted to tap into a
huge influx of people will place on the city’s combination of the private sector’s impressive
infrastructure is of great concern to city manag- technological expertise, the human capital
ers. They face two main challenges: ensuring and networks of Londoners, and the munici-
that the city’s physical infrastructures will allow pal authority’s ability to coordinate with local
them to provide high-quality public services, firms. The Mayor’s Office issued two reports

12
detailing its strategy: the London Infrastructure by ensuring that the multiple organizations
Plan 2050 in 2013 (Mayor of London, 2015), that compose the Greater London Authority
designed to channel public resources to the pool their data together, the Greater London
large-scale projects where they are most Authority’s Intelligence Team—which is the
needed, and the Smart London Plan in 2014, independent public organization responsible
focused on ensuring that London continues to for conducting socioeconomic and statistical
leverage its technological expertise and transi- research and providing local authorities with
tions towards a data-driven city. This plan was the evidence they need to formulate policy
drafted by the Smart London Board, a recently and strategy—will have access to a much
created panel of experts, including academics, larger volume and variety of data.
business leaders, and entrepreneurs working To illustrate how these multiple policy
together to advise the Mayor on data policy. papers, public organizations, and data infra-
It is the primary framework for the numerous structures are intending to transform London
smart city initiatives that London will pursue. into a smarter city, this case study focuses on a
A core objective of the plan is to specific policy space—transportation—to pro-
transform London’s data environment into vide an in-depth analysis of the complexity of
one in which data are openly shared and integrating a large public organization into a
centralized. These data will be made avail- smart city plan, and the opportunities that it cre-
able to the general public through the London ates for the public sector. For a more detailed
Datastore and the London Dashboard, pub- summary of the Greater London Authority’s
lic open data repositories that provide real- governance structure, the London Infrastructure
time information on a range of topics, from Plan 2050, and the Smart London Plan, please
crime rates to Tube delays. Furthermore, refer to the Annex.

Institutional Arrangements

TfL is the local government body responsible for managing the red bus network and con-
for most components of the Great London tracting services to private sector companies.
Area’s public transportatiopn network. The TfL’s operations are centrally managed from
Greater London Authority oversees TfL, under the Surface Transport and Traffic Operations
the direction of a board of directors headed Centre, which uses real-time surveillance and
by the Mayor of London (GLA, undated). The analytics to monitor and coordinate responses
board develops and applies policies to promote to traffic congestion and incidents.
and encourage safe, integrated, and efficient Transportation showcases the poten-
transportation facilities. The body is orga- tial of combining open and big data initiatives,
nized in three main directorates: the London and how London’s strategy has allowed the
Underground, London Rail, and Surface city to realize both short- and long-term gains.
Transport. Each is responsible for different By combining bottom-up accountability tools
aspects and modes of transportation. Each of to obtain direct feedback from people and the
these bodies consists of a number of subsid- extensive collection and analysis of wide vari-
iaries that are responsible for managing spe- ety of digital data streams, TfL has pursued a
cific components of the transportation system. development strategy that will increase public
London’s Buses, for example, is responsible well-being and allow the London Infrastructure

13
Delivery Board (LIDB) to plan large, costly from 1.5 percent to 3 percent of the GLA’s
infrastructure projects according to the evolv- total capital expenditure, an increase of £50
ing needs and preferences of stakeholders. billion, and that this figure could be much
In 2013, an analysis conducted by TfL higher if the transit system is not upgraded
found that the demand for public transpor- in a timely fashion. The scale of the invest-
tation is likely to increase by 50 percent by ments that must be made to improve London’s
2050 (Mayor of London, 2015). Furthermore, infrastructure is likely to mirror the scale of
the Intelligence Unit of the Greater London the cost to Latin America’s megacities, which
Authority (GLA) estimates that the cost of run- must also continue to meet the needs of their
ning the city’s transportation system will climb growing populations.

Data Collected

For more than a decade, TfL has worked to cutting-edge analytics and information systems
incorporate the data it collects daily from peo- are now allowing it to collect enormous amounts
ple using its multiple services into its orga- of data from a variety of sources (POST, 2014).
nizational body. It began these step toward The data gathered from the TfL’s many director-
becoming a data-driven public service provider ates and subsidiaries allows it to:
in 2005, when it formed a partnership with the ● Operate the largest smart ticketing sys-
Massachusetts Institute of Technology (MIT) tem in England, the ‘Oyster’ card con-
to find new ways to exploit the data it was col- tactless payment system. Recording
lecting to relay highly personalized information the time, location, and date of use of
on service disturbances along their commut- the card allows them to track people’s
ing routes and to plan future upgrades to the use of the mass transit system. The
transportation system. More recently, TfL has system has also been upgraded to
made large investments to radically improve its allow people to pay with their contact-
data management to allow its 517 full-time IT less credit cards, further enhancing the
staff to take full advantage of the high volume data collection potential of this system
of information they are collecting (Shah, 2014). (POST, 2014).
In 2014, TfL selected services from the analyt- ● Utilize fixed sensors that can provide
ics firm Tibco to bring together data resources data on the degree of congestion on a
across the organization’s multiple directorates particular road.
(Shah, 2014). The goal was to create a central- ● Combine mobile sensors, such a float-
ized data infrastructure and improve the data ing vehicle data, which use data col-
collection and analytics process (Rossi, 2015). lected through mobile phones, GPS
Concurrently, it invested in tech firm SAP’s trackers, or on-board navigation sys-
in-memory analytics software HANA to man- tems to give a picture of general traf-
age and improve decision-making in real time, fic conditions on roads. These sensors
allowing TfL to go from overnight processing to are used to monitor the movement of
having the data processed almost immediately London’s 19,000 buses.
(Shah, 2016). ● Include sensors on bike rental stations,
TfL’s recent push to centralize its such as the Barclays Cycle hire, that
data management operation by investing in provide real-time information on the

14
availability of bicycles and the patterns data mining tools and geospatial visualiza-
of bike rentals. tions (Feldman, 2015).
● Provide a Wi-Fi system, which can be While it is difficult to obtain a rigorous
used to track the movement of people account of the exact analytical processes used
inside the Tube’s many hallways to pro- by TfL’s data scientists and engineers, and the
vide a clear picture of the flow of peo- size and composition of this team, its efforts
ple as they commute to and from work have already led to the creation of numerous
(Shah, 2014). services for it clients (Mayor of London, 2014).
TfL analyzes the data collected for two These include:
main purposes: improving customer experi- ● The Barclays Cycle hire, which has
ence and conducting research to determine the provided Londoners with over 25 mil-
upgrades that must be made to the transporta- lion bike trips since 2010 and created
tion infrastructure. an open data stream broadcasted on
Although there is little public informa- TfL’s website that provides real-time
tion on TfL’s Customer Experience Analytics information on the usage of their bicycle
Team, this unit focuses on utilizing TfL’s data system.
to provide users with real-time information ● Tracking the movement and speed
that increases their well-being by providing of all London buses, which led to the
a information on traffic and transit conges- creation of the “Countdown” service.
tion, scheduled service disruptions, and other This service provides live bus arrival
conditions (Feldman, 2015). This highly spe- information for all 19,000 buses and
cialized team studyies the travel behavior of allows TfL staff to quickly detect service
individuals using Oyster card data, develop- disruptions.
ing personalized services for customers who ● Providing its customers with e-mails
request tailored information. It uses predic- regarding planned service disrup-
tive analytics to mitigate against platform and tions along their most common travel
train congestion at stations and innovative routes.

Predictive Analytics
With regard to long-term infrastructure planning, these insights, the LIDB—with the support of
TfL’s planning team has developed the London the GLA’s Intelligence Unit—has been able to
Land-Use and Transport Interaction Model prioritize transportation-related infrastructure
(LonLUTI) and the London Transportation Studies upgrades (Mayor of London, 2015). The top
Model (LTS) to prepare forecasts of growth in priorities are the following:
total travel, changes in travel patterns, and modes ● An impressive railway system upgrade
of transportation chosen, to identify where infra- intended to connect 200,000 homes
structure upgrades must be made (TfL, 2014). north of London to the center of the city
The LonLUTI is a predictive analytics (with £2 million of initial funding from the
model that combines data regarding land use by Government)
firms, residents, developers and transportation ● An extensive program to upgrade the
service and infrastructure suppliers to forecast city’s underground railway system,
where demand for mass transit services will increasing its passenger capacity by 30
increase in the future (Feldman, 2015). Using percent

15
● Improving the city’s business junctions opportunities faced by the public sector in this
to improve facilities for cyclists and context. Even though it is undeniable that TfL
pedestrians has been at the forefront of the data-driven man-
● The creation of two tunnels to connect agement of transportation networks for decades,
neighborhoods currently isolated from Smart London’s emphasis on open data initia-
the economic center tives, and the organization’s recent push to cen-
The establishment of Smart London Plan tralize and optimize its data collection and ana-
has led TfL to share an increasingly large amount lytics processes are testaments to the potential
of its data on the London Datastore (Mayor of of these types of policies for maximizing cost-
London, undated). The open data initiative has efficiency in public services and dramatically
allowed private software developers and citizens increasing the value and utility of data. The fact
to gain access to a wealth of information that can that TfL has pursued a data-driven strategy for
be used to provide Londoners with ad hoc ser- decades makes it difficult to evaluate the exact
vices that rely on powerful analytics software and impact that the Smart London Plan will have on
real-time information. These initiatives include: improving service delivery. However, the increas-
● A number of third-party apps, such as ing level of cooperation between diverse local
Citymapper, which use the open-source stakeholders showcases the role of smart city
data compiled and shared by TfL to pro- initiatives in improving the data ecosystem and
vide their clients with optimized travel regulatory framework of the urban environment
routes, real-time public transportation in which they are implemented.
planning, and other services Considering that the Smart London Plan
● London Buses Countdown service, used was only implemented recently, much remains to
to create over 60 transport apps that be seen regarding the extent to which the GLA’s
provide real-time information for TfL’s other organizations will also benefit from TfL’s data
passengers deluge, but the current efforts and strategy papers
● Collaboration with academic and are quite promising. In particular, the recent shift
research institutions, including MIT, toward an open data regulatory framework has led
Oxford University, and University of public organizations to greatly increase the quantity
Cambridge, to explore ways to use this of information that they share, even within a body
large volume of data for future data ana- as integrated as TfL (Card, 2015). This will greatly
lytics to support Smart London Initiatives increase the value of the data collected and will
(Feldman, 2015) increase the return on investment of collecting and
● TfL’s innovation portal, designed to encour- processing large quantities of information. Within
age entrepreneurs to submit innovative TfL, this transpires through the fact that both TfL
technological solutions to pressing issues Planning and TfL’s Customer Experience team uti-
relating to London’s transportation network lize the same information, collected from disparate
Overall, TfL’s integration within London’s partners, to ultimately achieve radically different
Smart City initiative reveals the challenges and outcomes (Feldman, 2015).

Smart Cities: Policy Papers


The main policy papers highlighted in the case 2050 and the Smart London Plan, are summa-
study, mainly the London Infrastructure Plan rized below.

16
The London Infrastructure Plan 2050
The London Infrastructure Plan 2050, the first to provide the leadership necessary to ensure
of its kind, was drawn up by the Office of the the proper delivery of infrastructure projects.
Mayor of London to guide the development of The Board will ensure the strategic continuity of
London’s physical infrastructure for the next three the plan in consultation with utility companies,
decades. It focuses on upgrading the city’s mass regulators, and the public infrastructure plan-
transit system, increasing its hub capacity, man- ning authorities by adapting the city’s regulatory
aging energy demand to meet climate change framework to each stakeholder’s incentives and
goals, improving the city’s digital infrastructure, objectives. The report emphasizes the need to
managing water supplies, and transitioning to a leverage London’s position as a world leader in
“green” infrastructure, among other challenges. cutting-edge technologies to promote data shar-
The London Infrastructure Delivery Board, an ing through open data initiatives and ensure that
independent authority, was established in 2014 the city can proactively address future challenges.

The Smart London Board and Smart London Plan


In March 2013, the Mayor of London estab- into the fabric of London,” regulated by an
lished the Smart London Board, a panel of overarching open data framework (Mayor of
academics, business leaders, and entrepre- London, 2014: 3).
neurs, to advise the Mayor and the London Overall, the Mayor of London’s smart
Enterprise Panel on how to make London city approach involves promoting the collabo-
smarter by integrating data and technology. ration and engagement of multiple stakehold-
The initial output of this board was the Smart ers, increasing efficiency in resource manage-
London Plan, which is designed to promote the ment, supporting technological innovation,
synergy between the capital’s systems—from and creating transparent open data initiatives
local government to health care delivery and to improve the lives of Londoners. This strate-
utilities—and state-of-the-art digital technol- gies to achieve this vision include investing in
ogy.2 This objective will be achieved by identi- enterprises, training, infrastructure, environ-
fying opportunities and priorities for the public mental protection, the health and well-being
sector, driving citizens to voice their concerns of people, and the mass transit system. The
and maximize the return on their human and plan is particularly well designed for London’s
social capital, and by incentivizing private context because it allows public organizations
investment in state-of-the-art technology. The that have invested heavily in data collection
Plan aims to allow the city to build on its inno- and processing in recent years to seamlessly
vation lead by promoting the establishment of integrate within London’s new cutting-edge
evolving interventions designed to “integrate data infrastructure and open data regulatory
opportunities from new digital technologies framework.

2
Smart London Plan: http://www.london.gov.uk/sites/default/files/smart_london_plan.pdf

17
Institutional Framework
This section describes London’s larger gover- through the local Council tax. This large public
nance structure, and how the multiple public organization’s governance structure is unique in
bodies discussed in the smart city case study, the United Kingdom in terms of structure, election,
including the GLA, the Mayor of London, the and selection of powers.
London Assembly, TfL, and the GLA’s Intelligence The GLA has three functional bod-
Unit, are linked together. London’s governance ies: Transport for London, the Mayor’s Office
structure comprises a number of national, city, for Policing and Crime, and London Fire and
and sub-city actors (Figure 2). The GLA, which Emergency Planning Authority. Each one is
employs most of the actors discussed in our case responsible for delivering specific public ser-
study, is the top-tier administrative body of the vices. Even though TfL only captures roughly 1.5
GLA. It is composed of an elected official—the percent of the GLA’s total capital expenditure, it
Mayor of London—and 25 elected members of the captures 60 percent of the GLA’s total expendi-
London Assembly. The GLA is funded primarily by ture, which indicates the importance of its position
direct government grants and collects some funds within the overall structure (Rode et al., 2014).

Figure 2. Governance Structure of the Greater London Authority

LONDON Multi-Level Governance


National level City level Sub-city level

governance structure greater london expenditure


2014-2015
100%
ECONOMY ENVIRONMENT & INFRASTRUCTURE & EDUCATION & HEALTH & SECURITY OTHER ECONOMY greater london authority
PLANNING TRANSPORT CULTURE SOCIAL SERVICES ENVIRONMENT & (mayor and assembly) 7%
PLANNING
uk central government INFRASTRUCTURE transport for London 60%
16 of 24 Departments 90% & TRANSPORT
BUSINESS, environment, transport education health defence hm treasury
INNOVATION & food & rural
SKILLS affairs culture, media home office cabinet office
& sport 80%
WORK & energy & justice communities &
PENSIONS climate change local
government
UK EXPORT
FINANCE Foreign & 70%
common wealth
office

60%

greater london authority


50%
BUSINESS & environment transport FOR culture health & SPORT MAYOR'S OFFICE
ECONOMY LONDON FOR POLICING &
PLANNING REGENERATION CRIME /
METROPOLITAN 40%
HOUSING POLICE
LONDON FIRE &
EMERGENCY
PLANNING 30% SECURITY POLICE AND SECURITY 29%
AUTHORITY /
LONDON FIRE
BRIGADE
20%

33 london BOROUGHS
10%
BUSINESS & environment LOCAL EDUCATION HOUSING LOCAL SERVICES
ECONOMY TRANSPORT
PLANNING SOCIAL SERVICES
fire & emergency 4%
0%
Source: Rode et al. (2014).
18
The Greater London Authority’s Intelligence Team
At the core of London’s data ecosystem lies The GLA’s Intelligence Unit shares the
the Greater London Authority’s Intelligence data it has collected and processed on The
team, the organization responsible for con- London Datastore and The London Dashboard,
ducting socioeconomic and statistical research two public open data repositories that provide
to provide local authorities with the evidence real-time information on a range of topics, from
they need to formulate policy and strategy. The crime rates to Tube delays. The Mayor of London
Intelligence Unit, housed in City Hall, works is responsible for The London Datastore, but
to assist policymakers in formulating open the assistant director of the Intelligence Unit
data policies and more importantly to provide directly manages it. The Intelligence Unit is also
forecasts on the economy, labor markets, and currently working on the City Data Strategy, “a
demographic trends. It draws on data provided much needed means of ensuring secure supply
by a host of actors, including the cluster of and sharing of meaningful ‘city data’ and pro-
public organizations that constitute the GLA, viding data suppliers and value generators with
Transport for London, the London Fire and the right set of motives and incentives to make
Emergency Planning Authority, and the Office the London data economy turn faster still” (GLA,
of the Mayor. An extensive open data regulatory undated). Unfortunately, there is scant informa-
framework ensures that information is efficiently tion regarding this initiative, but it is a testament
and transparently transferred from these orga- to the desire of local policymakers to create a
nizations in accordance with the objectives out- data environment that fosters the exchange of
lined in the Smart London Plan (GLA, undated). data between private and public actors.

19
TAXATION

A
s is the case in many developing economies, taxation challenges in LAC countries
include tax evasion, low collection rates, and weak tax administration. Tax evasion and
income tax fraud are the main problems identified and addressed in this three-country
case study.  Brazil, China, and the United States have used big data to formulate, improve, and
manage tax policy and administration in diverse environments through fairly similar policies
and tools. Each state agency studied created efficiencies by enhancing their ability to audit
transactions and tax filings and improving external monitoring and support of personal income
and business tax filings.

Leveraging Big Data for Taxation: A Three-Country Study


Although LAC countries fare better than other they attempt to exploit systemic complexities,
developing countries (IMF, 2011), taxation in LAC such as erroneous information on how to clas-
is characterized by low collection, unprogressive sify goods and services.
bracketing with rampant evasion, and very weak Many countries have begun to over-
tax administration (Corbacho, Fretes Cibils, and come these barriers by operationalizing official
Lora, 2013). This is especially true for countries mandates—implementing agreed guidelines
with a thriving informal sector, where business and policies as mandated—by which govern-
and other direct taxes are hard to collect because ments can create and access high volumes of
of its ambiguous, unregulated nature. It is there- tax-relevant data from multiple sources. These
fore no surprise that tax collection generates only range from public records, products of vari-
17 percent of GDP in LAC and tax administration ous routine government functions, to business
remains a key policy area of concern (Corbacho, or other processes that require government
Fretes Cibils, and Lora, 2013).3 oversight. These data resources are valuable,
Though challenges are complex and as they can be used to collect taxes, support
context-specific, optimizing revenue earnings, audit and monitoring efforts, and facilitate
expanding the tax base, and combating weak economic policy design and implementation.
tax administration and tax evasion are major Currently, 159 out of the 193 UN member
objectives of all governments. A common con- states utilize ICT-intensive systems for tax
tributing factor to this scenario is the high trans- management (World Bank, 2016). Solutions
action cost for reporting, which leads to evasion. are as varied as the unique challenges that
Tax subjects skirt provisions and obligations as each county faces.

3
IDB publication on taxes, firm size, and productivity in Latin American and Caribbean Countries. Available at http://idbdocs.iadb.org/wsdocs/
getdocument.aspx?docnum=35101307

21
As the cases from Brazil, China, and the business processes, creating electronic plat-
United States illustrate, governments have har- forms to centralize information processing and
nessed the power of large scale data to achieve integrate platforms to facilitate the exchange of
efficient tax administration by streamlining information among government departments.

CHINA
China is utilizing big data for taxation in various Specific challenges in administering the VAT
ways. The value-added tax (VAT) contributes for China were counterfeiting, invoice reselling,
more revenue than any other tax (Shuanglin, or outright theft of invoices (Xing and Whalley,
2008), and was estimated at 43 percent of total 2014).
tax revenue in 2012 (Wu, 2013). For many Today, China’s tax administration system
years, tax evasion was high due to loopholes makes extensive use of big data solutions that
in monitoring transactions, leading to a rev- use multi- and cross-referencing to verify busi-
enue loss of US$15.9 billion in 2012 (Qing, ness information. Simultaneous monitoring and
2013). Some businesses were found not to auditing efforts facilitate tax filing by business
have issued invoices, a key input in the tax owners where data gaps or discrepancies exist.
payment and verification process. There was a China’s National Development and Reform
need to standardize the e-commerce industry Commission piloted a nationwide program
to address such gaps and avoid the use of fake aimed at enabling the issuance of invoices via
documentation that enabled the tax evasion. an online tax system to address the loopholes.

Governance Structure and Regulatory Framework

The focus of this case study is the Golden Tax (SAT) created the Administrative Measures
Project (GTP), which was part of the broader for Online Electronic Invoices (SAT Order No.
economic reform program that China initiated 30), which standardizes the issuance and use
in the 1970s. The GTP mandates the use of of online electronic invoices. The Measures
specific, sophisticated ICTs to improve compli- came into effect on April 1, 2013. The sys-
ance with China’s VAT laws (Winn and Zhang, tem serves the dual purpose of enhancing
2010). The project, established in 1994, relies the efficiency of tax filing for taxpayers and
on a database built from the electronic VAT of tax administration for tax authorities. It
invoices collected as part of the management doe s more than electronically process docu-
information system for tax administration (Xing ments; it is also a source of data that informs
and Whalley, 2014). The government also uses the monitoring, coordination, and general tax
this information to analyze internal trade pat- administration duties and tasks.
terns and design policies based on the insights The GTP is a single platform, deployed at
generated (Xing and Whalley, 2014). all provincial and municipal levels. It is mainly run
As part of the ongoing GTP efforts, over subnational computer networks managed
China’s State Administration of Taxation by the respective tax authorities.

22
Technical Considerations

The GTP was deployed in three phases. provincial, prefecture, state, and national levels.
The first phase was the National Computer a single computer network linked tax authorities
Inspection Network System of VAT Special at the provincial, municipal, and country levels,
Invoices, in 1994 (Winn and Zhang, 2010). It which enabled cross-referencing and enlarged
involved an invoice cross-inspection system, the database (Winn and Zhang, 2010). This
focused on collecting invoices for transactions improvement of the system led to a decrease
over 5,000 Chinese Yen and was piloted in fifty in dubious VAT-specific invoices from 300,000
major cities. Initially, tax authority staff entered to 20,000 per month (Yu, 2003).
data manually into the system. However, due The third phase of the system, con-
to the errors caused by manual data entry, a structed in 2006, includes subsystems that
second phase was rolled out in 1998. execute management, collection, inspection,
This second phase was referred to as punishment, implementation, remedy, and super-
the “one network, four subsystems” as stated vision functions (Figure 1). The system is able to
by Xu Shanda, a Vice Commissioner of State generate VAT invoices automatically using the
Administration of Taxation. It was designed to anti-counterfeiting subsystem and investigate
enhance the system’s capacity to detect coun- sources of dubious invoices by location and busi-
terfeit invoices, entry errors, and fraud attempts ness type, and monitor progress of the invoice
across more tax department levels and at the to the next stage of processing using set criteria.

Figure 1: Management and Control Process of GTP: Phase III

Stop selling if any problem Selling Input invoicing system


invoice

Investigation Invoicing

Problematic
Invoices that do not pass invoices after Declare invoice data
enter the investigation investigation and return payment
system result

Inspection VAT payment

Counterfoil and credit invoices Obtain credit after credit


Credit
inputted into inspection system invoice certification
certification

Source: Xing and Whalley (2014).

23
Infrastructure and Physical Capital Factors
Two factors that have enabled China to design progressive improvement model that fixes such
and implement the system in such a vast econ- gaps (IMF, 2011).
omy are its globally competitive domestic ICT Across borders: On 27 August 2013,
enterprises and its advanced ICT technology China signed the Multilateral Convention on
(Winn and Zhang, 2010). Countries interested in Mutual Administrative Assistance in Tax Matters
learning from China’s example can use existing at the Organisation for Economic Co-operation
international standards for e-invoicing to estab- and Development (OECD). This convention sets
lish similar platforms. automatic exchange of information as the new,
The GTP presents an innovative fiscal global standard. Mexico and Brazil are also sig-
and administrative reform process that is seen natories to the convention.
to have delivered dividends in ensuring com- This is the most comprehensive multilat-
pliance with tax laws, reduction of monitoring eral tax instrument available. It provides all forms
costs, and increased global competitiveness of of administrative assistance including spontane-
local businesses due to the efficiencies derived ous exchange of information, simultaneous tax
from the use of and familiarity with such a sys- examinations, and assistance with tax collection.
tem (Winn and Zhang, 2010). It is primarily a A valuable tool for governments to fight offshore
government-run and -controlled system that tax evasion, the convention also ensures com-
augments the government’s ability to monitor pliance with national tax laws and respects the
and oversee tax administration. Although it rights of taxpayers by protecting the confidential-
takes up to a month to verify whether an invoice ity of the information exchanged. In an increas-
is correspondingly used to file VAT returns, the ingly globalized and interconnected world, tax
system nevertheless provides an efficient tool cooperation and tax compliance are of crucial
for verification, and the SAT has adopted a importance for all countries and citizens.

The Fapiao
The VAT is collected using a simple pre-pay- and information must match those held in the
ment electronic transaction support and moni- system depending on the invoices issued. Any
toring system. Mandatory electronic invoices discrepancies or inaccuracies are investigated
must be issued to verify payments. One such and appropriate action is taken.
invoice is the fapiao. Fapiao are purchased in advance from
How it works: Business owners regis- the tax authority to the value of a month or a
ter with the tax authorities to open an account year’s worth of projected tax collections as
with the relevant provincial tax authority. During recorded by the business entity. In essence, busi-
business transactions, they fill out the details of ness owners pay the tax in advance. Business
the transaction and the taxes charged and issue owners are required to issue fapiao—the tax
the electronic fapiao. The relevant tax authority invoice—reflecting the total value of each trans-
verifies the fapiao by matching the information action. These have become very useful in moni-
provided against that held in the online system toring and combating tax evasion, especially in
(business name, business address, unique a cash- based economy such as China’s. Firms
business identification number, province, etc.). then can claim refunds for any unissued invoices
For tax purposes, the reported business income since the purchase is made in advance. This has

24
enhanced the government’s ability to verify tax the business process. Provinces are allowed
filings against reported transactions as a way of to pilot the system to assess its feasibility, and
thwarting tax evasion. In addition, it gives the they can recommend to the state tax agency
government a basis for projections of business any intrinsic or systemic adjustments necessary
volume for a financial year, which helps in fis- depending on the uniqueness of their experi-
cal planning (e.g., foreign exchange supply and ences. Furthermore, allowances were made for
control), support, and monitoring. users experiencing bad Internet connections;
Each fapiao has a unique number that they can issue the invoice manually and then
legitimizes it. Customers can verify its authen- upload it within 48 hours following a transac-
ticity through an instant text messaging service tion. This sustains the vital information provi-
to a government hotline before accepting it as sion function of the system for effective tax
issued by a business entity. Customers are administration.
particularly encouraged to do this because the Big data elements: This invoicing and
government uses lottery scratch cards tied to VAT system has inherent connections to busi-
these invoices. This incentivizes all participants ness licensing, foreign exchange controls, and
in the process to ask for these invoices in a bid other elements of the business environment.
to increase their chances of winning.   By monitoring the reported volumes of trade,
A unique feature of this collaboration is the the government can deduce the nature of trans-
interaction of telecommunications systems col- actions (whether goods or services) to ensure
laborating with government verification mecha- accurate reporting by business owners. This
nisms to close loopholes.  This is through a free has the immediate effect of ensuring effective
text messaging service platform run by the gov- taxation but also the long-term effect of providing
ernment that enables clients to verify, in real time, information to public officials regarding business
invoice legitimacy before accepting it. The data needs and transaction trends, and thus improving
captured on the invoice are what delivers the divi- policymaking. For example, business classifica-
dends in terms of enabling the revenue collection.  tion under Chinese law requires specific sets of
Controls: Any changes to the invoice information on registration, issuance of invoices,
must be authorized by the tax authorities. These and general tax filing. This creates an informa-
changes may include details unique to the busi- tion exchange and verification mechanism that
ness (name, license numbers, classification, utilizes business information and unique identity
etc.) The buyer also has the ability to check the to verify taxpayers even among different govern-
accuracy of the information on the invoice and ment agencies, specifically, licensing and taxa-
may reject it in case of inaccuracies, in which tion agencies. This system provides other indica-
case it cannot be used for tax submission. This tors and datasets for the government to use in
provides a systematic control against tax eva- tax auditing, monitoring, and other roles besides
sion or fraud even at the point of first contact of the recording of business transactions.

BRAZIL
Brazil was one of the first countries to adopt of their respective tax regimes, which presents
the VAT system and is one the fastest adopt- both challenges and opportunities for tax admin-
ers of ICTs historically (Edicomgroup, 2016). istration. Brazil experiences high tax evasion,
Brazil’s states have different tax rates as part especially when business owners try to take

25
advantage of the different tax regimes across and tax payment systems. Brazil’s tax system
states. In 2015, Brazil was ranked 120th out is legislated by 27 different authorities; thus,
of 189 countries on the World Bank’s Ease of there is a real demand for a functioning tax
Doing Business Index (World Bank, 2014). This administration environment that will attract for-
ranking is largely attributed to the complexity of eign investment and ensure smoother domestic
navigating the country’s registration, licensing, processes.

Governance, Regulatory Framework, and Standardization


Brazil’s federal government introduced the red flags. With respect to income tax, it pre-
Public Digital Bookkeeping System (Sistema populates information forms with user details
Público de Escrituração Digital, or SPED). The based on their official identity. Central to the
SPED project is part of the federal government’s system is the ability to identify any departures
efforts to optimize the key Brazilian public infra- from legal tax filing provisions within the exist-
structure aimed at standardizing the exchange ing laws. This is facilitated by the integration
of information through informatics. It aims to of tax administration platforms at the federal,
facilitate fiscal administration with respect to the state, and municipal levels.
integration and exchange of tax information, as This system comprises the following five
provided in Article 37 of Brazil’s Constitution. components (Thebrazilbusinesscom, 2016):
This system was created in 2002 to standardize ● Nota Fiscal Eletrônica (NF-e): a stan-
taxation procedures and computerize the rela- dard electronic fiscal documents that is
tionship between tax authorities and taxpay- issued and stored electronically, with the
ers. It became operational in 2008; by 2012, validation of the issuer through digital
most companies were using it to manage their signature;
taxes. It is run by the Brazilian Internal Revenue ● Conhecimento de Transporte Eletrônico
Service. (CT-e): a transport authorization docu-
The system was created by decree, 4 ment, issued and stored electronically,
for the purpose of standardizing and shar- that enables the movement of goods or
ing financial and tax information, subject to cargo.
legal restrictions; streamlining and standard- ● Escrituração Fiscal Digital (EFD): a digi-
izing taxpayer obligations through different tal file that contains information about
regulatory agencies, rapid identification of tax taxpayers and their history, registration,
offenses, while enhancing the speed of access, tax calculations and payments, con-
process, control and oversight of tax opera- cessions due, and other information.
tions. This program aims to change the way Individuals file their taxes through a
businesses and individuals file taxes by stan- computer program and platform for this
dardizing forms and other documents. This is purpose.
done via an electronic platform that enables ● Escrituração Contábil Digital (ECD): an
government to cross-check information with obligatory portal for filing and saving
other databases and initiate audits in case of documents for all businesses.

4
Decree Number 6.022/2007.

26
● Nota Fiscal de Serviços Eletrônica verification by the tax authority. Sender verifi-
(NFS-e): a digital document designed cation tied to pre-registered sender identities
specifically for service providers, which enables monitoring, follow-up, and auditing. The
is stored on the government’s website. documents generated also act as a guarantee
Information contained in this document to buyers of the legitimacy of the seller.
can be used by both clients and the tax One example of the use of big data is
authorities to verify business information the multiple sources of data (business, client)
and legitimacy. captured on the invoices and the speed of infor-
All five components of the system store mation processing, especially where govern-
and process specialized data according to the ment and business enterprise databases inter-
nature of business, the license, and the purpose face in real time. The system’s ability to verify
of the transaction. Digital signature and certi- business legitimacy and authorize transactions
fication are the key elements of this process. in real time saves time and resources. The sys-
All taxpayers are required to register and are tem has also made information sharing easier
subsequently assigned a unique identifier in the between government departments and states
system, which enables them to use processes because of the uniformity of the formats used.
embedded in it. To move goods, the tax author- Using the data generated by this system pro-
ity has to evaluate and provide authorization for vides the government with valuable information
the legal transport of merchandise. This system about the economy.
supports many business processes, such as Initially, businesses resisted this sys-
freight-forwarding documents, cancellation of tem because it appeared to pose an additional
documents, rejection of documents, submission burden in the already heavily regulated busi-
of accompanying documentation, and authoriza- ness environment. However, the government’s
tions to partners. efforts to provide strong leadership and techni-
One unique feature of this system is the cal support facilitated buy-in and investment in
integration of business and government plat- the necessary inputs to enable the system to
forms, which enables fast and simultaneous operate efficiently. The system has increased
invoicing, tax filing, and authorization for tax voluntary compliance by companies in report-
purposes. The structure of the system provides ing and filing taxes because of its facilitation of
a centralized data repository and common work- audit action in case of red flags (Da Silva et al.,
space for communications between companies 2013). Simply put, the system is set up under
and the government, as well as for a company the assumption that, “by increasing the prob-
and a competing, complementary, or subsidiary ability of detecting tax violations, taxpayers will
company’s systems through the various verifica- declare a bigger portion of their income, or even
tion platforms and publicly held information. It their entire income” (Da Silva et al., 2013: 447).
enables file sharing under the law through eas- The keys to the success of this initia-
ily accessible document interfaces. SPED has tive include supportive efforts such as training
also improved business-to-government (B2G) tax officers and other system users, clear com-
and business-to-business (B2B) communication munication of the benefits of the system after
through the use of government-provided forms completion of setup, and investment in the ICT
and rules. Inbuilt pre-authorization protocols platforms that enable the system to function at
ensure transparent reporting, data capture, and the state and federal levels.

27
UNITED STATES
The United States has an advanced tax admin- The IRS collects over US$2.4 trillion in
istration system for various sectors of the econ- taxes from nearly 250 million tax returns each
omy. Personal income tax evasion remains high, year (Aggarwal, 2016). Citizens file taxes using
however, with 18–23 percent of total reportable online and manual systems, through private or
income not properly reported to the Internal public accountants, and through special soft-
Revenue Service (IRS) (Cebula and Feige, ware designed to guide and support the pro-
2012). U.S. citizens use their Social Security cess. About 80 percent of all tax returns are
Number (SSN) as their tax identity. Because received and processed electronically (Satran,
the SSN is a unique identifier and captures all 2013a). This showcases the large volume of
the government-held data about a person, it is digital data that is processed by the IRS. U.S.
used to assess and collect taxes from all eligible citizens can claim refunds based on allowable
individuals. deductions granted in US tax laws and policies.

Regulatory and Administrative Framework


The IRS is a bureau within the Department public to submit queries by introducing manda-
of the Treasury and is headed by a Commis- tory requirements for RS staff to include con-
sioner.5 Its function and role are set forth in the tact information on any document submitted to
Revenue Act of 1862. In response to emerg- individuals.
ing needs and trends, the IRS continually The IRS processes and issues tax
restructures and repositions itself in response refunds, 80 percent of which are managed elec-
to the dynamic tax and revenue environment tronically. This opens up the system to fraud.
through a formal process of hearings in the This occurs primarily through the falsification of
United States Congress. The Internal Reve- documents, identity theft, and the use of fraudu-
nue Service Restructuring and Reform Act of lent identities. Media reports acknowledge that
1998 introduced provisions for the utilization tax fraud is a growing challenge the IRS faces
of unique tax identification numbers to allow (Hunter, 2015). According to the U.S. Treasury
individual tax credit collection. It established Department, the number of identified fraudulent
divisions to serve specific regions and types federal returns increased by 40 percent from
of taxpayers, provided for the protection of 2011 to 2012, an increase of more than $4 bil-
citizens’ rights in tax administration, and cre- lion in illegitimate payouts done mainly through
ated accountability mechanisms that allow the identity theft (Newcombe, 2016).

Data Formatting and Operational Procedures


According to Aggarwal (2016), “the IRS report- analytics. Robo-audits process tax returns by
edly loses an estimated $300 billion each year checking them against data from third-party
in taxation error or cheating tactics.” To combat records. Collection and analysis of these data
these losses, the IRS decided to use big data “allow the IRS to generate and track unique

5
IRS Organizational Chart: available at https://www.irs.gov/pub/irs-news/irs_org_chart_2012_.pdf.

28
attributes regarding financial behavior, aid tax to train system users to navigate and utilize the
enforcement, and combat noncompliance.” system, assuage public concerns through infor-
(Aggarwal, 2016: 279). Robo-audits are multi- mation provision and advocacy, and provide
year projects that cost US$3 billion and rely on information on security and privacy.
third-party records of credit card and electronic The program compares tax return data
data payment providers, social media, email, with information from other state agencies,
and other online activities for project execution employers, and private firms to spot incor-
(Satran, 2013b). rect mailing addresses and stolen identities.
Data are obtained from a complex col- Because so many returns are filed electroni-
lection of digital information collected from many cally, fraud-spotting systems look for suspi-
sources. The IRS utilizes employer-filed data cious Internet protocol (IP) addresses. For
to compare and verify identities and reported example, when tax auditors notice that similar
incomes to ensure compliance. Employer-filed IP addresses are submitting a series of returns
data provides a second point of reference due for refunds which cannot be matched to any
to its independent nature, making it easier to employer data, they are flagged for further
identify real and fictitious identities created by scrutiny. To enhance the efficiency of robo-
fraudsters utilizing stolen but legitimate identities. audit, the IRS was given the power by law to
The IRS and state and security industry access credit card data to reference and vali-
service providers6 have undertaken a collabora- date identities and incomes. This was through
tive effort to address identity theft and protect a provision in the Housing Relief Act of 2009
taxpayers. This was done through a collabora- (McNeil, 2010; Satran, 2013a), which provided
tive agreement between software firms, pay- a bigger spectrum of information to utilize during
roll and tax financial product processors, and cross-referencing.
state tax administrators in an effort to recognize The IRS, in an attempt to address tax
identity theft and refund fraud. The agreement fraud issues, has instituted measures to bol-
includes, “identifying new steps to validate tax- ster its internal audit systems for optimizing in-
payer and tax return information at the time of house technological and analytics departments.
filing. The effort will increase information shar- It began working closely with state tax adminis-
ing between industry and governments. There trators, cybersecurity experts, and independent
will be standardized sharing of suspected iden- tax and financial service providers to collectively
tity fraud information and analytics from the tax address the issue. Public recognition of the
industry to identify fraud schemes and locate importance of dealing with government revenue
indicators of fraud patterns.”7 According to the loss and the need to protect clients and taxpay-
IRS, the partnership also embodies commitment ers from this threat has bolstered this effort.

6
Electronic Tax Administration Advisory Committee (ETAAC), Federation of Tax Administrators (FTA) represented the states, council for Elec-
tronic Revenue Communication Advancement (CERCA) and the American Coalition for Taxpayer Rights (ACTR).
7
IRS Press Release Number IR-2015-87, June 2015.

29
CITIZEN SECURITY

T
he New York Police Department’s (NYPD) introduction of COMPSTAT in the 1990s offers
insights into the value of introducing data analytics into policing under a new mayor
and police commissioner in New York City. COMPSTAT is a performance management
system that is used to reduce crime and achieve other citizen security goals, while emphasiz-
ing information sharing, responsibility, and accountability. This case study demonstrates how
big data can have far-reaching implications within a police force. The introduction of real-time
data analytics led to substantial personnel changes to reduce crime. The study highlights how
important strong political will and credible commitment by leadership to integrate data are in
realizing the benefits of big data over the long term. It also demonstrates how public–private
partnerships (PPP) between the government and technology companies help ensure that data
tools continue to evolve, despite resource constraints.

The Challenges of Combating Crime

Criminal enterprises invest heavily in sophisticated faced by police forces across the LAC region
technology and innovations, which makes them (Johnson, Forman, and Bliss, 2012), it is impera-
increasingly difficult to monitor or apprehend. Due tive to develop and implement highly cost-effective
to the budgetary and technological constraints solutions to support crime reduction efforts.

The New York Police Department's COMPSTAT

The case of New York City in the 1990s was One of the tools used by the NYPD to
selected to explore how big data can be lever- curb high crime rates and increase citizen secu-
aged by different police forces to anticipate and rity has been COMPSTAT, a system that allows
prevent crime. This study offers an analysis of police agencies to adopt innovative technologies
the potential benefits of integrating data into and problem-solving techniques while updating
day-to-day police activities for reducing crime. traditional police structures. The system func-
In the early 1990s, the crime rates in New York tioned in the following way: every week, per-
City were extremely high. Between 1985 and sonnel from each of the NYPD’s 77 precincts, 9
1990, the number of homicides increased from police services, and 12 transit districts meet to
1,392 to 2,262, a 60 percent increase in only present a wide range of crime-related data. To
five years with rates hovering above 2,000 streamline the process, the police commissioners
through to 1992 (White, 2012). and executives receive all analytics information

31
in advance of the meetings. During the meeting, indicators. High-ranking personnel from inves-
each precinct commander shows activities and tigative units, such as vice and narcotics,
accomplishments to the police commissioner, attend COMPSTAT meetings to further ensure
deputy commissioners, and other top executives. comprehensive explanations of each precinct’s
Precincts are allocated to police departments main challenges.
based on geographical zones throughout New The strategic planning exercises dur-
York’s five boroughs. ing COMPSTAT meetings are then used to
Alongside police management and offi- distribute street cops based on the crime ana-
cers, the analytics unit, which receives data lytics insights. This monitoring and evaluation
from the police in the neighborhoods, provides system provides department leadership with
a summary of the evolution of the amount of the information needed to identify critical fac-
crime and its patterns, as well as a range of tors that lead to high crime rates and allocate
citywide and precinct-specific performance resources to combat them.

Leadership, Vision, and Institutional Arrangements

Atop New York City’s institutional structure sits crime through deterrence and the relentless pur-
the mayor, who is in charge of appointing and suit of criminals.8 Within the first year, Bratton
removing unelected officials from office, and replaced four of the five police chiefs with
hence monitoring their performance to ensure “aggressive risk takers” (Henry, 2006a: 105),
that the city’s public organizations can achieve and ensured that the resulting organizational
their mandates. Part of the mayoral mandate is structure would effectively administer resources
the ability to appoint the police commissioner, and integrate data analytics down the chain of
who is responsible for organizing the police command.
force and creating their overall strategy for The NYPD’s managerial structure
policing. This strategy includes responsibility places Commissioner Bratton at the top of
for reducing crime and maintaining the repu- the police organizational hierarchy, followed
tation of the police force (City of New York, by the deputy commissioners and the police
2004). chiefs directly below him. Within two weeks
Crime was a key issue during the 1993 of setting up the new administration, precinct
NYC mayoral election. Rudy Giuliani made commanders were given “greater authority,
crime reduction one of his key campaign discretion, and organizational power” (Henry,
messages. Once he was elected mayor, he 2006a: 104) to effectively integrate and update
appointed William Bratton as police commis- the new data-driven resources and manage-
sioner. Their commitment to crime reduction ment systems. Through “weekly crime control
included the introduction and utilization of data and quality of life strategy meetings” inept or
tools (Bureau of Justice Assistance and Police incompetent managers were identified and
Executive Research Forum, 2013). In 1995, “more than two-thirds of the department’s 76
COMPSTAT was implemented with the purpose precinct commanders were replaced” within the
of supporting the NYPD in its mission of fighting first year (Henry, 2006a: 105).

8
http://www.nyc.gov/html/nypd/html/administration/mission.shtml.

32
The precinct commander's role is to (O’Connell, 2001), and the “NYPD had no
organize the police officers throughout their functional system in place to rapidly and accu-
allocated geographical zone, including defining rately capture crime statistics or use them for
the number and type of officers needed in their strategic planning. “Crime statistics were often
respective jurisdictions. The devolution of con- three to six months old by the time they were
trol from top management alongside the intro- compiled and analyzed” (Henry 2006a: 105).
duction of COMPSTAT led to the management The biggest shift in the organization was the
paradigm shift that helped achieve the impres- devolution of power to precinct commanders
sive reduction in crime rates. COMPSTAT func- and the fact that all levels of management were
tions were integrated into officers’ mainstream required to communicate in person on a weekly
mandates, making them accountable to one basis.
another and management, irrespective of their Moreover, the units were responsible
administrative and jurisdictional duties. for submitting weekly crime reports to the cen-
Further, Commissioner Bratton believed tralized COMPSTAT Unit located in the Chief
that it was possible for police to be proactive— of Department’s Office (Kelling and Sousa,
that is, to anticipate and prevent crime before 2001). This helped create evidence-based per-
it occurred. This was contrary to the prevailing formance accountability (Nagy and Podolny,
view at the time. Prior to the introduction of 2008). COMPSTAT was an innovative, devo-
COMPSTAT, communication with senior man- lution-driven managerial structure backed up
agement was conducted through memoranda by data analytics.

COMPSTAT Data-Driven Management Structure

COMPSTAT’s operational standards and brainstorming and innovative problem solv-


procedures are based on the following main ing and results in coherent strategies and
principles:9 plans across each individual, unit, or function
● Availability of timely and accurate (Henry, 2006a; Yuskel, 2014). The streamlined
intelligence information network facilitated a reduction in
● Rapid response response time to emerging crime trends.
● Implementing effective tactics The third principle—implementing effec-
● Relentless follow-up tive tactics for tackling and deterring crime—
The first two aspects require accurate relies on the effective identification of crime
crime analytics. In the absence of timely and patterns as well as their appropriate responses.
accurate information, the NYPD would not be This allows management to mobilize resources
able to anticipate or respond promptly to crime. to target crime in a cost-effective way. The data
The high level of cooperation at every echelon identify which precincts need additional support
of the NYPD’s management chain ensured and when to deploy more officers to a particu-
the application of the principles listed above, lar geographical area (Perry et al., 2013). Some
which transformed the police force into “a datasets were georeferenced and thus enabled
seamless web” (Henry, 2006b). This facilitates police to identify where criminal activity was most

9
http://nypdnews.com/2016/04/compstat-keeping-nyc-safe-an-inside-look/

33
likely to take place. This was facilitated by imple- best vantage point to create context-specific
mented community engagement campaigns, solutions. The decentralization of management
which informed adjustments to patrol routes. encouraged all staff to take ownership over
The fourth principle, optimization, recognizing local trends.
embodies the willingness of senior managers This feeds into the final element: “relent-
to delegate power to lower-ranking officers less follow up” (Yuskel, 2014). 10 Autonomy
through feedback loops and follow-up (Bratton, allowed senior management to hold lower-rank-
1996). Optimization increased the ownership ing officials to account via clear performance
of tasks by all individuals in the police force. indicators augmented by data. In some cases,
This strategy was based on the belief that this led to the removal of precinct commanders
police officers had the most knowledge about who were unable to distribute and manage their
conditions in their communities and thus the personnel and resources adequately.

Technical Considerations
Predictive Policing

Predictive policing is the application of quanti- indicate the likelihood of crime. If criminals are
tative analysis techniques to identify likely tar- successful in carrying out crimes, there is a
gets for police intervention and prevent crime high probability they will attempt to replicate
through statistical predictions (Perry et al., the circumstances that previously made them
2013). This model, grounded in criminology, successful (Perry et al., 2013). Additionally, the
suggests that criminals and victims are likely to growing body of curated data can be used to
follow a common pattern. Overlaps in patterns rapidly investigate suspects and victims.

Data Analysis Overview


COMPSTAT’s analytical insights were derived data and existing intervention initiatives. Prior
from existing datasets at the department’s to the introduction of digital data visualization
disposal. The data used in COMPSTAT were tools, crime maps were created manually and
gathered from regular police operations. contained fewer datasets.
Datasets include both (i) structured data, such Further, data analysts can identify hot
as gender, age, and race, and (ii) unstructured spots through maps, which display individual
data such as witness statements and criminal crimes and the crime density of a particular loca-
records. Key elements to reaping benefits from tion. The hot spot maps, made available to the
the data are crime maps. GIS tools create public, show levels of heat in an area to demon-
crime maps by combining crime and general strate crime density (see Map 1).11

10
However, it should be noted that this method has been shown to have negative externalities in combating crime, most notably racial profiling
of African-Americans.
11
https://maps.nyc.gov/crime/.

34
Map 1. Sample Hot Spot Maps

Source: Eck et al. (2005).

Further Developments

The NYPD's use of data has greatly evolved Homeland Security. The centralization of data
since the introduction of COMPSTAT. The collection, storage, and analysis allowed the
steps taken in the 1990s laid the foundation for NYPD to achieve economies of scale and
further data innovations, including the creation thus ensure the cost-effectiveness of the
of a data warehouse. This data hub allows the COMPSTAT system.
NYPD to centralize its data collection and stor- Another case of public–private col-
age operations (D'Amico, 2006). To improve laboration was the creation of the Domain
data analysis operations, the NYPD set up Awareness System (DAS), developed by
the Real Time Crime Center (RTCC) with the Microsoft in conjunction with the NYPD. The
assistance of IBM. Today, data scientists and DAS draws data from a wide range of govern-
engineers conduct crime analysis alongside ment agencies (Joh, 2012). DAS provides street
officers. The cost of setting up the center was officers with specific information about their cur-
US$11 million (D'Amico, 2006). The RTCC rent location via mobile devices. This software
combines NYPD data with other public data- uses information collected from archived police
sets, including Internet searches. Information data, privately operated CCTV cameras, and
can be shared easily with other crime-fight- license plate readers (Dahl, 2012) to offer a
ing agencies, such as the Department of holistic snapshot of their surroundings.

35
POLICY GUIDE

T
his section provides a list of the main insights that emerge from the case studies. The
recommendations are disaggregated along multiple dimensions to give policymakers
a list of the challenges they must address when implementing big data solutions in the
public sector.
The first set of recommendations relates to institutional arrangements—the host of formal
institutions and resultant factors that structure and frame the use of big data in the operations
of public organizations. The policy guide starts off with the leadership, vision, and policy plans,
which is the foundation for introducing data analytics into existing processes. This is followed
by governance structures, organizational structure, and regulatory frameworks. Subsequently,
the technical considerations are broken down by data sharing and privacy and protection, data
storage and collection, data analysis and interpretation tools, and technical equipment for data
processing.

Institutional Arrangements
clear objectives. The creation of special
Leadership, Vision, and Policy Plans panels of experts, such as the Smart
London Board in this case, can also be
The cases indicate that commitment from lead- an effective way of ensuring that the
ership is the cornerstone of successful data vision becomes reality at a relatively
implementation. Leaders must first establish a low cost, as they provide monitoring
clear, comprehensive vision for the use of data and timely recommendations.
that falls within a larger development plan and ● Taxation: In all of the cases, govern-
includes accessible procedures and incentive ments identified the tax challenge and
alignment for creators, analyzers, and users of led the effort to address it. The respec-
data. tive tax authorities were key in informing
● Smart Cities: Mayor Boris Johnson and leading the effort of big data lever-
began by identifying key challenges aging, especially in linking monitoring
and opportunities (population growth and auditing functions to transaction
and the ensuing strain on mass transit, and reporting processes of tax subjects.
London’s position as a leader in tech- Through the use of legal instruments
nology). He then created clear policy (circulars, laws, and policy guidelines),
papers with a long-term horizon and the tax authorities were able to utilize

37
government structures and mandates also prevent it. Their strategic vision
to create a conducive environment for was formulated through the introduc-
database integration, cross-referencing, tion of COMPSTAT, a management
and collaboration with a wide range of and data analytics tool for increasing
stakeholders, an effort that enriched police effectiveness. At weekly meet-
the electronic information available. ings, all precinct commanders and their
Governments also played a leading role staff were required to present their
in correlating big data integration with crime data to management for analy-
wider policy agendas and Internet infra- sis, strategic planning, and defense of
structure, as in the case of Brazil. This resource allocation. The data provided
informed system design, training, and insights on crime and was also used to
support to facilitate compliance. monitor performance, measured by a
● Citizen Security: The leadership of reduction in crime. Fully implementing
the recently elected mayor and the COMPSTAT required massive person-
newly appointed police commissioner nel changes, demonstrating the political
aimed to not only reduce crime but will and leadership commitment.

Governance Structures

Governance structures allow a data vision to feedback that considers preferences and insti-
evolve into functional application. Public agen- tutional or budgetary constraints can strengthen
cies must be flexible and must adapt to chang- long-term expectations, accountability, and own-
ing needs and dynamic opportunities. The most ership and limits potential disruptions caused by
important shared characteristic in these cases political turnover (Corduneanu-Huci, Hamilton,
is the level of inter-institutional collaboration and Ferrer, 2012). Incentivizing the private sec-
and information exchange. As the case stud- tor to work closely with university and research
ies showed, this type of collaboration is key to organizations to tackle complex public problems
facilitating data sharing and the incorporation of is a promising way to align big data projects with
data-driven insights into public agencies’ policy financing for public interest objectives.12
process. Removing silos and creating data envi- ● Smart Cities: The push to encourage
ronments that allow many diverse and remote the exchange of information between
sectors of government to access critical, real- Transport for London’s many sub-bod-
time information often lead to big data initiatives ies is a clear indication of the desire on
in the public sector. the part of London’s senior officials to
Governance structures should also be establish a leaner but more effective
designed to facilitate collaboration with the pri- governance structure. At the munici-
vate sector and academia, since these might be pal level, Smart London’s emphasis on
good sources of knowledge, technical capaci- consulting with stakeholders (citizens,
ties, and data. Having broad internal and public boroughs, public firms) is playing a key

12
An example of this is the D4D challenge by the telecommunications company, Orange. Teams were invited to use big data gathered from
mobile phones to create solutions for variety of policy issues, such as health and transportation (Tatevossian and Yuklea, 2014). Most of the
teams that submitted solutions were from academia.

38
role in ensuring that London evolves in guidelines were utilized to ensure buy-
the direction that suits the needs and in and compliance.
preferences of as many people as pos- ● Citizen Security: Decentralization of
sible, increasing the potential benefits managerial authority was a key com-
and longevity. ponent of the project, encouraging all
● Taxation: Governments faced with staff to take ownership over recognizing
revenue collection optimization chal- local trends.This was carried out with the
lenges opted to emphasize and prac- intention of enabling those with years of
tice multi-institutional collaborations service, experience, and familiarity with
that facilitated information exchange. their communities to manage the com-
Mechanisms to make targeted tax- plex operational problems under their
payers aware of new standards and charge.

Organizational Structure

The research conducted identified many dis- insights they obtain from their analyses
rupting organizational factors as a result of the is destined for very different audiences.
introduction of data. These range from from Creating multiple analytics teams that
highly skilled data scientists with competitive are geared toward specific goals is an
salaries working alongside public servants to effective way of increasing the quality of
substantial management overhauls informed by service provided and ensuring that all of
data evidence. In all cases, integrating data cre- a public body’s main functions can be
ates change. However obvious that may sound, achieved simultaneously.
the full breadth of such changes may not always
be predictable from the outset. Commitment by ● Taxation: In all cases, a state bureau
leadership to the data vision and articulating or agency was in charge of harnessing
its complexities and potential externalities can big data for taxation. This points to
help ameliorate any tensions that evolve without the need for a core entity to provide
undermining the big data project. managerial support and oversee
implementation. Diverse personnel
● Smart Cities: The decision of Transport were recruited to execute technical
for London (TfL) to have multiple analyt- tasks, ensure compliance with national
ics teams is an effective way of ensur- regimes, and propose reforms where
ing that they can achieve their two main necessary.  A unique factor of the United
goals: providing their customers with States and Brazil cases was the close
a high level of service, and upgrading collaboration of different units within the
and maintaining their transportation same agency (Revenue) or wider public
network. Even though TfL uses big data sector (Treasury, Trade, etc.). China’s
to achieve both of these objectives, case shows an additional mandate to an
its Customer Experience Team and existing state administrative tax entity
Planning Team are composed of indi- that issued guidelines and support to
viduals with different skill sets, and the provinces and taxpayers.

39
Additionally, in the case of Brazil, a pri- entrenched culture based on patronage
vate company, Invoiceware, which oper- and favoritism (Nagy and Podolny, 2008),
ates the Global Compliance Platform, as data on crime reduction could verify
provides support and guidance for navi- performance.
gating Brazil’s system to strengthen
skills and compliance. Additionally, the Transit and Housing
Police Departments were merged into
● Citizen Security: Leaders overhauled the NYPD in 1995 (Henry 2006a: 104)
management and the organizational to streamline security accountability.
structure of the police units under them. The NYPD added data analysts to the
Four out of five police chiefs and two- department to work alongside street
thirds of precinct commanders were cops to aggregate and interpret real-
replaced in the first year. This change in time data. This allowed information and
management was further perceived as ideas to flow across skill sets, functional
a credible commitment to removing the domains, and geographic areas.

Regulatory Frameworks

Regulation is integral to proper data manage- and allows both private and public firms,
ment and usage. It requires a delicate balance and the general population, to take
that considers protection from data misuse while advantage of the enormous amount of
not stifling important sharing and innovation. data being collected and analyzed. This
Governments may be required to create new is a quintessential step in ensuring the
rules in unfamiliar policy space that cover ethi- longevity and success of the Smart
cal usage and appropriate sanctioning mecha- London Plan, as smart city initiatives
nisms for non-compliance. involve the collaboration of many actors.
The existence of open data policies and As investments in physical and human
regulations is a key condition for the exploi- capital can be done quickly and big data
tation of big data for policy objectives. In the systems can be established rapidly, it is
absence of open data regulations, mechanisms crucial to ensure that the appropriate
for data sharing among public institutions can legal framework is enacted to ensure
also facilitate its effective use. This is partic- that the growth of the technological sys-
ularly relevant for sensitive data that cannot tem is not delayed by regulatory issues.
be easily opened. In addition, LAC countries
should discuss regulations on access to data ● Taxation: Through statutory and admin-
generated by public companies and private istrative instruments, reporting, record-
concessionnaires as part of their open data ing and tax filing standards were put
and PPP agendas. in place in the United States. State
Administration Tax Orders and Acts of
● Smart Cities: London officials recog- Congress are some of the tools utilized
nized the need for an open data policy to provide regulatory guidance and to
framework to ensure that London’s data enable compliance. Brazil relied upon
ecosystem is conducive to innovation provisions in the federal constitution and

40
an existing ICT plan to roll out electronic jurisdiction. However, there are lingering
invoicing platforms across states, a key concerns about misuse of data by gov-
enabler of the SPED system. China’s ernment agencies. The New York District
State Administration of Taxation Order is Court ordered the NYPD to delete infor-
an example of how sanctions, including mation from its online records, as it
fines and jail time, are used to dissuade found that they were improperly investi-
would-be tax evaders. Sanctions, includ- gating Muslims in relation to terror inves-
ing fines and jail time around failure to tigations (Kredo, 2016). Acknowledging
use or attempts at manipulating the sys- the many ways in which data can be
tems may help reduce tax evasion. used unlawfully requires appropriate
regulation and may also necessitate
● Citizen Security: There was no signifi- an independent adjudicator to rule on
cant change in the regulatory framework privacy issues. It is recommended that
for data usage by the NYPD because it security ministries implementing the use
is governed by Federal law, which ide- of big data educate officers regarding
ally limits the abuse of data regardless of ethical standards of use.

Technical Considerations

Data Sharing and Privacy and Protection

Sharing
Supplying common, structured repositories for for reaping the benefits. Data may be stored in
government data across disparate agencies and a number of formats, for example, CSV, XML,
service areas enables more complete and accu- Excel, and others, depending on the tool cre-
rate information sharing and learning. Removing ating or housing the data. It is also useful to
silos and creating data environments that allow consider the types of data that will be open
diverse and remote sectors of government to to the public versus those used by the public
access critical, real-time information. Big data’s and private sectors because of their capacity
integration into public sector decision making to combine and store large quantities of data.
requires at least three infrastructure invest- Standards on the creation and dissemination
ments “(1) a platform for organizing, storing, of data are essential to ensuring the accuracy
and making data accessible; 2) computing tech- of evidence and insights: data must be open
nology and power that can process large-scale and machine-readable to be easily and rapidly
datasets; and 3) data formats that are struc- processed.
tured and usable” (Bertot et al., 2014: 6). This The long-term potential of big data
is particularly true for human service agencies, requires the ability to combine massive amounts
which can best serve citizens by utilizing all of data for algorithmic analysis. The most
public information on their behalf. expensive and riskiest aspect of upgrading to
Due to the vast volume of data being big data systems is its migration from existing
produced, shared, and stored, formal standards data warehouses to the cloud, where larger
regarding format and readability are essential volumes of data can be stored. The cloud can

41
support hundreds of billions of data and does many other examples reviewed by the authors
not require investment in assets such as serv- did not include high-risk data sets that pose a
ers, their cooling, or maintenance. Establishing threat to citizen security. While it is advisable
a secure repository in the cloud for government to build out regulatory protections and stan-
data initiatives represents an opportunity for dards, this should not be done at the expense
‘leapfrogging’ for countries in which data ware- of capitalizing on valuable data access and min-
houses are not practical. ing opportunities that can lead to collaborative
solutions with firms and civil society.
Privacy and Protection The case studies highlight that per-
Decisions regarding data privacy and protection sonal data, such as tax documents, are not
vary by community and perceived values and anonymized for internal purposes, as authori-
potential for misuse. As the name suggests, big ties need to know whom they are investigating.
data incorporates all types of information indis- Tax authorities such as the Internal Revenue
criminately. Portions of the data being produced Service face numerous threats from hackers
are personal and, even if anonymized, have the and must continue to invest in tougher cyber-
potential to become identifiable. Privacy con- security systems (IRS, 2016). At the same
cerns exist for individuals, firms, and govern- time, individuals look to the government to help
ment, thus requiring a comprehensive frame- protect their personal data from misuse. The
work for protection. It is important to engage European Union is currently reviewing new leg-
the public and other stakeholders to ensure that islation on data protection.
the debate surrounding data privacy and protec- Data protection rules are typically
tion illuminates public concerns while limiting enforced via a regulator or privacy authority.
the likelihood of overly restrictive regulations Their mandates, independence from govern-
(Mcdonnell, 2016). ment, and authority vary by country. The range
Although there is no single vision regard- of authority may include conducting investiga-
ing data privacy, many countries are in the pro- tions, addressing complaints, and issuing fines.
cess of determining appropriate standards. In Technology itself can have a role in limiting mis-
the United States, for example, memoranda sion creep. Specific technological design con-
are used to update security protocols for data siderations limit data collection to restrict illegal
release by federal agencies, providing adequate or unauthorized data processing, mining, and
controls to ensure that information is “resistant access (Privacy International, 2016). Ultimately,
to tampering, to preserve accuracy, to main- the government, along with consumer protection
tain confidentiality as necessary, and to ensure advocates and civil society as a whole, must
that the information or service is available as establish a privacy framework that promotes
intended by the Agency and as expected by data sharing for purposes of greater well-being
users” (Bertot, 2014). and access to public goods and services while
Protection and organization of govern- setting the boundaries against misuse of per-
ment data require an agreed upon classifica- sonal and sensitive information. By engaging
tion and taxonomy system with proper security different stakeholders in this debate, the govern-
protocols in place for confidential information ment has the opportunity to raise and address
(Chakrabarti et al., 1998). The vast majority of concerns and educate the public on the value
government data utilized in the case studies and of data sharing.

42
Data Storage and Collection

Effective data storage and collection require security system to prevent platforms in
appropriate technologies for gathering and stor- which multiple users interact from being
ing data for many current and potential future compromised To optimize the benefits
uses. Governments have vast quantities of data of big data, governments need to make
from various departments and agencies, that certain requirements compulsory. These
once accessible to public servants, can improve electronic systems require investment
service delivery. Collaboration with the private in the internet and other ICT infrastruc-
sector to create the right systems and tools is ture to provide interaction of this kind.
common and has cost saving potential. The need for end-to-end communica-
tion between government and business
● Smart Cities: TfL continues to integrate enterprises calls for setup of a mutu-
its multiple data streams and share its ally beneficial and collaborative design
wealth of information with other public to facilitate smooth use. As has been
bodies. It opened a strategic data center in shown, big data can increase the effi-
2009 and plans to open another one soon. ciency of tax administration by support-
As its data is currently stored in 30 differ- ing the audit, monitoring, and referenc-
ent centers, centralizing its data collection ing functions. However, all systems must
operations will allow it to increase econo- be tailored to local challenges, facilities,
mies of scale and efficiency by investing capabilities, and visions. Phased imple-
in cutting-edge information systems in mentation may be required to correct
fewer locations and facilitate the process errors, eliminate redundancies, and
of sharing information with other orga- strengthen capacity.
nizations that use the same datacenter.
Creating these highly efficient and central- ● Citizen Security: The NYPD collabo-
ized data storage facilities is an essential rated with IBM to develop the Real-Time
component of the Smart London Plan. It Crime Center (RTCC), a data warehouse
is a cost-effective way to provide multiple capable of storing vast amounts of data
public bodies with the most up-to-date from multiple precincts and agencies.
technologies that allow them to benefit This PPP allowed the NYPD to incor-
from each other’s data. porate an innovative, tailored solution
● Taxation: Each tax case demonstrated despite their operational and human capi-
the need for a backup and server tal constraints.

Data Analysis and Interpretation Tools

Data analysis and interpretation tools are and leveraging existing resources and human
highly context specific and depend on the capital is necessary for developing a realistic
level of sophistication required of the ana- budget and ensuring long-term funding for data
lytic insights. Assessing current capabilities investments.

43
● Smart Cities: TfL has continuously ● Taxation: Big data leveraging systems
invested in IT equipment to provide its need to be set within the standards of
staff with the tools they need to comple- national infrastructure and protocols.
ment the powerful big data systems that This facilitates smoother cross-refer-
collect and analyze data. The recent encing, file transfer, and analysis. This
adoption of SAP’s HANA in-memory applies to ICT equipment, Internet stan-
analytics software to centralize the dards, and electronic document formats.
entire data collection and analytics pro- Countries that do not have sufficient
cess showcases the quality of software domestic capabilities to construct such
that is now available and the potential databases can use international, stan-
of PPPs. The London Land-Use and dardized, ISO-approved invoice formats.
Transport Interaction model and the Another option is allowing private ven-
London Transport Studies model, the dors that customize the vision for users,
‘workhorses’ of TfL’s Planning Unit, as in the case of Invoiceware in Brazil.
demonstrate the value of creating spe-
cialized models that can process large ● Citizen Security: In collaboration with
amounts of data for various purposes. Microsoft, the NYPD developed the
The integration of multiple data streams Domain Awareness System (DAS),
through the use of cutting-edge software which draws data from a wide range
can therefore allow one organization to of government agencies, current and
fully utilize its data to achieve widely dif- archived police data, privately operated
ferent goals, increasing the cost-effec- CCTV cameras, and license plate read-
tiveness of the entire system and the ers, which are sent to officers’ mobile
benefits that it yields. devices while they are on patrol.

Technical Equipment for Data Processing

Choosing the appropriate technical equipment skilled and specialized individuals: the
for data processing must take into account Customer Experience Unit, which work
the internal and external participants along to provide TfL’s customers with per-
the chain of creators and users of data. These sonalized information to optimize their
systems can be comparatively complex and experience, and the Planning Unit,
require highly trained staff to use them properly. which uses predictive analytics models
Much of the research suggests that the data to identify upgrades to the transportation
scientists, engineers, and others brought in to infrastructure. Although the exact com-
use this equipment are not centralized teams. bination of these two teams is not pre-
Rather, they are embedded within the particular cisely known, they both have a number
policy space or department for close collabora- of data scientists, software engineers,
tion with intended users. urban planning experts, and customer
experience personnel depending on
● Smart Cities: To properly use the pow- their exact function. These analytics
erful information systems at TfL’s dis- teams are fundamental to the overall
posal, it created two teams of highly system operation.

44
● Mainstream Equipment: TfL’s recent rules. Sub-contracted training for public
information system upgrades have servants facilitated the introduction and
prompted senior management to pro- rollout of the new systems. In addition to
vide IT staff with additional training so tax administration staff, Brazil provided
that the majority of its IT operations training on SPED for business owners.
remain in-house, reducing costs over The United States partnered with
the long term. The learning and devel- consumer representatives, vendors, and
opment team ensures that they are other stakeholders to inform and build
staying technologically literate and can capacity in the areas of data security,
understand new challenges and oppor- identity theft, and privacy to address tax
tunities generated by the new data sys- fraud issues comprehensively.
tems. By providing all relevant staff with
training, TfL ensures that its employees ● Citizen Security: Initially, the data
will become an integral part of its new analytics skills required under the original
data-driven structure and maximize the COMPSTAT system were less complicated
potential benefits that arise from using than the current DAS or RTCC programs.
big data. It also hedges against the risk Ongoing training is required for analytics
that employees may feel that analytical staff, as systems such as Microsoft’s DAS
tools will replace the need for them. require more sophisticated skill sets for
operability.  Officers utilizing DAS require
● Taxation: Each country reviewed mobile devices that enable them to send
released standards and guidelines for and receive information from the central
formatting, procedures, and processing command.

45
REFERENCES

Aggarwal, A. 2016. Managing Big Data Integration in the Public Sector. Piscataway, NJ: IGI Global.

Berst, J. 2015. Smart Cities Readiness Guide. Redmond, WA: Smart Cities Council.

Bertot, J. C. et al. 2014. “Big Data, Open Government and e-Government: Issues, Policies and Recom-
mendations.” Information Polity 19(1,2): 5–16.

Bureau of Justice Assistance and Police Executive Research Forum. 2013. “COMPSTAT: Its Origins,
Evolution, and Future in Law Enforcement Agencies.” Washington, DC: Police Executive Re-
search Forum.

Card, J. 2015. “Open Data is at the Centre of London’s Transition into a Smart City.” Retrieved March
18, 2016, from http://www.theguardian.com/media-network/2015/aug/03/open-data-london-
smart-city-privacy.

Cebula, R. and J. Feige. 2012. “America’s Unreported Economy: Measuring the Size, Growth and De-
terminants of Income Tax Evasion in the U.S.” Crime, Law and Social Change 57(3): 265–85.

Chakrabarti, S., B. Dom, R. Aggrawal, and P. Raghavan. 1998. “Scalable Feature Selection, Classifi-
cation and Signature Generation for Organizing Large Text Databases into Hierarchical Topic
Taxonomies.” The VLDB Journal 7(3): 163–78.

City of New York. 2004. New York City Charter. New York, NY: City of New York.

Corbacho, A., V. Fretes Cibils, and E. Lora (eds.). 2013. More than Revenue: Taxation as a Develop-
ment Tool. (IDB Development in the Americas). London, United Kingdom: Palgrave Macmillan
Publishers.

Corduneanu-Huci, C., A. Hamilton, and I. M. Ferrer. 2012. Understanding Policy Change: How to Apply
Political Economy Concepts in Practice. Washington, DC: World Bank.

D’Amico J. 2006. “Stopping Crime in Real Time.” Retrieved March 1, 2016, from http://www.policechief-
magazine.org/magazine/index.cfm?fuseaction=search_rs&keyword=Stopping+Crime+in+Real
+Time&x=8&y=5.

Da Silva, A., G. Passos, M. Gallo, and M. Peters. 2013. “SPED: Public Digital Bookkeeping System:
Influence on the Economic-financial Results declared by Companies/SPED E Sistema Público
de Escrituração Digital: Influência nos resultados econômico-financeiros declarados pelas em-
presas.” Revista Brasileira De Gestão De Negócios 15(48): 445–61.
47
Dahl, E. 2014. “Local Approaches to Counterterrorism: The New York Police Department Model.” Jour-
nal of Policing, Intelligence and Counter Terrorism 9(2): 81–97. Available at http://www.tandfon-
line.com/doi/abs/10.1080/18335330.2014.940815.

Eck, J., S. Chainey, J. Cameron, M. Leitner, and R. Wilson. 2005. “Mapping Crime: Understanding Hot
Spots.” Washington, DC: U.S. Department of Justice Office of Justice Programs. Available at
http://discovery.ucl.ac.uk/11291/1/11291.pdf.

Edicomgroup. 2016. “Brazilian e-Invoicing | NF-e.” Available at: http://www.edicomgroup.com/en_US/


solutions/einvoicing/LATAM_einvoicing/brazilian_einvoicing.html

Feldman, O. 2015. Big Data and Big Models for a Better Customer Experience. London, United King-
dom: Transport for London.

GLA (Greater London Authority). Undated. Open Data Charter. London, United Kingdom: GLA. Avail-
able at https://londondatastore-upload.s3.amazonaws.com/OPEN-DATA-CHARTER.pdf

Henry, V. E. 2006a. “Compstat Management in the NYPD: Reducing Crime and Improving Quality of
Life in New York City.” Resource Material Series No. 68: 100–16.

_____. 2006b. “Managing Crime and Quality of Life Using Compstat: Specific Issues in Implementation
and Practice.” Resource Material 68: 117–32.

Hill, D. 2015. “London’s Booming: How the City’s Population Surged Past Pre-war Peak.” Retrieved
March 18, 2016, from http://www.theguardian.com/cities/2015/jan/09/london-booming-popula-
tion-growth-success-challenge.

Hunter, M. 2015. Tax-refund Fraud to hit $21 Billion, and there’s Little the IRS Can Do.”

Retrieved April 6, 2016, from http://www.cnbc.com/2015/02/11/tax-refund-fraud-to-hit-21-bil-


lion-and-theres-little-the-irs-can-do.html.

IBM. Undated. The Four Vs of Big Data. Available at: http://www.ibmbigdatahub.com/infographic/four-


vs-big-data.

IMF (International Monetary Fund). 2011. “Supporting the Development of More Effective Tax Sys-
tems. A Report to the G-20 Working Group by the IMF, OECD, UN, and World Bank.” Avail-
able at https://www.imf.org/external/np/g20/pdf/110311.pdf.

IRS (Internal Revenue Service). 2015. “IRS, Industry, States Take New Steps Together to Fight Iden-
tity Theft, Protect Taxpayers.” IR-2015-87. June 11, 2015. Washington, DC: IRS. Available at
https://www.irs.gov/uac/newsroom/irs-and-industry-and-states-take-new-steps-together-to-
fight-identity-theft-and-protect-taxpayers.

_____. 2016. “IRS Statement on E-filing PIN.” Retrieved February 9, 2016, from https://www.irs.gov/
uac/Newsroom/IRS-Statement-on-Efiling-PIN.

48
Joh, E. 2014. “Policing by Numbers: Big Data and the Fourth Amendment.” Washington Law Review
89(35). Available at SSRN: http://ssrn.com/abstract=2403028

Kelling, G. L. and W. H. Sousa. 2001. Do Police Matter? An Analysis of the Impact of New York City’s
Police Reforms. New York, NY: Manhattan Institute Center for Civic Innovation.

Kredo, A. 2016. “Court Requires NYPD to Purge Docs on Terrorists Inside U.S.” Retrieved August 28,
2016, from http://freebeacon.com/national-security/court-requires-nypd-purge-docs-terrorists-
inside-us/.

Mayor of London. Undated. London Datastore. London, UK: Mayor of London. Available at https://data.
london.gov.uk.

_____. 2013. “London Infrastructure Plan 2050: A Consultation.” London, UK: Mayor of London. Avail-
able at https://www.london.gov.uk/what-we-do/business-and-economy/better-Infrastructure/
london-infrastructure-plan-2050.

_____. 2014. Smart London Plan. London, UK: Mayor of London. Available at http://www.london.gov.
uk/sites/default/files/smart_london_plan.pdf

_____. 2015. “London Infrastructure Plan 2050 Update.” London, UK: Mayor of London. Available
at https://www.london.gov.uk/what-we-do/business-and-economy/better-Infrastructure/london-
infrastructure-plan-2050.

McNeil, T. 2010. “Recovery Act of 2009: Public Housing Capital Fund: Obligations and Number of Jobs
by ZIP Code.” Cityscape 12(2): 145–47.

Nagy A. and J. Podolny. 2008. “William Bratton and the NYPD.” Yale Case 07–015. New Haven, CT:
Yale University.

Newcombe, T. 2016. “States Use Big Data To Nab Tax Frauders.” Available at http://www.governing.
com/columns/tech-talk/gov-states-big-data-tax-fraud.html.

O’Connell, P. E. 2001. “Using Performance for Accountability: The New York City Police Department.”
In M. A. Abramson and J. M. Kamensky (eds.), Managing for Results 2002. PricewaterhouseC-
oopers Endowment Series on the Business of Government. Lanham, MD: Rowman & Littlefield
Publishers.

Paranagua, P. A. 2012. “Latin America Struggles to Cope with Record Urban Growth.” Retrieved March
20, 2016, from http://www.theguardian.com/world/2012/sep/11/latin-america-urbanisation-city-
growth.

Perry, W., et al. 2013. “Predictive Policing, The Role of Crime in Law Enforcement Operations.” Santa
Monica, CA: RAND Corporation.

POST (Parliamentary Office of Science and Technology). 2014. Big and Open Data in Transport. Lon-
don, UK: Houses of Parliament.

49
Privacy International. 2016. Data Protection. Available at https://www.privacyinternational.org/node/44.

Qing, L. Y. 2013. “China Rolls out Tighter Rules for e-Invoicing.” Retrieved April 6, 2016, from http://
www.zdnet.com/article/china-rolls-out-tighter-rules-for-e-invoicing/.

Rode, P., G. Floater, N. Thomopoulos, J. Docherty, P. Schwinger, A. Mahendra, and W. Fang. 2014.
“Accessibility in Cities: Transport and Urban Form.” NCE Cities Paper 03. LSE Cities. London,
UK: London School of Economics and Political Science.

Rossi, B. 2015. “How TfL will Use Data about You to Keep London Moving as its Population Soars.”
Information Age. Retrieved March 18, 2016, from http://www.information-age.com/it-manage-
ment/strategy-and-innovation/123459878/how-tfl-will-use-data-about-you-keep-london-mov-
ing-its-population-soars.

SAS (Satistical Analysis System Institute). 2016. Big Data: What it is and Why it Matters. Retrieved
March 10, 2016, from http://www.sas.com/en_us/insights/big-data/what-is-big-data.html.

Satran, R. 2013a. “Next Target of IRS Robo-Audits: Small Business.” U.S. News & World Report. Avail-
able at http://money.usnews.com/money/personal-finance/articles/2013/05/09/next-target-of-
irs-robo-audits-small-business.

_____. (2013b). “Will the Data Boom Pay Dividends?” Yahoo News. Available at https://www.yahoo.
com/news/data-boom-pay-dividends-153416832.html?ref=gs

Shuanglin, L. 2008. “China’s Value-added Tax Reform, Capital Accumulation, and Welfare Implica-
tions.” China Economic Review 19(2): 197–214.

Tatevossian, A. and L. Yuklea. 2014. “The Second “Data for Development” (D4D) Challenge in Africa.”
New York, NY: United Nations Global Pulse. Available at http://www.unglobalpulse.org/Orange-
data-for-development-Senegal.

TfL (Transport for London). 2014. “London Land-Use and Transport Interaction Model.” London, United
Kingdom: Transport for London. Available at http://content.tfl.gov.uk/the-london-land-use-and-
transport-interaction-model.pdf.

_____. Undated. “How We are Governed.” Available at https://tfl.gov.uk/corporate/about-tfl/how-we-


work/how-we-are-governed.

Thebrazilbusinesscom. 2016. “The Brazil Business.” Retrieved April 6, 2016, from http://thebrazilbusi-
ness.com/article/all-about-sped.

UN-Habitat. 2012. State of Latin American and Caribbean Cities Report. Nairobi, Kenya: UN-Habitat.
Available at http://unhabitat.org/?mbt_book=state-of-latin-american-and-caribbean-cities-2.

WEF (World Economic Forum). 2012. Big Data, Big Impact: New Possibilities for International Devel-
opment. Geneva, Switzerland: WEF. Available at http://www3.weforum.org/docs/WEF_TC_
MFS_BigDataBigImpact_Briefing_2012.pdf.

50
White, M. D. 2012. ‘The New York City Police Department, its Crime Control Strategies and Organiza-
tional Changes, 1970-2009.” Justice Quarterly 31(1): 74–95.

Winn, J. and A. Zhang. 2010. “China’s Golden Tax Project: A Technological Strategy for Reducing VAT
Fraud.” Peking University Journal of Legal Studies 4: 1–33.

World Bank. 2014. Doing Business 2015: Going Beyond Efficiency. Washington, DC: World Bank.
Available at http://www.doingbusiness.org/~/media/GIAWB/Doing%20Business/Documents/
Annual-Reports/English/DB15-Chapters/DB15-Report-Overview.pdf.

_____. 2016. World Development Report 2016: Digital Dividends Overview. DC: World Bank. Avail-
able at http://documents.worldbank.org/curated/en/961621467994698644/pdf/102724-WDR-
WDR2016Overview-ENGLISH-WebResBox-394840B-OUO-9.pdf.

Wu, R. 2013. An Overview of E-invoicing in China and the Factors Affecting Individual’s Intention to
B2C E-invoicing Adoption. Espoo, Finland: Aalto University School of Business. Available at:
http://epub.lib.aalto.fi/en/ethesis/pdf/13389/hse_ethesis_13389.pdf.

Xing, W. and J. Whalley. 2014. “The Golden Tax Project, Value-added Tax Statistics, and the Analysis
of Internal Trade in China.” China Economic Review 30: 448–58.

Yu, K. 2003. “On the Problems of Golden Tax Project.” International Taxation 3: 65–7.

Yuskel, Y. 2014. “Implementation of Compstat in Police Organizations: The Case of Newark Police
Department.” Journal of International Social Research 7(35): 774–96.

51
Interviews

Title Organization Date of Interview


Professor (Wireless Communications) King’s College London January 16, 2016
Director (Telecommunications)
Director, LSE Cities The London School of Economics and Political January 28, 2016
Science
MSc Candidate, Smart Cities Policy – The London School of Economics and Political February 8, 2016
Think Tank Science
LSE Fellow, Public Management and The London School of Economics and Political February 8, 2016
Governance Science
Associate Professor, International The London School of Economics and Political January 2016
Development and Research Associate, Science
Institute for Fiscal Studies
Digital Analytics Consultant Deloitte February 4, 2016
Chief Executive Office SolidPartner January 22, 2016
Hadoop Expert U.S. National Archives and Records March 14, 2016
Administration
Engagement Director Amplero March 3, 2016

52

Das könnte Ihnen auch gefallen