Beruflich Dokumente
Kultur Dokumente
T
Aadhaar from a technology point of view. Specifically, he Aadhaar project is the worlds largest national iden-
tity scheme, launched by the Government of India,
the possibilities of identification and authentication
which seeks to collect biometric and demographic data
without consent using the Aadhaar number or biometric of residents and store these in a centralised database. To date,
data, and unlawful access of Aadhaar data in the central about 1.1 billion users have enrolled in the system. However,
repository are examined. The analysis suggests that serious concerns have been raised over the privacy and security
issues related to the Aadhaar project. In this article, we exam-
privacy protection in Aadhaar will require an
ine some of these issues from a computer science perspective.
independent third party that can play the role of an
online auditor; study of several modern tools and 1.1 Background
techniques from computer science; and strong legal and Privacy concerns relating to the Aadhaar project have been the
subject of much heated debate recently (Express News Service
policy frameworks that can address the specifics of
2016; NDTV 2016a). Positions taken by the government and
authentication and identification in a modern Unique Identification Authority of India (UIDAI) on these issues
digital setting. have been ambiguous. Arguing before a bench in the Supreme
Court, the Attorney General of India has claimed that Indian
citizens have no constitutional right to privacy (PTI 2015). This
is surprising not only because there are several interpretations
of constitutional provisions and judgments to the contrary
(Bhatia 2015; Kumar 2015) but also because it contravenes con-
ventional wisdom and best practices in digital authentication
and authorisation systems (Diffie and Hellman 1979).
The finance minister, while getting the Aadhaar bill passed
as a money bill, announced that the government presupposes
privacy as a fundamental right and claimed that the bill has
tightened privacy provisions when compared to what was
there in the previous version (Scroll Staff 2016). However, nei-
ther the government nor the UIDAI makes it clear what pre-
cisely are the privacy concerns that are being addressed, what
precisely are the methods being deployed, and why the result-
ing proposal is secure. The UIDAI (2014) does describe the se-
curity measures it has put in place, but does not provide an
analysis of the measures with respect to perceived threat lev-
els and potential privacy breaches. This has resulted in an
overall confusion about the impact on privacy engendered by
We thank Reetika Khera for the many discussions, and Ambuj Sagar,
the Aadhaar project.
Narayanan Kurur and the anonymous reviewer for suggestions on
improving the manuscript. The first author thanks Manoj Prabhakaran
On the other hand, several civil society activists and social
for many helpful comments and Mihir Bellare for suggesting the use of commentators (Arun 2016; Krishna 2017; Mehta 2017; Jayaram
fuzzy extractors. 2015; Ramanathan 2016; Vombatkere 2016; Makkar 2016; Dug-
Shweta Agrawal (shweta.a@gmail.com) is with the Department of gal 2011; Drze 2016) have expressed concerns about the weak
Computer Science and Engineering, Indian Institute of Technology, privacy provisions in the Aadhaar project and the bill. How-
Madras. Subhashis Banerjee (suban@cse.iitd.ac.in) and Subodh Sharma ever, while alerting to the possibilities of opening doors to
(svs@cse.iitd.ac.in) are with the Department of Computer Science and mass surveillance, we feel that some of the commentaries have
Engineering, Indian Institute of Technology, New Delhi.
been unbounded in their criticisms and not entirely specific in
Economic & Political Weekly EPW september 16, 2017 vol liI no 37 93
SPECIAL ARTICLE
their statements of concerns. The gist of most criticisms has For example, it may be tremendously insightful to be able to
been that the use of biometrics and a unique identification correlate education levels, family incomes, and nutrition
number (UIN), storage of biometric and demographic data, and across the entire population; or disease spread with income
authentication trails in a central repository are necessarily and education.
unsafe. However, whether breach of privacy is inevitable and More generally, it may enable carrying out econometric
whether there may exist technological and legal provisions analysis, epidemiological studies, automatic discovery of latent
which can make Aadhaar safe, are important questions that topics, and causal relationships across multiple domains of the
have not been adequately addressed. economy (UN Global Pulse 2012; McNabb et al 2009; Krishna-
We note that some crucial lacunae in the identification and murthy and Desouza 2014; Varian 2014; Einav and Levin 2014,
authentication processes of Aadhaar have been pointed out by 2013; Athey and Imbens 2015; Kleinberg et al 2015; McBride
Centre for Internet & Society (CIS 2016), which also makes several and Nichols 2015). Indeed, extending the scope of Aadhaar
important suggestions, including implementation of recom- from just being an identification and authentication system for
mendations of Shah (Planning Commission 2012) and Sinha social welfare schemes to a system which generates large-
(Lok Sabha Secretariat 2012) committees. Despite these, thorough scale data and facilitates automated analysis and planning,
analyses of the possible ways in which privacy can be breached, can potentially lead to far-reaching benefits.
and possible countermeasures both from technological and At the same time, apart from the concerns of loss of privacy
legal perspectives, remain missing. In this article, we endeavour and civil liberties, the Aadhaar project has attracted consider-
to fill in some of this gap from a technology point of view. able criticism for causing significant disruptions and exclu-
sions in social welfare schemes (Johari 2016; NDTV 2016a, b;
1.2 Perspectives on Aadhaar: Pros and Cons Drze 2016; Yadav 2016a, b; Khera 2016; Somanchi et al 2017),
At its core, the Aadhaar Act attempts to create a method for both due to careless deployment and uncertainties in biome-
identification of individuals so as to provide services, subsidies tric matching.
and other benefits to them. While the effectiveness of Aadhaar We believe that all the above issues, both for and against,
to the extent claimed in preventing leakages in social welfare require careful analysis and rigorous evaluation; and that the
schemes has been questioned (Khera 2011, 2015; Zhong 2016), technological, legal, and policy frameworks need to be consid-
the advantages of computerisation and reliably maintaining erably strengthened through debates and informed choices to
eligibility and distribution records in digital forms are well evolve an effective national identity scheme.
accepted (Masiero 2015; Khera 2013). Any digitisation requires
indexes or unique IDs, and in social welfare schemes local 1.3 Our Goal
unique IDs like ration or job card numbers are typically used. In the modern digital era, privacy protection does not demand
Standardising the digital record-keeping processes across that data should not be collected, stored, or used, but that
geographies and verticals and linking the local IDs with the there should be provable guarantees that the data cannot be
unique national identities provided by Aadhaar are tanta- used for any purpose other than those that have been app-
mount to virtually collating the different digital record tables roved. Recent advances in the discipline of computer science
into one. Though the digital records may still be geographi- offer several novel and powerful solutions to address many of
cally distributed, real-time access to the data, using the Aad- the privacy and security challenges posed by the Aadhaar pro-
haar IDs as handles, can then be provided to authorised central ject. Our goal is to carefully examine the security concerns,
and state agencies for audit, monitoring, analysis, and planning survey the technological tools that may aid us, and provide a
purposes. Thus, the Aadhaar number provides a single index first order analysis of what might be feasible.
across all services that may use Aadhaar. Our approach is as follows. We first capture the functionality
Additionally, the Aadhaar project may provide the necessary desired by the Aadhaar project. Next, we analyse the security
impetus to standardisation and digitisation of other domains as risks and vulnerabilities engendered by each entity and each
well, many of which are long overdue. The Aadhaar IDs can be communication link in the Aadhaar model. We examine the
used to create local IDs for digitisation of new verticals easily. security measures proposed by UIDAI and discuss where these
Even more importantly, Aadhaar can facilitate linking of local may be lacking. We elucidate recent tools from computer sci-
IDs in currently isolated verticals like census, education, health- ence, particularly from the fields of cryptography and security,
care and immunisation records, birth and death records, land which may assist in providing safeguards: this puts some
records, property registration, income tax, banking, loans and stated concerns to rest while simultaneously raising multiple
defaults, police verification and law enforcement, disaster unforeseen issues.
management, security and intelligence and such others.
Thus, Aadhaar may not only enable efficient design, deliv- 2 The Aadhaar Model
ery, monitoring, and evaluation of services in each domain In this section, we describe the various entities involved in
individually but also offers the possibility of using modern data Aadhaar and their interdependencies, which will enable us to
analytics techniques for finding large-scale correlations in user reason about its privacy and security requirements. The
data that may facilitate improved design of social policy strate- Aadhaar authentication and identity verification system com-
gies and early detection and warning systems for anomalies. prises the following entities (UIDAI 2016b):
94 september 16, 2017 vol liI no 37 EPW Economic & Political Weekly
SPECIAL ARTICLE
Figure 1: The Aadhaar Authentication Framework authentication packets for authentication, and receives the au-
thentication results.
Aaadhar User
Unapproved profiling, tracking, and surveillance of individu- should be trustworthy even when the UIDAI systems and the
als should not be possible. There should be sufficiently strong network cannot be trusted.
measures to prevent such breaches in privacy, with user-verifi- Manual inspection of user data, authentication records, and
able proof of the same. The technical implementation of pri- audit trails should not be allowed. In special cases of properly
vacy and security must be provably correct with respect to the authorised investigations, such inspections may only be possi-
legal framework. The legal framework, in turn, needs to be ble through pre-approved, audited, and provably tamper-proof
suitably enhanced with special provisions to protect the priva- computer programmes, and an accurate tamper-proof record
cy of individuals and society in an advanced information of the entire investigation and digitally signed authorisation
technology setting. chain must be maintained at all times.
The enrolment agencies and the enrolment devices cannot
3.3 Possible Ways of Breach of Privacy be trusted from data privacy and security points of view; nei-
In what follows, we briefly examine the various ways in which ther can the POS devices and various AUAs, whether govern-
the privacy of an individual can be compromised in a setting ment or private, be trusted for data protection.
such as in Aadhaar. AUAs cannot be trusted with biometric and demographic
data; neither can they be trusted with sensitive user data of
Correlation of identities across domains: It may become private nature (for example, medical and immunisation re-
possible to track an individuals activities across multiple cords, etc). All provisions of data privacy and security that
domains of service (AUAs) using their global Aadhaar IDs apply to UIDAI must also apply to the AUAs. Strong legal and
which are valid across these domains. This would lead to iden- policy frameworks are required to ensure this.
tification without consent. It should not be possible to correlate identities across appli-
cation domains, except on suitably anonymised data through
Identity theft: This may happen through leakage of biometric pre-approved, audited, and provably tamper-proof computer
and demographic data, either from the central repository, or programs for carrying out data analysis.
from a POS or enrolment device. In what follows, we discuss the various threats and vulnera-
bilities that result from the Aadhaar project in more detail and
Identification without consent using Aadhaar data: There analyse the measures adopted by the UIDAI against these. We
may be unauthorised use of biometrics to illegally identify also suggest a few possibilities of enhancing the privacy and
people. Such violations may include identifying people by security protections.
inappropriate matching of fingerprint or iris scans or facial
photographs stored in the Aadhaar database, or using the 4 Authentication without Consent
demographic data to identify people without their consent and As we have already discussed, authentication without consent
beyond legal provisions. should not be possible under any circumstances. Additionally,
it should be possible to revoke an authentication credential in case
Illegal tracking of individuals: Individuals may be tracked or it is compromised, with the identity of the individual remaining
put under surveillance without proper authorisation or legal intact. UIDAI defines Aadhaar authentication as follows:
sanction using the authentication and identification records Aadhaar authentication is the process wherein Aadhaar number,
and trails in the Aadhaar database, or in one or more AUAs along with other attributes (demographic/biometrics/OTP) is submit-
databases. Such records will typically also contain information ted to UIDAIs Central Identities Data Repository (CIDR) for verifica-
on the precise location, time, and context of the authentication tion; the CIDR verifies whether the data submitted matches the data
available in CIDR and responds with a Yes/No. No personal identity
or identification and the services availed.
information is returned as part of the response. (UIDAI 2016a)
We wish to emphasise that insider attacks are the most
dangerous threats in this context. For instance, the last three The UIDAI (2016a) goes on to define five types of Aadhaar-
attacks above are much more likely if the attacker can collude based authentication:
with an insider with access to various components of the
Aadhaar system. Type 1 authentication: Through this offering, service delivery
agencies can use Aadhaar authentication system for matching
3.4 Requirement Analysis for Privacy Protection the Aadhaar number and demographic attributes (name, address,
In view of the above, effective privacy protection not only date of birth, etc) of a resident.
requires protecting the Aadhaar system from external attacks
but from internal attacks as well. This requires strong guaran- Type 2 authentication: This offering allows service delivery
tees on securing the data, logs and the transaction trails in the agencies to authenticate residents through OTP delivered to
Aadhaar and the AUA systems. their mobile number and/or email address present in CIDR.
UIDAI cannot be trusted against possible system hacks, in-
sider leaks, and tampering of authentication records and audit Type 3 authentication: Through this offering, service delivery
trails. Indeed, the identity verification and authentication agencies can authenticate residents using one of the biometric
providing applications running on UIDAI computer systems modalities, either iris or fingerprint.
96 september 16, 2017 vol liI no 37 EPW Economic & Political Weekly
SPECIAL ARTICLE
Type 4 authentication: This is a two-factor authentication of- without consent across domains, leading to multiple breaches
fering with OTP as one factor and biometrics (either iris or fin- in privacy (Makkar 2016; CIS 2016; LSE 2005).
gerprint) as the second factor for authenticating residents. Another worrisome issue is that of identity theft, and its
potential for damage now increases manifold. As an illustra-
Type 5 authentication: This offering allows service delivery tive example, let us consider the United States (US) Social
agencies to use OTP, fingerprint, and iris together for authenti- Security Number (SSN) (SSA 2017). The primary difference
cating residents. between Aadhaar and SSN is that the SSN does not have any
Thus, we see that authentication is implemented in biometric identifier attached and it does not support authenti-
Aadhaar via the mechanisms of passwords and biometric in- cation. The SSN associated with a person provides a single
formation. However, in the usage of biometrics, we believe interface to the persons dealings with a vast number of public
there is an implicit confusion between the concepts of identity and private bodies, very similar to how the usage of the Aad-
verification and authentication. In the above usage, biometric haar number is being envisaged. While this facilitates use of
information is used for authentication relying on the unstated administrative data for useful data analytics (McNabb et al
assumption that this information is private. However, we argue 2009), the ease of obtaining the SSN from across public and
that biometric data is public: for instance, peoples finger- private databases also results in extremely high number of
prints can be lifted without their consent from a variety of identity theft cases in the US (LSE 2005: 100).
objects that they may touch and their iris data may be picked The UIDAI does acknowledge the possibility of breach of pri-
up by high resolution, directional cameras from a distance. vacy that can arise due to the use of a single identifier across
Even DNA information can be obtained from the objects that multiple domains and recommends that the AUAs should use
users may touch (Houck and Houck 2008). Hence, fraudulent only domain specific identifiers in their dealings with people
presentation of biometric data for authentication, without (UIDAI 2011: 7). Examples of domain specific identifiers are
conscious participation by a user, is a definite possibility bank account numbers, passport numbers, driving licence
(Akhtar 2012). numbers, ration card numbers, etc. The UIDAI mandates that
Another difficulty with using biometrics as authentication the AUAs should maintain a mapping between their domain
credentials is that revoking biometrics like fingerprints or iris specific identifiers and the global Aadhaar numbers at their
for a compromised user is problematic.1 back end. The UIDAI does not maintain any such mapping and
The analysis in the prior section leads us to conclude that assumes that there cannot be any breach of privacy from the
the usage of only biometrics in the context of Aadhaar authen- UIDAI because the mappings are unidirectional.
tication (Type 3 authentication above) has significant prob- This, however, does not fully mitigate the risks and, the
lems. Type 1 authentication is susceptible to the same problem, possibility of leakage of the Aadhaar number from an AUAei-
since it also uses public information for authentication. It will ther from the database, or during know your customer (KYC)
be necessary to use other factors, like trustworthy manual processes, or even during availing servicescannot be ruled
oversight, in conjunction with these modalities for authentica- out. In particular, there appear to be no safeguards or even
tion. The other types use at least one private modality and are guidelines, either technical or legal, on how the Aadhaar num-
hence safe. ber should be maintained and used by various AUAs in a cryp-
We note that biometrics can certainly be very useful for iden- tographically secure way, and how to prevent the Aadhaar
tity verification. A careful case analysis must be performed to number of an individual from becoming public. In fact, in
delineate whether identity verification or authentication is re- many of the schemes that require Aadhaar authentication, it is
quired in any given context, and UIDAI should appropriately necessary to provide the Aadhaar number as a public identifier
change its authentication architecture to account for the above. which violates UIDAIs own recommendations. With such weak
Also, the legal and policy frameworks must make a clear dis- provisions, identification without consent and correlation of
tinction between authentication and identity verification. identities across application domains without approval remain
as real possibilities. Additionally, since the Aadhaar number is
5 The Aadhaar Number and the Possibility supposed to be valid for life (UIDAI 2011), it cannot easily be
of Identification without Consent revoked in case of an identity theft or if the Aadhaar number is
The Aadhaar number is at the heart of the Aadhaar scheme compromised in any other way.
and is one of the biggest causes of concern. Recall that the Thus, linking individuals across domains with a global iden-
Aadhaar number is a single unique identifier that must tifier for legitimate data analysis and the possible loss of pri-
function across multiple domains. vacy because of the correlation of identity across domains such
Given that the Aadhaar number must necessarily be dis- a global identifier facilitates are conflicting requirements. An
closed for obtaining services, it becomes publicly available, not alternative and more principled strategy to resolve the conflict
only electronically but also often in human readable forms as would be for the UIDAI to issue different local identifiers
well, thereby increasing the risk that service providers and (different Aadhaar numbers) for different domains, but to
other interested parties may be able to profile users across cryptographically embed in to all local identifiers a unique
multiple service domains. Once the Aadhaar number of an indi- master identifier. Several alternatives are possible. One may
vidual is (inevitably) known, that individual may be identified design the identifiers so that no linking across domains is
Economic & Political Weekly EPW september 16, 2017 vol liI no 37 97
SPECIAL ARTICLE
possible at all and it is impossible to isolate the global signa- about user credentials, not even a hash. Also, no process at the
ture from any of the local identifiers. The linking then becomes authentication servers should be able to glean any information
unidirectional, but in the reverse direction to what UIDAI has whatsoever about user credentials from the information ex-
currently suggested. change during an authentication process. Stronger guarantees
Alternatively, one may allow limited linking across do- for tamper detection should be employed. In particular, the
mains, either bidirectional or even unidirectional. The London authentication and other servers must be able to prove to any
School of Economics and Political Science (LSE 2005) identity designated auditor that they have not been tampered with and
report actually suggests such a scheme. Correlation across are running only pre-approved and inspected computer pro-
multiple domains using the master identifier, through crypto- grams. The servers must also be able to prove that none of
graphically secure and pre-approved data analytics software, their data, including records and log files, have been manually
will always be possible in such a scheme. Sufficiently strong inspected or modified.
cryptographic measures should be used to embed the master In almost all internet applications, including banking, it is
identifier in to the local ones to prevent against possible exter- tacitly assumed that the client access devices mobiles and
nal correlation attacks. Also, a major shift in the policy frame- handhelds, laptops and desktop computers are trusted, and
work is necessary to reverse the direction of linking. the responsibility of data protection in these devices is passed
on to the users. However, in special situations where the access
6 Protection of User Data devices are not owned by the users but are supplied by service
In Section 2, we discussed that a major threat to privacy of users providers, the users may have a right to be assured that data
arises from the possibility of insider attacks. In this section, we and credentials cannot be compromised from the access de-
discuss the possibilities of securing Aadhaar from such threats. vices. Examples of such access devices are ATMs, Aadhaar en-
rolment stations, and other POS terminals. In all such cases,
6.1 Threat Levels one may require that a client terminal or a POS device must be
In what follows, we outline the various levels of threat that are able to prove at all times to the server, and also to any ap-
possible and measures that can be taken in each case. proved third party auditor, that it has not been tampered with
Among others, this scenario is common in internet banking, and does only what it is supposed to do. It should also be able
where the application and authentication servers are usually to provide such a proof to a discerning user.
the same; in campus networks, where snooping and attacks
are fairly common; and in various internet and mobile applica- 6.2 Analysis of UIDAI Measures
tion-based services that use Google or Facebook for authenti- The security and privacy infrastructure of UIDAI has the fol-
cation. The basic security requirements in such situations are lowing main features (UIDAI 2014):
that the authentication servers and the application servers (i) There is 2048 bit PKI (Wikipedia 2016h) encryption of bio-
must authenticate themselves to each other and to the clients, metric data in transit and end-to-end encryption from enrol-
to prevent against possible man-in-the-middle attacks (Wiki- ment/POS to CIDR.
pedia 2016f); and user credentials and other critical data must (ii) There are trusted network carriers (ASAs) between CIDR
never travel over the network in unencrypted form. The above and AUAs. Effective precaution has been taken against denial
requirements can be met via a slew of known techniques, of service (DOS) attacks.
almost all of which rely on public key cryptography (PKI) (iii) HMAC (Wikipedia 2016c) based tamper detection of PID
(Wikipedia 2016h). (personal identity data) blocks, which encapsulate biometric
This is a more challenging security situation where, in addi-
tion to the above, one also has to worry about data leaks from EPWRF India Time Series
the servers, either due to hacking or even due to insider leaks. Expansion of Banking Statistics Module
Some common countermeasures are: (i) the authentication (State-wise Data)
servers must never store any user credentials and may only The Economic and Political Weekly Research Foundation (EPWRF) has
store a Hash (Wikipedia 2016a), a value computed from user added state-wise data to the existing Banking Statistics module of its online
India Time Series (ITS) database.
credential(s) using a non-invertible function, and use it for
State-wise and region-wise (north, north-east, east, central, west and south)
matching. Then, user credentials can never leak; (ii) all criti- time series data are provided for deposits, credit (sanction and utilisation),
cal data, records and logs must be stored only in encrypted credit-deposit (CD) ratio, and number of bank offices and employees.
Data on bank credit are given for a wide range of sectors and sub-sectors
forms on the servers. The decryption keys should not be easily (occupation) such as agriculture, industry, transport operators, professional
accessible; and (iii) there must be provisions for tamper detec- services, personal loans (housing, vehicle, education, etc), trade and finance.
These state-wise data are also presented by bank group and by population
tion for both data and programs. group (rural, semi-urban, urban and metropolitan).
Popular solutions to realise the above-mentioned counter- The data series are available from December 1972; half-yearly basis till June
measures, such as secure hash algorithms (SHA-n) (Wikipedia 1989 and annual basis thereafter. These data have been sourced from the
Reserve Bank of Indias publication, Basic Statistical Returns of Scheduled
2016i, a) and Kerberos authentication protocol (Wikipedia 2016d) Commercial Banks in India.
do exist and are frequently employed. Including the Banking Statistics module, the EPWRF ITS has 16 modules
In even stricter situations, one may require in addition that covering a range of macroeconomic and financial data on the Indian economy.
For more details, visit www.epwrfits.in or e-mail to: its@epwrf.in
the authentication servers must never store any information
98 september 16, 2017 vol liI no 37 EPW Economic & Political Weekly
SPECIAL ARTICLE
and other data at the field devices, is one of the security There appears to be no proper tamper detection and runtime
features of the UIDAI infrastructure. audit of the field devices, including enrolment stations, to
(iv) There is registration and authentication of AUAs. ensure that they are functioning true to specifications, and
(v) Within CIDR, only a SHA-n Hash (Wikipedia 2016i) of Aadhaar that there is no possibility of data leakage from the field
number is stored. devices. Without such measures it will have to be assumed that
(vi) Audit trails are stored SHA-n encrypted (Wikipedia 2016o), leakage of data is always possible.
possibly also with HMAC (Wikipedia 2016c) based tamper Finally, we note that user biometric data are stored in the
detection. central repository, perhaps encrypted, but this still violates an
(vii) Only hashes of passwords and PINs are stored. Biometric important safeguard that we mentioned in Section 6.1 that user
data are stored in original form though. credentials should never be stored on the server. Unless there
(viii) Authentication requests have unique session keys and are some specific reasons to store the original biometric data,
HMAC (Wikipedia 2016c). There is protection against replay it may be safer to store only non-invertible intermediate repre-
attacks. sentations which are sufficient for matching (Tulyakov et al
(ix) Resident data is stored using 100-way sharding (vertical 2005; Dodis et al 2004).
partitioning) (Wikipedia 2016j). First two digits of Aadhaar
number are used as shard keys. 6.3 Possible Measures against Insider Attacks
(x) All enrolment and update requests link to partitioned data- Our starting point is that the environment in which the CIDR
bases using RefIDs (coded indices). programs (code) are executed cannot be assumed to be trusted.
(xi) All system accesses, including administration, through a One must address the possibility that the attacker has full
hardware security module (HSM) (Wikipedia 2016b) which access to the computer programmes that may be running on
maintains an audit trail. the UIDAI database. This may include both the source code and
(xii) All analytics are carried out only on anonymised data. the runtime environment.
While these measures appear to be quite reasonable against How can one hope to secure such a system against insider
external attacks, they may not be enough to forestall insider attacks? We believe that two independent lines of defence are
attacks. Though the safeguards adequately address the threat required: First, there has to be an independent third party that
scenario, they are not adequate for the threat levels described can play the roles of an online auditor and keeper of cryptograph-
in Section 6.1. For something as important as the national ic keys; and second, several modern tools and techniques from
identity project, one will have to assume that the biggest secu- computer science offer (partial) solutions to these problems.
rity and privacy threats come from insider leaks. These include These need to be studied, evaluated and appropriately deployed.
possible unauthorised and surreptitious examination of data, In what follows, we briefly describe each of these.
transaction records, logs and audit trails by personnel with Note that although critical data and transaction logs are
access, leading to profiling and surveillance of targeted groups maintained encrypted within the UIDAI, the decryption keys
and individuals, perhaps at the behest of interested and influ- are also stored in the UIDAI systems. Since the decryption must
ential parties in the state machinery itself. Hence, one would happen routinely, the computer programs running in the UIDAI
ideally like to have provisions to guard against the threat levels systems must be able to access these keys. There is no reason
described in Section 6.1. to believe that these keys cannot be retrieved with the collu-
There are a number of apparent weaknesses in the system. sion of multiple parties within the UIDAI in which case the data
Most of the security measures are based on cryptographic may be illegally accessed.
encryption techniques that require cryptographic keys to
decode. Protection of these keys is of great importance, and it Distributed key management: At least a part of every crucial
is necessary to have suitable measures to do so. Currently, we decryption key must remain with the third party, and a dis-
do not find mention of any such measures, and we believe that tributed key management protocol (Wikipedia 2016e) must
assuming trust in this context is a significant vulnerability. be put in place. The third party must share the portion(s) of
We do not believe that HSMs (Wikipedia 2016b), which are the key(s) it holds with a corresponding computer program in
also under the administrative control of the same organisa- the CIDR at run-time, through a secure channel, only after
tion, offer adequate protection against insider attacks for authenticating the genuineness of the program using a secure
something as crucial as the national identity verification and certificate and verifying that the program has not been
authentication system. tampered with.
There appears to be no well-defined and cryptographically
sound approval procedure for data inspection, whether for in- Audit and approval of UIDAI programs: To enable the above,
vestigation or for analytics. This makes the system extremely it will be necessary for the auditor to examine, approve and
open to abuse. There appears to be no well-defined procedure cryptographically sign every program that may run in the
for audit and approval of various UIDAI programs and software. CIDR. Thereafter, these programs should periodically during
In particular, one would like to be able to establish that the run-time and on demand cryptographically prove to the
programs have not been tampered with and are doing precisely auditors programs that they are genuine and have not been
what they are supposed to do. tampered with.
Economic & Political Weekly EPW september 16, 2017 vol liI no 37 99
SPECIAL ARTICLE
Audit of data inspection: All data inspection, including those Homomorphic and functional encryption: Another secu-
through special purpose programs for data analytics, should rity threat is the possibility of server breaches, whether the
be digitally approved by the auditor. attack is launched from inside or outside the organisation. To
There has to be proper legal provisions for setting up such prevent a server breach from leaking valuable user data, crit-
online third-party audit and key-management systems. ical data needs to be stored on the server in an encrypted
Even with the above measures in place, the complete de- form. In order to perform analytics directly over encrypted
cryption keys will have to reside in the memory of the UIDAI data, one could resort to homomorphic and functional en-
computer systems at some point during the execution. A well- cryption techniques (Sahai and Waters 2005; Gentry 2009)
trained system administrator, with access to the hardware or symmetric searchable encryption (Bellare et al 2007;
and the operating system, may still be able to access the de- Curtmola et al 2011).
cryption keys from the systems memory. There are a variety
of tools in computer science that may provide a defence White-boxing and code obfuscation: Another useful class of
against such attacks at the time of execution. We describe defences against insider attacks comes from techniques devel-
some of them below. oped in the area of white-box cryptography. Typically, one
assumes that attacks are black box, that is an attacker has ac-
Storing hash of biometric data: Since the Aadhaar database cess to the input and the output of a program, but not to the
stores sensitive biometric data of individuals, a useful strategy internal workings of the program. However, an insider may
to protect this data is to store only a non-invertible hash of bio- have full access to the source code and the binary file running
metric data, which converts a string representing biometric on the system, and also the corresponding memory pages dur-
data to a nearly uniform random string which does not leak ing execution. Additionally, the attacker can also possibly
any information about the individual. Some techniques to make use for debuggers and emulators, intercept system calls
achieve these are fuzzy extractor (Dodis et al 2004) and sym- and, tamper with the binary and its execution. Such attacks
metric hashing (Tulyakov et al 2005). are called white-box attacks, and white-box cryptography
(Wyseur 2008) aims to implement cryptographic procedures
Tamper-proof code: A significant cause of concern is that a in software that transform and obfuscate code and data in
malicious insider may be able to modify the code so that it be- such a way so that the cryptographic assets remain secure
haves arbitrarily. Such attacks are dangerous not just in terms even when subject to white-box attacks.
of denial of service but also because arbitrary behaviour may
lead to leakage of secrets embedded in the code. 6.4 Securing Field Devices
Third-party audit will be required to set up the processes Finally, client access devices (or POS devices) can broadly be
to ensure that the code is tamper free. The third-party understood to have the same critical components that CIDR
auditors can rely on known practices in the formal verifica- servers have: hardware (the device itself) and the application(s)
tion and validation literature (such as CFI, model checking, running on the device. Solutions to secure client devices are
static code analysis, etc (Wikipedia 2016k) to realise sought no different than the solutions for servers that we discussed
countermeasures. above.
Note Dodis, Yevgeniy, Leonid Reyzin and Adam Smith (2013): Lessons from the East Godavari Pilot,
1 We note that there is a notion of cancellable (2004): Fuzzy Extractors: How to Generate Hindu, 11 April, http://www.thehindu.com/
biometrics, but this is still in the research Strong Keys from Biometrics and Other Noisy opinion/lead/lessons-from-the-east-godavari-
domain (Patel et al 2015; Tulyakov et al 2005) Data, International Conference on the Theory pilot/article4603273.ece.
and may not yet integrate well with commer- and Applications of Cryptographic Techniques, (2015): Five Myths about Aadhaar, Outlook,
cial matching software. Switzerland, Conference proceedings EURO- 18 September, http://www.outlookindia.com/
CRYPT 2004, pp 52340. website/story/five-myths-about-aad-
Drze, Jean (2016): The Aadhaar Coup, http:// har/295364.
References www.thehindu.com/opinion/lead/jean-dreze- (2016): Aadhaar-enabled Exclusion and Cor-
on-aadhaar-mass-surveillance-data-collection ruption, Deccan Herald, 27 November, http://
Akhtar, Zahid (2012): Security of Multimodal Bio-
/article8352912.ece. www.deccanherald.com/content/583315/aad-
metric Systems against Spoof Attacks, PhD Diss,
Department of Electrical and Electronic Engineer- Duggal, Pavan (2011): Does the UID Project Infringe haar-enabled-exclusion-corruption.html.
ing, University of Cagliari, https://pralab.diee. on Privacy?, http://www.business-stan dard. Kleinberg, Jon, Jens Ludwig, Sendhil Mullainathan
com/article/opinion/does-the-uid-project-in- and Ziad Obermeyer (2015): Prediction Policy
unica.it/sites/default/files/Akhtar_PhD2012.pdf.
fringe-on-privacy-111080300006_1.html. Problems, American Economic Review, Vol 105,
Arun, Chinmayi (2016): Privacy Is a Fundamental
Einav, Liran and Jonathan D Levin (2013): The Data No 5, pp 49195, http://www.aeaweb.org/
Right, Hindu, 18 March, http://www.thehindu.
Revolution and Economic Analysis, Working articles?id=10.1257/aer.p20151023.
com/opinion/lead/lead-article-on-aadhaar-bill-
Paper 19035, National Bureau of Economic Re- Krishna, Gopal (2017): Will Aadhaar Cause Death of
by-chinmayi-arun-privacy-is-a-fundamental-
search, http://www.nber.org/papers /w19035. Civil Rights?, Business Today, 23 March, http://
right/article8366413.ece.
(2014): Economics in the Age of Big Data, www.businesstoday.in/magazine/columns/will-
Athey, Susan and Guido W Imbens (2015): Machine Science, Vol 346, No 6210, http://science.sci- aadhaar-cause-death-of-civil-rights/story/248331.
Learning for Estimating Heretogeneous Casual encemag.org/content/ 346/6210/1243089. html.
Effects, Working Paper No 3350, Stanford Uni-
Express News Service (2016): Aadhar Bill Passed in Krishnamurthy, Rashmi and Kevin C Desouza (2014):
versity, https://www.gsb.stanford.edu/faculty-
Lok Sabha, Opposition Fears Surveillance, Big Data Analytics: The Case of Social Secur-
research/working-papers/machine-learning-
Indian Express, 12 March, http://indianexpress. ity Administration, Information Policy, Vol 19,
estimating-heretogeneous-casual-effects.
com/article/india/india-news-india/aadhar- pp 16578, http://ssrn.com/abstract =2757871.
Bellare, Mihir, Alexandra Boldyreva and Adam card-uid-bill-lok-sabha-arun-jaitley/.
ONeill (2007): Deterministic and Efficiently Kumar, Ashwani (2015): Privacy, a Non-negotiable
Gentry, Craig (2009): Fully Homomorphic Encryp- Right, Hindu, 10 August, http://www.thehindu.
Searchable Encryption, Advances in Cryptology tion Using Ideal Lattices, Proceedings of the com/opinion/lead/privacy-a-nonnegotiable-
CRYPTO 2007, pp 53552. Forty-first Annual ACM Symposium on Theory right/article7519148.ece.
Bhatia, Gautam (2015): Sorry, Mr Attorney-General, of Computing (STOC 2009), pp 16978. Lok Sabha Secretariat (2011): The National Identi-
We Do Actually Have a Constitutional Right to Houck, Max and Lucy Houck (2008): What Is Touch fication Authority of India Bill, 2010, Standing
Privacy, Wire, 28 July, https://thewire.in/ DNA?, Scientific American, http://www.scien- Committee on Finance (201112), 42nd Report,
7398/sorry-mr-attorney-general-we-do-actual- tificamerican.com/article/experts-touch-dna- Ministry of Planning, www.prsindia.org/up-
ly-have-a-constitutional-right-to-privacy/. jonbenet-ramsey/. loads/media/UID/uid%20report.pdf.
CIS (2016): List of Recommendations on the Jayaram, Malavika (2015): Aadhaar Debate: Privacy LSE (2005): The Identity Project: An Assessment
Aadhaar Bill, 2016: Letter Submitted to the Is Not an Elitist Concern Its the Only Way to of the UK Identity Cards Bill and Its implica-
Members of Parliament, Centre for Internet & Secure Equality, Scroll.in, 15 August, http:// tions, The London School of Economics and
Society, https://cis-india.org/internet-govern- scroll.in/article/748043/aadhaar-debate-pri- Political Science, http://eprints.lse.ac.uk/684/.
ance/blog/list-of-recommendations-on-the- vacy-is-not-anelitist-concern-its-the-only-way- Makkar, Sahil (2016): Aadhaar Is Actually Surveil-
aadhaar-bill-2016. to-secure-equality. lance Tech: Sunil Abraham, Business Standard,
Costan, Victor and Srinivas Devadas (2016): Intel Jennifer McNabb, David Timmons, Jae Song and
SGX Explained, IACR Cryptology ePrint Arc Carolyn Puckett (2009): Uses of Administrative
hive, 86, https://eprint.iacr.org/2016/086.pdf. Data at the Social Security Administration,
Curtmola, Reza, Juan Garay, Seny Kamara and Social Security Bulletin, Vol 69, No 1, https://www.
Rafail Ostrovsky (2011): Searchable Symmetric ssa.gov/policy/docs/ssb/v69n1/v69 n1p75.html. available at
Encryption: Improved Definitions and Efficient Johari, Aarefa (2016): In Drought-hit Saurashtra,
Constructions, Journal of Computer Security,
Vol 19, No 5, pp 895934.
Poor Internet Network Can Often Mean No
Food Rations, Scroll.in, 29 June, http://scroll.
Gyan Deep
Diffie, Whitfield and Martin E Hellman (1979): in/article/810683/in-drought-hit-saurashtra- Near Firayalal, H. B. Road
Privacy and Authentication: An Introduction no-internetcan-often-mean-no-food-rations. Ranchi 834 001
to Cryptography, Proceedings of the IEEE, Vol 67, Khera, Reetika (2011): The UID Project and Welfare Jharkhand
No 3, pp 397427, http://ieeexplore.ieee.org/ Schemes, Economic & Political Weekly, Vol 46,
stamp/stamp.jsp?arnumber=1455525. No 9. Ph: 0651-2205640
Economic & Political Weekly EPW september 16, 2017 vol liI no 37 101
SPECIAL ARTICLE
12 March, http://www.business-standard.com/ (2016a): Aadhaar Authentication Overview, en.wikipedia.org/wiki/Man-in-the-middle_at-
article/opinion/aadhaar-is-actually-surveillance- http://www.cse.iitd.ac.in/~suban/reports/ tack, accessed on 30 July 2016.
tech-sunil-abraham-116031200790_ 1.html. UIDAI_REPORTS/auth.pdf. (2016g): Secure Multi-party Computation,
McNabb, Jennifer, David Timmons, Jae Song and (2016b): Operation Model, https://authportal. https://en.wikipedia.org/wiki/Secure_multi-
Carolyn Puckett (2009): Uses of Administrative uidai.gov.in/web/uidai/home-articles?url Title party_computation, accessed on 30 July 2016.
Data at the Social Security Administration, =operation-model&pageType=authentication, (2016h): Public Key Infrastructure, https://
Social Security Bulletin, Vol 69, No 1, https:// accessed on 2 August 2017. en.wikipedia.org/wiki/Public_key_infrastruc-
www.ssa.gov/policy/docs/ssb/v69n1/v69n1p75. (2017): AUA Audit Compliance Checklist, https: ture, accessed on 30 July 2016.
html. //authportal.uidai.gov.in/static/AUA%20Com- (2016i): Secure Hash Algorithm, https://
Masiero, Silvia (2015): PDS Computerisation: What pliance%20Checklist.pdf, accessed on 2 August en.wikipedia.org/wiki/Secure_Hash_Algorithms,
Other States Can Learn from Kerala, Ideas for 2017. accessed on 30 July 2016.
India, 6 July, http://www.ideasforindia.in/ar- UN Global Pulse (2012): Big Data for Development: (2016j): Shard (Database Architecture), https://
ticle.aspx?article_id=1474. Challenges and Opportunities, http://www. en.wikipedia.org/wiki/Shard_(database_archi-
McBride, Linden and Austin Nichols (2015): Improved unglobalpulse.org/sites/default/files/BigData- tecture), accessed on 30 July 2016.
Poverty Targeting through Machine Learning: forDevelopment-UNGlobalPulseJune2012.pdf. (2016k): Static Programme Analysis, https://
An Application to the USAID Poverty Assessment Varian, Hal R (2014): Big Data: New Tricks for en.wikipedia.Org/wiki/Static_program_anal-
Tools, Economics That Really Matters, http:// Econometrics, Journal of Economic Perspectives, ysis, accessed on 30 July 2016.
www.econthatmatters.com/wp-content/uploads/ Vol 28, No 2, pp 328, http://www.aeaweb. (2016o) Secure Hash Algorithms, https://
2015/01/improvedtargeting_21jan2015.pdf. org/articles?id=10.1257/jep.28.2.3. en.wikipedia.org/wiki/Secure_Hash_Algorithms,
Mehta, Pratap Bhanu (2017): Big Brother Is Winning, Vombatkere, Sudhir (2016): How Aadhaar Neglects accessed on 30 July 2016.
Indian Express, 8 February, http://indianexpress. Personal Privacy and National Security, Main- Yadav, Anumeha (2016a): Rajasthan Presses on
com/article/opinion/columns/digitisation- stream, Vol LIV, No 13, http:// www.main with Aadhaar After Fingerprint Readers Fail:
power-of-state-surveillance-transparency- streamweekly.net/article6283.html. Well Buy Iris Scanners, Scroll.in, 10 April,
4513022/. http://scroll.in/article/806243/rajasthan-press-
Wikipedia (2016a): Cryptographic Hash Function,
NDTV (2016a): Truth v Hype: Aadhaars One Billion es-on-with-aadhaarafter-fingerprint-readers-
https://en.wikipedia.org/wiki/Cryptographic_
Challenge, NDTV, 9 April, http://www.ndtv. fail -well-buy-iris-scanners.
hash_function, accessed on 30 July 2016.
com/video/news/truth-vs-hype/truth-vs- (2016b): Rajasthans Living Dead: Thousands
hype-aadhaar-s-one-billion-challenge-411279. 2016b): Hardware Security Module, https://
of Pensioners without Aadhaar or Bank Acc-
en.wikipedia.org/wiki/Hardware_security_
(2016b): , ounts Struck Off Lists, 6 August, Scroll.in,
module, accessed on 30 July 2016.
NDTV, 16 July http://khabar.ndtv.com/video/ http://scroll.in/article/813132/rajasthans-liv-
show/ndtv-special-ndtv-india/what-should-they- (2016c): Hash-based Message Authentication ing-dead-thousandsof-pensioners-without-
do-who-dont-get-ration-423998. Code, https://en.wikipedia.org/wiki/Hash- aadhaar-or-bank-accounts-struck-off-lists.
based_message_authentication_code, accessed Wyseur, Brecht (2009): White-Box Cryptography,
Patel, V M, N K Ratha and R Chellappa (2015):
on 30 July 2016. Diss, Katholieke Universiteit Leuven, https://
Cancelable Biometrics: A Review, IEEE Signal
Processing Magazine, Vol 32, No 5, pp 5465. (2016d): Kerberos (Protocol), https://en.wiki- www.esat.kuleuven.be/cosic/publications/
Planning Commission (2012): Report of the Group pedia.org/wiki/Kerberos_(protocol), accessed thesis-152.pdf.
of Experts on Privacy, Chaired by Justice A P on 30 July 2016. Zhong, Raymond (2016): Is the Indian Government
Shah, http://planningcommission.nic.in/re- (2016e): Key Management, https://en.wikipedia. Saving as Much as It Says on Gas Subsidies?,
ports/genrep/repprivacy.pdf. org/wiki/Key_management, accessed on 30 21 March, https://blogs.wsj.com/indiareal-
PTI (2015): Right to Privacy Not a Fundamental July 2016. time/2016/03/21/is-the-indian-government-
Right, Cannot be Invoked to Scrap Aadhaar: (2016f): Man-in-the-middle Attack, https:// saving-as-much-as-it-says-on-gas-subsidies/.
Centre Tells Supreme Court, Economic Times,
23 July, http://articleshttp://economictimes.
indiatimes.com/news/politics-and-nation/
right-to-privacy-not-a-fundamental-right-can-
Journal Rank of EPW
not-be-invoked-to-scrap-aadhar-centre-tells-
supreme-court/article show/48178526.cms. Economic and Political Weekly is indexed on Scopus, the largest abstract and citation
Ramanathan, Usha (2016): Opinion: Data Is the
New Gold and Aadhaar Is the Tool to Get It,
database of peer-reviewed literature, which is prepared by Elsevier N V (bit.ly/2dxMFOh).
Scroll.in, 30 December, https://scroll.in/article
/825049/data-is-the-new-gold-and-aadhaar-is-
Scopus has indexed research papers that have been published in EPW from 2008 onwards.
the-tool-to-get-it.
The Scopus database journal ranks country-wise and journal-wise. It provides three broad sets
Sahai, Amit and Brent Waters (2005): Fuzzy Iden-
tity-Based Encryption, Advances in Cryptology of rankings: (i) Number of Citations, (ii) H-Index, and (iii) Scimago Journal and Country Rank.
EUROCRYPT 2005, pp 45773.
Somanchi, Anmol, Srujana Bej and Mrityunjay Presented below are EPWs ranks in 2015 in India, Asia and globally, according to the total cites
Pandey (2017): Well done ABBA?, Economic & (3 years) indicator.
Political Weekly, Vol 52, No 7.
Scroll Staff (2016): Jaitley Admits Right to Privacy Highest among 37 Indian social science journals and second highest among 187 social
but Brazens It Out on Money Bill Manoeuvre science journals ranked in Asia.
for Aadhaar, Scroll.in, 16 March, http://scroll.
in/article/805236/jaitley-admits-right-to-pri- Highest among 38 journals in the category, Economics, Econometrics, and Finance in the
vacy-butbrazens-it-out-on-money-bill-manoeu- Asia region, and 37th among 881 journals globally.
vre-for-aadhar.
SSA (2017): New or Replacement Social Security Highest among 23 journals in the category, Sociology and Political Science in the Asia
Number and Card, Social Security, Social Security
Administration, https://www.ssa.gov/ssnumber/
region, and 17th among 951 journals globally.
Tulyakov, Sergey, Faisal Farooq and Venu Govin- Between 2009 and 2015, EPWs citations in three categories (Economics, Econometrics,
daraju (2005): Symmetric Hash Functions for
Fingerprint Minutiae, Pattern Recognition and
and Finance; Political Science and International Relations; and Sociology and Political
Image Analysis, pp 3038. Science) were always in the second quartile of all citations recorded globally in the Scopus
UIDAI (2011): Aadhaar Security Policy & Framework database.
for UIDAI Authentication, (Version 1.0), http://
uidai.gov.in/images/authDoc/d34securitypol- For a summary of statistics on EPW on Scopus, including of the other journal rank indicators
icyframeworkv1.pdf, accessed on 31 July 2016.
please see (bit.ly/2dDDZmG).
(2014): Aadhaar Technology and Architecture:
Principles, Design, Best Practices, & Key Lessons, EPW consults referees from a database of 200+ academicians in different fields of the social
http://www.cse.iitd.ac.in/~suban/reports/UID-
AI_REPORTS/AadhaarTechnologyArchitecture_ sciences on papers that are published in the Special Article and Notes sections.
March2014.pdf, accessed on 31 July 2016.
102 september 16, 2017 vol liI no 37 EPW Economic & Political Weekly