Sie sind auf Seite 1von 20

Data Mining Techniques to Detect

Suspicious Identities using


Automated Targeting System

Himanshu Kackar

Himanshu Kackar is a student of Master of Business Administration


from Vinod Gupta School of Management, IIT Kharagpur
Introduction
The fight against terrorism requires the
government to find new approaches to intelligence
gathering and analysis. At the same time, advances
in technology provide new opportunities to collect
and use information. “Data mining” is one
technique that has significant potential for use in
countering terrorism. Data-mining and automated
data-analysis techniques are not new; they are
already being used effectively in the private sector
and in government. They have generated concern
and controversy, because they allow the
government far greater ability to use and analyze
private information effectively. This makes private
data a more attractive and powerful resource for
the government and increases the potential for
government intrusion on privacy. Recent high-
profile government programs that would explore or
employ data-mining and data-analysis techniques
for counterterrorism have caused public concern,
but the debate has not always been fully informed.
Resolving this debate intelligently and rationally is
important if we are to move forward in protecting
both our security and our liberties.

Himanshu Kackar is a Student of Master of Business Administration from


Vinod Gupta School of Management, IIT Kharagpur 2
Legislative action on data mining has had an
“all-or-nothing” quality. For example, American
Government terminated the controversial Terrorism
Information Awareness (earlier called Total
Information Awareness) (TIA) research program at
the Department of Defense Advanced Research
Projects Agency (DARPA), rather than dealing with
concerns by imposing conditions or controls on the
program. Policy on data mining and related
techniques that impact privacy should not rely
solely on prohibition. Policymakers must make
informed decisions about how to oversee and
control government use of private information most
effectively when using these techniques. To make
these decisions, policymakers should take the
following under consideration:
• Is the proposed program for research or for
application?
• If research, does the program include research
on privacy protection?
• If application, what type of data analysis will be
used?
• What data will be accessed?

Himanshu Kackar is a Student of Master of Business Administration from


Vinod Gupta School of Management, IIT Kharagpur 3
• What level of errors—false positives and false
negatives4—is the analysis expected to
generate?
• For what purpose is the analysis being used and
how narrowly tailored is it to that purpose?
• Are there ways to assure that its use will not be
expanded beyond this purpose without further
debate?
• Is the data mining or automated analysis to be
used only as an analytical or investigatory tool,
or will decisions that affect individuals be made
based on data-analysis results alone?
• What controls are being applied to collection,
use, retention, and dissemination of identities?
• Is technology that can assist with privacy
protection being used?

Himanshu Kackar is a Student of Master of Business Administration from


Vinod Gupta School of Management, IIT Kharagpur 4
Background and Some Terminology
Data mining involves the use of
sophisticated data analysis tools to discover
previously unknown, valid patterns and
relationships in large data sets. These tools can
include statistical models, mathematical algorithms,
and machine learning methods Consequently, data
mining consists of more than collecting and
managing data, it also includes analysis and
prediction.

Data mining enables corporations and


government agencies to analyze massive volumes
of data quickly and relatively inexpensively. The use
of this type of information retrieval has been driven
by the exponential growth in the volumes and
availability of information collected by the public
and private sectors, as well as by advances
in computing and data storage capabilities. In
response to these trends, generic data mining tools
are increasingly available for — or built into —
major commercial database applications. Today,
mining can be performed on many types of data,
including those in structured, textual, spatial, Web,
or multimedia forms.

Himanshu Kackar is a Student of Master of Business Administration from


Vinod Gupta School of Management, IIT Kharagpur 5
Data mining applications can use a variety of
parameters to examine the data. They include
association (patterns where one event is connected
to another event), sequence or path analysis
(patterns where one event leads to another event,
such as the birth of a child and purchasing diapers),
classification (identification of new patterns),
clustering (finding and visually documenting groups
of previously unknown facts), and forecasting
(discovering patterns from which one can make
reasonable predictions regarding future activities).

Data mining has become increasingly


common in both the public and private sectors.
Industries such as banking, insurance, medicine,
and retailing commonly use data mining to reduce
costs, enhance research, and increase sales. For
example, the insurance and banking industries use
data mining applications to detect fraud and assist
in risk assessment (e.g., credit scoring). Using
customer data collected over several years,
companies can develop models that predict
whether a customer is a good credit risk, or whether
an accident claim may be fraudulent and should be
investigated more closely.

Himanshu Kackar is a Student of Master of Business Administration from


Vinod Gupta School of Management, IIT Kharagpur 6
The proliferation of data mining has raised
implementation and oversight issues, including
concerns about the quality of the data being
analyzed, the interoperability of
the databases and software, and potential
infringements on privacy.

In the public sector, data mining applications


were initially used as a means to detect fraud and
waste, but they have grown also to be used for
purposes such as measuring and improving
program performance. In the public sector, the most
frequent uses of data mining are in the following
areas:

 improving service or performance;

 detecting fraud, waste, and abuse;

 analyzing scientific and research information;

 managing human resources;

 detecting criminal activities or patterns; and

 analyzing intelligence and


detecting terrorist activities.

Himanshu Kackar is a Student of Master of Business Administration from


Vinod Gupta School of Management, IIT Kharagpur 7
Why Data Mining for Counterterrorism?
Although all traditional intelligence collection
methods remain important, understanding the
terrorists and predicting their actions requires us to
rely more on making sense of many small pieces of
information. The September 11, 2001, attacks
illustrate this point. Even in hindsight, there is no
single source other than perhaps an extraordinarily
well-placed human asset—that could have provided
the full or even a large part of the picture of what
was being planned. There are number of clues, ,
that if recognized, combined, and analyzed might
give enough information to track down the terrorists
and stop their plan. Therefore, the focus should be
still on improving ability to collect human and other
traditional sources of intelligence, an edge can be
gained from more access to information and quality
analysis. For counterterrorism, small dots of data in
a sea of information must be found and make a
picture out of them.
Data-mining and automated data-analysis
techniques are not a complete solution. They are
only tools, but they can be powerful tools for this
new intelligence requirement. Although instinct and
continual hypothesizing remain irreplaceable parts

Himanshu Kackar is a Student of Master of Business Administration from


Vinod Gupta School of Management, IIT Kharagpur 8
of the analytic process, these techniques can assist
analysts and investigators by automating some low-
level functions that they would otherwise have to
perform manually. These techniques can help
prioritize attention and provide clues about where
to focus, thereby freeing analysts and investigators
to engage in the analysis that requires human
judgment. In addition, data mining and related
techniques are useful tools for some early analysis
and sorting tasks that would be impossible for
human analysts. They can find links, patterns, and
anomalies in masses of data that humans could
never detect without this assistance. These can
form the basis for further human inquiry and
analysis.
One initial potential benefit of the data-
analysis process is that the use of large databases
containing identifying information assists in the
important task of accurate identification, more
information takes it far easier to resolve whether
two or more records represent the same or different
people. For example, an investigator might want to
determine whether the John Doe boarding a plane is
the same person as the Jack Doe on a terrorist
watch list or the J.R. Doe that shared a residence
with a suspected terrorist. If the government has
Himanshu Kackar is a Student of Master of Business Administration from
Vinod Gupta School of Management, IIT Kharagpur 9
only names, it is virtually impossible to resolve
these identities for certain; if the government has a
social security number, a date of birth, or an
address, it is easier to make that judgment
accurately. The task of identity resolution is far
easier to perform when there are large data sets of
identifying information to call on. Not incidentally,
identity resolution also makes the government
better at determining when a person in question is
not the one suspected of terrorist ties, thereby
potentially reducing inconvenience to that person.
Pattern-based data analysis also has potential
for counterterrorism in the longer term, data-mining
research must find ways to identify useful patterns
that can predict an extremely rare activity terrorist
planning and attacks. It must also identify how to
separate the “signal” of pattern from the “noise” of
innocent activity in the data. One possible
advantage of pattern-based searches if they can be
perfected would be that they could provide clues to
“sleeper” activity by unknown terrorists who have
never engaged in activity that would link them to
known terrorists. Unlike subject-based queries,
pattern-based searches do not require a link to a
known suspicious subject.

Himanshu Kackar is a Student of Master of Business Administration from


Vinod Gupta School of Management, IIT Kharagpur 10
Types of pattern-based searches that could
prove useful include searches for particular
combinations of lower-level activity that together
are predictive of terrorist activity. For example, a
pattern of a “sleeper” terrorist might be a person in
the country on a student visa who purchases a
bomb-making book and 50 medium-sized loads of
fertilizer. Or, if the concern is that terrorists will use
large trucks for attacks, automated data analysis
might be conducted regularly to identify people who
have rented large trucks, used hotels or drop boxes
as addresses, and fall within certain age ranges or
have other qualities that are part of a known
terrorist pattern. Significant patterns in e-mail
traffic might be discovered that could reveal
terrorist activity and terrorist “ringleaders.” Pattern
based searches might also be very useful in
response and consequence management. For
example, searches of hospital data for reports of
certain combinations of symptoms, or of other
databases for patterns of behaviour, such as
pharmaceutical purchases or work absenteeism
might provide an early signal of a terrorist attack
using a biological weapon.

Himanshu Kackar is a Student of Master of Business Administration from


Vinod Gupta School of Management, IIT Kharagpur 11
Automated Targeting System
The Department of Homeland Security (DHS),
Customs and border Protection (CBP) has developed
the Automated Targeting System (ATS). ATS is one
of the most advanced targeting systems in the
world. Using a common approach for data
management, analysis, rules-based risk
management, and user interfaces, ATS supports all
CBP mission areas and the data and rules specific to
those areas.
Customs and Border Protection (CBP)
developed ATS, an intranet-based enforcement and
decision support tool that is the keystone for all CBP
targeting efforts. ATS compares traveler, cargo, and
conveyance information against intelligence and
other enforcement data by incorporating risk-based
targeting scenarios and assessments. CBP uses ATS
to improve the collection, use, analysis, and
dissemination of information that is gathered for the
primary purpose of targeting, identifying, and
preventing potential terrorists and terrorist
weapons from entering the United States. ATS also
identifies other violations of U.S. ATS allows CBP
officers charged with enforcing U.S. law and
preventing terrorism and other crime to focus their

Himanshu Kackar is a Student of Master of Business Administration from


Vinod Gupta School of Management, IIT Kharagpur 12
efforts on travelers, conveyances, and cargo
shipments that most warrant greater scrutiny. ATS
standardizes names, addresses, conveyance
names, and similar data so these data elements can
be more easily associated with other business data
and personal information to form a more complete
picture of a traveler, import, or export in context
with previous behavior of the parties involved.
Traveler, conveyance, and shipment data are
processed through ATS and are subject to a real-
time, rules-based evaluation.

ATS consists of six modules that provide


selectivity and targeting capability to support CBP
inspection and enforcement activities.
• ATS-Inbound – inbound cargo and conveyances
(rail, truck, ship, and air)
• ATS-Outbound – outbound cargo and
conveyances (rail, truck, ship, and air)
• ATS-Passenger (ATS-P) – travelers and
conveyances (air, ship, and rail)
• ATS-Land (ATS-L) - private vehicles arriving by
land
• ATS - International (ATS-I) - cargo targeting for
CBP's collaboration with foreign customs
authorities

Himanshu Kackar is a Student of Master of Business Administration from


Vinod Gupta School of Management, IIT Kharagpur 13
• ATS-Trend Analysis and Analytical Selectivity
Program, (ATS-TAP) (analytical module)
Five of these modules are operational and
subject to recurring systems’ maintenance. They
are: the ATS cargo modules, import, and export
(ATS Inbound and ATS Outbound); the ATS
Passenger
module; the ATS-Land module; and ATS-Analytical
module. The ATS-International module is being
developed to support collaborative efforts with
foreign customs administrations.

ATS System Overview


• ATS-Inbound is the primary decision support
tool for inbound targeting of cargo. This system
is available to CBP officers at all major ports
(air/land/sea/rail) throughout the United States,
and also assists CBP personnel in the Container
Security Initiative (CSI) decision-making process.
ATS Inbound provides CBP officers and Advance
Targeting Units (ATU) with an efficient, accurate,
and consistent method for targeting and
selecting high risk inbound cargo for intensive
examinations. ATS-Inbound assists in identifying
imported cargo shipments, which pose a high

Himanshu Kackar is a Student of Master of Business Administration from


Vinod Gupta School of Management, IIT Kharagpur 14
risk of containing weapons of mass effect,
narcotics, or other contraband. ATS-Inbound
increases the effectiveness of CBP officers
dealing with imported cargo by improving the
accuracy of the targeting of weapons of mass
effect, narcotics or other contraband,
commercial fraud violations, and other violations
of U.S. law. The approach is to process data
pertaining to entries and manifests against a
variety of rules to make a rapid automated
assessment of the risk of each import. Entry and
manifest data is received from the Automated
Manifest System (AMS), Automated Broker
Interface (ABI), and the Automated Commercial
Environment (ACE).
• ATS-Outbound is the outbound cargo targeting
module of ATS that assists in identifying exports
which pose a high risk of containing goods
requiring specific export licenses, narcotics, or
other contraband. ATS-Outbound uses Shippers’
Export Declaration (SED) data that exporters file
electronically with CBP's AES. The SED data
extracted from AES is sorted and compared to a
set of rules and evaluated in a comprehensive
fashion. This information assists CBP officers with
targeting and/or identifying exports with
Himanshu Kackar is a Student of Master of Business Administration from
Vinod Gupta School of Management, IIT Kharagpur 15
potential aviation safety and security risks, such
as hazardous materials and Federal Aviation
Administration (FAA) violations. In addition, ATS-
Outbound identifies the risk of specific exported
cargo for such export violations as smuggled
currency, illegal narcotics, stolen vehicles or
other contraband.\
• ATS-Passenger (ATS-P) is the module used at
all U.S. airports and seaports receiving
international flights and voyages to evaluate
passengers and crewmembers prior to arrival or
departure. It assists the CBP officer’s decision-
making process about whether a passenger or
crewmember should receive additional screening
prior to entry into or departure from the country
because the traveller may pose a greater risk for
violation of U.S. law. The system analyzes the
Advance Passenger Information System (APIS)
data from TECS, Passenger Name Record (PNR)
data from the airlines, TECS crossing data, TECS
seizure data, and watched entities. ATS-P
processes available information from these
databases to develop a risk assessment for each
traveller. The risk assessment is based on a set
of National- and user-defined rules which are
comprised rule sets that pertain to specific
Himanshu Kackar is a Student of Master of Business Administration from
Vinod Gupta School of Management, IIT Kharagpur 16
operational/tactical objectives or local
enforcement efforts.
• ATS-Land (ATS-L) is a module of ATS that
provides for the analysis and rule-based risk
assessment of private passenger vehicles
crossing the nation's borders. By processing and
checking of the license plate numbers of vehicles
seeking to cross the border, ATS-L allows CBP
officers to cross-reference the TECS crossing
data, TECS seizure data, and State Department
of Motor Vehicle (DMV) data2 to employ the
weighted rules-based assessment system of ATS.
In this way ATS-L provides, within seconds, a risk
assessment for each vehicle that assists CBP
Officers at primary booths in determining
whether to allow a vehicle to cross without
further inspection or to send the vehicle for
secondary evaluation.
• ATS-International (ATS-I) is being developed
to provide foreign customs authorities with
controlled access to automated cargo targeting
capabilities and provide a systematic medium for
exchanging best practices and developing and
testing targeting concepts. The exchange of best
practices and technological expertise can
provide vital support to other countries in the
Himanshu Kackar is a Student of Master of Business Administration from
Vinod Gupta School of Management, IIT Kharagpur 17
development of effective targeting systems that
can enhance the security of international supply
chains and fulfill the objective of harmonizing
targeting methodologies. If information from
foreign authorities is run through the ATS-I
module, it may also, consistent with applicable
cooperative arrangements with that foreign
authority, be retained in ATS-I by CBP to enhance
CBP's targeting capabilities.
• ATS-Trend Analysis and Analytical
Selectivity (ATS-TAP,) improves CBP's ability
to examine, locate, and target for action
violators of US laws, treaties, quotas, and
policies regarding international trade. ATS-
Analytical offers trend analysis and targeting
components. The trend analysis function
summarizes historical statistics that provide an
overview of trade activity for commodities,
importers, manufacturers, shippers, nations, and
filers to assist in identifying anomalous trade
activity in aggregate.
ATS supports the decision-making process and
reinforces the role of the trained professionals
making independent decisions necessary to identify
violations of U.S. law at the border.

Himanshu Kackar is a Student of Master of Business Administration from


Vinod Gupta School of Management, IIT Kharagpur 18
Conclusion
ATS is a decision support tool used by CBP
officers to identify individuals, cargo and
conveyances that may require additional scrutiny
based on observations related to data describing
those individuals. The ATS system supports CBP
officers in identifying individuals or cargo that may
be a risk to U.S. law enforcement, but it does not
replace their judgment in determining whether the
individual or goods/merchandise, as applicable,
should be allowed into the country. ATS offers
equitable risk assessment using a secure encrypted
network; however, it is the policies and procedures
and laws that govern the inspection and other law
enforcement processes that ultimately protect
individual privacy rights. The professionalism
applied by CBP officers serves to further protect
individual privacy rights.
References:
• http://itlaw.wikia.com/wiki/Data_mining
• http://dataminingtools.net/blog/tag/market-
research/
• http://abbottanalytics.blogspot.com/
• http://search.ebscohost.com/login.aspx?
direct=true&db=bth&AN=36506616&site=ehost
-live

Himanshu Kackar is a Student of Master of Business Administration from


Vinod Gupta School of Management, IIT Kharagpur 19
• http://www.nap.edu/openbook.php?
record_id=12452&page=218
• http://csis.org/files/media/csis/pubs/040301_data
_mining_report.pdf
• http://www.crcpress.com/product/isbn/97808493
14605;jsessionid=+utvxdOfvNG4h8s99EM6xw**
• http://abbottanalytics.blogspot.com/2010/01/dat
a-mining-and-privacyagain.html

Himanshu Kackar is a Student of Master of Business Administration from


Vinod Gupta School of Management, IIT Kharagpur 20

Das könnte Ihnen auch gefallen