Sie sind auf Seite 1von 8

11110010010101010010101

10101001010010010100101010010101
Whitepaper

110010010101010010101
101001010010010100101010010101
1110010010101010010101
101001010010010100101010010101
10101010100001010010101011100
010101001010010010100101010010101
1110010010101010010101
Analytics – Revolutionizing

010101010100001010010101011100
10101001010010010100101010010101
DLP Incident Response
0010101010100001010010101011100 AN OVERVIEW OF DLP INCIDENT RISK RANKING
Integrated Security

010101001010010010100101010010101

By Dr. Lidror Troyansky – Forcepoint™ Research Fellow


011110010010101010010101
0101010100001010010101011100
011110010010101010010101
10101001010010010100101010010101
10101010100001010010101011100
10101001010010010100101010010101
101011110010010101010010101
0101011110010010101010010101
101010100001010010101011100
101001010010010100101010010101
Integrated Security Analytics - Revolutionizing DLP Incident Response

Introduction
Security systems generate a large number of alerts, but only The results of the research is an integrated security analytics
a small subset of them represent critical risks to high value system tuned for a DLP data set that enables security
business data. operations teams to be more responsive and effective with
fewer resources.
Noise – whether it comes from personal communication, broken
business processes or false positives – makes the task of In this whitepaper, we’ll provide an overview of the techniques
identifying genuine data theft risks challenging, even for well- applied and the challenges addressed by the security analytics
resourced security operations teams. system and its new Incident Risk Ranking feature.

`` The recent breach of a major retail chain highlighted the


difficulty and risk that noise poses. Despite a security
product in place that identified and sent alerts on data theft
activity, the security operations team was unable to respond
effectively due to critical events getting lost in a swamp of
alert noise.

`` Effectively identifying a small number of critical security


alerts amongst a large incident data set involves a high
level of domain expertise and valuable operational time

`` Hiring a team of data scientists is one option to address


this challenge. A more practical and cost effective approach
is to use a security tool that automatically identifies and
prioritizes high risk activity, and then presents it in a format
that the security operations team can quickly act upon.

At Forcepoint, we started a research and development


project in 2013 with the goal of revolutionizing how our
customers approach DLP incident response and use DLP
data sets to identify security risks caused by broken business
processes.

The team working on the project contained both data scientists


and a dedicated group of senior software engineers. Their
research covered advanced risk modelling techniques, artificial
intelligence and data analytics used in threat intelligence
systems to automate the identification and prioritization of
threats.

www.forcepoint.com 2
Integrated Security Analytics - Revolutionizing DLP Incident Response

Overview
Incident Risk Ranking uses the new security analytics system to
identify the deliberate exfiltration of critical data and other high risk
data activity scenarios. At a high level it:

1. Correlates and groups related incidents and events into


meaningful DLP cases
2. Builds behavioral organization and employee-level baselines to
identify activity anomalies
3. Utilizes artificial intelligence to classify DLP cases in terms of
the risk they to pose to the organization (e.g. data theft, broken
business processes and unintentional leaks)
4. Enriches the data set to provide business context for each DLP
case (who, what, why and when)
5. Assigns a data loss risk score to each case and presents a stack
ranked operational view of risk.

The end result of this process is the Incident Risk Ranking report,
which presents a stack ranked list of the top 10-20 data theft risk
cases for the previous 24 hours, ready for investigation.

Incident Risk Ranking Report

www.forcepoint.com 3
Integrated Security Analytics - Revolutionizing DLP Incident Response

Quantifying Risk
Each person has an intuitive notion regarding risk, but assigning In order to assess the risk, we also need to assess the
a meaningful and consistent risk metric is difficult. Although probabilities of the various possible scenario classes:
some clear, high-risk cases are easy to discern—such as a file
with thousands of credit card numbers that was sent in the `` Was it deliberate data theft? In this case, the impact can
middle of the night to a dubious destination by an employee be very large and there is an urgent need to address
with a poor record—it’s much harder to decide about cases with the problem.
an ambiguous data classification or incidents within the “gray
area.” These can stem from an employee’s mistake, broken `` Was it a broken business process, where information is
business processes or from sophisticated insiders who attempt exchanged in a non-secure manner? In this case, the risk is
to make their activity look “normal.” enduring and requires systematic, yet not urgent, action.

Systematic approaches to risk quantification and management `` Or was it was a one-time mistake?
were first developed in the insurance industry and were based
on the expectation value of the loss. Broadly speaking, this can On the other hand, false positives and events of low importance,
be expressed as: such as personal communication, also have costs associated
with the time and attention that was diverted for their analyses,
Risk = (Probability of “bad” events)•(Amount of loss associated with as well as the resulting opportunity costs associated with
the events) missing high-impact events. That’s why it’s so important to be
able to identify superfluous incidents whenever possible
To this day, insurance underwriting is still based on this
basic formula, which is also widely used for quantifying other In order to assess the probabilities, our researchers have
risks, and is, by and large, the benchmark for risk quantification. developed an advanced tool based on a technology called
Bayesian Belief Networks, that utilizes a spectrum of
The intimate acquaintance of content-aware DLP with sensitive observables and indicators to assess the plausibility of various
content, whether it’s intellectual property or regulated data scenarios by combining DLP domain expert knowledge, deep
sets, allows the system to assess the potential damages or learning techniques and statistical inference.
losses associated with cases in which a certain type of content is
stolen or otherwise exposed. The key to Bayesian Belief Networks is the ability to see
behind the single alert or incident. Before assessing the risks,
In general, the impact can be assessed using the classification the system first correlates related incidents into cases that
and the size of the exposed data: an incident with a single aggregate various incidents based on key attributes such as
credit card number is much less severe than an incident with the source, destination and data types, as well as more subtle
a hundred credit card numbers, which is yet less severe patterns that take into account various similarity measures
than stealing credentials for a database with millions of between incidents.
sensitive records.

www.forcepoint.com 4
Integrated Security Analytics - Revolutionizing DLP Incident Response

After constructing the various cases, the probabilities of the


various scenario classes or possible explanations are assessed
using special Bayesian Belief Networks that were developed and
designed for these specific classes. The various explanations
compete with each other, and eventually, the product obtains
the likelihood of each scenario, as illustrated schematically in
Figure 1.

Bayesian networks simplify the assessment of likelihood based


on multiple observables and indicators by using the notion of
conditional independence in various hierarchical levels. The
system can group a relatively small number of observables
and indicators—such as the employee who sent his resume
to another company’s HR and sent an email that suggests a
negative disposition to their boss—to assess the plausibility of
various hypotheses. In this case, the employee is likely to leave
soon. This could indicate that an incident where he sent source
code to his own Gmail account is more likely to be data theft
incident than a case in which he merely wanted to work on the
code on his spare time.

Figure 1: Baysian Network


Final Probability
for Each Scenario

Suspected Possible Benign


Data Theft Scenarios

Suspected
Suspected
User Group Unintentional
User
(e.g. “On Notice”)

False
Sensitive Suspected Positives
Data Destination

Suspected Likely To
Disposition Leave

AP-DATA AP-WEB
Prior Baselines
Indicators Indicators
Observable #1:
Observable #1: Suspected
Observable #2 Send his own Observable #2
Disgruntled Behavior (Other)
CV (Resume)

Unusual Mail To Previous Unusual


Indicators
Volume Self Attempts Hours

Observable #1 Observable #2

www.forcepoint.com 5
Integrated Security Analytics - Revolutionizing DLP Incident Response

In some cases, however, the content itself may provide strong


Folding, chaining and grouping incidents
indication that the incident is a data theft incident. For example,
Grouping incidents is an effective way to summarize data and
there is no conceivable reason to send the Security Account
overcome the deluge of incidents. In principle, an incident group
Manager (SAM) database to an external Gmail address. The
is a collection of incidents that can be meaningfully described.
results are automatically explained on cards in the TRITON®
TRITON AP-DATA defines four basic types of groups:
AP-DATA incident risk ranking report:

`` Basic cases and folding (grouping of related incidents)


Case Card: Incidents Impacting Risk Score
`` Incident chains and processes
`` Superfluous incidents
`` Behavioral baselines and anomalies

A basic case comprises one or more incidents that, from the


user’s perspective, should be referred to as a single transaction.
Examples would include copying a directory that contains
sensitive data within multiple files to removable media, or
uploading a single file to cloud storage and that file being split
into multiple data chunks by the web application. In these
instances, all these incidents are folded into a single case.
Case Card: Source Creating the Incident

The risk for the case is evaluated by first assessing the total
impact of all the incidents in the case and the probabilities
for various scenarios (data theft case, false positive, etc.). The
following card summarizes a case with 50 incidents involving
credit card data:

Case Card: Number of Clustered Incidents in This Case

In other cases, it would be hard (or impossible) to determine if


it was data theft or an unintentional leak and the system would
render these as “uncategorized”:

Case Card: Incidents Which Cannot Be Categorized

www.forcepoint.com 6
Integrated Security Analytics - Revolutionizing DLP Incident Response

Incident chains and processes Behavioral baselines and anomalies


At the next level, the system looks at multiple incidents that Baselines provides references for normal (or common) behavior.
together tell a story. Chain-like cases are a sequence of Baselines are time-dependent and can be associated with
incidents from the same source, as is illustrated in Figure 2. sources, destinations, channels and content in various levels
of granularity. For example, the system can consider specific
This sequence of incidents constitutes a case that can be users, user groups or the entire organization baselines for
characterized as a chain. The context provided by previous standard working hours, combinations of channels, rules,
incidents highlight the intention of the subsequent incidents, in number of matches, destinations and transaction sizes, as
this case, a data theft attempt. well as anomalies or deviations from those baselines that are
statistically significant.
Other cases involve incidents that were created as part of
a process, such as a sequence of events generated by an While anomalies provide an important set of indicators, most
individual, a group of users or a machine that is used to of the behavioral anomalies are benign, as people often change
achieve a certain goal or related to a certain theme, whether their behavior. For instance, when you start working on a new
legitimate or illegitimate. Notable examples for such processes topic, with new suppliers or customers, or when you travel
are business processes (and in particular, broken business to places you’ve never been to before, you create anomalies
processes, where sensitive data is rendered unprotected) and that may or may not become the new normal. Incorporating
deliberate data theft activity. baselines and anomalies within a powerful probabilistic
framework, such as Bayesian Belief Networks, allows digesting
Superfluous incidents the relevant information from baseline indicators without
Groups can include incidents that shouldn’t have been there creating the deluge of false positives typical of products that
in the first place—for example, false positives and personal alert on each anomaly.
communication. While 100% accurate automatic identification
of false positives is virtually impossible, the system can assess
the probability of false positives, or alternatively, the confidence
level that an incident is a true positive. This is accomplished by
using:

`` Statistical methods
`` Deviations from baselines
`` Prior information about the precision of the classifiers and
rules in the various DLP sensitivity levels (“Wide”, “Default”
and “Narrow)

Figure 2: User activity is linked, categorized and a risk score applied

Attempt to send Attempt to copy


User X sent Aggregate
suspected data large directory to
his own resume sensitive data
by web removable media

www.forcepoint.com 7
Integrated Security Analytics - Revolutionizing DLP Incident Response

Conclusion
By combining a Bayesian network-based expert system,
machine learning and behavioral baseline analysis, Incident
Risk Ranking delivers a new, systematic approach to risk
quantification and management for DLP incident data sets.
It enables security operations teams to identify and respond
rapidly to high-risk interactions with business critical data sets.

Incident Risk Ranking is the first use case supported by TRITON


AP-DATA integrated security analytics. Additional use cases,
including automated broken business process detection and
policy tuning, will be delivered in forthcoming releases.

Forcepoint’s goal with integrated security analytics is to


increase the ability of our customers’ security operations teams
to reduce enterprise data security risk with the same or even a
reduced budget. This capability is provided without additional
charge in all Forcepoint TRITON DLP products and
add-on modules.

If you would like to find out more about how integrated security
analytics can transform your data security program, register
interest via this webpage:

www.forcepoint.com/DLPIncidentRiskRanking

CONTACT ABOUT FORCEPOINT


www.forcepoint.com/contact Forcepoint™ is a trademark of Forcepoint, LLC. SureView®, ThreatSeeker® and TRITON® are registered trademarks of Forcepoint, LLC.
Raytheon is a registered trademark of Raytheon Company. All other trademarks and registered trademarks are property of their
respective owners.
[whitepaper_incident_risk_ranking_enus] 200053.012317

Das könnte Ihnen auch gefallen