Sie sind auf Seite 1von 15

Predicting

Android Application
Security and Privacy Risk With
Static Code Metrics
Akond Rahman*, Priysha Pradhan*, Asif Partho**, and
Laurie Williams*
North Carolina State University*, Nested Apps**
Contact: aarahman@ncsu.edu

1
Motivation
• Mobile applications are susceptible to security
and privacy risk

• Can we help app developers assess security


and privacy risk?
http://www.cbsnews.com/news/mobile-phone-apps-malware-risks-how-to-prevent-hacking-breach/
http://www.techrepublic.com/article/bad-news-android-devs-40-percent-of-apps-in-the-market-are-leaving-sensitive-
backdoors-exposed/

2
Research Objective
The goal of this paper is to aid Android
application developers in assessing the security
and privacy risk associated with Android
applications by using static code metrics as
predictors.

3
Our Contribution
• An evaluation of how static code metrics can
be used to predict the security and privacy risk
with the help of statistical learners

4
Research Question
• RQ: How effectively can statistical learners be
used to predict security and privacy risk using
static code metrics?

5
Dataset from Krutz Check if AndroRisk
et al. Scores are Available
Methodology

Feature Selection,
Statistical Learners, Clustering
Cross Validation

6
Dataset
• Dataset from Krutz et al. included 4,416 Android
applications
• 1,407 applications included AndroRisk scores.
AndroRisk is a tool that is part of the AndroGuard
toolchain
• Five risk levels: very low (VL), low (L), medium (M),
high (H), very high (VH)

http://blog.k3170makan.com/2014/11/automated-dex-decompilation-using.html

7
Dataset
• Dataset from Krutz et al. included 21 code metrics
Category Metrics
Bad Coding Practice Blocker practices, Critical practices, Major practices,
Minor practices, Total bad coding practices

Duplication Duplicated blocks, Duplicated files, Duplicated lines

Object-oriented Class complexity, Comment lines, Complexity, Density of


comment lines, Files, File complexity, Function complexity,
Lines, Lines of code, Methods, Number of classes,
Percentage of comments, Percentage of duplicated lines

https://www.sonarqube.org/community/logos/

8
Empirical Findings: Feature Selection
• One principal component, 98.9% variance
• Top contributing static code metrics
– lines of code
– complexity
– total bad coding practices

9
Empirical Findings: Prediction
Performance (Precision)
CART kNN r-SVM RF
1

0.8
Precision

0.6

0.4

0.2

0
VL L M H VH
Risk level

10
Empirical Findings: Prediction
Performance (Recall)
CART kNN r-SVM RF
1

0.8

0.6
Recall

0.4

0.2

0
VL L M H VH
Risk level

11
Empirical Findings: RQ
• Summary
– r-SVM can be used to build a prediction
model for predicting security and privacy
risk that takes static code metrics as input.

12
Implications
• Static code metrics: bad coding practice, lines
of code
• IDE enhancement: extend existing Android-
specific IDEs such as AndroidStudio, AIDE

13 https://www.cloudbees.com/jenkinsworld/home
https://puppet.com/puppetconf
https://git-merge.com/
Limitations
• Generalization
• Use of static code metrics
• Selection of statistical learners

14
Conclusion
• With proper use of statistical learners, static
code metrics can be useful to predict security
and privacy risk for Android applications. Even
though they are not comprehensive for
predicting security and privacy risk.

15

Das könnte Ihnen auch gefallen