Sie sind auf Seite 1von 5

2010 Seventh International Conference on Information Technology

Continuous Biometric User Authentication in Online Examinations


Eric Flior, Kazimierz Kowalski
Department of Computer Science, California State University Dominguez Hills
1000 Victoria Street, Carson, CA 90747
eflior1@cp.csudh.edu, kkowalski@csudh.edu

signature to the keystroke dynamics of the person taking


the examination, we can make a determination about
whether or not the person taking the test is the registered
user.
This paper presents information about using keystroke
dynamics to obtain biometric authentication of a user, and
a software project, which uses HTML, PHP, MySQL and
JavaScript to implement an online examination where
keystroke dynamics are used in order to authenticate the
user.

Abstract
Online examinations pose a unique problem for distancebased education, in that it can be very difficult to provide
true user authentication. Due to the inherent anonymity
of being online, compared to taking an examination in a
classroom environment, students may attempt to
artificially boost their scores in online examinations by
having another individual take the exam for them, which
a typical user/password authentication scheme cannot
detect. This paper discusses and presents a method for
providing continuous biometric user authentication in
online examinations via keystroke dynamics.

2. Authentication Methods
Currently, there are 4 primary methods of user
authentication, which are: 1) Knowledge factors, or
something unique that the user knows; 2) Ownership
factors, or something unique that the user has; and
Inherence factors, 3) something unique that the user is or
4) something unique that the user does [1]. However,
when considering online examinations, each of these
methods has a number of drawbacks.

Key Words: correlation, cosine, dynamics, keystroke,


multi-factored, signature

1. Introduction
In the situation of giving an online examination, there
are security factors to consider beyond simple password
authentication for access to the examination. It is not
unreasonable to assume that a student may willingly give
their password to someone else, with the intent that the
other person will take the examination for the student.
With this in mind, a system must be developed in order
to determine that the person taking the examination is, in
fact, the student registered to take the examination. While
it may be infeasible to guarantee with 100% confidence
that the person taking the examination is the student, there
are methods which can be used to provide an estimate of
how certain it is that the person taking the examination is
who they claim to be.
One way we can accomplish this is a biometric method
in which we monitor the keystroke dynamics of the person
taking the examination. Characteristics of keystroke
dynamics vary from person to person, and are thought to
be as individual as a signature. By measuring the flight
time, or the time it takes the user to go from one key
down event to another, a profile can be built of a users
typing signature. When we compare this recorded

978-0-7695-3984-3/10 $26.00 2010 IEEE


DOI 10.1109/ITNG.2010.250

2.1 Knowledge factors


With regards to something unique that the user knows,
this authentication method requires the user to know a
unique sequence of numbers or characters.
In an environment where a user does not want an
unauthorized user to access their account, for instance, in
online banking, implementing a strong password policy
can help provide authentication security for the user.
However, if the user will freely give their password away,
no password policy, however strong, can prevent an
unauthorized user from gaining access.

2.2 Ownership factors


In the same vein, if the user is required to have some
token, such as an ATM card, dongle, or key, an
unscrupulous user can easily transfer this token to the
unauthorized user, circumventing the authentication
scheme.

488

3. Keystroke Dynamics

2.3 Inherence factors

There are a number of factors to be considered when


performing biometric user authentication via keystroke
dynamics.
D. Gunetti and C. Picardi, of the University of Torino,
claim that Keystroke dynamics, unlike other biometric
information, convey an unstructured and very small
amount of information. From two consecutive keystrokes
we may just extract the digraph latency, and the
amount of time each key is held down (the keystroke
duration), a pretty shallow kind of information. Moreover,
this information may vary not only because of the intrinsic
instability of behavioral characteristics, but because
different keyboards can be used, different environmental
conditions exist, and, above all, because typing rhythms
also depend on the entered text. [4]
In order to combat this problem of limited information,
the role of keystroke dynamics in biometric authentication
is often limited to fixed sections of text. This can be quite
limiting in the domain of an online examination, as each
exam taker is expected to provide a unique test answer. In
the section four, we present a method for providing
continuous user authentication with unique trial texts.
Despite these limitations, there are a number of metrics
which can be recorded and used for user verification.
These include, but are not limited to: Typing speed;
Keystroke
seek-time;
Flight-time;
Characteristic
sequences of keystrokes; and Examination of
characteristic errors [5]. We discuss these metrics in the
following sections.

The next methods, known collectively as inherence


factors, provide a very accurate means of authentication.
They do, however, have drawbacks in that they can be
unreasonably intrusive and expensive and difficult to
implement.
2.3.1. Something the user is. The third method provides
a very reliable method of authenticating a user, as most
metrics used, such as fingerprint, voiceprint, retinal
pattern, etc. are relatively difficult to duplicate. In the
case of online examinations, however, there is an inherent
difficulty in implementing these authentication methods
due to the hardware requirement [2].
At the time of this writing, while fingerprint readers are
becoming more popular, many are still prohibitively
expensive. Other biometrics, such as DNA sampling, are
simply too intrusive, expensive, and time consuming to
consider for online authentication.
Considering that many computers have built-in
microphones, voice recognition is promising; however, it
may be rather difficult to distinguish a live user from a
recording [3].
2.3.2. Something the user does. The final method,
something unique that the user does, is perhaps the most
promising of the four methods for providing continuous
user authentication in online examinations.
Examples of something unique that the user does
include a users handwriting, walking gait, or typing
rhythm. Authentication via handwriting, or sometimes
simply a signature, requires that the exam taker have
access to a tablet device, which can be cost prohibitive.
In addition, given the wide variation of handwriting font,
style, and size, and considering the variations which can
be displayed by an individual, developing a fast and
efficient computer program for handwriting authentication
is relatively difficult.
Since most computers have a keyboard as an input
device, it is rather natural to examine the typing rhythm,
or keystroke dynamics, of a particular user in order to
perform authentication. This method, unlike many of the
ones discussed, has the unique advantage of being able to
be applied continuously throughout the examination. This
helps prevent a situation in which a user accesses a system
by legitimately authenticating themselves, and then giving
access to an unauthorized user.

3.1. Typing speed


A typical user has a maximum typing speed, and is
directly related to their typing skill. Typing speed is
typically measured in Words per Minute and represents
the number of 5-character sequences a user can type in
one minute. For the purposes of user authentication, it is
preferable to determine a users maximum Keystrokes per
Minute rather than their WPM.
Depending on a users skill and experience with typing,
this maximum Keystrokes per Minute represents an upper
bound on the speed number of keystrokes a user will
typically type in one minute. That is to say, a user may,
and often will, enter fewer than their maximum
Keystrokes per Minute, but it is rather unlikely that they
will enter more. Thus, if a user provides keystrokes at a
rate considerably larger than their recorded maximum
Keystrokes per Minute, this may provide an indication
that the user is not who they claim to be.

489

Shift-key for too long, resulting in backspacing, or simply


common typographical errors.
If these common errors can be recorded, they also
provide a reference against which the users identity can
be checked.

3.2. Keystroke seek-time


Depending on each users mastery of typing, different
letters will take a different amount of time in order for the
user to locate and press a particular key. This can be
rather unique, as a typical keyboard has 105 keys, which
gives at most 105! potential combinations of seek-time,
assuming the seek time for each key is different.
Given that there are so many different potential
combinations of seek time, a dramatic difference in key
seek-times can suggest the presence of an unauthorized
user.

4. Implementing Continuous
Dynamic Authentication

Keystroke

The first step in performing biometric identification


using keystroke dynamics requires determining a profile
of the user. This is much like storing a signature card at a
bank, and provides a reference against which later tests
can be made.
R. Joyce and G. Gupta describe the process, To obtain
a reference signature, we follow an approach similar to
that used by the banks and other financial institutions. A
new user goes through a session where he/she provides a
number of digital signatures by typing in the four strings
several times. Note that in the present environment the
digital signature has four components, one component for
each string that the user types. The system requires a new
user to provide eight reference signatures by typing
his/her username, password, first name and last name
eight times. The number 8 was chosen to provide
sufficient data to obtain an accurate estimation of the
users mean digital signature as well as information about
the variability of his/her signatures. [6]
These digital signatures are then processed and stored
for later use. Once the signature has been recorded and
processed, the data is compared against a new signature
generated at the time of verification. As such, we must be
able to determine the correlation between the newly
created signature and the recorded and stored signature.

3.3. Flight-time
Flight-time, which is the time between two key-up or
two key-down events, is another metric which can be used
to determine a profile of a user. Flight-time also includes
the amount of time that a user holds a key down, known as
hold-time.
Flight-time varies greatly from one user to another, as
the flight-time is closely related to the physiological
makeup of the users hands. A right-handed user may, for
instance, have a shorter hold-time on keys on the right half
of the keyboard when compared with their hold-time for
keys on the left-half of the keyboard. Injuries and other
physical abnormalities may also express themselves
through the flight-time metric.
Due to the physiological nature of variations in flighttime, we will focus on flight-time as the metric used for
user authentication in our proof of concept system.

3.4. Characteristic sequences of keystrokes


In a given language which can be typed on a keyboard,
there are a series of sequences of keys which are
repeatedly typed. In the English language, these include
short words such as the, which are typed while requiring
very little thought from a user. In addition, there are a
number of frequently typed sequences of keys which are
not words, but form common parts of words, for instance,
many words begin with the same prefix, or end with the
same suffix. In addition, commonly typed words, for
instance, the name of the user, are deeply ingrained in the
users typing pattern.
These sequences of keystrokes, if captured, can
provide another method of verifying a users identity.

4.1. Cosine Correlation


One method of comparing new data against the
recorded signature was developed by the noted Polish
mathematician, Hugo Steinhaus, co-founder of the Lww
School of Mathematics. In implementing this method,
called the cosine correlation, we attempt to determine the
correlation between the current trial signature and the
reference signature.
The correlation,
, is determined as follows:
, where is a
vector of length which stores the flight times between
keystrokes in the reference signature, and is a vector of
length which stores the flight times between keystrokes
in the trial signature. Each
refers to the flight time
between two keystrokes. A low r value implies a positive
correlation, and should result in the user being
authenticated.

3.5. Examination of characteristic errors


In addition to having characteristic sequences of
keystrokes, a user may also make a number of
characteristic errors. These may include holding the

490

4.2 Proof of Concept

recording the keystroke dynamics for this trial is repeated.


This process is repeated until the examination is
completed.

We have developed a proof of concept software system


which incorporates HTML, PHP, MySQL, and JavaScript
to create an implementation for administering an an online
examination where keystroke dynamics are used in order
to authenticate the user.
In order to implement continuous authentication via
keystroke dynamics, the system uses PHP and JavaScript
embedded in HTML. JavaScript is used to record the
time between key presses, and also to calculate the cosine
correlation between the recorded signatures and the trial
signatures. PHP is used to provide an interface between
the MySQL database, and to allow information to be
passed from one page to another.
When the user provides their signature on the
registration.php page, JavaScript is used to record the time
between key down events. The length of the sample text
provided must be at least 500 characters long. The
signature requires that the user not backspace or delete
during the registration, and doing so will cause the user to
have to begin the registration again. The time between
successive keystrokes is stored as an element of an array,
which provides the basis for the signature. Upon
completing the registration, the array, in a commadelimited string representation, and its length are sent via
PHP to registration2.php.
At registration2.php, PHP is used to turn the string
from registration.php back into an array, and the array is
divided into 10 discrete signatures of 50 characters each.
These signatures are then stored in MySQL as commadelimited strings.
When the examination begins, exam.php retrieves the
10 signatures associated with the user which were
previously generated and stored in MySQL. It then passes
each of those signatures to JavaScript, which stores each
signature as an integer array containing the keystroke
dynamics. When the user enters the answer to their exam
question, the system monitors the users keystrokes.
When the user has completed a series of 50 keystrokes
with no deletion or significant pauses, the system uses
JavaScript to determine the cosine correlation between the
signature of the last 50 keystrokes and all 10 stored
signatures. If the average value lies over a certain
threshold, a counter containing the number of failed
authentications is incremented. When the user continues
to the next successive question, both the answer and the
number of failed authentications are passed to the next
page.
At exam2.php, the answer from the previous question
and the number of failed authentications from the previous
question are stored in the MySQL database. The user is
presented with a new question, and the process of

4.2.1. Proof of concept system testing. For testing


purposes, a php script is run, which simulates the work
which would be done prior to administering the signature
generation. A database is created in MySQL, which
contains the following tables: students, which contains
information about the student; dynamics, which contains
the keystroke signatures; and answers, which contains
information about the students responses and any failed
authentications. In addition, virtual students are randomly
generated, and assigned random and unique student ID
numbers. This information is stored in the students table
of the database.
Once the database has been created and populated with
student information, we simulate the generation of
keystroke dynamic signatures. In their class, students
would be directed to the index page, index.html. There,
they encounter a PHP script, and are required to enter in
their unique testing identification number. This is a
number separate from their student identification number,
and acts as a password for access to the testing system,
providing multi-factored user authentication.
The first time the user logs in, they are directed to a
PHP script where they are asked to copy a pre-determined
text into a text box. The system uses JavaScript to record
the keystroke dynamics of the user in 10 discrete 50
keystroke blocks. Upon completion of entering the text,
the user is directed to another PHP script. On this page,
the keystroke dynamic information is stored in the proper
field of the dynamics table, and the user is notified that
their information has been recorded.
At this time, the user can log off the system, and their
biometric information will be ready for comparison at the
time of the examination. When the user loads index.html
at the time of the examination, the system recognizes that
their biometric information is already contained in the
system, and directs them to the first page of the
examination.
The examination can be set up to be either hard-coded
with the examination question, or to select a random
question from a database of questions. When the student
logs on to exam.php, they are presented with an essay
style question, and a text box in which to enter their
answer. The system records the answer, and generate a
signature every 50 keystrokes which is compared against
the 10 signatures which are stored in the system. The
cosine correlation is determined, and if the values lie
above a certain threshold, an alert is generated and stored
in the MySQL database.
After completing the essay question, the user is
directed to the successive question in the examination

491

provide a level of certainty that a user who sits an online


examination is, in fact, the one who was supposed to take
the examination.

where the essay answer for the previous question is


recorded, and the process repeats.
Failing an authentication can be visible or made
transparent to the user. One method of making a failed
authentication visible to the user is to generate a
JavaScript event which turns the background of the page
red to notify the user that they have failed an
authentication.
Knowing ahead of time that the system will be
determining whether or not the student is actually
answering the question provides a deterrent effect,
impressing on the students that the work must be their
own. However, there is a downside to this, in that there is
a psychological effect in any false positive generated
while the student is answering an exam question which
may cause the student to lose concentration, become
confused, or upset at a false negative. In addition,
knowing that the exam is using keystroke dynamics to
authenticate the user may simply cause the user to
circumvent the system by having a collaborator tell the
student an answer, and have the student type the essay.
Upon completion of the exam, the student is directed to
a completion page, where the final answer is recorded.
The student is notified of the systems recognition that they
have successfully completed the exam, and is allowed to
log off.
The administrator of the exam can look at the answers
table after the exam has finished, and extract the answers
recorded by each student. It is expected that each student
will generate a small number of alerts during the process
of taking the examination, but an abnormally high number
of alerts generated will give the administrator reason to
suspect that the person who wrote the examination is not
the student registered in the class.

12. References
[1] Anderson, R, Security Engineering: A Guide to Building
Dependable Systems, Wiley Publishing, Inc., Indianapolis, IN,
2008
[2] Y. Levy, M. Ramin, A Theoretical Approach for Biometrics
Authentication
of
e-Exams,
http://telempub.openu.ac.il/users/chais/2007/morning_1/M1_6.pdf
[3] Kinnunen, T., Hautamaki, V., Franti, P., On the Fusion of
Dissimilarity-Based Classifiers for Speaker Identification, 8th
European Conference on Speech Communication and
Technology, 2641-2644, 2003
[4] D. Gunetti , C. Picardi, Keystroke analysis of free text, ACM
Transactions on Information and System Security (TISSEC), v.8
n.3, p.312-347, August 2005
[5] Ilonen, J., Keystroke Dynamics, Lecture in Advanced Topics
in Information Processing, http://www.it.lut.fi/kurssit/0304/010970000/seminars/Ilonen.pdf
[6] R. Joyce , G. Gupta, Identity authentication based on
keystroke latencies, Communications of the ACM, v.33 n.2,
p.168-176, Feb. 1990

4. Conclusion
While our proof of concept system used HTML, PHP,
JavaScript and MySQL, there are a number of
programming technologies which can be used to gather
data regarding keystroke dynamics. We found that using
keystroke dynamics for biometric authentication of a user
taking an online examination is feasible for multi-factor
user authentication. Steinhaus method of cosine
correlation gives us a way to perform continuous user
authentication via keystroke dynamics in an online
examination scenario.
The problem of requiring a fixed text for authentication
via keystroke dynamics can be overcome by generating
multiple signatures from one set of text, and using the
average value of the cosine correlation. In this manner,
variations from one signature to another are diminished
and can give a more accurate correlation between the trial
signature and the recorded signature. This allows us to

492

Das könnte Ihnen auch gefallen