You are on page 1of 9

Fusion-Based Approach for Long-Range Night-Time Facial

Robert B. Martin, Mikhail Sluch, Kristopher M. Kafka, Andrew Dolby, Robert Ice
and Brian E. Lemoff
WVHTC Foundation, 1000 Technology Drive, Suite 1000, Fairmont, WV, USA 26554

Long range identification using facial recognition is being pursued as a valuable surveillance tool. The capability to
perform this task covertly and in total darkness greatly enhances the operators ability to maintain a large distance
between themselves and a possible hostile target. An active-SWIR video imaging system has been developed to produce
high-quality long-range night/day facial imagery for this purpose. Most facial recognition techniques match a single
input probe image against a gallery of possible match candidates. When resolution, wavelength, and uncontrolled
conditions reduce the accuracy of single-image matching, multiple probe images of the same subject can be matched to
the watch-list and the results fused to increase accuracy. If multiple probe images are acquired from video over a short
period of time, the high correlation between the images tends to produce similar matching results, which should reduce
the benefit of the fusion. In contrast, fusing matching results from multiple images acquired over a longer period of
time, where the images show more variability, should produce a more accurate result. In general, image variables could
include pose angle, field-of-view, lighting condition, facial expression, target to sensor distance, contrast, and image
background. Long-range short wave infrared (SWIR) video was used to generate probe image datasets containing
different levels of variability. Face matching results for each image in each dataset were fused, and the results
Keywords: Image Fusion, Face Recognition, SWIR, Night Vision, Surveillance, Biometrics, Active Imaging

As tensions increase in areas around the world, new technologies are continually being created and evaluated to protect
oneself or property. Video surveillance systems are very popular because they put a standoff distance between the target
and the operator, can be automated for 24 hour operation, and also can be used for post-processing. When teamed with
add-on software packages, these surveillance systems become a very formidable tool for defense applications. Facial
recognition systems are an example of this and are becoming increasingly popular in many communities including the
military, commercial, and private sectors. As the desire for this technology grows, so does the requirement for its
accuracy. The ability to detect and identify individuals at long ranges, with high accuracy, in short periods of time,
whether day or night, in all weather conditions, while remaining completely tactical and covert would prove highly
favorable in this application. For this reason, the West Virginia High Technology Consortium Foundation (WVHTCF),
under a research contract from the Office of Naval Research (ONR), with funding and oversight from the Office of the
Secretary of Defense Deployable Force Protection Program (DFP), is developing the Tactical Imager for Night/Day
Extended Range Surveillance (TINDERS)
. This is an actively illuminated short-wave infrared (SWIR) video
imaging system that uses illumination that it invisible to conventional silicon optics and the naked eye. This system
provides imagery suitable for facial recognition at ranges of up to 400 meters in complete darkness and imagery for
tracking purposes beyond three kilometers. Optimization of the face recognition performance can be pursued in three
primary ways. First, the optical hardware can be optimized to produce the highest quality SWIR imagery. Second, the
face matching algorithms can be optimized to produce the most accurate matching of a single SWIR probe image to a
database of visible-spectrum mug-shots. Most of the effort to date has focused on these first two approaches. The third
approach, which is discussed in this paper, is to fuse the matching results for multiple SWIR probe images of the same
person to increase the identification accuracy. In principle, the longer an individual is monitored by the video
surveillance system, the more facial images can be captured and processed, and the more accurate the fused matching
result will be. This paper investigates the hypothesis that it is more beneficial to have a low correlation than a high
correlation between the fused probe images. The data suggests that the fusion of multiple images with low correlation
results in a somewhat larger improvement in accuracy than the fusion of multiple images with high correlation. The
results of this work may ultimately lead to the integration of new features in the TINDERS system, and may also be
generalizable to other face recognition systems based on video surveillance.
Automatic Target Recognition XXIV, edited by Firooz A. Sadjadi, Abhijit Mahalanobis, Proc. of SPIE Vol. 9090, 909006
2014 SPIE CCC code: 0277-786X/14/$18 doi: 10.1117/12.2052725
Proc. of SPIE Vol. 9090 909006-1
Downloaded From: on 09/27/2014 Terms of Use:

2.1 System
In 2009, TIN
detection an
and ruggedne
larger netwo
allowing for
recorded SW
2.2 SWIR I
angle is ma
Operating wi
completely in
times as defi
1M at the illu
when compar
While typica
skin reflectiv
nose, lips, an
acquired sen
against an ex
Figure 1
2.3 Hardwa
computer. F
NDERS began
on, this unit w
d identificatio
t of the second
unctions as a n
rk. It can be
r either manua
n can be attemp
WIR video data.
s an actively i
atched to the
scent light em
ithin a commo
nvisible to the
ined by the AN
uminator outpu
red to 800 nm.
al visible spectr
As illustrated
vity, lack of a n
nd eyes remai
nsor data again
xisting database
. Visible Spectru
RS system is co
igure 2 depicts
as a laborator
was mobilized
on in complete
d generation sy
ight/day video
tasked or slew
al or automatic
pted. The TIN

lluminated SW
imagers fiel
mitting diode (
on telecommun
naked eye and
NSI Z136 and
ut and Class 1
rum facial reco
in Figure 1, no
noticeable pup
in the same as
nst existing da
e of visible spe
um and SWIR M
omprised of th
s both a concep
ry prototype pr
and transporte
e darkness. T
ystem. Improv
surveillance s
wed to a spec
c person detec
NDERS system
WIR system w
ld-of-view (FO
(SLED) fed to
nication wavele
d conventional
IEC 60825 las
at the target.
ognition has ma
oticeable differ
il, and high ha
s in the visible
atabases. For
ectrum mug sho
Mug Shot . (left)
hree component
ptual illustratio
roviding for in
ed across the c
The success o
ements were m
system that can
cific location o
ction to be ac
m has a large st
ith optical zoo
OV) with mo
o an erbium-d
ength band tha
l silicon image
ser eye safety
Safe operation
ade major adva
rences between
air reflectivity.
e spectrum. F

Visible spectrum
ts: the optical h
on of the compo
nitial proof of c
country where
of this system
made in size, w
n be deployed a
of interest. Th
torage capacity
om capability.
otorized optics
doped fiber am
at is greater tha
ers. The system
. T
n at this wavel
ances, operatio
n SWIR and v
However, the
For a system t
e challenge is
m mug shot (righ
head unit, the e
onents and a pi
concept. After
e it demonstrat
was then imp
weight, durabili
as a standalone
he system prov
Once a person
y allowing for
The illumina
s. The sourc
mplifier, runnin
an 1400 nm, th
m remains com
The eye-safety
length allows f
on in the SWIR
e landmarks, su
to be most use
to match the
ht) SWIR probe
electronics enc
icture of the de
r successful la
ted long range
proved upon w
ity, mobility, u
e system or as
vides live vide
n is detected,
the post-proce
ation beam div
ce is a fiber
ng at constant
mpletely eye sa
y classification
for 65 times m
R band does off
m imagery incl
uch as the edge
eful, it must m
SWIR facial

closure, and the
eployable unit.
e person
with the
part of a
eo output
essing of
t power.
system is
afe at all
is Class
ore light
ffer some
lude low
es of the
match its
e control

Proc. of SPIE Vol. 9090 909006-2
Downloaded From: on 09/27/2014 Terms of Use:
O p t i c a l H e a d U n i t
I )
I l l u m i n a t o r
O p t i c a l
F i b e r
I m a g e r
0 0 0
M o t i o n C o n t r o l l e r s
D S P B o a r d
C o m m u n i c a t i o n s
A r r a y
P a n / T i l t
U n i t
E l e c t r o n i c s E n c l o s u r e
O p t i c a l
S o u r c e
N e t w o r k i n g /
c o m e t s
P o w e r
S u p p l i e s
U m b i l i c a l C a b l e
E t h e r n e t o r I A N
C o m p u t e r
E t h e r n e t o r I A N ( i n c l u d i n g M E D
O p t i o n a l
N e t w o r k
C o n n e c t e d
C o n t r o l D e v i c e

Figure 2. TINDERS System. (left) Conceptual illustration of TINDERS system (right) Deployable TINDERS system
showing optics head, electronics enclosure and control computer.
The optical head unit contains all the optics including the mechanisms to move them. Included within is a laser range
finder, a focal plane array, and a video processing board. All components are encased in a climate controlled and
environmentally sealed enclosure mounted to a FLIR pan/tilt unit. Typically it is deployed on the tripod as depicted
above but can also be mounted on a mast or tower.
The electronics enclosure contains the supporting power supplies, communication electronics, and illumination source.
As previously mentioned, this is a fiber-coupled system, meaning the SLED, amplifier, and filter are contained within
this thermal electrically cooled enclosure. The Mil Spec connections facilitate 24 hour operation in all conditions.
The system is currently controlled by a semi-rugged computer. The computer does not need to be co-located with the
system. Both simply need to be plugged into any ordinary 100 Mbit/s network and 110 VAC. The unit offers a large
storage capacity for video post-processing as well as in situ operation.
2.4 Control Software
TINDERS can be controlled by an operator from the control computer console or a client running Windows Remote
Desktop Connection if the system is network connected. The operator is presented with a number of controls to perform
their desired task whether it is surveillance, detection, identification, or tracking. The system has integrated person and
face detection algorithms allowing for tracks to be initiated with little operator involvement. Once these are initiated, the
operator then can attempt facial recognition as discussed in the next section. TINDERS, as previously mentioned, can
also be integrated into a larger network of sensors and report its findings to a database or central location for further
2.5 Facial Recognition
The TINDERS control software and the facial recognition package are both contained within one graphical user
interface (GUI). The facial recognition platform is powered by a commercially available package developed by
MorphoTrust USA. It is based on their Face Examiner Workstation
, but with the addition of custom software that pre-
processes the SWIR images before they are matched against a gallery that is pre-populated with a database of visible-
spectrum mug shots.
Selection of video frames for face recognition can be done in either an automated or manual mode. In the automated
mode, faces are detected in video and automatically submitted to the face recognition process. In manual mode, the
operator clicks a button to cue a predefined number of SWIR video frames to be submitted to the facial recognition
process. Alternatively, the user can load previously-saved files containing SWIR facial imagery. The user can then
manually mark the eye positions. Matching is then initiated. Each probe image is then matched against each gallery
image and assigned a score. These are then ranked based on the fused score.
Proc. of SPIE Vol. 9090 909006-3
Downloaded From: on 09/27/2014 Terms of Use:
. '
i l '
I i
q 6 I l

Figure 3 illustrates the TINDERS GUI with integrated facial recognition when used in manual mode. The image on the
left represents the live video feed. The image on the right is a probe image that was searched against the visible
spectrum database. The visible images below show the possible matches with the correct subject scoring Rank 1.

Figure 3. Screenshot of the TINDERS GUI, showing user controls, facial recognition in manual mode, correct match,
and other controls.
2.6 Fusion
The fusion process involves combining the face matching results from multiple probe images to increase the
identification confidence level. The live video feed supplied by the TINDERS system supplies up to 30 frames per
second of SWIR facial imagery. As long as the images belong to the same person, they can continue to be matched and
the matching results fused. For the results presented in this paper, the maximum score fusion technique is used in
which each visible gallery image is assigned a fused matching score equal to its highest single-probe matching score
across all of the probe images.
The probe images submitted to facial recognition do not necessarily have to be identical to one another; however, in
most TINDERS experiments that have been analyzed to date, test subjects were stationary and facing the camera with a
neutral facial expression, having a pose angle and facial expression very similar to the enrolled visible-spectrum gallery
images. As a result, the fusion process in these experiments has combined the matching results for a set of probe images
that closely resemble one another, i.e. probe images with high correlation. While the fused results have always been
more accurate than the single-probe results, it is possible that the lack of variability in the probe images has limited the
performance improvement provided by fusion. Thus, an experiment was performed to investigate the impact of probe
image variability on fusion performance.
3.1 Database
The visible-spectrum gallery used for this experiment included enrollment images from two datasets. The first dataset
was collected in summer through winter of 2012 in conjunction with a research group at West Virginia University
(WVU). This dataset consists of 104 individuals. The second dataset, which was collected specifically for this
experiment, included 10 individuals. Thus, the total database size used in this experiment was 114 faces. One individual
inadvertently appeared in both datasets; however, this does not affect our ability to test the hypothesis, as the duplicate
record affects all matching scores in the same way.
3.2 Experimental Set Up
In order to determine whether it is more favorable to have a fusion of images of high or low correlation, sets of images
need to be produced having both high and low correlation. These sets need to be produced in a repeatable manner. A
Proc. of SPIE Vol. 9090 909006-4
Downloaded From: on 09/27/2014 Terms of Use:
i i

test sequence
dataset with
The data was
one end of th
meters at the
variability in
were strategi
looking at ea
3.3 Image A
After the visi
The subject
system was a
was recordin
the recording
mark and say
center mark a
speak out loa
the subject w
160 meter ma
At this point
neutral angul
Figure 4.
facial rec
Figure 5.
facial rec
Several aspe
and all of the
examples of
image blurrin
collection wa
e was develop
specific pose a
s collected in t
he structure. O
e other end of t
n background.
ically placed in
ach one.
ible mug shot w
was instructed
already running
ng live video.
g of ~ 300 vide
y out loud, O
and repeat. In
ad introduced v
was instructed t
Example imag
t, the three exe
lar progression
High Correlati
Low Correlati
cts of the expe
e 160-m image
overexposed im
ng due to came
as not repeated
ped that used t
angles and facia
the outdoor pa
One seat was p
the structure.
The parking g
n front of both
was acquired i
d to stare with
g, had the illum
The subject w
eo frames at 30
One-one thousa
total there wer
variability into
to repeat this p
gery from 70-m
ercises are rep
n. This comple
ion. SWIR Fac
ion. SWIR Fac
erimental cond
ry was inadver
magery at 70-m
era motion, pa
d. While the
al expressions.
arking structure
placed at a ran
At both locatio
garage is dimly
seats that force
indoors, each s
h a neutral exp
minator on, ha
was asked to m
0 frames per se
and, two-one th
re nine marks t
o the facial exp
rogression wit
m can be seen i
peated: ten sec
etes the acquisi
cial Imagery at 7
cial Imagery at 7
ditions resulted
rtently overexp
m and 160-m.
articularly at 16
overall face m
system to rec
. The video wa
e of the WVHT
nge of 70 mete
ons, back drop
y lit offering ve
ed the subjects
subject was ush
pression direct
ad the imager z
maintain this po
econd. The su
housand. The
that gave a com
pression record
thout speaking
in Figures 4 an
cond frontal ne
ition process.
70-m range show
70-m range show
d in less than o
posed, with the
In addition, h
60 m. Due to t
matching perfor
cord SWIR vid
as then post-pr
TCF facility.
ers from the sy
ps were placed
ery little ambie
s to turn their h
hered to the 70
tly into the TI
zoom set to the
osition and exp
ubject was then
e subject was t
mplete angular
ed from frame
the words out
nd 5. The subj
eutral expressi
wing high correla
wing low correla
optimal imager
e effect most pr
high winds in th
time constraint
rmance on this
deo of all 10
rocessed for im
ystem and ano
behind the sub
ent lighting. S
heads approxim
0 meter mark in
INDERS syste
e minimum fie
pression for 10
n instructed to
then instructed
view of the fa
e to frame. On
loud, maintain
ject was then a
on, talking ang
ation between im
ation between im
ry. First, some
ronounced at 1
he parking stru
ts and harsh w
s dataset is exp
subjects in the
mage extraction
S system was
other at a range
bjects to minim
Several chalk m
mately 20 degre
n the parking s
em. At this p
ld of view (FO
0 seconds, allow
look at the bot
d to look at the
ce. Having the
ce this was com
ning a constant
asked to procee
gular progress

mages to be fused

mages to be fused
e of the 70-m
60-m. Figure
ucture resulted
winter weather,
pected to be f
e second
set up at
e of 160
mize any
ees when
oint, the
OV), and
wing for
ttom left
e bottom
e subject
t, neutral
ed to the
sion, and
d for
d for
6 shows
in some
the data
far lower
Proc. of SPIE Vol. 9090 909006-5
Downloaded From: on 09/27/2014 Terms of Use:

than typical
correlation fu
Figure 6.
Earlier intern
degrees is co
angles is far
facial imager
3.4 Processi
Once the vid
imaging cond
extracted as w
extracted fro
correlation (h
sets are listed
6 frontal n
6 frontal neu

Each set of p
person visibl
probe images
locations, it w
neutral set w
images, wher
location requ
marked each
For each tria
for all ten tes
table above.
used imagery s
Image overexp
omparable to th
lower. Thus,
ry with frontal
ing for Facial
deo was acqui
ditions. At bot
well as each o
om the recorde
high variability
d in Table 1.
6 fron
15 fron
24 fron
neutral, 9 neutra
utral, 9 neutral
angle progr
probe images w
le-spectrum gal
s, eye locations
was important
were included in
re severe overe
uired that the en
al, the rank and
st subjects and

erformance at
should still illu

posure. Exampl
experiments ha
hat of frontal im
any positive e
facial imagery
ired, probe im
th distances, a
f the 9 angular
ed video, they
y) set, there w
Table 1. De
ntal neutral
ntal neutral
ntal neutral
al angle progre
angle progress
ression (24 tota
was then submit
llery, and the f
s were marked
to ensure that a
n all sets at the
exposure led to
ntire processin
d fused score ob
once for each
these distance
strate the effec
es shown from d
ave indicated t
magery; howev
ffect observed
y may be signif
mages were ex
large number o
r poses with bo
were grouped
was a high-corr
escriptions of th
ession (15 total
sion, and 9 talk
tted to the TIN
fused matching
d manually. Be
any probe imag
same distance
o low matching
ng chain be per
btained by the
of the 70-m pr
es, the relative
ct of image cor
distances of both
that matching p
ver, matching p
on fused matc
tracted for eac
of highly-corre
oth neutral and
into several s
relation set con
e sets of probe im
6 fronta
NDERS face re
g scores for the
ecause the matc
ge included in
e) had the same
g performance,
formed 4 times
subject of inte
robe sets and 4
e performance

h 70 m (left) and
performance of
performance o
ching scores re
ch subject cor
elated frontal n
d talking expre
sets with high
ntaining the sa
mages used in th
al neutral, 9 neu
angle p
cognition softw
e top 10 ranked
ching scores de
more than one
e marked eye p
the sensitivity
s, independentl
erest was recor
times for each
between high
d 160 m (right).
f facial imager
f facial imager
esulting from th
rresponding to
essions. Once
and low corre
ame number of
he experiment
160 Meters
6 frontal neutra
4 frontal neutr
utral angle prog
progression (24

ware for match
d candidates we
epend upon the
e set (e.g. the im
positions in eac
y of the results
ly, with eye-lo
rded in a spread
h of the 160-m
h-correlation a

ry at pose angle
ry with 20-deg
he fusion of 20
o each of the d
ion probe imag
the probe imag
elation. For ea
f probe images
gression, and 9
4 total)
hing against the
ere displayed.
e marked eye
mages in the 6
ch set. For the
to the choice o
ocations indepe
dsheet. This w
probe sets liste
and low-
es of 10-
gree pose
ges were
ges were
ach low-
s. These
9 talking
e 114-
For all
e 160-m
of eye
was done
ed in the
Proc. of SPIE Vol. 9090 909006-6
Downloaded From: on 09/27/2014 Terms of Use:
C u m u l a t i v e M a t c h C h a r a c t e r i s t i c a t 7 0 m
0 . 9
0 . 8
0 . 7
@ 0 . 6
0 . 5
s 0 . 4 -

0 . 3 -
0 . 2 -
0 . 1 -
3 4 5 6 7 8 9 1 0
R a n k
0 . 9
0 . 8
0 . 7
@ 0 . 6
0 . 5
s 0 . 4

0 . 3
0 . 2
0 . 1
C u m u l a t i v e M a t c h C h a r a c t e r i s t i c a t 7 0 m
6 N e u t r a l s
- P - 1 5 N e u t r a l s
- - 2 4 N e u t r a l s
0 o
0 1 2 3 4 5 6
R a n k
7 8 9 1 0

3.5 Cumulative Match Characteristics
The cumulative match characteristic (CMC) curve
is a convenient tool to compare the face recognition accuracies
resulting from the fusion of the different sets of probe images. Because the CMC curve directly shows how often a
genuine match is highly ranked by the system, it is better suited to the current experiment than other types of metrics,
such as the receiver operating characteristic
. Overlaying the CMC curves for different sets of probe images will allow
for easy visualization of the relative matching performance for the fusion of the different sets.
To calculate a CMC curve, the fraction of subjects correctly ranked better than or equal to each rank is computed. For
instance the Rank 1 value of the CMC is just the fraction of subjects who were correctly identified at Rank 1. For Rank
2, the fraction of subjects correctly identified at Rank 2 or higher is computed, etc. Equation 1 provides the general
formula for the CMC, where P(r) is the fraction of test subjects whose fused gallery image matching score had a rank of
r out of all gallery images in the database, and k is the rank coordinate on the horizontal axis of the CMC curve.
CHC(k) = P(r);
k = 1, , m, (1)

3.6 Results for 70-m data
The CMC results for the high-correlation probe sets are shown in Figure 7. The left plot shows the baseline CMC for the
fusion of a set of 6 frontal, neutral-expression facial images like those in Figure 4, while the right plot overlays the CMC
curves for similar sets with 15 and 24 images. The Rank 1 performance for all three sets is 40%, while the Rank 10
performance of the 15 and 24-image set is 80% as compared to 70% for the 6-image set. This is consistent with a
possible small benefit to fusing results for a larger number of very similar probe images, although this result is within the
statistical error of the experiment. Note that with a database size of 114 faces, the purely random chance of guessing a
Rank 1 match would be 0.9% and the chance of guessing Rank 10 or better would be ~ 9%.

Figure 7. 70-m CMC results for highly-correlated sets of probe images. (left) CMC results for the fusion of 6 frontal,
neutral-expression images. (right) CMC results for the fusion of 15 and 24 highly-correlated frontal, neutral-expression
images overlaid on the CMC from the left plot.
Figure 8 compares the CMC curves for low-correlation probe sets to high-correlation probe sets acquired at 70-m range.
The left plot shows the CMC for the 70-m set that includes 6 neutral-frontal images and 9 neutral angle progression
images, while the right plot shows the set that also includes the 9 talking angle progression images. Both plots overlay
CMC curves for high-correlation probe sets for comparison. In both cases, a clear performance improvement is seen
when the fused dataset includes both neutral frontal and a progression of angled poses. In particular, both of the low-
correlation image sets have 90% rank 4 performance and 100% rank 10 performance, while the high-correlation image
sets have no better than 60% rank 4 performance and 80% rank 10 performance. This is despite the fact that the low-
correlation SWIR probe image sets fuse the original 6 neutral-frontal images with additional SWIR images that have a
20 pose angle, which typically have low matching performance against frontal neutral visible gallery images.
Proc. of SPIE Vol. 9090 909006-7
Downloaded From: on 09/27/2014 Terms of Use:
C u m u l a t i v e M a t c h C h a r a c t e r i s t i c a t 7 0 m
0 . 9
0 . 8
0 . 7
0 . 6
0 . 5
0 . 4 -

0 . 3
0 . 2
0 . 1
a - 6 N e u t r a l s
t N . P r o g r e s s i o n
- 9 - 1 5 N e u t r a l s
3 4 5 6 7 8 9 I i i '
R a n k
0 . 9
0 . 8
0 . 7
0 . 6
0 . 5
0 . 4

0 . 3
0 . 2
0 . 1
C u m u l a t i v e M a t c h C h a r a c t e r i s t i c a t 7 0 m
D 4
6 N e u t r a l s
N . & T . P r o g r e s s i o n
X 2 4 N e u t r a l s
3 4
R a n k
7 8 9 1 0
C u m u l a t i v e M a t c h C h a r a c t e r i s t i c a t 1 6 0 m
0 9
0 8
0 7
1 6 N e u t r a l s
2 4 N e u t r a l s
N . & T . P r o g r e s s i o n s
3 0 6
A 0 A
0 3
0 2
0 1
0 1 2 3 4 5 6 7 8 9 1 0
R a n k
C u m u l a t i v e M a t c h C h a r a c t e r i s t i c a i 1 6 0 m A n a l y s i s
0 9
0 8
N e u t r a l s
2 4 N e u t r a l s
N . & T . P r o g r e s s i o n s
0 7
3 0 6
0 3
0 2
3 4 5 6 7 8 9 1 0
R a n k
0 9
0 8
0 7
3 0 6
E 0 A
0 3
0 2
0 1
C u m u l a t i v e M a t c h C h a r a c t e r i s t i c a t 1 6 0 m A n a l y s i s 2
. 1 6 N e u t r a l s
t 2 4 N e u t r a l s
& T . P r o g r e s s i o n s
2 3 4 5 6 7 8 9 1 0
R a n k
0 a
0 1
T 6 N e u t r a l s
I 1 2 4 N e u t r a l e
T N . & T . P r o g r e s s i o n s
2 2 4 5 6 2 8 9 1 6
R a n k

Figure 8. Comparison of low-correlation to high-correlation CMC curves at 70-meters range. (left) CMC curve for set
containing 6 neutral frontal and 9 neutral angle progression images overlaid with CMC curves for 6 and 15 neutral frontal
images. (right) CMC curve for set containing 6 neutral frontal, 9 neutral angle progression, and 9 talking angle progression
images overlaid with CMC curves for 6 and 24 neutral frontal images.
3.7 Results for 160-m data
As mentioned above, the quality of the 160-m images was severely degraded by overexposure and high winds, reducing
the face matching performance well below typical TINDERS performance at this distance. When image quality is poor,
face matching results are very sensitive to the marked eye locations, so 4 analysis groups of CMC curves were
generated, each group having independent, but internally-consistent eye locations. Figure 9 shows the CMC curves for
each of the four groups. Each plot includes a CMC for the 6 and 24 neutral-frontal probe image set as well as the 24-
image set containing 6 neutral-frontal, 9 neutral angle progression, and 9 talking angle progression images.

Figure 9. Comparison of low-correlation (green) to high-correlation (blue and red) CMC curves at 160-meters range. The
low-correlation probe set includes 6 neutral-frontal images, 9 neutral angle progression, and 9 talking angle progression
images. High-correlation probe sets include 6 and 24 neutral-frontal images. Image quality was degraded by overexposure
and motion blur due to high winds. Each plot uses the same images but with independent manual marking of eye locations.
Note the large variability in the data between plots.
(a) (b)
Proc. of SPIE Vol. 9090 909006-8
Downloaded From: on 09/27/2014 Terms of Use:

As can be seen from Figure 9, there was a large amount of statistical variability between the four analysis groups, and it
was not possible to discern a statistically significant difference in matching performance between the high-correlation
and low-correlation probe image sets at 160 m. While the degraded imagery led to overall low matching performance,
achieving only 10% - 20% rank 1 success, and only 50% - 60% rank 10 success, the success rates were still far better
than random chance, which would have predicted only 0.9% and 9% rank 1 and rank 10 success rates, respectively,
indicating that the facial features still played an important role in determining the CMC curves. One possible
explanation for the lack of observed benefit for fusing low-correlation images is the large 20 pose angle used in the
angle progression images.

By fusing the face matching results from multiple probe images of the same person, it is possible to improve overall
matching performance. It is logical to expect that probe images that closely resemble each other will produce similar
matching scores against the gallery, and thus derive limited overall benefit from fusion. In contrast, it is logical to
expect that probe images with varying pose angle and facial expression, within some useable range, will produce more
variation in matching scores, thus deriving a larger benefit from fusion. It is also clear that probe images with very high
pose angle or extreme expressions, where single-probe matching performance is extremely low, would not be expected
to improve matching performance when fused with other probe images within the useable range. The 70-m results
shown in Figure 8 clearly show that the fusion of 24 images with variable pose angles and expressions results in higher
matching performance than the fusion of 24 images with very similar pose angle and expression. These results indicate
that even images with pose angle of 20 can have a positive impact on matching performance, when the image quality is
sufficiently high. In contrast, the 160-m results shown in Figure 9, which used low-quality, overexposed probe images,
showed no statistically-significant improvement to matching performance when images with varying 20 poses were
added to the fused probe image set. One possible explanation for this is that the range of useable pose angles decreases
as image quality decreases. It is possible that a low-correlation probe image set of the same quality, using a smaller pose
angle such as 10, might have resulted in a larger improvement to the fused matching score at 160 m. Experiments that
include a larger number of subjects, automatic eye location, and more low-to-intermediate pose angles will be needed to
better understand and quantify the relationship between image correlation and fused face matching performance.
This research was performed under contract N00014-09-C-0064 from the Office of Naval Research, with funding and
oversight from the Deployable Force Protection Science and Technology Program. The authors would like to
acknowledge important technical contributions from Jason Stanley, Kenneth Witt, William McCormick, and the
MorphoTrust USA team, as well as the cooperation of the WVU Center for Identification Technology Research.
[1] Robert B. Martin, Mikhail Sluch, Kristopher M. Kafka, Robert V. Ice, and Brian E. Lemoff, Active-SWIR
signatures for long-range night/day human detection and identification, Proc. SPIE 8734, Active and Passive
Signatures IV, 87340J (May 23, 2013).
[2] Brian E. Lemoff, Robert B. Martin, Mikhail Sluch, Kristopher M. Kafka, William B. McCormick, and Robert V.
Ice, Long-range night/day human identification using active-SWIR imaging, Proc. SPIE 8704, Infrared
Technology and Applications XXXIX, 87042J (June 18, 2013).
[3] Brian E. Lemoff, Robert B. Martin, Mikhail Sluch, Kristopher M. Kafka, William B. McCormick, and Robert V.
Ice, Automated night/day standoff detection, tracking, and identification of personnel for installation protection,
Proc. SPIE 8711, Sensors, and Command, Control, Communications, and Intelligence (C3I) Technologies for
Homeland Security and Homeland Defense XII, 87110N (June 6, 2013).
[4] Laser Safety, Wikipedia. 2013. Web. 12 March 2013. <>
[5] MorphoTrust USA Face Examiner Workstation web page.
rWorkstation.aspx .
Biometrics Testing and Statistics:
Proc. of SPIE Vol. 9090 909006-9
Downloaded From: on 09/27/2014 Terms of Use: