Beruflich Dokumente
Kultur Dokumente
by
Joseph M. Cocchini
ABSTRACT
Modern
day
generation
and
use
of
sensitive
digital
data
continues
to
grow
at
an
aggressive
rate
(Craglia,
et
al.,
2008).
In
the
face
of
this
mounting
pressure,
data-protective
workhorse
cryptography
algorithms
face
becoming
at
the
least,
suspect
and
at
the
most,
obsolete.
DNA-based
and
DNA-derived
solutions
are
presently
emerging
as
viable
platforms
on
which
to
regain
lost
ground
and
from
which
to
advance
the
state
of
the
current
art
(Bardou,
et
al.,
2012).
DNAs
massively
parallel
processing
capabilities,
storage
capacity
and
density,
computational
abilities
and
natural
one-way
transcription
properties
make
it
a
contender
in
the
archival
storage,
parallel
processing
and
digital
data
encryption
and
decryption
environments
("The
emerging
science,"
2009).
Moving
to
real-time
(lab-less,
in
situ
sampling)
reading
of
in
situ
DNA
remains
one
of
the
largest
hurdles
to
practical
implementation,
but
progress
is
being
made
toward
that
end
(Toumazou,
et
al.,
2012).
This
paper
looks
at
human
cryptographic
history
leading
up
to
DNAs
digital-data-specific
componentry
[and
characteristics
of
same],
the
mechanics
of
commoditizing
DNA
for
use
as
a
digital
data
storage
and
encryption
medium,
practical
strides
made
toward
using
DNA
as
a
storage
and
retrieval
environment
and
commercial
DNA
reading
technologies.
ii
TABLE
OF
CONTENTS
ABSTRACT ........................................................................................................................................ ii
TABLE
OF
CONTENTS...................................................................................................................... iii
TABLE
OF
FIGURES .......................................................................................................................... v
TABLE
OF
TABLES ............................................................................................................................ 9
INTRODUCTION ............................................................................................................................. 10
HISTORICAL
PERSPECTIVE ............................................................................................................. 11
DNA
AS
AN
INFORMATION
MEDIUM............................................................................................ 17
Codons .................................................................................................................................. 18
Longevity
of
DNA .................................................................................................................. 21
DNA
Digital
Storage
Capacity ............................................................................................... 22
DNA
Energy
Efficiency .......................................................................................................... 24
Data
Error
Rates
in
DNA ....................................................................................................... 24
DNA
Microarrays .................................................................................................................. 25
Reading
and
Writing
in
DNA ................................................................................................. 26
DNA
Barcoding
of
Living
Organisms ................................................................................... 29
Biomolecular
or
DNA
Computing ......................................................................................... 30
Breaking
DES
Using
a
Molecular
Computer ......................................................................... 31
DNA-BASED
CRYPTOGRAPHY ........................................................................................................ 33
DNA
Encoding ....................................................................................................................... 33
DNA-Based
Steganography
and
Watermarking ................................................................... 34
DNA-Based
Data
Encryption
Using
Yet
Another
Encryption
Algorithm
(YAEADNA) ............ 36
Inputs
Psuedo-Code ..................................................................................................... 37
Algorithm
Psuedo-Code ............................................................................................... 37
iii
Output
Psuedo-Code .................................................................................................... 37
DNA/Amino
Acid-Based
Encryption
of
the
Playfair
Cipher .................................................. 40
Inputs
Psuedo-Code ..................................................................................................... 44
Algorithm
Psuedo-Code ............................................................................................... 45
Processing
Psuedo-Code .............................................................................................. 45
Output
Psuedo-Code .................................................................................................... 45
Experiment ................................................................................................................... 45
Opportunities
for
Additional
Security
(Sabry,
et
al.,
2010) ......................................... 46
DNA-Based
Encryption
using
the
DNA-Crypt
Algorithm ...................................................... 46
One-Time
Pads...................................................................................................................... 47
DNA-Based
Data
Encryption
Using
Traditional
(RSA)
Encryption
Methodology.................. 51
Key
Generation
Detail .................................................................................................. 52
Data
Pretreatment
Detail ............................................................................................. 53
Encryption
Detail .......................................................................................................... 53
Decryption
Detail ......................................................................................................... 54
Data
Post-Treatment
Detail ......................................................................................... 54
Security
Implications .................................................................................................... 54
DNA
Detection
and
Analysis ................................................................................................. 55
Lab-Free
Contact-Based
DNA
Testing .......................................................................... 55
Lab-Free
Contact-Less
(Remote)
Molecular
Analysis................................................... 57
CONCLUSION ................................................................................................................................. 59
REFERENCES .................................................................................................................................. 61
iv
TABLE
OF
FIGURES
Figure
1.
Example
of
a
Spartan-Greek
scytale
(Ribeiro,
2012).
....................................................
11
Figure
2.
Example
of
a
Demaratus-like
steganographic
tablet.
The
wax
layer
is
removed
and
text
imprinted
(and
colored
for
demonstration
purposes)
on
the
wood
substrate.
The
wax
is
then
replaced
(and
inked
in
white
for
demonstration
purposes)
(Schovanek,
2010).
......
12
Figure
3.
Example
of
a
simple
character
substitution
schema
(Stallings,
2010).
..........................
12
Figure
4.
Example
of
an
Alberti
Cipher
wheel
(Ribeiro,
2012).
.....................................................
12
Figure
5.
Example
of
a
Jefferson
wheel
(Ribeiro,
2012).
..............................................................
13
Figure
6.
A
basic
one-time
pad
encryption/decription
schema
(Stallings,
2010).
........................
13
Figure
7.
Example
of
an
Enigma
machine
(Ribeiro,
2012).
...........................................................
14
Figure
8.
Example
of
MIT
CTSS
system
(Lelusz,
2009).
.................................................................
14
Figure
9.
DES
encryption
schema
(Smyth,
2007).
.........................................................................
15
Figure
10.
DES
encryption
schema
(Stallings,
2010).
....................................................................
16
Figure
11.
Schematic
representation
of
a
single
DNA
nucleotide
or
monomer.
..........................
17
Figure
12.
Organic
representation
(dyed)
of
a
segment
of
DNA
polynucleotide
structure
(Van
Voorst,
Finzel,
2012).
............................................................................................................
18
Figure
13.
Nucleotide
sequences
dictating
specific
amino
acid
outcomes.
.................................
21
Figure
14.
Electron
photomicrograph
of
bacterium
isolated
from
~25
million-year-old
Dominican
amber
(Orkand,
et
al.,
1998)
..............................................................................
22
Figure
15.
DNA
microarray
showing
magnification
of
a
subsection
(DNA
Microarray
Virtual
Lab,
2012).
....................................................................................................................................
25
Figure
16.
Schematic
of
oligonucleotide
probes
within
a
DNA
microarray.
.................................
25
Figure
17.
DNA
microarray
droplet
depositing
from
inkjet-like
printhead
(Lausted,
et
al.,
2004).
....................................................................................................................................
26
Figure
18.
DNA
microarray
printing
platform
(Lausted,
et
al.,
2004).
..........................................
27
Figure
19.
DNA
information
storage
decoding/encoding
schema
(Church,
et
al.,
2012).
............
27
Figure
20.
Illuma
2000
real
time
DNA
sequencer
used
to
read
Church
and
teams
encoded
microarray
(Church,
et
al.,
2012).
.........................................................................................
28
Figure
21.
Sample
DNA
sequencer
output
of
nucleotide
readings
from
a
single
land
fragment
(How
do
we
sequence
DNA?,
2012).
....................................................................................
28
Figure
22.
Example
of
a
UPC
barcode.
..........................................................................................
29
Figure
23.
Example
DNA
barcoding
of
animals
and
insect
species
and
the
subtle
variations
between
seemingly
identical
species
(Luoma,
2008).
..........................................................
30
Figure
24.
Graphic
schema
of
a
seven-node
Hamiltonian
Path
Problem
showing
fourteen
possible
routes,
with
the
redline
representing
the
only
correct
path
(Chen,
Wood,
2000).
31
Figure
26.
Function
schema
of
steganographic
algorithms
(Heider,
Barnekow,
2007).
...............
34
Figure
27.
Schema
of
a
secret-key
stegosystem
with
passive
adversary
showing
embedded
text
E,
covertext
C,
stegotext
S,
Alices
private
random
source
R,
and
secret
key
K
shared
by
Alice
and
Bob,
with
Alice
sending
either
covertext
C
or
stegotext
S
(Cachin,
1998).
...........
35
Figure
28.
A
basic
steganographic
implementation
in
DNA.
The
message
synthesizing
process
is
shown
(a).
The
encoding
rule
is
shown
(b).
The
PCR
result
is
shown
(c).
The
DNA-based
ciphertext
and
corresponding
plaintext
is
shown
(d)
(Cachin,
1998).
..................................
35
Figure
29.
Computational
schema
of
one
YAEADNA
encryption
round
(Amin,
et
al.,
2006).
......
36
Figure
30.
Pearson
correlation
analysis
between
ciphertext
and
corresponding
plaintext
(Amin,
et
al.,
2006).
..........................................................................................................................
39
vi
Figure
31.
Test
image
results
(not
to
scale)
before
and
after
processing
[from
left
to
right]
of
the
underlying
data
to
show
randomness
of
octet
distribution
within
a
given
DNA
sequence
(Amin,
et
al.,
2006).
..............................................................................................................
39
Figure
32.
Flowchart
of
DNA-based
Playfair
algorithm
(Sabry,
et
al.,
2010).
...............................
41
Figure
33.
Sample
encryption
processing
(Sabry,
et
al.,
2010).
....................................................
43
Figure
34.
Sample
decryption
processing
(Sabry,
et
al.,
2010).
....................................................
44
Figure
35.
Structure
of
the
encoder
and
decoder
for
a
Hamming
code.
(Bandakkanavar,
(2010).
..............................................................................................................................................
47
Figure
36.
A
basic
one-time
pad
encryption/decription
schema
(Stallings,
2010).
......................
48
Figure
37.
A
DNA-based
one-time
pad
A,
plain
text
B,
cipher
text
and
primer
DNA
polymerase
primer
(black
box),
(Wong,
et
al.,
2003).
..............................................................................
48
Figure
38.
Example
of
DNA
tiles
carrying
data
representing
binary
1
(light
blue),
binary
0
(white),
start
block
or
attachment
point
(dark
blue
fading
to
lower
right)
and
end
marker
or
attachment
point
(dark
blue
fading
to
upper
left).
The
sequences
can
be
ligated
(attached
by
an
end
or
ends)
to
longer
DNA
strands
by
using
the
start
and
stop
points,
or
sticky
ends
(Leier,
et
al.,
2000).
..........................................................................................
49
Figure
39.
Assemby
of
DNA
binary
strands.
Strands
are
composed
of
shorter,
concatenated
tiles
by
overlapping
sticky
start
(s)
and
end
(e)
terminators
with
arbitrary
quantities
of
DNA
bits
in
between.
Bit
strands
containing
up
to
32
bits
were
yielded
in
this
process
(Leier,
et
al.,
2000).
...............................................................................................................
49
Figure
40.
Example
of
gel-electrophoresis
(Electrophoresis,
2011).
............................................
50
Figure
41.
Self
assembly
schema
of
a
random
DNA
tile
assembly
(Hirabayashi,
et
al.,
2010).
....
50
vii
Figure
42.
Scaffold
construction
schema
by
sample
input
message:
M
=
00011011
(Hirabayashi,
et
al.,
2010).
..........................................................................................................................
51
Figure
43.
Encryption
scheme
flowchart
(Cui,
2008).
...................................................................
51
Figure
44.
Data
pre/post
treatment
schema
(Cui,
2008).
.............................................................
52
Figure
45.
Genalysis
proprietary
microchip
for
real-time
DNA
analysis
(Toumazou,
et
al.,
2012).
....................................................................................................................................
55
Figure
46.
Excitation
schema
of
Genalysis
proprietary
ISFET
mechanism
(Toumazou,
et
al.,
2012).
....................................................................................................................................
56
Figure
47.
Schema
of
hydrogen
ion
release
upon
extending
an
existing
DNA
strand
with
one
or
more
additional
nucleotides
(Toumazou,
et
al.,
2012).
........................................................
56
Figure
48.
Genalysis
external
testing
process
overview
(Toumazou,
et
al.,
2012).
....................
57
Figure
49.
Schematic
representation
of
single
molecule
experiment
to
detect
malachite
green
adsorbed
on
a
planar
metal
surface
(Neacsu,
et
al.,
2006).
.................................................
57
Figure
50.
Scanning
electron
microscopy
image
of
a
wrinkled
Raman-active
gold
surface
(Zhang,
et
al.,
2011).
..........................................................................................................................
58
viii
TABLE
OF
TABLES
Table
1.
Amino
acids,
corresponding
codes
and
codons
that
initiate
their
generation.
..............
19
Table
2.
ASCII
table
of
Western
type
characters.
.........................................................................
20
Table
3.
Comparative
sequential
numbering
represented
in
Quaternary,
Binary
and
Decimal.
..
21
Table
4.
Viable
lifespans
of
different
types
of
digital
data
storage
mediums
(Conway,
1996).
...
22
Table
5.
Viable
type
character
densities
over
a
variety
of
mediums
(Conway,
1996).
.................
23
Table
6.
Data
bit
densities
across
a
variety
of
mediums
(Church,
Kosuri,
2012).
.........................
23
Table
7.
Raw,
uncorrected
data
bit
error
rates
across
a
variety
of
mediums.
.............................
24
Table
8.
Associations
of
DNA
nucleotide
bases
to
binary
number
equivalents.
..........................
33
Table
9.
Plain
text
character
frequency
in
DNA
strand
(Amin,
et
al.,
2006).
................................
38
Table
10.
DNA
representation
of
bits
(Sabry,
et
al.,
2006).
..........................................................
40
Table
11.
Amino
acids
and
their
corresponding
64
codons
(Sabry,
et
al.,
2006).
.........................
41
Table
12.
Distribution
of
the
alphabet
with
corresponding,
interchangeable
codons
(Sabry,
et
al.,
2010).
..............................................................................................................................
42
Table
13.
New
distribution
of
the
alphabet
with
corresponding
codons,
after
application
of
the
Playfair
Cipher
(Sabry,
et
al.,
2010).
.....................................................................................
43
Table
14.
Performance
results
of
experiment
processing
stages
(Sabry,
et
al.,
2010).
................
46
Table
15.
Associations
of
DNA
nucleotide
bases
to
binary
number
equivalents.
........................
53
INTRODUCTION
In
the
area
of
information
security,
the
need
for
reliable
protection,
storage
and
consumption
of
sensitive
data
has
been
codified
into
the
CIA
Triad:
Confidentiality,
Integrity
and
Availability.
In
this
context,
confidentiality
implies
two
overarching
classes
of
information
consumer:
those
subscribers
that
merit
access
to
a
given
data
set
within
given
circumstances,
and
those
that
do
not.
Integrity
in
this
context
infers
a
reliable
lack
of
adulteration
of
any
given
data
set.
Availability
suggests
a
minimum
standard
of
secure,
reliable
functionality
in
all
conduit
systems
serving
up
any
given
data
set
to
vetted
subscribers
of
same
(Confidentiality,
Integrity
&
Availability,
2009).
Mankind
has
developed
a
wide
variety
of
methods
over
time
to
enforce
and
make
available
the
elements
of
the
Triad.
But
with
each
advance,
the
opportunity
to
render
that
advance
null
offers
itself
(Mathai,
2012)
(Ribeiro,
2012).
As
modern,
workhorse
cryptography
algorithms
such
as
DES,
and
more
recently,
MD5)
are
broken
or
become
suspect,
new
solutions
for
information
security
are
being
sought
out
to
offer
protection
for
sensitive
data
(Bardou,
et
al.,
2012).
As
a
result,
DNA-based
digital
data
storage,
cryptography
and
steganography
are
emerging
as
technologies
with
significant
potential
for
more
capable
versions
of
existing
encryption
implementations
as
well
as
new
and
potentially
unbreakable
algorithms
(Cui,
et
al.,
2009).
10
HISTORICAL
PERSPECTIVE
Modern
data
protection
has
its
roots
in
all
manner
of
mechanical,
arithmetical
and
mathematical
constructs
(Ribeiro,
2012).
In
~700BC,
the
Spartans
and
Greeks
used
scytales
to
send
sensitive
battle-related
military
messages
back
and
forth.
Both
sender
and
receiver
possessed
identical,
hexagonal
sticks,
around
which
a
strip
of
leather
or
parchment
was
wrapped
in
a
spiral.
The
sender
would
write
out
a
message
on
the
spiral-wound
material,
unwind
it
and
send
it
along
to
the
intended
receiver,
who
upon
receipt
would
wrap
the
material
around
a
stick
of
identical
proportion
in
order
to
decode
the
message
(Ribeiro,
2012).
Figure
1.
Example
of
a
Spartan-Greek
scytale
(Ribeiro,
2012).
Herodotus
refers
to
an
example
of
steganography
in
his
book
of
~440BC,
History
in
which
the
character
Demaratus
sent
a
warning
about
a
forthcoming
attack
to
Greece
by
writing
it
directly
on
the
wooden
backing
of
a
wax
tablet
before
covering
the
backing
over
with
the
intended
beeswax
writing
surface
(Arnold,
2000).
11
Figure
2.
Example
of
a
Demaratus-like
steganographic
tablet.
The
wax
layer
is
removed
and
text
imprinted
(and
colored
for
demonstration
purposes)
on
the
wood
substrate.
The
wax
is
then
replaced
(and
inked
in
white
for
demonstration
purposes)
(Schovanek,
2010).
Julius
Caesar
employed
a
simple
substitution
cipher
~40BC
to
secure
military
communications.
This
relatively
weak
encryption
method
shifts
each
plaintext
character
a
certain
number
of
spaces,
using
the
same
variable
for
each
shift
(Mathai,
2012).
Figure
3.
Example
of
a
simple
character
substitution
schema
(Stallings,
2010).
In
1467,
Leon
Battista
Alberti
developed
what
appears
to
be
have
been
the
first
polyalphabetic
substation
cipher.
The
underlying
mechanism
consisted
of
two
concentric
metal
discs
of
differing
diameters,
allowing
for
the
alignment
of
characters
in
the
plaintext
and
ciphertext
(Ribeiro,
2012).
Figure
4.
Example
of
an
Alberti
Cipher
wheel
(Ribeiro,
2012).
Thomas
Jefferson
invented
what
is
known
as
the
Jefferson
Wheel
in
1797.
The
wheel
consisted
of
26
discs
on
a
shared
spindle,
with
the
letters
of
the
alphabet
embossed
on
their
rims.
Turning
the
discs
would
scramble
and
unscramble
the
plaintext
message
(Ribeiro,
2012).
12
Figure
5.
Example
of
a
Jefferson
wheel
(Ribeiro,
2012).
Auguste
Kerckhoffs
was
a
military
cryptologist
who
proposed
a
sea
change
[at
the
time]
in
the
thinking
about
contemporary
encryption
practices
in
the
late
19th
century.
Kerckhoffs
suggested
that
an
encryption
algorithm
should
be
assumed
to
be
known,
and
that
the
key
alone
should
be
assumed
to
be
secret.
In
this
way,
if
a
key
is
compromised,
it
can
be
changed,
reestablishing
secure
communications
without
the
need
to
abandon
the
encryption
method
itself
(Mathai,
2012).
Figure
6.
A
basic
one-time
pad
encryption/decription
schema
(Stallings,
2010).
The
One
Time
Pad
(OTP)
is
a
technique
that
offers
perfect
secrecy'
by
using
a
truly
random
key
only
once
per
communication,
whose
length
is
the
same
as
the
plaintext
in
question.
While
the
initial
theory
has
been
attributed
to
a
paper
written
by
Frank
Miller
in
1882,
the
practical
implementation
of
the
OTP
is
attributed
to
U.S.
Army
Signal
Corp.
Commander
Joseph
Mauborgne
and
Bell
Labs
Gilbert
Vernam
some
thirty-five
years
later.
This
approach
offers
perfect
secrecy
because
even
if
a
single
key
is
compromised,
it
does
not
reveal
anything
about
future
or
past
transmissions.
The
strength
of
the
technique
lies
in
randomness
and
one
time
use
of
the
keys
(Stallings,
2010).
13
Figure
7.
Example
of
an
Enigma
machine
(Ribeiro,
2012).
Invented
by
Arthur
Scherbius
~1943,
Enigma
was
Germany's
main
cryptographic
technology
during
WW
II.
It
consisted
of
a
basic
keyboard,
a
display
that
would
reveal
the
cipher
text
letter
and
a
scrambling
mechanism
such
that
each
plain
text
letter
entered
as
input
via
the
keyboard
was
transcribed
to
its
corresponding
cipher
text
letter
(Mathai,
2012).
Figure
8.
Example
of
MIT
CTSS
system
(Lelusz,
2009).
MITs
Compatible
time-Sharing
System
(CTSS)
was
the
first
known
system
to
employ
a
formal
username/password
combination
to
control
access.
In
addition,
CTSS
may
have
been
the
first
system
to
be
compromised
by
a
password
breach.
In
1966,
a
software
bug
resulted
in
the
exposure
of
the
CTSS
master
password
table
(Ribeiro,
2012).
14
Figure
9.
DES
encryption
schema
(Smyth,
2007).
The
National
Bureau
of
Standards
(NBS)
developed
the
Data
Encryption
Standard
(DES)
in
1979,
using
what
was
then
state
of
the
art,
56-bit
encryption.
Not
even
supercomputers
of
the
time
could
crack
DES,
which
remained
a
standard
for
nearly
twenty
years
until
it
was
broken
in
~fifty-six
hours
in
1998
by
the
Electronic
Freedom
Foundation.
Cable
television
content
providers
HBM,
Cinemax
and
other
introduced
the
Videocipher
II
in
1985,
which
was
a
video
scrambling
system
based
on
DES
(Ribeiro,
2012).
15
Figure
10.
DES
encryption
schema
(Stallings,
2010).
The
National
Institute
of
Standards
and
Technology
(NIST)
published
the
Advanced
Encryption
Standard
(AES)
in
2001.
AES
uses
128-bit
encryption
and
is
estimated
that
cracking
it
would
require
255
(more
than
36
quadrillion)
years
to
accomplish,
using
non-quantum
computing
resources.
AES
is
actively
in
use
today
(Stallings,
2010).
16
Figure
11.
Schematic
representation
of
a
single
DNA
nucleotide
or
monomer.
DNA
nucleotides
do
not
generally
exist
in
nature
as
freestanding
molecules,
but
more
commonly
pair
off,
forming
a
twisting
or
helix
structure
as
they
do
so.
DNA
monomers
pair
off
by
virtue
of
mutual,
complimentary
attraction
(hydrogen
bonding)
between
connective
points
to
form
polymers,
or
17
Figure
12.
Organic
representation
(dyed)
of
a
segment
of
DNA
polynucleotide
structure
(Van
Voorst,
Finzel,
2012).
Codons
Abstractly
speaking,
a
strand
of
DNA
is
roughly
equivalent
to
a
tape
measure
onto
which
groupings
of
symbols
(bases)
have
been
printed
in
a
repetitive
pattern
a
great
many
times.
Through
a
lengthy
and
complex
process,
these
bases
are
eventually
translated
into
chains
of
amino
acids
(Shimanovsky,
et
al.,
2003).
18
Nucleotide
codon
GCT
GCC
GCA
GCG
CGT
CGC
CGA
CGG
AGA
AGG
ATT
AAC
GAT
GAC
TGT
TGC
CAA
CAG
GAA
GAG
GGT
GGC
GGA
GGG
CAT
CAC
ATT
ATC
ATA
TTA
TTG
CTT
CTC
CTA
CTG
AAA
AAG
ATG
TTT
TTC
CCT
CCC
CCA
CCG
TCT
TCC
TCA
TCG
AGT
AGC
ACT
ACC
ACA
ACG
TGG
TAT,
TAC
GTT
GTC
GTA
GTG
Random
codon
from
D
and
N
Random
codon
from
E
and
Q
Random
codon
TAA
TAG
TGA
Table 1. Amino acids, corresponding codes and codons that initiate their generation.
In
this
scenario,
each
subgroup
of
triplet
bases
(codons)
is
treated
as
being
representative
of
a
character
in
a
finite
alphabet.
This
is
not
unlike
the
use
of
groups
of
bits
of
computer
data
to
represent
the
characters
of
the
Western
alphabet.
In
fact,
the
amino
acid
code
and
codon
table
bears
a
strong
visual
and
functional
resemblance
to
the
ASCII
table
of
Western
type
characters
(Shimanovsky,
et
al.,
2003).
19
Character
A
B
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
Decimal
number
65
66
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
Binary
number
0100
0001
0100
0011
0100
0100
0100
0101
0100
0110
0100
1000
0100
1000
0100
1001
0100
1010
0100
1011
0100
1100
0100
1101
0100
1110
0100
1111
0101
1111
0101
0001
0101
0010
0101
0011
0101
0100
0101
0101
0101
0110
0101
0111
0101
1000
0101
1001
0101
1011
DNA
is
in
essence
then,
a
natural
platform
for
the
encoding
of
data,
with
the
lowest
coding
denominator
being
the
codon.
These
natural
groupings
are
referred
to
as
codons:
a
form
of
minimal
codeword
from
which
more
complex
instruction
sets
are
derived.
Codons
are
the
fundamental
unit
of
code
storage
within
DNA,
comprised
of
discrete
triplets
of
nucleotides
representing
data
values
based
on
the
sequence
within
each
triplet
(Shimanovsky,
et
al.,
2003).
20
Figure
13.
Nucleotide
sequences
dictating
specific
amino
acid
outcomes.
Adenine
(A),
Cytosine
(C),
Thymine
(T)
and
Guanine
(G)
form
the
basis
for
a
quaternary
or
Base4
numbering
system.
Quaternary
10
11
12
13
20
21
22
23
Binary
10
11
100
101
110
111
1000
1001
1010
1011
Decimal
10
11
111
112
Quaternary
30
31
Binary
1100 1101 1110 1111 10000 10001 10010 10011 10100 10101 10110
Decimal
12
13
32
14
33
15
100
16
101
17
102
18
103
19
110
20
21
22
Longevity
of
DNA
Data
at
rest
remains
accessible
[and
therefore
of
value]
to
the
extent
that
the
storage
medium
in
which
it
is
contained
remains
both
viable
and
accessible.
The
ideal
in
terms
of
professional
archiving
is
for
a
given
storage
medium
to
remain
viable
and
accessible
ad
infinitum
(Conway,
1996).
DNA
has
stood
the
test
of
time,
in
that
it
has
maintained
the
same
underlying
molecular
construct
during
the
billions
of
years
since
life
emerged
(Bancroft,
et
al.,
2001).
21
Medium
Clay
Tablet
Papyrus
Printed
Book
Newspaper
Magnetic
Disk
Optical
DNA
Table 4. Viable lifespans of different types of digital data storage mediums (Conway, 1996).
Stored
under
favorable
conditions,
DNA
can
exhibit
in
an
effectively
and
practically
limitless
shelf
life.
Microbiologist
Raul
Cano
announced
in
1995,
the
retrieval
of
two
different,
viable
(able
to
be
made
to
reproduce),
bacterial
spores
found
in
a
piece
of
amber
of
Dominican
the
origins
of
which
are
thought
to
be
25
to
40
million
years
old
(Lambert,
et
al.,
1998).
Figure
14.
Electron
photomicrograph
of
bacterium
isolated
from
~25
million-year-old
Dominican
amber
(Orkand,
et
al.,
1998)
22
Medium
Clay
Tablet
Papyrus
Printed
Book
Newspaper
Magnetic
Disk
Optical
DNA
Table
5.
Viable
type
character
densities
over
a
variety
of
mediums
(Conway,
1996).
*
Extrapolated
from
Conways
figures
for
characters
per
square
inch
using
the
average
caliper-measured
thickness
of
the
medium
in
question.)
**
Not
part
of
Conways
figures.
The
data
densities
that
are
currently
possible
with
DNA
vastly
exceed
comparable
volumetric
storage
capacities
of
current
electronic,
magnetic,
optical
and
experimental
media,
making
it
a
potentially
attractive
storage
medium
for
encrypted
archival
data
(Church,
Kosuri,
2012).
Medium
Compact
Disk
(CD)
DVD-QL
Blu-Ray
(QL)
Oracle
StorageTek
Magnetic
Tape
Hard
Disk
12-atom
memory
Xe
positioning
Quantum
Holography
Super-resolution
GFP
DNA
in
E.
coli
Mycoplasma
Single
Strand
DNA
(ssDNA)
Date
1982
2000
2010
2010
2012
2012
1991
2008
2011
2005
2010
2012
Table 6. Data bit densities across a variety of mediums (Church, Kosuri, 2012).
Based
on
a
demonstrable
data
density
approaching
5.5
million
billion
bits
(petabits)
per
cubic
millimeter,
all
the
data
in
the
world
could
theoretically
be
stored
using
several
grams
of
DNA
(Church,
Kosuri,
2012).
23
Medium
Optical
Compact
Disc
In
vitro
(microarray)
DNA
In
vivo
(live)
cell
DNA
Magnetic
Disk
Subsystem
Table 7. Raw, uncorrected data bit error rates across a variety of mediums.
Uncorrected
data
error
rates
in
DNA
are
also
favorable
when
compared
to
current
data
storage
technologies.
Because
DNA
as
a
storage
medium
permits
each
segment
of
information
to
be
stored
in
an
enormous
number
of
identical
molecules,
the
resulting
redundancy
tends
to
mitigate
data
losses
that
could
foreseeably
take
place
due
to
progressive,
random
decay
(Bancroft,
et
al.,
2001).
In
the
case
of
the
2012
Church
Kosuri
experiment
in
which
an
entire
book,
including
images
and
formatting,
was
written
to
DNA,
the
observable
error
rate
was
10
bits
per
5.7
million
bits
encoded;
a
factor
of
approximately
1.8x10-6,
with
the
bulk
of
those
errors
occurring
primarily
in
areas
not
containing
payload
information
(splice
areas),
making
the
effective
error
rate
lower
still
(Church,
Kosuri,
2012).
24
By
comparison,
the
raw
bit
error
rate
on
a
typical
optical
compact
disc,
which
stores
data
at
a
significantly
lower
bit
density
is
between
10-5
to
10-6,
before
electronic
correction.
That
is
to
say,
the
demonstrated,
uncorrected
(raw)
bit
error
rate
in
the
Church
book
experiment
is
nearly
two
times
lower
that
the
raw
bit
error
rate
of
a
typical
optical
compact
disc
before
electronic
correction
(National
Chung
Hsing
University,
2012).
DNA
Microarrays
In
the
context
of
digital
data
storage
and
retrieval,
DNA
digital
data
microarrays
are
composed
of
typically
planar
(flat)
substrates
onto
which
an
extremely
find
grid
of
DNA
material
is
deposited
in
the
form
of
dots,
accompanied
by
chemistry
required
to
enable
the
process
(Chan,
et
al.,
2009).
Figure
15.
DNA
microarray
showing
magnification
of
a
subsection
(DNA
Microarray
Virtual
Lab,
2012).
A
microarray
substrate
may
be
composed
of
glass,
silicon
chips
or
nylon
membranes.
The
DNA
is
then
printed,
spotted,
or
synthesized
directly
onto
the
substrate
in
the
form
of
DNA
probes.
In
this
context,
a
DNA
probe
is
a
relatively
small
(stretches
of
50-100
nucleotides),
single-stranded
molecule
of
DNA
bases
(Microarrays
Factsheet,
2007).
Figure
16.
Schematic
of
oligonucleotide
probes
within
a
DNA
microarray.
25
Figure
17.
DNA
microarray
droplet
depositing
from
inkjet-like
printhead
(Lausted,
et
al.,
2004).
The
printing
system
is
enhanced
by
control
systems
that
ensure
the
intended
series
of
nucleotide
dots
are
deposited
on
the
glass
substrate,
thereby
limiting
catastrophic
data
storage
media
errors.
Using
this
printing
process,
researchers
have
been
able
to
synthesize
(print)
oligonucleotide
probes
(stretches
of
50-100
nucleotides)
in
a
manner
identical
to
that
of
using
an
inkjet
printer
to
print
words
on
a
page
of
paper.
26
Figure
18.
DNA
microarray
printing
platform
(Lausted,
et
al.,
2004).
Church
and
his
team
converted
an
html-coded
draft
of
a
book
that
included
53,426
words,
11
JPG
images
and
1
JavaScript
program
into
a
5.27
megabit
bit
stream.
That
bit
stream
was
then
encoded
onto
54,898
oligonucleotides
(oligos)
each
of
which
in
turn
encoded
a
96-bit
data
block,
a
19-bit
address
specifying
the
location
of
the
data
block
in
the
bit
stream,
and
flanking
common
sequences
(start
and
stop
blocks)
for
amplification
and
sequencing.
Figure
19.
DNA
information
storage
decoding/encoding
schema
(Church,
et
al.,
2012).
To
read
the
encoded
book,
Churchs
team
applied
limited-cycle
Polymerase
Chain
Reaction
(PCR)
amplification
to
the
oligos
in
order
to
produce
a
pool
of
DNA
replications
large
enough
to
draw
from.
Developed
by
Kary
Mullis
in
the
early
1980s,
PCR
is
a
biochemical
process
whereby
one
or
more
strands
of
DNA
is
amplified
by
several
orders
of
magnitude,
producing
thousands
to
millions
of
copies
of
a
particular
DNA
sequence
(Bartlett,
Stirling,
2003).
The
results
were
then
sequenced,
decoded
and
read
back
on
an
Illumina
HiSeq
DNA
sequencer.
27
Figure
20.
Illuma
2000
real
time
DNA
sequencer
used
to
read
Church
and
teams
encoded
microarray
(Church,
et
al.,
2012).
In
this
application,
the
sequencer
plots
nucleotide
sequences
detected
in
one
'lane'
of
a
sample.
The
sequencer
computer
interprets
and
prints
the
nucleotide
sequence
across
the
top
of
the
plot
(How
do
we
Sequence
DNA?
(2012).
Figure
21.
Sample
DNA
sequencer
output
of
nucleotide
readings
from
a
single
land
fragment
(How
do
we
sequence
DNA?,
2012).
Once
the
sequence
readings
were
retrieved
and
decoded,
Church
and
his
team
recovered
all
data
blocks,
with
a
total
of
10
bit
errors
out
of
5.27
million,
which
were
predominantly
located
within
non-information
bearing
areas
of
the
sequences
(Church,
Kosuri,
2012).
The
bit
density
of
Churchs
effort
approaches
5.5
petabits,
or
1
million
billion
bits,
per
cubic
millimeter.
This
level
of
data
concentration
within
the
given
physical
footprint
effectively
eliminates
near-term
concerns
as
to
the
amount
of
data
and
information
that
can
be
coded
into
a
DNA
construct.
28
Figure
22.
Example
of
a
UPC
barcode.
These
barcodes
can
then
be
quickly
processed
from
thousands
of
specimens
and
unambiguously
analyzed.
In
relatively
prolific
use
since
2000,
DNA
barcoding
is
able
to
produce
a
level
of
granularity
such
that,
what
was
once
thought
to
be
one
species
of
butterfly
is
really
ten
species
of
butterflies,
demonstrating
the
level
of
granularity
possible
(Hollingsworth,
et
al.,
2009).
29
Figure
23.
Example
DNA
barcoding
of
animals
and
insect
species
and
the
subtle
variations
between
seemingly
identical
species
(Luoma,
2008).
The
International
Barcode
of
Life
(iBOL)
organizes
collaborators
from
more
than
150
countries
to
participate
in
a
variety
of
campaigns
to
census
diversity
among
plant
and
animal
groups.
The
10-year
Census
of
Marine
Life,
completed
in
2010,
provided
the
first
comprehensive
list
of
more
than
190,000
marine
species
and
identified
6,000
potentially
new
species
(Hollingsworth,
et
al.,
2009).
DNA
barcoding
relies
on
short,
highly
variable
regions
of
the
mitochondrial
and
chloroplast
genomes;
the
complete
sets
of
genetic
material
present
living
cells.
With
thousands
of
copies
per
cell,
mitochondrial
and
chloroplast
sequences
are
readily
amplified
by
polymerase
chain
reaction
(PCR),
even
from
very
small
or
degraded
specimens
(Hollingsworth,
et
al.,
2009).
A
sample
of
tissue
is
collected,
preserving
the
specimen
whenever
possible
and
noting
its
geographical
location
and
local
environment.
A
small
leaf
disc,
a
whole
insect,
or
a
sample
of
muscle
are
suitable
sources.
DNA
is
extracted
from
the
tissue
sample,
and
the
barcode
portion
of
the
DNA
is
amplified
by
PCR.
The
sequencing
results
are
then
used
to
search
a
DNA
database
(Hollingsworth,
et
al.,
2009).
30
Figure
24.
Graphic
schema
of
a
seven-node
Hamiltonian
Path
Problem
showing
fourteen
possible
routes,
with
the
redline
representing
the
only
correct
path
(Chen,
Wood,
2000).
Whereas
modern
day,
electrically
supplied
computers
produce
electrical
impulses
to
represent
bits
of
information,
the
DNA
computer
examines
the
patterns
of
combination
or
growth
of
nucleotide
molecules
or
strings
(probes).
DNA
computers
use
the
bases
A,
C,
G
and
T
as
their
memory
units,
along
with
recombinant
DNA
techniques
to
carry
out
the
fundamental
operations.
A
program
on
a
DNA
computer
is
executed
as
a
series
of
biochemical
operations
which
have
the
effect
of
synthesizing,
extracting,
modifying
and
cloning
the
DNA
strands
in
question
(Boneh,
et
al.,
1995).
Breaking
DES
Using
a
Molecular
Computer
By
observing
and
building
on
Adelmans
Hamiltonian
Path
work,
these
DNA
computing
principles
were
applied
to
the
task
of
breaking
the
Data
Encryption
Standard
(DES)
on
a
theoretical
basis
using
a
molecular
(DNA-based)
computer
as
early
as
1995.
Based
on
the
availability
of
a
plaintext-
ciphertext
pair,
DES
was
predicted
to
be
recoverable
within
four
months
from
experiment
beginning
to
end.
Additionally,
the
parallel
processing
characteristics
of
the
DNA
computer
model
made
it
likely
that
even
with
only
a
pool
of
plaintext
candidates
to
choose
from
(with
no
1:1
match),
it
would
still
be
possible
to
recover
the
DES
key
in
question
in
the
same
period
of
time
(Boneh,
et
al.,
1995).
31
Figure
25.
DES
circuit
schema
(Boneh,
et
al.,
1995).
32
DNA-BASED
CRYPTOGRAPHY
Through
the
use
of
DNA
computing,
the
DES
cryptographic
protocol
had
been
shown
to
be
capable
of
being
broken
(Boneh,
et
al.,
1995).
As
more,
modern
cryptography
algorithms
are
broken,
new
directions
are
being
sought
to
provide
needed
protect
for
sensitive
data.
Whereas
DNA-based
computing
has
been
used
to
solve
traditional
mathematical
problems,
DNA-based
cryptography
addresses
the
issues
of
using
a
biological
system
as
a
practical
support
for
any
given
cryptographic
system.
Developing
DNA
computing
into
a
viable
cryptography
and
steganography
platform
offers
the
potential
for
a
new
generation
of
powerful
or
even
unbreakable,
algorithms
(Sabry,
et
al.,
2010).
As
of
this
writing,
research
in
DNA
cryptography
remains
represented
more
by
theory
than
practical
application,
constrained
by
high
tech
lab
requirements
and
labor-intensive
procedures.
These
factors
currently
prevent
DNA-based
computing
from
entering
the
mainstream
of
the
current
information
security
environment,
but
are
dissipating
as
limiting
factors
(Sabry,
et
al.,
2010).
DNA
Encoding
DNA-encoded
digital
information
can
be
copied
such
that
the
possibility
of
successful
theft
of
intellectual
property
stored
in
this
manner
is
relatively
low
(Shimanovsky,
et
al.,
2003).
In
its
most
simplistic
form,
adding
or
hiding
digital
data
in
a
DNA
sequence
requires
only
a
flat
encoding
of
2
bits
per
nucleotide.
Nucleotide
Base
Adenine
(A)
Thymine
(T)
Cytosine
(C)
Guanine
(G)
Base2
Numbering
00
01
10
11
Using
this
reference
system,
binary
segments
can
be
added
to
DNA
for
purposes
of
hiding
data,
annotating
a
given
DNA
sequence,
watermarking,
and
so
on.
The
application
of
data
compression,
data
encryption
and
checksum
verification
on
such
data
would
work
in
the
same
fundamental
manner
in
which
compression,
encryption
and
verification
work
in
an
electronically-based
digital
data
storage
and
conveyance
system
(Shimanovsky,
et
al.,
2003).
33
Figure
26.
Function
schema
of
steganographic
algorithms
(Heider,
Barnekow,
2007).
The
first
published
research
on
hiding
an
artificial
message
in
living,
in
situ
DNA
was
authored
by
Taylor
et
al.
in
1999
(Jiao,
Goutte,
2009).
This
research
involved
the
storing
of
a
message
in
a
sample
of
human
DNA.
The
same
researchers
subsequently
published
materials
in
2001
describing
the
potential
for
long-term
storage
of
artificial
information
in
DNA
using
codon
encoding
to
represent
the
Western
alphabet
(Shimanovsky,
et
al.,
2003).
Research
was
published
in
2009
identifying
living,
in
situ
DNA
as
a
viable
medium
for
long-term
and
ultra
compact
information
storage
and
a
steganographic
message
medium
for
hidden
messages.
In
this
model,
message-encoded
artificial
DNA
is
added
to
the
genome
of
one
or
more
living
organisms,
such
as
common
bacteria.
This
approach
yields
a
medium
for
the
storage
and
conveyance
of
very
high
densities
of
information.
These
information
sets
can
take
the
form
of
digital
watermarks,
secure
public
keys
for
decrypting
hidden
information
in
steganocryptography,
and
so
forth
(Shimanovsky,
et
al.,
2003).
34
Along
with
the
camouflaging
effect
afforded
by
the
cover
medium,
information
encoding
can
be
used
to
secure
the
message
in
a
further
effort
to
conceal
its
contents
from
unauthorized
access
(watermarking).
Encoding
and
decoding
of
that
information
are
generally
based
on
Kerckhoffs
Principle,
which
states
that
the
security
of
a
cryptosystem
should
depend
solely
on
the
secrecy
of
the
respective
key
and
the
respective
private
randomizer
(Cayre,
Bas,
2008).
Figure
27.
Schema
of
a
secret-key
stegosystem
with
passive
adversary
showing
embedded
text
E,
covertext
C,
stegotext
S,
Alices
private
random
source
R,
and
secret
key
K
shared
by
Alice
and
Bob,
with
Alice
sending
either
covertext
C
or
stegotext
S
(Cachin,
1998).
The
mechanisms
behind
DNA-based
steganographic
watermarking
are
a),
using
traditional
encryption
key-based
techniques
on
the
target
information
and
b),
concealing
and
encrypting
the
target
information
in
large
numbers
of
irrelevant
DNA
sequence
chains.
This
approach
makes
it
extraordinarily
difficult
to
ascertain
the
correct
beginning
and
end
points
of
target
information,
thereby
making
it
difficult
(in
combination
with
encryption)
to
decode
as
a
result.
Only
a
recipient
with
advance
knowledge
of
the
correct
DNA
indexing
and
decryption
information
would
be
able
to
reliably
find
the
correct
DNA
fragment,
thereby
decoding
the
message
(Cui,
2009).
Figure
28.
A
basic
steganographic
implementation
in
DNA.
The
message
synthesizing
process
is
shown
(a).
The
encoding
rule
is
shown
(b).
The
PCR
result
is
shown
(c).
The
DNA-based
ciphertext
and
corresponding
plaintext
is
shown
(d)
(Cachin,
1998).
35
Figure
29.
Computational
schema
of
one
YAEADNA
encryption
round
(Amin,
et
al.,
2006).
36
Inputs
Psuedo-Code
The
inputs
for
this
model
are:
A (plaintext character)
RND (G)
Algorithm
Psuedo-Code
1. In
a
plain
text
file,
each
character
is
sequentially
replaced
by
its
corresponding
ASCII
code:
A ASCII [A]
3. Starting
from
a
random
location
in
a
binary
file
contained
within
a
single
strand
DNA
sequence,
a
search
is
performed
for
a
quadruple
DNA
sequence
representing
a
plain
text
character.
This
sequence
has
the
same
nucleotides
sequence
as
the
ASCII
code
of
the
respective
plain-text
character.
4. A
sequential
search
is
performed
starting
from
a
random
location
X:
X RND [G]
5. If
the
desired
pattern
is
found,
its
location
is
then
written
to
a
pointer
locations
(PTR)
output
file:
6. Repeat the procedure for all other characters, beginning with Step 2
Output
Psuedo-Code
The
output
for
this
model
are
one
or
more
pointer
coordinates
of
found
quadruple
DNA
nucleotides
sequences
representing
the
binary
octets
in
question
(Amin,
et
al.,
2006).
37
23618
25392
27895
34295
34984
35244
39016
40134
41598
43075
43545
46915
47593
47631
50643
50722
51632
60091
61618
62262
62918
91953
92378
118728
130831
150386
150665
[-]
[,]
[s]
[2]
[p]
[9]
[>]
[C]
[L]
[5]
[M]
[!]
[U]
[']
[E]
[/]
[.]
[B]
[G]
[K]
[:]
[S]
[I]
[%]
[4]
[)]
[Q]
Frequency in DNA
Character
Frequency in DNA
164485
164592
166484
167323
171488
171570
185988
187115
190084
194813
198309
199178
202537
208559
210222
210841
216379
216498
217607
217891
223088
223518
224830
225581
225977
229001
229883
[P]
[7]
[N]
[;]
[?]
[3]
[_]
[8]
[]]
[y]
[O]
[x]
[D]
[t]
[u]
[W]
[(]
[z]
[R]
[T]
[0]
[@]
[w]
["]
[^]
[J]
[]
[H]
231686
247484
249825
253181
253181
259677
261141
264718
265877
267487
274382
277367
281713
282656
287690
291964
306117
313004
313137
313743
313988
332166
336562
338352
347070
349870
391919
391919
Table 9. Plain text character frequency in DNA strand (Amin, et al., 2006).
The
full
text
of
the
novel,
Uncle
Toms
Cabin
was
used
as
the
basis
of
the
encryption
effort.
The
table
above
illustrates
the
82
characters
occurring
in
the
novel,
along
with
their
frequency
of
occurrence
(Amin,
et
al.,
2006).
A
correlation
analysis
was
performed
between
randomly
selected
locations
for
all
1,015,120
characters
of
the
plaintext
in
an
effort
to
determine
indirect
relationships
(Amin,
et
al.,
2006).
38
Figure
30.
Pearson
correlation
analysis
between
ciphertext
and
corresponding
plaintext
(Amin,
et
al.,
2006).
The
results
of
the
Pearson
correlation
analysis
indicate
a
majority
of
the
locations
in
the
plaintext
lack
any
significant
relationship
to
their
respective
locations
in
the
related
DNA
sequence,
with
71
characters
having
a
correlation
coefficient
between
-0.1
and
.1,
and
the
other
11
characters
having
a
factor
between
-0.12
and
0.48
(Amin,
et
al.,
2006).
The
encryption
process
was
subsequently
tested
on
images
in
order
to
exemplify
the
randomness
of
the
encrypted
octet
locations
within
the
respective
DNA
sequence
(Amin,
et
al.,
2006).
Figure
31.
Test
image
results
(not
to
scale)
before
and
after
processing
[from
left
to
right]
of
the
underlying
data
to
show
randomness
of
octet
distribution
within
a
given
DNA
sequence
(Amin,
et
al.,
2006).
39
The
350X258
pixel
image
above
and
to
the
left
was
used
as
the
test
subject.
The
test
image
data
was
then
translated
into
a
single
DNA
strand.
The
locations
of
the
DNA
octets
were
cataloged
and
inserted
into
an
88,410,189
long
nucleotide
sequence.
The
locations
data
were
then
retrieved
and
reshaped
into
a
350X258
pixel
matrix
and
their
values
rescaled
to
represent
a
full
range
color
map.
The
result
of
this
latter
effort
is
seen
in
the
image
above
and
to
the
right
(Amin,
et
al.,
2006).
Bit 2
DNA
0
0
1
1
0
1
0
1
A
C
G
T
In
addition
to
the
DNA
bases,
their
twenty
respective
amino
acid
elements
were
employed.
Three
additional
codons
were
used
to
represent
the
stopping
point
for
the
coding
region
(Sabry,
et
al.,
2006).
40
Amino
acid
Alanine/A
Arginine/R
Alanine/A
Arginine/R
Asparagine/N
Aspartic
acid/D
Cysteine/C
Glutamine/Q
Glutamic
acid/E
Glycine/G
Histidine/H
Isoleucine/I
START
Nucleotide
codons
GCU,
GCC,
GCA,
GCG
CGU,CGC,
CGA,CGG,
AGA,AGG
GCU,
GCC,
GCA,
GCG
CGU,CGC,
CGA,CGG,
AGA,AGG
AAU,
AAC
GAU,
GAC
UGU,
UGC
CAA,
CAG
GAA,
GAG
GGU,
GGC,
GGA,
GGG
CAU,
CAC
AUU,
AUC,
AUA
AUG
Amino
acid
Methionine
Phenylalanine
Leucine/L
Lysine/K
Methionine/M
Phenylalanine/F
Proline/P
Serine/S
Threonine/T
Tryptophan/W
Tyrosine/Y
Valine/V
STOP
Nucleotide
codons
ATG
TTT
TTC
UUA,
UUG,
CUU,
CUC,
CUA,CUG
AAA,
AAG
AUG
UUU,
UUC
CCU,
CCC,
CCA,
CCG
UCU,
UCC,
UCA,
UCG,
AGU,
AGC
ACU,
ACC,
ACA,
ACG
UGG
UAU,
UAC
GUU,
GUC,
GUA,
GUG
UAA,
UGA,
UAG
Table 11. Amino acids and their corresponding 64 codons (Sabry, et al., 2006).
Because
this
model
starts
with
the
binary
form
of
the
plaintext,
any
number
or
special
characters
may
be
represented,
in
contrast
to
the
original
Playfair
Cipher,
which
allowed
for
capital
letters
of
the
alphabet
only
(Sabry,
et
al.,
2006).
Figure
32.
Flowchart
of
DNA-based
Playfair
algorithm
(Sabry,
et
al.,
2010).
41
In
the
table
below,
there
are
only
20
amino
acids
along
with
one
START
and
one
STOP,
whereas
25
letters
are
needed
to
construct
the
Playfair
matrix
(I
and
J
are
assigned
to
one
cell).
As
a
result,
the
letters
B,
O,
U,
X,
Z
will
share
certain
codons.
Since
the
START
and
amino
acid
Methionine/M
have
the
same
codon
(AUG),
the
Methionine/M
amino
acid
will
not
be
used.
Counting
the
number
of
codons
of
each
character,
we
will
find
the
number
codons
that
can
be
used
interchangeably
to
represent
a
given
character
varies
between
1
and
4
codons.
This
variable
(between
1
and
4)
represents
the
'ambiguity'
of
the
character,
or
[AMBIG]
variable;
a
significant
modification
to
the
traditional
Playfair
Cipher
(Sabry,
et
al.,
2006).
Table
12.
Distribution
of
the
alphabet
with
corresponding,
interchangeable
codons
(Sabry,
et
al.,
2010).
This
completes
the
assignment
of
codons
required
to
service
the
characters
in
the
English
alphabet,
so
an
amino
acid-based
plaintext
message
can
be
processed
using
the
traditional
Playfair
Cipher
process
using
the
secret
key.
The
output
from
the
Playfair
Cipher
being
applied
to
the
amino
acid
table
represents
the
ciphertext
in
the
table
below
(Sabry,
et
al.,
2006).
42
Table
13.
New
distribution
of
the
alphabet
with
corresponding
codons,
after
application
of
the
Playfair
Cipher
(Sabry,
et
al.,
2010).
The
idea
that
one
character
can
have
more
than
one
DNA
representation
is
itself
an
addition
to
confusion
concept
that
enhances
the
algorithms
strength
(Sabry,
et
al.,
2006).
Figure
33.
Sample
encryption
processing
(Sabry,
et
al.,
2010).
43
As
with
the
traditional
Playfair
Cipher,
the
decryption
process
is
the
inverse
of
the
encryption
process.
There
is
however
the
additional
complication
of
the
[AMBIG]
variable
in
this
DNA-based
version
which
left
untended,
could
make
it
impossible
to
choose
which
codon
to
use
in
relationship
to
its
related
amino
acid
character.
The
problem
of
codon-amino
acid
mapping
is
addressed
here
by
adding
two
additional
identification
bits
for
each
amino
acid
character
indicating
the
correct
codon
to
choose
in
the
decryption
effort.
These
2
bits
can
be
converted
to
DNA
form
using
the
variables
in
Table
10.
Since
the
[AMBIG]
factor
can
range
from
1
to
4,
the
identification
bits
must
be
able
to
represent
a
number
range
from
0
to
3
(Sabry,
et
al.,
2006).
Figure
34.
Sample
decryption
processing
(Sabry,
et
al.,
2010).
Inputs Psuedo-Code
44
Algorithm
Psuedo-Code
1. Prepare
the
secret
key:
Remove
the
spaces
from
[P]
in
order
to
avoid
attacker's
trace
to
a
character
which
is
repeated
many
times
within
the
message
Processing
Psuedo-Code
1. Binary
form
[BP]
=
BINARY
[P]
(Replace
each
character
by
its
binary
representation-8
bits-)
2. Binary
form
[BP]
=
BINARY
[P]
(Replace
each
character
by
its
binary
representation-8
bits-)
3. Amino
acids
form
[AP]
=
AMINO
[DP]
(Replace
each
3
DNA
characters
by
their
Amino
acid
character
keeping
in
track
the
ambiguity
of
each
Amino
acid
[AMBIG]
4. Construct
the
Playfair
5X5
matrix
and
add
[K]
row
by
row,
then
add
the
rest
of
alphabet
characters
5. Amino
acid
of
cipher
text
[AC]=
PLAYFAIR
[AP]
6. DNA
form
of
cipher
text
[DC]
=
DNA
[AC]
Output Psuedo-Code
Add [DC] and [AMBIG] together in the suitable form final cipher text [C]
Experiment
The
first
150
kilobytes
(KB)
of
paragraph
content
of
the
book,
A
Tale
of
Two
Cities
was
selected.
The
secret
key
selected
was
CHARLES
DICKENS,
resulting
in
an
11
byte
(11B)
key
(Sabry,
et
al.,
2010).
The
amino
acid
table
was
loaded
(see
Table
12),
and
the
secret
key
formatted,
removing
spaces
and
non-English
characters,
if
present.
The
plaintext
was
formatted,
removing
spaces
and
separating
any
repeated,
adjacent
characters
through
the
addition
of
a
tilde
(~)
between
pairs
(Sabry,
et
al.,
2010).
45
The
characters
were
converted
to
binary
form,
which
form
was
then
converted
to
DNA,
and
finally
to
an
amino
acid
sequence,
recording
data
for
the
[AMBIG]
variables.
The
Playfair
Cipher
was
then
applied,
converting
the
amino
acid
sequence
back
into
DNA
form,
with
the
[AMBIG]
bit
embedded
therein
(Sabry,
et
al.,
2010).
Table
14.
Performance
results
of
experiment
processing
stages
(Sabry,
et
al.,
2010).
The
time
required
to
load
the
amino
acids
table
and
preparing
the
secret
key
is
omitted
due
to
the
short
duration
of
the
events
(Sabry,
et
al.,
2010).
Secret
Key:
the
more
random
and
lengthy
the
key,
the
more
difficult
the
cipher
will
be
to
break.
Replacing
the
English
alphabet
with
an
amino
acid
sequence:
an
amino
acid
sequence
may
be
used
to
finish
populating
Playfair
matrix
after
secret
key
application.
Insert
the
ciphertext
into
a
host
DNA
strand
for
insertion
into
a
microdot:
once
in
DNA
form,
the
ciphertext
can
be
converted
to
a
steganographic
medium.
Use
of
conventional
XOR-ing:
an
additional
key
may
be
defined,
which
can
then
be
XOR-
ed
with
either
the
amino
acid
or
DNA
versions
of
the
ciphertext,
assuring
randomness.
46
Figure
35.
Structure
of
the
encoder
and
decoder
for
a
Hamming
code.
(Bandakkanavar,
(2010).
While
mutations
may
occur
infrequently,
destroying
encrypted
information,
an
integrated
fuzzy
controller
decides
on
a
set
of
heuristics
based
on
three
input
dimensions
and
recommends
whether
or
not
to
use
a
correction
code
(Heider,
Barnekow,
2007).
One-Time
Pads
The
concept
of
the
one-time
pad
(OTP)
was
first
introduced
by
U.S.
Army
Signal
Corp
officer,
Joseph
Mauborgne
as
an
improvement
to
the
Vernam
cipher.
The
scheme
uses
a
random
key
that
is
as
long
as
the
message
itself,
eliminating
the
need
to
repeat
the
key
(Stallings,
2010).
47
Additionally,
the
key
is
used
to
encrypt
or
decrypt
only
one
message,
after
which
it
is
discarded.
The
result
is
an
encrypted
output
that
bears
no
statistical
relationship
between
the
plaintext
and
the
encoded
message
whatever.
Without
a
statistical
thread
to
pull,
breaking
a
OTP-encrypted
message
is
considered
impossible,
a
concept
referred
to
in
cryptography
as
perfect
secrecy
(Stallings,
2010).
OTPs
are
largely
impractical
in
practice
using
modern
hardware
and
software
due
to
required
key
lengths
and
distribution
requirements.
They
have
however
become
far
more
relevant
with
the
advent
of
theoretical
quantum
computing.
In
the
quantum-computing
model,
the
possibility
[if
not
likelihood]
exists
that
any
traditional
encryption
system
can
be
broken
in
short
order,
with
the
exception
of
those
schemes
based
on
OTP
encryption.
As
a
result,
it
is
the
OTP,
combined
with
DNAs
extraordinary
storage
and
computing
capabilities
that
make
the
combination
an
attractive
encryption
model
(Xiao,
et
al.,
2006).
Pak
Chung
Wong
et
al.
developed
a
OTP
algorithm
to
be
used
in
DNA
to
store
data
in
living
organisms.
The
data
are
translated
into
a
DNA
sequence,
which
is
then
inserted
into
the
nucleotide
sequences
of
living
organisms.
The
insert
sequence
is
flanked
by
two
primer
sequences
that
are
not
pre-
existent
in
the
target
genome.
The
sequence
is
introduced
to
a
living
cell
where
it
coexists,
and
by
association
is
replicated
along
with
the
genomes
native
DNA
(Wong,
et
al.,
2003).
Figure
37.
A
DNA-based
one-time
pad
A,
plain
text
B,
cipher
text
and
primer
DNA
polymerase
primer
(black
box),
(Wong,
et
al.,
2003).
Wong
et
al.
employed
a
substitution
cipher
to
encode
a
song
text
into
the
genome
of
a
living
organism,
Deinococcus
radiodurans.
This
organism
was
chosen
specifically
for
its
ability
to
withstand
ionizing
radiation,
to
show
that
the
song
information
dould
foreseeably
remain
intact
and
retrievable
for
several
centuries
(Wong,
et
al.,
2003).
48
Figure
38.
Example
of
DNA
tiles
carrying
data
representing
binary
1
(light
blue),
binary
0
(white),
start
block
or
attachment
point
(dark
blue
fading
to
lower
right)
and
end
marker
or
attachment
point
(dark
blue
fading
to
upper
left).
The
sequences
can
be
ligated
(attached
by
an
end
or
ends)
to
longer
DNA
strands
by
using
the
start
and
stop
points,
or
sticky
ends
(Leier,
et
al.,
2000).
In
this
model,
DNA
strands
were
assembled
by
concatenating
short,
double-stranded
DNA
molecules
representing
0
(0-DNA
bit),
and
1
(1-BNA
bit),
sequence
start
point
and
sequence
end
point.
The
start
and
end
points
are
considered
sticky,
allowing
the
binary
strands
to
be
polymerized
(combined)
through
DNA
annealing
(reformation
of
DNA
strands
from
heat-exposed
DNA
fragments)
and
ligation
(trimming)
(Leier,
et
al.,
2000).
Figure
39.
Assemby
of
DNA
binary
strands.
Strands
are
composed
of
shorter,
concatenated
tiles
by
overlapping
sticky
start
(s)
and
end
(e)
terminators
with
arbitrary
quantities
of
DNA
bits
in
between.
Bit
strands
containing
up
to
32
bits
were
yielded
in
this
process
(Leier,
et
al.,
2000).
Using
this
method
the
information
content
can
be
decrypted
and
read
directly
by
PCR
(amplification)
and
subsequent
gel-electrophoresis,
requiring
no
additional
work
such
as
subcloning
or
sequencing
(Leier,
et
al.,
2000).
49
Figure
40.
Example
of
gel-electrophoresis
(Electrophoresis,
2011).
Miki
Hirabayashi,
et
al.
outlined
a
process
in
2010
whereby
self-assembling
DNA
is
made
to
function
as
a
truly
random
number
generator
for
purposes
of
producing
OTPs
(Hirabayashi,
et
al.,
2010).
Figure
41.
Self
assembly
schema
of
a
random
DNA
tile
assembly
(Hirabayashi,
et
al.,
2010).
In
this
DNA-based
cryptosystem,
each
tile
is
randomly
set
throughout
the
entire
DNA
assembly).
Each
tile
has
both
XOR
calculation
and
random
number
capabilities.
Each
tile
has
a
sticky
end
for
XOR
calculation,
one
sticky
end
for
random
number
generation,
and
two
sticky
ends
for
connection
(Hirabayashi,
et
al.,
2010).
50
Figure
42.
Scaffold
construction
schema
by
sample
input
message:
M
=
00011011
(Hirabayashi,
et
al.,
2010).
When
the
tile
assembly
begins,
the
value
of
the
random
operation
tile
is
determined
by
a
probability
factor
of
0.5
at
each
slot,
for
of
being
either
a
1
or
a
0.
In
other
words,
the
probability
of
being
one
or
the
other
is
exactly
50/50.
This
probability
factor
is
achieved
by
adding
the
same
concentration
of
random
operation
tiles
(Hirabayashi,
et
al.,
2010).
Figure
43.
Encryption
scheme
flowchart
(Cui,
2008).
51
It
is
difficult
to
obtain
M
from
C
unless
one
has
KB.
It
is
important
to
note
here
that
KA,
KB
and
C
are
not
limited
to
digital
data,
but
can
be
any
method,
material,
data,
etc.
such
as
a
DNA
sequence.
E
and
D
are
also
not
limited
to
mathematical
calculations,
but
can
be
any
physical
or
chemical
or
biological
or
mathematical
process
such
as
RSA
cryptography.
The
intended
receiver
Bob
has
a
pair
of
keys
(e,
d)
(Cui,
2008).
Figure
44.
Data
pre/post
treatment
schema
(Cui,
2008).
The
set
of
two
primer
pairs
identifying
the
critical
start
and
stop
points
in
the
overall
message
payload
are
cooperatively
generated
by
both
Alice
and
Bob.
Because
of
this,
should
an
attacker
acquire
one
primer
pair,
the
amplification
process
required
to
bring
the
message
payload
to
a
stage
at
which
it
can
be
extracted
will
not
work
correctly,
because
of
the
omission
of
the
correct,
corresponding
primer
pair
(Cui,
2008).
52
Nucleotide
Base
Adenine
(A)
Thymine
(T)
Cytosine
(C)
Guanine
(G)
Base2
Numbering
00
01
10
11
Encryption
Detail
Coding
the
binary
ciphertext
C
into
a
binary
DNA
format
composed
of
64
nucleotides
flanked
by
forward
and
reverse
PCR
primer
pairs
constitutes
the
construction
of
the
secret-message
DNA
sequence.
The
primer
pairs
are
required
in
order
to
enable
the
insertion
and
acceptance
of
the
secret-message
sequence
into
a
larger
DNA
sequence,
and
to
delineate
the
encrypted
messages
beginning
and
end
points
from
within
that
larger
environment
(Cui,
2008).
53
Alice
then
generates
an
overall
message
payload
in
the
form
of
a
number
of
dummy
DNA
sequences
that
have
the
same
overarching
structure
as
the
secret-message
DNA
sequence.
The
secret-
message
DNA
sequence
is
then
placed
in
among
the
dummy
DNA
sequences
for
camouflage.
The
message
payload
is
then
sent
to
Bob
using
an
open
communications
channel
(Cui,
2008).
Decryption
Detail
After
the
intended
receiver
(Bob)
gets
the
message
payload,
he
has
the
means
for
locating
the
starting
and
ending
points
(associated
primer
pairs)
for
the
secret-message
DNA
sequence,
within
the
overall
message
payload.
Bob
then
translates
the
secret-message
DNA
sequence
into
the
binary
ciphertext
C,
using
his
primer
pair
information
to
tell
him
where
the
correct
start
and
stop
points
of
the
secret-message
DNA
sequence
are
within
the
overall
message
payload.
From
there,
Bob
can
decrypt
the
binary
ciphertext
C
into
binary
plaintext
M,
using
his
secret
key
e
(Cui,
2008).
Security
Implications
As
with
traditional
encryption
mediums,
it
must
be
assumed
that
an
attacker
has
knowledge
of
the
biological
base
on
which
DNA-based
data
encryption
is
premised,
and
has
access
to
the
necessary
tooling
required
to
manipulate
the
encoded
medium,
and
the
data
it
contains.
What
is
not
known
by
the
attacker
is
the
encryption
key
KA
(one
half
of
the
associated
primer
pairs),
Bobs
public
key
e,
the
decryption
key
KB
(the
other
half
of
the
associated
primer
pairs)
and
Bobs
secure
key
d
(Cui,
2008).
So
in
this
encryption
model,
security
is
derived
from
two
areas:
54
Figure
45.
Genalysis
proprietary
microchip
for
real-time
DNA
analysis
(Toumazou,
et
al.,
2012).
By
performing
simultaneous
amplification
and
detection
of
nucleic
acids,
the
Genalysis
CMOS
microchip
reduces
the
time
to
product
significantly
(Toumazou,
et
al.,
2012).
55
Figure
46.
Excitation
schema
of
Genalysis
proprietary
ISFET
mechanism
(Toumazou,
et
al.,
2012).
DNAE
has
developed
a
variation
on
the
Field
Effect
Transistor
(FET)
called
an
Ion-sensitive
Field
Effect
Transistor
(ISFET),
in
which
real-time
sensitivity
to
released
hydrogen
ions
has
been
cultivated.
In
the
Genalysis
system
scenario,
the
presence
of
hydrogen
ions
drives
electrical
signal
generation
from
the
ISFET.
The
greater
the
number
of
hydrogen
ions
detected
by
the
ISFET,
the
greater
the
electrical
signal
the
ISFET
produces.
The
ability
to
produce
detective
ISFET
sensor
arrays
in
densities
measured
in
the
tens
of
millions
is
the
primary
factor
in
the
ability
to
detect
and
analyze
target
DNA
sequences
in
a
relatively
short
period
of
time
(Toumazou,
et
al.,
2012).
Figure
47.
Schema
of
hydrogen
ion
release
upon
extending
an
existing
DNA
strand
with
one
or
more
additional
nucleotides
(Toumazou,
et
al.,
2012).
The
increase
in
hydrogen
ions
is
achieved
through
DNA
amplification
as
a
result
of
hydrogen
ions
being
released
whenever
a
strand
of
DNA
is
extended
by
the
addition
of
a
nucleotide
(Toumazou,
et
al.,
2012).
A
sample
preparation
kit
is
first
opened,
then
used
to
acquire
a
sample
of
the
target
DNA
in
the
form
of
a
swab
or
saliva
sample.
The
balance
of
the
preparation
kit
is
used
to
prepare
the
sample
for
deposition
of
a
purified
target
DNA
sample
onto
an
electronic
substrate
(CMOS
microship).
56
Figure
48.
Genalysis
external
testing
process
overview
(Toumazou,
et
al.,
2012).
That
substrate
is
then
interfaced
to
an
external,
available
Universal
Serial
Bus
(USB)
port
present
on
a
computing
device
running
the
Genalysis
analysis
softwareTest
results
are
typically
available
within
~30
minutes
from
the
time
analysis
begins
(Toumazou,
et
al.,
2012).
Figure
49.
Schematic
representation
of
single
molecule
experiment
to
detect
malachite
green
adsorbed
on
a
planar
metal
surface
(Neacsu,
et
al.,
2006).
In
the
example
shown
above,
a
helium-neon
(HeNe)
laser
is
focused
on
a
silver
(Au)
probe.
The
reflected
(scattered)
light
is
then
detected
using
a
charged-couple
device
(CCD)
sensor.
57
Figure
50.
Scanning
electron
microscopy
image
of
a
wrinkled
Raman-active
gold
surface
(Zhang,
et
al.,
2011).
The
development
of
wrinkly
film
nanoporous
gold
surfaces
from
which
to
perform
Raman
spectroscopy
analysis
has
largely
addressed
low
stability
and
poor
reproducibility
in
single
molecule
identification,
analysis
and
test
results
(Zhang,
et
al.,
2011).
58
CONCLUSION
Mankind
has
been
dependent
on
DNA
for
its
information
and
related
personnel
security
identification,
verification
and
authentication
efforts
since
civilized
life
began.
In
the
beginning,
only
DNA-based
outcomes
and
resultants
were
available
for
use:
hair
and
skin
color;
eye
color
and
features;
height;
bipedal
gait;
facial
profile
and
features;
geometry
of
the
hand
and
its
features;
vocal
features;
and
so
on.
With
the
passage
of
time
we
are
in
process
of
coming
full
circle,
able
now
to
read
and
write
from
and
to
the
DNA
medium
itself,
in
theory
and
in
practice.
Bypassing
the
outcomes
and
resultants
of
DNA
in
favor
of
direct
access
to
the
information
it
contains,
and
can
be
made
to
contain.
With
the
advent
of
demonstrably
secure
approaches
to
reading
and
writing
from
and
to
DNA,
the
cycle
is
accelerating.
The
ever-decreasing
cost
of
these
technologies
will
further
accelerate
real-time
use
of
DNA
as
a
secure
identification,
authentication
and
information
storage
medium.
Modern
day
societal
norms
have
associated
with
it
an
almost
unique
sense
of
certainty
that
is
not
entirely
possessed
by
other
information
security
identification,
verification
and
authentication
systems
currently
available.
Whereas
these
other
systems
seem
at
least
potentially
vulnerable
to
subversion,
DNA
is
commonly
viewed
as
being
absolutely
elemental
and
as
a
result,
undeniable.
That
this
perception
is
not
entirely
accurate
is
perhaps
of
little
consequence,
since
information
security
is
often
composed
as
much
sense
and
circumstance,
as
it
is
of
reality.
Ever-increasing
technical
strides,
combined
with
the
ever-spreading
availability
of
the
technology
will
make
DNA
more
pertinent
than
ever
as
a
tool
for
establishing
identity,
authenticity
and
integrity
in
information
security.
The
rate
of
progress
in
many
related
areas
of
research
are
in
fact
proceeding
so
quickly
as
to
make
the
description
of
these
ideas
and
systems
in
the
future
tense
something
of
a
challenge.
The
massive
storage
capacity
and
density
of
DNA,
combined
with
its
parallel
computing
capabilities
and
energy
efficiency
will
make
it
possible
to
apply
traditional,
hybrid-traditional
and
entirely
new
encryption
schemes
to
areas
and
applications
that
were
previously
too
impractical,
unwieldy
or
resource-intensive
to
contemplate:
The
relatively
low
temperature
required
to
manufacture
paper
will
make
DNA-infused
stock
for
use
in
high
security
hardcopy
documents
a
reality.
The
unparalleled
data
diffusion
and
obfuscation
opportunities
in
DNA
will
allow
for
multi-access
level
electronic
documents
in
which
a
single
distribution
produces
different
readable
versions
of
the
document
based
on
recipient
security
clearance.
59
What
were
once
wholly
impractical
OTP
applications
will
become
relatively
commonplace,
significantly
altering
the
electronic
distribution
mechanics
of-pay
periodicals,
books,
manuscripts,
etc.
DNA
data
tagging
of
animal
and
plant
kingdoms
is
already
underway
in
full.
With
the
advent
and
refinement
of
near-real-time
and
real-time,
close-proximity,
non-contact
methods
of
reading
human
DNA,
reading
homo
sapiens
barcodes
for
security
purposes
will
become
viable.
The
sheer,
secure
digital
data
storage
capacity
of
DNA
will
contribute
to
the
gradual
obsolescence
of
mechanically-based
storage
mediums
for
archival
storage
of
sensitive
data.
60
REFERENCES
(2009).
The
emerging
science
of
dna
cryptography.
MIT
Technology
Reivew,
Retrieved
from
http://www.technologyreview.com/view/412610/the-emerging-science-of-dna-cryptography/
Adleman,
L.
(1994).
Molecular
Computation
of
Solutions
to
Combinatorial
Problem.
Science,
266(5187),
1021-1024.
Amin,
S.,
Saeb,
M.,
&
El-Gindi,
S.
(2006,
November).
A
DNA-Based
Implementation
of
YAEA
Encryption
Algorithm.
IASTED
International
Conference
On
Computational
Intelligence,
San
Francisco,
CA.
Arnold,
J.
(2000).
History:
A
Very
Short
Introduction.
Oxford:
Oxford
University
Press.
Aron,
J.
(2011,
October
16).
Keeping
a
Lid
On
Your
Digital
DNA.
New
Scientist,
2834.
Retrieved
October
14,
2012,
from
http://www.newscientist.com/article/mg21228346.500-keeping-a-lid-on-your-
digital-dna.html?DCMP=OTC-rss&nsref=online-news
Balado,
F.
(2010).
On
the
Shannon
Capacity
of
DNA
Data
Embedding.
2010
IEEE
International
Converence
on
Acoustics,
Speech,
and
Signal
Processing
:
Proceedings,
1,
1766
-
1769
.
Bandakkanavar,
R.
(2010,
December
10).
Implementation
of
Hamming
Code.
KRAZYTECH.
Retrieved
November
4,
2012,
from
http://krazytech.com/projects/implementation-of-hamming-code
Bancroft.,
Carter.,
Bowler.,
Timothy.,
Bloom.,
&
Taylor.
(2001).
Long-Term
Storage
of
Information
in
DNA.
Science,
293(5536),
1763-1765.
Bartlett,
J.,
&
Stirling,
D.
(2003).
A
Short
History
of
the
Polymerase
Chain
Reaction.
PCR
Protocols,
226,
3-
6.
Bardou,
R.,
Focardi,
R.,
Kawamoto,
Y.,
Simionato,
L.,
Steel,
G.,
&
Tsay,
J.
(2012).
Efficient
Padding
Oracle
Attacks
on
Cryptographic
Hardware.
Advances
in
Cryptology
CRYPTO
2012
32nd
Annual
Cryptology
Conference,
Santa
Barbara,
CA,
USA,
August
19-23,
2012.
Proceedings,
7417,
608-
625.
Baum,
E.
(1995).
Building
an
Associative
Memory
Vastly
Larger
Than
The
Brain.
Science,
268(5210),
583-
585.
Beale,
B.
(2003,
February
25).
Tiny
Self-Powered
DNA
Computer
Unveiled
News
in
Science
(ABC
Science).
ABC.net.au.
Retrieved
October
14,
2012,
from
http://www.abc.net.au/science/articles/2003/02/25/792007.htm
Benenson,
Y.,
Gil,
B.,
Ben-Dor,
U.,
Adar,
R.,
&
Shapiro,
E.
(2004).
An
Autonomous
Molecular
Computer
for
Logical
Control
Of
Gene
Expression.
Nature,
429(6990),
423-429.
61
Boneh,
D.,
Dunworth,
C.,
Lipton,
R.
(1995).
Breaking
DES
Using
a
Molecular
Computer.
Technical
Report
CS-TR-489-95,
Department
of
Computer
Science,
Princeton
University.
Brown,
K.
(Director)
(1999,
October
20).
How
Long
Will
It
Last?
Life
Expectancy
of
Information
Media
Information
Media.
ARMA
International
Conference.
Lecture
conducted
from
Image
Permanence
Institute,
Cincinnati,
Ohio.
Cachin,
C.
(1998,
May).
An
Information
Theoretic
Model
for
Steganography.
Proceedings
of
2nd
Workshop
on
Information
Hiding,
Cambridge,
MA.
Cayre,
F.,
&
Bas,
P.
(2008).
Kerckhoffs-Based
Embedding
Security
Classes
for
WOA
Data
Hiding.
IEEE
Transactions
on
Information
Forensics
and
Security,
3(1),
1-15.
Chan,
Y.,
Nguyen,
A.,
Niu,
L.,
&
Corn,
R.
(2009).
Fabrication
of
DNA
Microarrays
with
poly-L-Glutamic
Acid
Monolayers
on
Gold
Substrates
for
SPR
Imaging
Measurements.
PMC,
25(9),
5054-5060.
Chen,
J.,
&
Wood,
D.
(2000,
February).
Computation
with
Biomolecules.
Proceedings
of
the
National
Academy
of
Sciences
of
the
United
States
of
America.
Church,
G.,
Gao,
Y.,
&
Kosuri,
S.
(2012).
Next-Generation
Digital
Information
Storage
in
DNA.
Science,
10,
1-2.
Retrieved
September
5,
2012,
from
http://www.sciencemag.org/content/early/2012/08/15/science.1226355.full.pdf
Confidentiality,
Integrity
&
Availability.
(2009).
IS
Handbook.
Retrieved
November
7,
2012,
from
http://ishandbook.bsewall.com/risk/Methodology/CIA.html
Conway,
P.
(1996).
Preservation
in
the
Digital
World.
Washington,
D.C.:
Commission
on
Preservation
and
Access.
Craglia,
M.,
Goodchild,
M.,
Annoni,
A.,
&
Camara,
G.
(2008).
Next-Generation
Digital
Earth.
International
Journal
of
Spatial
Data
Infrastructures
Research,
3,
146-167.
Cui,
G.,
Li,
C.,
Li,
H.,
&
Li,
X.
(2009,
August).
DNA
Computing
and
Its
Application
to
Information
Security
Field.
2009
Fifth
International
Conference
on
Natural
Computation.
Cui,
G.,
Qin,
D.,
Wang,
Y.,
&
Zhang,
X.
(2008).
An
Encryption
Scheme
Using
DNA
Technology.
College
of
Electrical
Information
Engineering,
Zhengzhou
University
of
Light
Industry.
Deoxyribonucleic
Acid
(DNA)
Fact
Sheet.
(n.d.).
National
Human
Genome
Research
Institute
(NHGRI)
-
Homepage.
Retrieved
April
30,
2012,
from
http://www.genome.gov/25520880
Di
Cristofaro,
E.
(2011).
Sharing
Sensitive
Information
with
Privacy.
(Master's
thesis,
University
of
California,
Irvine)Retrieved
from
http://www.emilianodc.com/PAPERS/dissertation.pdf
DNA
Barcoding
101.
(2012).
Retrieved
October
7,
2012,
from
http://www.dnabarcoding101.org/introduction.html
62
DNA
Microarray
Virtual
Lab.
(2012).
Learn
Genetics.
Retrieved
November
10,
2012,
from
http://learn.genetics.utah.edu/content/labs/
Electrophoresis
|
Labplanet
Blog.
(2011,
January
3).
Labplanet
Blog
|
Just
another
WordPress
site.
Retrieved
November
15,
2012,
from
http://blog.labplanet.com/2011/01/03/electro
Fullerton,
E.,
Margulies,
D.,
Moser,
A.,
&
Takano,
K.
(2012).
Advanced
Magnetic
Recording
Media
for
High-Density
Data
Storage
-
Electroiq.
Solid
State
Technology
-
Electronics
Manufacturing
Industry
News
for
Semiconductors,
Advanced
Packaging
and
Nanotechnology.
Retrieved
September
29,
2012,
from
http://www.electroiq.com/articles/sst/print/volume-44/issue-
9/features/thin-film-technology/advanced-magnetic-recording-media-for-high-density-data-
storage.html
Gehani,
A.,
&
LaBean,
T.
(2000).
DNA-Based
Cryptography.
Informally
published
manuscript,
Department
of
Computer
Science,
Duke
University,
Durham,
NC,
North
Carolina,
United
States.
Hall,
D.,
&
Seaton,
S.
(2006).
Flexible
Protein
Microarray
Inkjet
Printing.
GEN
-
Genetic
Engineering
and
Biotechnology
News,
26(18).
Retrieved
October
1,
2012,
from
http://www.genengnews.com/gen-articles/flexible-protein-microarray-inkjet-printing/1897/
Harizopoulos,
S.,
Shah,
M.,
Meza,
J.,
&
Ranganathan,
P.
(2009,
January).
Energy
Efficiency:
The
New
Holy
Grail
of
Data
Management
Systems
Research.
4th
Biennial
Conference
on
Innovative
Data
Systems
Research,
Asilomar,
CA.
Heider,
D.,
&
Barnekow,
A.
(2007).
DNA-Based
Watermarks
Using
the
DNA-Crypt
Algorithm.
BMC
Bioinformatics,
8(176),
1-11.
Hirabayashi,
M.,
Kojima,
H.,
&
Oiwa,
K.
(2010).
Design
of
True
Random
One-Time
Pads
in
DNA
XOR
Cryptosystems.
Natural
Computing,
2,
174-183.
Hollingsworth,
P.,
et
al.
(2009).
A
DNA
Barcode
for
Land
Plants.
Proceedings
of
the
National
Academy
of
Sciences
of
the
United
States
of
America,
106,
12794-12797.
How
do
we
Sequence
DNA?.
(2012).
University
of
Michigan
DNA
Sequencing
Core.
Retrieved
October
7,
2012,
from
http://seqcore.brcf.med.umich.edu/doc/educ/dnapr/sequencing.html
Jambhekar,
N.
(2011).
Steganography:
To
Preserve
Document
Security.
Golden
Research
Thoughts,
1(6).
Retrieved
November
4,
2012,
from
http://www.aygrt.net/PublishArticles/107.aspx
Jiao,
S.,
&
Goutte,
R.
(2009).
Hiding
Data
in
DNA
of
Living
Organisms.
Natural
Science
,
1(3),
181-184.
Kamali,
B.
(1995).
Error
Control
Coding.
IEEE
Potentials,
14(2),
15-19.
63
Lambert,
L.,
Cox,
T.,
Mitchell,
K.,
Rossello-Mora,
R.,
Cueto,
C.
D.,
Dodge,
D.,
et
al.
(1998).
Staphylococcus
Succinus
Sp.
Nov.,
Isolated
From
Dominican
Amber.
International
Journal
of
Systemic
and
Evolutionary
Biology,
48,
511-518.
Lausted,
C.,
Dahl,
T.,
Warren,
C.,
King,
K.,
Smith,
K.,
Johnson,
M.,
et
al.
(2004).
Posam:
A
Fast,
Flexible,
Open-Source,
Inkjet
Oligonucleotide
Synthesizer
and
Microarrayer.
Genome
Biology,
5(8),
R58
1-
17.
Leier,
A.,
Banzhaf,
W.,
&
Rauhe,
H.
(2000).
Cryptography
with
DNA
Binary
Strands.
Biosystems,
57(1),
13-
22.
Lelusz,
M.
(2009,
May
9).
inleo.blog
wirtualizacja,
cloud
computing,
storage,
serwery,
it
Zastosowanie
maszyn
wirtualnych
w
przedsiebiorstwie
Rys
historyczny
technologii
wirtualizacji.
inleo.blog
wirtualizacja,
cloud
computing,
storage,
serwery,
it
.
Retrieved
November
15,
2012,
from
http://blog.inleo.pl/?p=68
Leo,
R.
A.
(2012,
August
16).
Writing
the
Book
in
DNA
|
HMS.
Home
|
HMS.
Retrieved
September
23,
2012,
from
http://hms.harvard.edu/content/writing-book-dna
LeProust,
E.,
Peck,
B.,
Spirin,
K.,
McCuen,
H.
B.,
Moore,
B.,
Namsaraev,
E.,
et
al.
(2010).
Synthesis
of
High-Quality
Libraries
of
Long
(150mer)
Oligonucleotides
by
a
Novel
Depurination
Controlled
Process.
Nucleic
Acids
Research,
38(8),
2522-2540.
Liu,
C.,
Shi,
L.,
Xu,
X.,
Li,
H.,
Xing,
H.,
Liang,
D.,
et
al.
(2012).
DNA
Barcode
Goes
Two-Dimensions:
DNA
QR
Code
Web
Server.
Plus
One,
7(5),
1-7.
Luoma,
J.
(2008,
June
3).
DNA
Technology:
Discovering
New
Species
by
Jon
R.
Luoma:
Yale
Environment
360.
Yale
Environment
360:
Opinion,
Analysis,
Reporting
&
Debate.
Retrieved
November
3,
2012,
from
http://e360.yale.edu/feature/dna_technology_dis
Mathai,
J.
(2012).
History
of
Computer
Cryptography
and
Secrecy
Systems.
Informally
published
manuscript,
Computer
and
Information
Science,
Fordham
University,
New
York,
NY,
Retrieved
from
http://www.dsm.fordham.edu/~mathai/crypto.html
Microarrays
Factsheet.
(2007,
July
27).
National
Center
for
Biotechnology
Information.
Retrieved
October
1,
2012,
from
http://www.ncbi.nlm.nih.gov/About/primer/microarrays.html
Neacsu,
C.,
Dreyer,
J.,
Behr,
N.,
&
Raschke,
M.
(2006).
Scanning-probe
Raman
Spectroscopy
with
Single-
Molecule
Sensitivity.
Physical
Review,
73,
193406-1
through
193406-4.
Ning,
K.
(2009).
A
Pseudo
Dna
Cryptography
Method.
Manuscript
submitted
for
publication,
Library,
Cornell
University,
Ithaca,
NY.
64
O'Brien,
J.
(1998,
February).
SCAA
-
Electronic
Records:
Basic
Concepts
in
Preservation
and
Access.
SCAA
-
Saskatchewan
Council
for
Archives
and
Archivists.
Retrieved
September
23,
2012,
from
http://scaa.usask.ca/e-paper.html
Optical
Storage
Technology
-
The
Compact
Disc.
(2004,
April
12).
National
Chung
Hsing
University.
Retrieved
September
29,
2012,
from
benz.nchu.edu.tw/~imtech/course/ods/Chapter%202%20-
%20The%20Compact%20Disc.pdf
Pbo,
S.,
Gifford,
J.,
&
Wilson,
A.
(1988).
Mitochondrial
DNA
sequences
from
a
7000-year
old
brain.
Oxford
Journals
-
Nucleic
Acids
Research,
16(20),
9775-9787.
Pierik,
A.,
Dijksman,
F.,
Raaijmakers,
A.,
Wismans,
T.,
&
Stapert,
H.
(2008).
Quality
Control
of
Inkjet
Technology
for
DNA
Microarray
Fabrication..
Biotechnology
Journal,
3(12),
1581-1590.
Provos,
Neils.,
&
Honeyman,
Peter.
(2003).
Hide
and
Seek:
An
Introduction
to
Steganography.
Security
&
Privacy
Magazine,
IEEE,
1(3),
32-44.
Sabry,
M.,
Hashem,
M.,
Nazmy,
T.,
&
Khalifa,
M.
(2010).
A
DNA
and
Amino
Acids-Based
Implementation
of
Playfair
Cipher.
International
Journal
of
Computer
Science
and
Information
Security,
8(3),
126-
133.
Schovanek,
M.
(2010,
November
7).
Marek
Schovanek
News
News.
Marek
Schovanek.
Retrieved
November
12,
2012,
from
http://www.marekschovanek.com/wordpress/?cat=1&paged=2
Securing
Data
at
Rest:
Developing
a
Database
Encryption
Strategy.
(2002).
RSA
-
Security,
Compliance
and
Risk
Management
Solutions.
Retrieved
November
15,
2012,
from
www.rsa.com/products/bsafe/whitepapers/DDES_WP_0702.pdf
Shimanovsky,
B.,
Feng,
J.,
Potkonjak,
M.
2003).
Hiding
Data
in
DNA.
XAP
Corporation,
Department
of
Computer
Science,
University
of
California,
Los
Angeles
Smyth,
D.
(2007,
July
18).
An
inside
look
at
Symmetric
Encryption.
DotNetSlackers:
ASP.NET
News
and
Articles
For
Lazy
Developers.
Retrieved
November
15,
2012,
from
http://dotnetslackers.com/articles/security/AnInsideLookAtSymmetricEncryption.aspx
RedWeb
Technologies
|
ADNAS.
(2012).
ADNAS
|
Crime
Fighting
Has
Never
Had
a
Weapon
Like
This.
Retrieved
October
13,
2012,
from
http://www.adnas.com/redweb
Ribeiro,
R.
(2012,
May).
A
History
of
Encryption
Through
the
Ages
[Infographic].
In
Biztech.
Retrieved
November
12,
2012,
from
http://www.biztechmagazine.com/article/2012/05/history-
encryption-through-ages-infographic
Reif,
J.
(2002).
The
Emergence
of
the
Discipline
of
Biomolecular
Computation
in
the
US.
New
Generation
Computing,
20(3),
217-236.
65
Stallings,
W.
(2010).
Cryptography
and
Network
Security:
Principles
and
Practice
(2nd
ed.).
Upper
Saddle
River,
N.J.:
Prentice
Hall.
Toumazou,
C.
(2012).
DNA
Electronics
-
Real-Time
Disposable
Gene
Tests.
DNA
Electronics
-
Real-time
disposable
gene
tests.
Retrieved
November
9,
2012,
from
http://dnae.co.uk/platforms/genalysis/lab-free-dna-testing/
Trummel,
K.,
&
Weisinger,
J.
(1986).
The
Complexity
of
the
Optimal
Searcher
Path
Problem.
Operations
Research,
34(2),
324-327.
Van
Voorst,
J.,
&
Finzel,
B.
(2012,
May).
A
Searchable
Database
of
Macromolecular
Conformation.
2012
chemistry
biology
interface
training
symposium,
Minneapolis,
MN.
Wong,
P.
C.,
Wong,
K.,
&
Foote,
H.
(2003,
January).
Organic
Data
Memory
Using
the
DNA
Approach.
Communications
of
the
ACM,
46,
95-98.
Xiao,
G.,
Lu,
M.,
Qin,
L.,
&
Lai,
X.
(2006).
New
Field
of
Cryptography:
DNA
Cryptography.
Chinese
Science
Bulletin,
51(12),
1413-1420.
Zhang,
L.,
Lang,
X.,
Hirata,
A.,
&
Chen,
M.
(2011).
Single-Molecule
Spectroscopy
-
A
New
Gold
Standard?
ACS
Nano,
5,
4407-4413.
66