Beruflich Dokumente
Kultur Dokumente
James A. Foster Laboratory for Applied Logic Dept. of Computer Science University of Idaho May 8, 1997
Chemistry of DNA
DNA molecules: paired strands of nucleotides bases attached to sugarphosphate backbones Nucleotides bases: Adenine binds with Thymine, Guanine binds with Cytosine Backbone: 5 carbons, similar to linked list One molecule's 50 binds to next ones 30 10 binds to nucleotide Paired Strands: Bases bond to complementary strand Sequences: listed 50 to 30
Laboratory for Uphill Computing
May 8, 1997 jaf
T A
G C
C G
A T
GCCA
TGCATTC
CGGT
ACGTAAG
Note:
GCCATGCATTC
CGGTACGTAAG
s is complement of s is
May 8, 1997 jaf
svt
for
0) Given
5 3
PCR Illustration
s s t
3 5
1) Denature (heat)
1111111111111111 0000000000000000 11 00 1111111111111111 0000000000000000 11111 00000 111111111111 000000000000 11111 00000 111111111111 000000000000 11111 00000 1111111111 0000000000
t s s t
111111111111111111 000000000000000000 111111111111111111 000000000000000000 11111 00000 111111111111 000000000000 11111 00000 1111111111 0000000000 11111 00000
t t s
11111 00000
111111111111111111 000000000000000000 111111111111111111 000000000000000000 11111 00000 111111111111 000000000000 11111 00000 1111111111 0000000000 11111 00000
t t t s
11111 00000
Laboratory for Uphill Computing
as DNA
Recall: NP problems are easy to solve given a short hint". This algorithm checks all possible hints" in parallel, with a polynomial number of operations.
Given: Directed graph G, nodes f , t Question: Is there a directed path, visiting every node exactly once, from f to t in G?
2 5
3
A possible Hamitonian path: (F,2,4,6,3,5,T)
Long enough not to bind to each other Short enough for PCR to work Edge u; v: build sequence S with 10-base su x from S , 10-base pre x from S except use all of S and S
uv u v f t
Path a
b c:
10
Example Examples of Adlemans encoding Nodes S2: GTCACACTTC GGACTGACCT S2: AGGTCAGTCC GAAGTGTGAC S4: TGTGCTATGG GAACTCAGCG S4: CGCTGAGTTC CCATAGCACA S5: CACGTAAGAC GGAGGAAAAA S5: TTTTTCCTCC GTCTTACGTG Edges (2,4): GGACTGACCT TGTGCTATGG (4,5): GAACTCAGCG CACGTAAGAC Paths (2.4.5):
11
AGGTCAGTCC GAAGTGTGAC
CGCTGAGTTC CCATAGCACA
TTTTTCCTCC GTCTTACGTG
uv
Overlapping segments will bind and leave sticky ends" to promote further binding
Example
Edge (2,4) Node 4
GGACTGACCT TGTGCTATGG CGCTGAGTTC CCATAGCACA
12
t
Run PCR on tube T using S and S as primers, put products in T 0 Separate strands with 20n + 10 bases from T 0, put products in tube R Note: by construction, nodes can be visited at most once with high probability If any DNA is left in R, return yes", else no"
13
A Mathematical Model
Primitives: tubes of DNA or similar Operations:
Remove T; T 0;
of form
i
,
placing them in T 0
ig
;T
Copy T;
Tig
We implicitly assume operations such as separation by size, ampli cation, ligation, annealing, and denaturing where needed
Laboratory for Uphill Computing
May 8, 1997 jaf
14
Solving dHP
Representation: as before
Input: for each node v and edge
Tv contains Sv and Sv 0 Tuv contains Suv and Suv 0 MixfTi; Tuv g,T RemoveT ,T0,fSf g RemoveT0,T 0,fSt g Move length 20n + 10 strings from T 0 to T 00 if DetectT 00
then return ``Yes'' else return ``No''
u; v,
Complexity: linear in number of nodes for Mix Note: S and S are DNA strands representing node v and edge u; v in input graph
v uv
15
Listing Permutations
Problem: input n, list all permutations of items Representation: p1i1p2i : : : p i where codes position j ", i 1; 2; : : : ; n
x n n j 2 f
pj
en-
Input: T with all valid strings for j to n CopyT ,fT1; T2; : : : ; Tng
=1
i is any string other than i MixfT1; T2; : : : ; Tn g,T first j is are distinct in each string T contains permutations of f1; 2; : : : ; ng
:
=1 = +1
16
Solving SAT
The Problem
^_ l F ~ = x
p q n i
=1 =1
j
i;j
x3
17
v1
v2
Etc.
vn
x1
xn
Variables: x1, x2, ..., xn Extra nodes: v1, v2, ..., vn Paths: sequences of literals over these variables
18
19
Computational Complexity
Let P-DNA be problems solvable with polynomial steps in this model Thm Beaver: P-DNA = PSPACE Generalized Turing-complete models splicing systems, DNA TMs exist But, P-DNA computations still require exponential volume, and perhaps lots of clock time
20
Disadvantages
Steps" are manual and slow Reaction time proportional to volume of reactants: real time can be much slower than number of steps Required volume can be huge Processes can introduce errors Processes do not scale up well
21
22
Advantages
Massive parallelism Attack special instances e.g. Keller graphs for MC Very low energy consumption 10,19 J versus 10,9 J per basic operation with no inherent lower bound Way cool
23
L. Adleman, Molecular Computation of Solutions to Combinatorial Problems", Science, 266:1021 1024, 1994 L. Adleman, On Constructing a Molecular Computer", manuscript, ftp: usc.edu pub csinfo papers adlemanmolecular computer.ps, 1995 D. Beaver, A Universal Molecular Computer", Technical report, Penn. State U., 1995 J. Hartmanis, On the Weight of Computations", Bull. Euro. Assoc. for Theoretical Comp. Sci., 55:136 138, 1995 J. H. Reif, Parallel Molecular Computation", in Proc. 7th ACM Symp. on Parallel Alg. and Arch., pp. 213 223, 1995 Also see URLs from my homepage bookmarks
Further Reading
24