Sie sind auf Seite 1von 13

INSTITUTE OF PHYSICS PUBLISHING Nanotechnology 17 (2006) R27R39

NANOTECHNOLOGY doi:10.1088/0957-4484/17/2/R01


DNA computing: applications and challenges

Z Ezziane
Dubai University College, College of Information Technology, PO Box 14143, Dubai, UAE, Middle East

Received 17 August 2005 Published 21 December 2005 Online at Abstract DNA computing is a discipline that aims at harnessing individual molecules at the nanoscopic level for computational purposes. Computation with DNA molecules possesses an inherent interest for researchers in computers and biology. Given its vast parallelism and high-density storage, DNA computing approaches are employed to solve many combinatorial problems. However, the exponential scaling of the solution space prevents applying an exhaustive search method to problem instances of realistic size, and therefore articial intelligence models are used in designing methods that are more efcient. DNA has also been explored as an excellent material and a fundamental building block for building large-scale nanostructures, constructing individual nanomechanical devices, and performing computations. Molecular-scale autonomous programmable computers are demonstrated allowing both input and output information to be in molecular form. This paper presents a review of recent advances in DNA computing and presents major achievements and challenges for researchers in the foreseeable future.

1. Introduction
DNA (deoxyribonucleic acid) computing research was inspired by the similarity between the way DNA works and the operation of a theoretical device known as a Turing machine. Turing machines process information and store them as a sequence, or list of symbols, which is very naturally related to the way biological machinery works. Biomolecular computing, where computations are performed by biomolecules, is challenging traditional approaches to computation both theoretically and technologically. The idea that molecular systems can perform computations is not new and was indeed more natural in the pre-transistor age. Most computer scientists know of von Neumanns discussions of self-reproducing automata in the late 1940s, some of which were framed in molecular terms (McCaskill 2000). Important was the idea, appearing less natural in the current age of dichotomy between hardware and software, that the computations of a device can alter the device itself. This vision is natural at the scale of molecular reactions, although it may appear as a fantasy to those running huge chip
0957-4484/06/020027+13$30.00 2006 IOP Publishing Ltd

production facilities. Alan Turing also looked beyond purely symbolic processing to natural bootstrapping mechanisms in his work on self-structuring in molecular and biological systems (McCaskill 2000). In biology, the idea of molecular information processing took hold starting from the unravelling of the genetic code and translation machinery and extended to genetic regulation, cellular signalling, protein trafcking, morphogenesis and evolution, which all have progressed independently of the development in the neurosciences. The essential role of information processing in evolution and the ability to address these issues on laboratory timescales at the molecular level was rst addressed by Adlemans key experiment (Adleman 1994), which demonstrated that the tools of laboratory molecular biology could be used to program computations with DNA in vitro. DNA computing approaches can be performed either in vitro (purely chemical) or in vivo (i.e. inside cellular life forms). The huge information storage capacity of DNA and the low energy dissipation of DNA processing led to an explosion of interest in massively parallel DNA computing. For serious proponents of the eld however, there never was R27

Printed in the UK

Topical Review
53A T C G C G T A G C T A T T T A G C C G -3 -5


Figure 1. Example of a DNA molecule.

Figure 2. A DNA molecule with sticky ends.

a question of brute search with DNA solving the problem of an exponential growth in the number of alternative solutions indenitely. Articial intelligence methods are used to address the combinatorial issue in DNA computing (Impagliazzo et al 1998, Sakamoto et al 1999), which will be discussed later in this review. Quantum computing is usually compared with DNA computing. Quantum computing involves high physical technology for the isolation of mixed quantum states necessary to implement efcient computations solving combinatorially complex problems such as factorization. DNA computing operates in natural noisy environments, such as a glass of water. It involves an evolvable platform for computation in which the computer construction machinery itself is embedded. Since DNA computing is linked to molecular construction such as nanomechanical devices and other nanoscale structures, the computations may eventually also be employed to build three-dimensional self-organizing partially electronic or more remotely even quantum computers (McCaskill 2000).

2. The structure of DNA

DNA is the major example of a biological molecule that stores information and can be manipulated, via enzymes and nucleic acid interactions, to retrieve information. Similarly, as a string of binary data is encoded with zeros and ones, a strand of DNA is encoded with four bases (known as nucleotides), represented by the letters A, T, C, and G. Each strand, according to chemical convention, has a 5 and a 3 end; hence, any single strand has a natural orientation. Figure 1 presents a DNA molecule composed of ten pairs of nucleotides. Bonding occurs by the pairwise attraction of bases; A bonds with T and G bonds with C. The pairs (A, T) and (G, C) are therefore known as complementary base pairs. DNA computing relies on developing algorithms that solve problems using the encoded information in the sequence of nucleotides that make up DNAs double helix and then breaking and making new bonds between them to reach the answer. The nucleotides are spaced every 0.35 nm along the DNA molecule, giving a DNA a remarkable data density estimated as one bit per cubic nanometre, and potentially exabytes (1018 ) amounts of information in a gram of DNA (Chen et al 2004). In two dimensions, assuming one base per square nanometre, the data density is over one million Gbits per square inch, whereas the data density of a typical high performance hard drive is about 7 Gbits per square inch (Ryu 2000). DNA computing is also massively parallel and can reach approximately 1020 operations s1 compared to existing teraop supercomputers. Another important property of DNA is its double-stranded nature. The bases A and T, and C and G, can bind together, forming base pairs. Therefore, every DNA sequence has a natural complement. For example, sequence S is AATTCGCGT, its complement, S , is TTAAGCGCA. Both S and S will hybridize to form double-stranded DNA. This R28

complementarity can be used for error correction. If the error occurs in one of the strands of double-stranded DNA, repair enzymes can restore the proper DNA sequence by using the complement strand as a reference. In DNA replication, there is one error for every 109 copied bases whereas hard drives have one error for every 1013 for ReedSolomon correction (Ryu 2000). From the basic principle of base pair complementarity, DNA contains two elements crucial to any computer: (1) a processing unit in the form of enzymes that denature, replicate and anneal DNA, which are operations capable of cutting, copying, and pasting; and (2) a storage unit encoded in DNA strings (Thaker 2004). Hence, when enzymes work on multiple DNA at the same time DNA computing becomes massively parallel and ultimately very powerful. The power in DNA computing comes from the memory capacity and parallel processing. For example, in bacteria, DNA can be replicated at a rate of about 500 base pairs a second, which is 10 times faster than human cells. This represents about 1000 bits s1 , but when many copies of the replication enzymes are to work on DNA in parallel, the rate of DNA strands will increase exponentially (2n after n iterations). Subsequently, after 30 iterations it increases to 1 Tbits s1 . 2.1. Matching DNA sticky ends Restriction enzymes catalyse the cutting of both strands of a DNA molecule at very specic DNA base sequences, called recognition sites. Recognition sites are typically 48 DNA base pairs long. Figure 2 shows a DNA molecule in which its four nucleotides in the left end and ve in the right end are not paired with nucleotides from the opposite strand. This molecule has sticky ends. There are over 100 different restriction enzymes, each of which cuts at its specic recognition site(s). A restriction enzyme cuts tiny sticky ends of DNA that will match and attach to sticky ends of any other DNA that has been cut with the same enzyme. DNA ligase joins the matching sticky ends of the DNA pieces from different sources that have been cut by the same restriction enzyme. Many restriction enzymes work by nding palindrome sections of DNA (regions where the order of nucleotides at one end is the reverse of the sequence at the opposite end). The process of joining the matching sticky DNA ends is used extensively in the eld of DNA technology to produce substances such as insulin and interferon, and to splice genes that alter a cell or organism from its original DNA for some benet. For example, in agriculture we have used gene splicing to delay the ripening process of tomatoes, to make more nutritious corn, to make rice that contains carotenes and to produce plants with natural pesticides.

3. DNA computers
A DNA computer is a collection of specially selected DNA strands whose combinations will result in the solution to

Topical Review
Molecular Computer Molecular Constructor

Universal Computer

Universal Constructor

Figure 3. The von Neumann architecture for a self-replicating system.

Molecular Positional Capability

Tip Chemistry

some problem, and a nanocomputer is considered as a machine that uses DNA to store information and perform complex calculations. Benenson et al (2003) observed the unique properties of DNA being a fundamental building block in the elds of supramolecular chemistry, nanotechnology, nanocircuits, molecular switches, molecular devices, and molecular computing. Many designs for miniature computers aimed at harnessing the massive storage capacity of DNA have been proposed over the years. Earlier schemes have relied on a molecule known as ATP, which is a common source of energy for cellular reactions, as a fuel source. However, Benenson et al (2003) designed a new model where a DNA molecule provides both the initial data and sufcient energy to complete the computation. Both models of the molecular computer are so-called automatons. Given an input string comprised of two different states, an automaton uses predetermined rules to arrive at an output value that answers a particular question. Then a specic enzyme acts as the computers hardware by cutting a piece of the input molecule and releasing the energy stored in the bonds. This heat energy then powers the next computation (Graham 2003). Positional control combined with appropriate molecular tools should enable researchers and practitioners to build a truly overwhelming range of molecular structures. Subsequently, one of the outcomes will be building a general-purpose programmable device, which is able to make copies of itself. von Neumann carried out a detailed analysis of self-replicating systems in a theoretical cellular automata model. In this model, as depicted in gure 3, he used a universal computer for control and a universal constructor to build more automata. The universal constructor was a robotic arm that, under computer control, could move in two dimensions and alter the state of the cell at the tip of its arm. By sweeping systematically back and forth, the arm could eventually build any structure that the computer instructed it to. In his three-dimensional model, von Neumann retained the idea of a positional device and a computer to control it. The architecture for Drexlers assembler, as depicted in gure 4, is a specialization of the more general architecture proposed by von Neumann. Similarly, there is a computer and constructor. However, the computer has shrunk to a molecular computer while the constructor combines two features: a robotic positional device and a well-dened set of chemical operations that take place at the tip of the positional device ( MITtecRvwSmlWrld/article.html). The complexity of a self-replicating system must be reasonable and acceptable. In addition, the complexity of an assembler, in terms of bytes, should not be beyond the complexity that can be dealt with by todays engineering capabilities. As indicated in table 1, the primary observation to

Figure 4. Drexlers architecture for an assembler.

Table 1. Complexity of self-replicating systems (Megabytes). von Neumanns universal constructor Internet Worm Mycoplasma genitalia E. coli Drexlers assembler Human NASA Lunar Manufacturing Facility About 0.63 About 0.63 0.14 1.16 12.5 800 13 000

be drawn from these data is that simpler designs and proposals for self-replicating systems both exist and are well within current design capabilities. The engineering effort required to design systems of such complexity will be signicant, but should not be greater than the complexity involved in the design of such existing systems as computers. Self-replication is used as a means to an end, not as an end in itself. A system able to make copies of itself but unable to make much of anything else would not be very useful. The purpose of self-replication in the context of manufacturing is to permit the low-cost replication of a exible and programmable manufacturing system. Hence, the objective is to build a system that can be reprogrammed to make a very wide range of molecularly precise structures ( 3.1. Self-assembling nanostructures with DNA DNA molecular structures and intermolecular interactions are particularly known to be amenable to the design and synthesis of complex molecular objects. Winfree et al (1998) used a molecular self-assembly approach to the fabrication of objects specied with nanometre precision. Their results demonstrated the potential of using DNA to create selfassembling periodic nanostructures, and therefore leading the way to nanotechnology. A few years later, Mao et al (2000) reported a onedimensional algorithm self-assembly of DNA triple-crossover molecules that can be used to execute four steps of a logical (cumulative XOR) operation on a string of binary bits. Their results suggest that computation by self-assembly may be scalable. Figure 5 depicts a simplied version for the implementation of the XOR cellular automaton using the Sierpinski rules (Rothemund et al 2004). Figure 4 has four horizontal parts: (A), (B), (C), (D), and (E). On the left of (A), the two time steps of the execution drawn are shown as a space time history and cells are updated synchronously according to XOR function. The right side of (A) shows the Sierpinski triangle. Part (B) translates the spacetime history into a tiling, in which for each possible input pair a tile T-x y is generated so that it bears the inputs represented as shapes on the lower R29

Topical Review

t=1 ... 0 1 1 0 0 ...


z outputs z x

z = x xor y z
T-xy T-xy


0 0

0 0

0 1


0 1



1 0

Initial conditions for the computation are provided by nucleating structures (0s and 1s)

Error-free growth results in the Sierpinski pattern

Error-prone growth including mismatch errors

Figure 5. The XOR cellular automaton implementation using tile-based self-assembly. (This gure is in colour only in the electronic version)

half of each side and the output as shapes duplicated on the top half of each side. Part (C) represents the four Sierpinski rule tiles; T-00, T-11, T-01, and T-10, represent the four entries of the truth table for XOR: 0 XOR 0 = 0, 1 XOR 1 = 0, 0 XOR 1 = 1, and 1 XOR 0 = 1. Part (D) is concerned with the growth results in the Sierpinski pattern, and part (E) uses symbols to indicate mismatch errors. DNA nanostructures provide a programmable methodology for bottom-up nanoscale construction of patterned structures, utilizing macromolecular building blocks called DNA tiles based on branched DNA. These tiles have sticky ends that match the sticky ends of other DNA tiles, facilitating further assembly into larger structures known as DNA tiling lattices. In principle, DNA tiling assemblies can be made to form any computable two- or three-dimensional pattern, however complex, with the appropriate choice of the tiles component DNA (Reif et al 2005). One potential approach is to use patterned DNA as scaffolds or templates for organizing and positioning molecular electronics and other components such as molecular sensors with precision and specicity. The programmability lets this scaffolding have the patterning required for fabricating complex devices made of these components. Sung et al (2004) discussed the fabrication and characterization of an original class of nanostructures based on the DNA scaffolds. They reported on the self-assembly of one- and two-dimensional DNA scaffolds, which served as templates for the targeted deposition of ordered nanoparticles and molecular arrays. Turberfield (2003) proposed to use self-assembling DNA R30

nanostructures as scaffolds for constructing and positioning molecular-scale electronic devices and wires. A principal challenge in DNA tiling self-assemblies is the control of assembly errors. This is predominantly relevant to computational self-assemblies, which, with complex patterning at the molecular scale, are prone to a quite high rate of error, ranging from approximately between 0.5% and 5% (Reif et al 2005). The limitation and/or elimination of these errors in self-assembly represent the most important major challenge to nanostructure self-assembly. 3.2. DNA nanomachines DNA has been explored as an excellent material for building large-scale nanostructures, constructing individual nanomechanical devices, and performing computations (Seeman 2003). A variety of DNA nanomechanical devices have been previously constructed that demonstrate motions such as open/close (Yurke et al 2000, Simmel and Yurke 2001, 2002, Liu and Balasubramanian 2003), extension/contraction (Li and Tan 2002, Alberti and Mergny 2003, Feng et al 2003), and motors/rotation (Mao et al 1999, Yan et al 2002, Niemeyer and Adler 2002), mediated by external environmental changes such as the addition and removal of DNA fuel strands (Li and Tan 2002, Alberti and Mergny 2003, Simmel and Yurke 2001, 2002, Yan et al 2002, Yurke et al 2000) or the change of ionic composition of the solution (Mao et al 1999, Liu and Balasubramanian 2003). The DNA walker could ultimately be used to carry out computations and to precisely transport nanoparticles of

Topical Review

material. The walker can be programmed in several ways in this direction. For example, information can be encoded in the walker fragments as well as in the track so that, while performing motion, the walker simultaneously carries out computation. Yin et al (2005a), (2005b) designed an autonomous DNA walking device in which a walker moves along a linear track unidirectionally. Sherman and Seeman (2004) have constructed a DNA walking device controlled by DNA fuel strands. Reif (2003) designed an autonomous DNA walking device and an autonomous DNA rolling device that move in a random bidirectional fashion along DNA tracks. Shin and Pierce (2004) designed the DNA walker for molecular transport. Recently, Yin et al (2005a), (2005b) encoded computational power into a DNA walking device embedded in a DNA lattice and therefore accomplished the design for an autonomous nanomechanical device capable of universal computation and translational motion. Implementing controllable molecular nanomachines made of DNA is one of the objectives of DNA computing and DNA nanotechnology (Takahashi et al 2005). Controlling DNA machines have been implemented using different methods: (1) DNA strands that hybridize with target machines and drive their state transition, (2) DNA strands can also be used as catalysts for the formation of double helices in such nanomachines, and (3) BZ transition of DNA capable of switching the confrontation of the DNA motor (Mao et al 1999). Various approaches have implemented the rst method. Yurke et al (2000) reported the construction of a DNA machine in which DNA is used not only as a structural material, but also as fuel. Simmel and Yurke (2001) described a DNAbased molecular machine, which has two movable arms that are pushed apart when a strand of DNA, the fuel strand, hybridizes with a single-stranded region of the molecular machine. Yan et al (2002) implemented a robust DNA mechanical device controlled by hybridization topology. On the other hand, implementations of the second method have also been reported. Seelig (2004) presented experimental results on the control of the decay rates of a metastable DNA fuel. They also discussed how the fuel complex can serve as the basic ingredient for a DNA hybridization catalyst. They also proposed a method for implementing arbitrary digital logic circuits. Turberfield and Mitchel (2003) described kinetic control of DNA hybridization, which has the potential to increase the exibility and reliability of DNA self-assembly through inhibiting the hybridization of complementary oligonucleotides. The proposed DNA catalysts were shown to be effective in promoting the hybridization and for using DNA as a fuel to drive free-running articial molecular machines.

4.1. Biomolecular computing Biomolecular computers are molecular-scale, programmable, autonomous computing machines in which the input, output, software, and hardware are made of biological molecules (Benenson and Shapiro 2004). Biomolecular computers hold the promise of direct computational analysis of biological information in its native biomolecular form, avoiding its conversion into an electronic representation (Adar et al 2004). This has led to pursing autonomous, programmable computers which are considered as nite automata (McAdams and Arkin 1997). An automaton can be stochastic, namely has two or more competing transitions for each state-symbol combination, each with a prescribed probability. A stochastic automaton is useful for processing uncertain information, like most biological information. Because of the stochastic nature of biomolecular systems, a stochastic biomolecular computer would be more favourable for analysing biological information than a deterministic one (McAdams and Arkin 1997). Stochastic molecular automata have been constructed in which stochastic choice is realized by means of competition between alternative paths, and choice probabilities were programmed by the relative molar concentrations of the software molecules coding for the alternatives. This approach was used in the construction of a molecular computer capable of probabilistic logical analysis of disease-related molecular indicators (Adar et al 2004). Benenson et al (2001) described a programmable nite automaton comprising DNA and DNA-manipulating enzymes that solves computational problems autonomously. The automatons hardware consists of a restriction nuclease and ligase, the software and input are encoded by doublestranded DNA, and programming amounts to choosing appropriate software molecules. Their experiments used 1012 automata, which were sharing identical software, and running independently and in parallel on inputs in 120 l solution at room temperature at a combined rate of 109 transitions s1 with a transition delity greater than 99.8%, consuming less than 1010 W. It has also been demonstrated that a single DNA molecule can provide both the input data and all of the necessary fuel for a molecular automaton (Benenson et al 2003). Those experiments showed that 3 1012 automata l1 performing 6.6 1010 transitions s1 l1 with transition delity of 99.9% dissipating about 5 109 W l1 as heat at ambient temperature. An autonomous biomolecular computer was described recently (Benenson et al 2004) which analyses the levels of messenger RNA (mRNA) species, and in response generates a molecule capable of affecting levels of gene expression. The designed biomolecule computer works at a concentration of close to 1012 computers l1 . The modularity of their design facilities improved each biomolecular computer component independently. They demonstrated how computer regulation by other biological molecules such as proteins, the output of other biologically active molecules such as RNA interference, can all be explored concurrently and independently. Progress in the development of molecular computers may lead to a Doctor in Cell which is represented by a biomolecular computer that operates inside the living R31

4. DNA computing
DNA computing is a novel and fascinating development at the interface of computer science and molecular biology. It has emerged in recent years, not simply as an exciting technology for information processing, but also as a catalyst for knowledge transfer between information processing, nanotechnology, and biology. This area of research has the potential to change our understanding of the theory and practice of computing.

Topical Review

organism, for example the human body, programmed with medical information to diagnose potential diseases and produce the required drugs in situ. This will ultimately lead to a device capable of processing DNA inside the human body and nding abnormalities and creating healing drugs. However, major changes will be needed for the molecular computer to operate in vivo (Shapiro et al 2004). Shapiro Lab is renowned for the creation of biomolecular computing devices, which are so tiny that more than a trillion t into one drop of water. These manufactured devices are made entirely of DNA and other biological molecules. A recent version was programmed by Shapiro and his research team to identify signs of specic cancers in a test tube, to diagnose the type of cancer and to release drug molecules in response. Though cancer-detecting computers are still in the very early stages, and can thus far only function in test tubes, Shapiro and his research team envision future biomolecular devices that may be injected directly into the human body to detect and prevent or cure disease. At the Shapiro Lab, their recent research mainly deals with the aspect of energy consumption by a computer. They were able to construct a molecular computer whose sole energy source is its input, a combination that is unthinkable in the realm of electronic computers. This energy is extracted as the input data molecule is destroyed during computation (http:// udi/). Recently, they initiated the BioSPI project which is concerned with developing predictive models for molecular and biochemical processes. Such processes, carried out by networks of proteins, mediate the interaction of cells with their environment and are responsible for most of the information processing inside cells. To this end, they developed a new computer system, called BioSPI, for representation and simulation of biochemical networks (Shapiro et al 2002). 4.2. Solving problems using DNA computing 4.2.1. Finite state problems. To compete with silicon, it is important to develop the capability of biomolecular computation to quickly execute basic operations, such as arithmetic and Boolean operations, that are executed in single steps by conventional machines. In addition, these basic operations should be executable in massively parallel fashion (Reif 1998). Guarnieri and Bancroft (1999) developed a DNA-based addition algorithm employing successive primer extension reactions to implement the carries and the Boolean logic required in binary addition (similar methods can be used for subtraction). Guarnieri, Fliss, and Bancroft prototyped (Guarnieri et al 1996) the rst biomolecular computation addition operations (on single bits) in recombinant DNA. They presented the development of a DNA-based algorithm for addition. The DNA representation of two non-negative binary numbers was presented in a form permitting a chain of primer extension reactions to carry out the addition operation. They demonstrated the feasibility of this algorithm through executing biochemically a simple example. However, it suffered from some limitations: (1) only two numbers were added, so it did not take advantage of the massive parallel processing capabilities of biomolecular computation; and (2) the outputs were encoded distinctly from the inputs, hence it did not allow for repetitive operations. R32

Subsequent proposed methods (Orlian et al 1998, Leete et al 1997, Gupta et al 1997) for basic operations such as arithmetic (addition and subtraction) allow chaining of the output of these operations into the inputs to supplementary operations, and to allow operations to be executed in massive parallel fashion. Rubin et al (1997) presented an experimental demonstration of a biomolecular computation method for chained integer arithmetic. 4.2.2. Combinatorial problems. DNA computing methods were employed in complex computational problems such as the Hamilton path problem (HPP) (Adleman 1994), maximal clique problem (Ouyang et al 1997), satisability problem (SAT) (Liu et al 2000), and chess problems (Faulhammer et al 2000). The advantage of these approaches is the huge parallelism inherent in DNA-based computing, which has the potential to yield vast speedups over conventional electronicbased computers for such search problems. The computational problem considered by Adleman (1994) was a simple instant of the directed travelling salesmen problem (TSP) also called Hamilton path problem (HPP). The technique used for solving the problem was a new technological paradigm, termed DNA computing. Adlemans experiment represents a landmark demonstration of data processing and communication on the level of biological molecules. It was the rst DNA computer set up to solve the TSP. This problem uses the scenario of a door-to-door salesman who must visit several connected cities without going through any city twice. To solve this problem using DNA, the rst step is to assign a genetic sequence to each city. For example, the city of Los Angeles might be coded GCACAGT. If two cities connect, then the connecting genetic sequence is assigned the rst three letters of one city and the last three letters of the other. For example, if Los Angeles connected to New York, the rst three letters of Los Angeles (GCA) would connect to the last three letters of New York (CGT). The TSP seems a simple puzzle; however, the most advanced supercomputers would take years to calculate the optimal route for 50 cities (Parker 2003). Adleman solved the problem for seven cities within a second, using DNA molecules in a standard reaction tube. He represented each of the seven cities as separate, single-stranded DNA molecules, 20 nucleotides long, and all possible paths between cities as DNA molecules composed of the last ten nucleotides of the departure city and the rst ten nucleotides of the arrival city. Mixing the DNA strands with DNA ligase and adenosine triphosphate (ATP) resulted in the generation of all possible random paths through the cities. However, the majority of these paths were not applicable to the situation because they were either too long or too short, or they did not start or nish in the right city. Adleman then ltered out all the paths that neither started nor ended with the correct molecule and those that did not have the correct length and composition. Any remaining DNA molecules represented a solution to the problem. The DNA computer provides enormous parallelism in one ftieth of a teaspoon of solution, approximately 1014 DNA representing ight numbers were simultaneously concatenated in about one second. The Adleman approach to the HPP is shown in gure 6. An instance of the HPP which is solved

Topical Review

Start Generate all possible n-bit strings in S

Generate strands encoding random paths

Let j = 1, w and x represent literals

Keep only the potential solutions

Y j<=n? i=1 N Y S nonempty?

Monitor the quantities of DNA generated for the specific graph

The instance is satisfiable

Remove strands that do not encode the HPP

Identify uniquely the HP solution

Strand encodes HPP?>


Discard the strands

Generate a new set S by merging extracted strings, increment j

N i<=l?

Y wi = xj ?

Figure 6. Adlemans approach to HPP.

Extract from S strings encoding wi =0

Extract from S strings encodings wi = 1

1 6

Increment i

3 5

Figure 8. A solution to the SAT.

Figure 7. Instance of the HPP solved by Adleman.

by Adleman is depicted in gure 7, with the Hamiltonian path (HP) highlighted by a dashed line. The DNA sequences were set to replicate and create trillions of new sequences based on the initial input sequences in a matter of seconds (DNA hybridization). The theory holds that the solution to the problem was one of the new sequence strands. By process of elimination, the correct and nal solution would be found. Based on Adlemans method, the amount of DNA scales exponentially, for example, solving a 200-city TSP would take probably an amount of DNA that weighed more than the earth. The error rate for each operation is another hurdle for DNA computing as the number of iterations increases (Ryu 2000). Lipton (1995) argued that all NP (non-deterministic polynomial time) problems could be efciently reduced to the HPP. He also demonstrated how DNA computing solves a twovariable SAT problem. Lipton (1995) proposed a solution to the SAT. Figure 8 depicts the approach followed in order to solve the SAT problem. An initial set S contains many strings, each encoding a single n-bit sequence. All possible n-bit sequences are represented in S. An instance, I, of SAT consists of a set of clauses. The problem is to assign a Boolean value to a variable in W such that at least one variable in each clause has the value true. If this is the case then I is satisable. Sakamoto et al (1999) showed that many NP-complete problems can be solved by a single series of successive

transitions, combined with parallel overlap assembly and some other operations. A disadvantage of this approach is that additional time is needed on successive transitions when more transitions are required. Sakamoto et al (2000) also reported the use of hairpin formation by single-stranded DNA molecules in order to explore the feasibility of autonomous molecular computing. The SAT problem was solved by using molecular biology techniques. The SAT problem of a given Boolean formula was examined autonomously, based on hairpin formation by the molecules that represent the formula. Their computation algorithm can test several clauses in the given formula simultaneously, which could reduce the number of laboratory steps required for computation. As a computer method, the main reason for using DNA sequences comes from the fact that they could encode each possibility in a single DNA molecule. This meant that all possibilities took up comparatively small space for much larger N than in normal computing. Liu et al (2000) solved the SAT problem using only 91 steps to nd solutions, while a normal sequential computer would have slowly wound through 1.6 million steps. Further work is required to discover smaller molecules to serve the same purpose, and improved chemical methods to prepare and lter out the nal solutions. The impact of DNA computing on cryptography remains to be determined. Beaver (1994) has estimated that to factor a 1000-bit number following Adlemans original approach, the required amount of solution would be 10200 000 l. However, Adleman has observed that a DNA computer sufcient to search for 256 DES keys would occupy only a small set of test tubes (Adleman 1996). Subsequently to Adlemans experiment (Adleman 1994), various combinatorial problems have been solved using DNA R33

Topical Review

computing; however, apparently, there has been a lack of progress in solving NP-problems since 2000. 4.3. Classes of DNA computing Essentially, three classes of DNA computing are now apparent: (1) intramolecular, (2) intermolecular, and (3) supramolecular. The Japanese Project lead by Hagiya (Takahashi et al 2005) focuses on intramolecular DNA computing, constructing programmable state machines in single DNA molecules, which operate by means of intramolecular conformational transitions. Intermolecular DNA computing, of which Adlemans experiment is an example, focusing on the hybridization between different DNA molecules as a basic step of computations. Finally, supramolecular DNA computing, as pioneered by Winfree (2003), harnesses the process of selfassembly of rigid DNA molecules with different sequences to perform computations. 4.3.1. Example of intramolecular DNA computing. Sakamoto et al (1999) described one example of intramolecular implementation and named it successive localized polymerization, and it is described by a single-stranded DNA molecule of the form stopper state1 state1 stopper state2 state2 stopper staten staten , where in each pair (statei statei ) of states, statei denotes the state before a transition, and statei the state after the transition. Each state is represented by an appropriate number of bases, called a state sequence. This process of state transitions can be repeated in a single tube by a simple thermal program consisting of thermal cycles for denaturation, annealing, and polymerization. The state machine DNA is assumed to form a hairpin, and transitions occur in an intramolecular manner rather than intermolecular ones. This approach might enhance the power of Adlemans approach to DNA computing (Adleman 1994, Lipton 1995). For solving instances of NP-complete problems, they rst generate the space of candidate solutions in a tube, where each candidate is represented by a DNA molecule. Hybridization and ligation are employed for the generation of the candidate space; recently, the technique of parallel overlap assembly has also been used (Ouyang et al 1997). The candidate space is then explored by a number of laboratory steps that together implement the condition for a candidate to be a real solution. This intramolecular method can be employed in this second step of exploring the candidate space and extracting the real solutions. NP-complete problems are solved by a single series of successive state transitions as described above. Since a series of state transitions can be considered as one big step, this means that the number of laboratory steps needed to explore the candidate space is constant, i.e. O(1). 4.3.2. Example of supramolecular DNA computing. Supramolecular assembly is the creating of molecular assemblies that are beyond the scale of one molecule. The self-assembly of small molecular building blocks programmed to form larger, nanometre-sized elements is an important goal of molecular nanotechnology. This approach is motivated by the magnicent examples occurring in nature: for instance, the supramolecular complex of the E. coli ribosome consisting of 52 protein and three RNA molecules. R34

Innovation and application of supramolecular assemblies have reached impressive new heights. For example, organizations involving nucleic acids have been used for drugs or DNA delivery, and can also be efcient as sensors for detection purposes. The interactions of various low-molecular weight substances with DNA are naturally relevant mechanisms in the cellular cycle and so also used in medicinal treatment (Bischoff and Hoffmann 2002). Depending on the particular drug structure, DNA-binding modes, like groove-binding, intercalating and/or stacking, give rise to supramolecular assemblies of the polynucleotides, as well as inuence the DNAprotein binding.

5. Intelligent systems based on DNA computing

5.1. Smart DNA chips A gene expression experiment with a single DNA chip can provide a visual display of how thousands of genes are expressed simultaneously and a huge amount of information on the genes. This eld has a critical implication to vital pathogenetic applications such as drug design and disease classication. In order to capitalize the abundance of new information made available by DNA chips, a key challenge remains of how to design and develop intelligent machine learning techniques so as to effectively explore such a vast amount of information. The problem of over tting is a leading concern with machine learning approaches to DNA chip data. These medical data are characterized by class imbalance, non-linear response, high noise, and large numbers of attributes. Pomeroy et al (2002) published DNA chip data for 60 cancer patients. Their attempts to model the data using unsupervised learning techniques were unsuccessful at predicting patient survival; however, they claim statistically signicant success using nearest neighbour and other supervised learning techniques. Li et al (2001) obtained good results using three nearest neighbours after selecting genes with a multi-run evolutionary approach on similarly sized DNA expression data. Intelligent DNA chips have been applied to the prediction and diagnosis of cancer, so that it expectedly helps us to exactly predict and diagnose cancer. To precisely classify cancer Cho and Won (2003) have to select genes related to cancer because extracted genes from DNA chips have many noises. This approach explored many features and classiers using three benchmark datasets to systematically evaluate the performances of the feature selection methods and machine learning classiers such as k-nearest neighbour, support vector machine. Kung and Mak (2005) also studied intelligent DNA chips and showed that machine learning techniques offers a viable approach to identifying and classifying biologically relevant groups in genes and conditions. The enormous width of DNA gene chip data makes over tting an ever present danger, particularly with powerful machine learning approaches. Langdon and Buxton (2004) used genetic programming in combination with leave one out cross validation and a principled objective function to evolve many non-linear functions of gene expression values. The approach was to whittle down the thousands of data attributes (gene expression measurements) into a few predictive ones.

Topical Review

Intelligent DNA memory has been designed by Chen et al (2004) as an attempt to capture global information about a population of an organisms, or whole genome gene expressions under certain conditions. Furthermore, the DNA memory incorporates intelligent processing and reasoning capabilities into the test tube. After the data gathering and analysis stage is complete, the high storage capacity and parallelism of DNA are used to draw inferences on the entire in vitro knowledge base. Sakakibara and Suyama (2000) proposed DNA chips with logical operations called intelligent DNA chips. They combined the DNA-computing method for representing and evaluating Boolean functions with the DNA Coded Number (DCN) method, and implemented DNA chips with logical operations executable. The developed DNA chips are considered intelligent because the DNA chips not only detected gene expression but also found logical formulae of gene expressions. This intelligent DNA would be able to provide logical inference for such diagnoses based on detected gene expression patterns. 5.2. Applying articial intelligence methods in DNA computing Since Adlemans solution to the HPP (Adleman 1994), DNA and RNA solutions of some NP-complete problems, such as the 3-SAT problem (Braich et al 2002), the maximal clique problem (Ouyang et al 1997), and the knight problem (Faulhammer et al 2000) were proposed. The power of parallel, high-density computation by molecules in solution allows DNA computers to solve hard computational problems such as NP-complete problems in polynomial increasing time, while a conventional Turing machine requires exponentially increasing time (Impagliazzo et al 1998, Sakamoto et al 1999). However, all the current DNA computing strategies are based on enumerating all candidate solutions, and then using selection processes to eliminate incorrect DNA. This algorithm requires that the size of the initial data pool increases exponentially with the number of variables in the calculation. For example, to calculate a DNA solution of an NP-complete problem, the number of molecules in the solution increases exponentially with respect to the problem size. As the problem size keeps increasing, the brute-force method will be infeasible. Therefore, the design of articial intelligence techniques in DNA computing will serve to break the barrier of this brute-force method and get a nal solution from a very small initial data pool, avoiding enumerating all candidate solutions. 5.2.1. Evolutionary and genetic algorithms. Evolution is a concept of obtaining adaptation through the interplay of selection and diversity. The tendency of evolving populations to minimize the sampling of large, low-tness individuals suggests that a DNA-based evolutionary approach might be effective for an exhaustive search. Of all evolutionary inspired approaches, genetic algorithms (GAs) seem particularly suited to implementation using DNA. This is because genetic algorithms are generally based on manipulating populations of bit strings using both crossover and mutation operators (Chen et al 1999).

The combination of the massive parallelism and high storage density inherent in DNA computing with the direct search capability of GAs represent major advantages for DNA-GA approaches. The GA is one of the possible ways to break the limit of the brute-force method in DNA computing (Yuan and Chen 2004). One gram of a singlestranded DNA is approximately 1.8 1021 nucleotides or about 1022 bytes. Individuals and answers can be encoded in DNA molecules using binary representations. Larger populations can carry on larger ranges of genetic diversity and hence can generate high-tness chromosomes in fewer generations thus effectively reducing the size of the search space. Furthermore, experimenting in vitro operations on DNA inherently involve errors. These are more tolerable in executing genetic algorithms than in executing deterministic algorithms. In a sense, errors may be regarded as a contributing factor to genetic diversity. A DNA-based GA was proposed as an application of an evolution program searching for good encodings (Deaton et al 1997). Yoshikawa et al (1997) combined the DNAencoding method with the pseudo-bacterial GA. Chen et al (1999) proposed the laboratory implementation of the DNAGA for some simple problems such as the Max 1s, the royal road, and the cold war problems. Wood et al (1999) designed and implemented a DNA-based in vitro genetic algorithm for the Max 1s problem. Wood and Chen (1999) proposed and implemented a DNA strand design suited for the royal road problem using a genetic algorithm, where in vitro evolution started with a randomized population of DNA strands. A few years later, Rose et al (2002) proposed a DNA-based in vitro genetic algorithm for the HPP. Evolutionary and genetic DNA computing were proposed to solve the maximum clique (Back et al 1999, Yuan and Chen 2004). Yuan and Chen (2004) designed a DNA best GA for the maximal clique problem, which was capable to produce correct solution within a few cycles at high probability. Their simulation indicated that the time requirement of their approach was approximately a linear function of the number of vertices in the network. Wood et al (2001) employed in vitro evolutionary DNA computing to learn game playing and nd adaptive gametheoretic strategies. They applied their approach for the game of poker where they constructed two single-stranded DNAs to represent the two possible plays. Stojanovic and Stefanovic (2003) designed a DNA computer named MAYA capable of playing tic-tac-toe. Ren et al (2003) proposed a new approach to the virus DNA-based evolutionary algorithm (VDNA-EA) to implement self-learning of a class of TakagiSugeno (TS) fuzzy controllers. The VDNA encoding method was used to encode the design parameters of the fuzzy controllers which has shortened the code length of the DNA chromosome. The frameshift decoding method was used to decode the DNA chromosome into the design parameters of the fuzzy controllers. Those methods have made the genetic operators capable to operate at the gene level within the VDNAEA approach. Computer simulation demonstrated the effectiveness of this method in designing automatically a class of TS fuzzy controllers. Neural networks also represent other prospective candidates (Russo et al 1994, Farhat and R35

Topical Review

Hernandez 1995) in making DNA computing more efcient. Therefore, the design of articial intelligence techniques in DNA computing will serve to break the barrier of this bruteforce method and get a nal solution from a very small initial data pool, avoiding enumerating all candidate solutions (Ezziane 2006). 5.2.2. Swarm intelligence. Apart from genetic algorithms and other evolutionary algorithms that have promising potential for a variety of problems such as automatic system design for molecular nanotechnology (Hall 1997), another emerging technique is swarm intelligence, which is inspired by the collective intelligence in social animals such as birds, ants, sh and termites. These social animals require no leader. Their collective behaviours emerge from interactions among individuals, in a process known as self-organization. Each individual may not be intelligent, but together they perform complex collaborative behaviours. Typical uses of swarm intelligence are to assist the study of human social behaviour by observing other social animals and to solve various optimization problems (Bonabeau et al 1999, Eberhart et al 2001). There are three main types of swarm intelligence techniques: models of bird ocking, the ant colony optimization (ACO) algorithm, and the particle swarm optimization (PSO) algorithm. Besides being a model of the human social behaviour, the particle swarm (Kennedy and Eberhart 1995) is closely related to swarm intelligence. In the particle swarm, there is no central control: no one gives orders. Each particle is a simple agent acting upon local information. Yet, the swarm as a whole is able to perform tasks, whose degree of complexity is well beyond the capabilities of the individual. The particle swarm shows signs of self-organization. The interactions among the low-level components (particles) result in complex structures at the global level (swarm) making it possible for it to perform optimization of functions. PSO was originally designed to simulate bird ocking in order to learn more about the human social behaviour (Kennedy and Eberhart 1995). However, the conventional particle swarm optimization relies on social interaction among particles through exchanging detailed information on position and performance. In the physical world, this type of complex communication is not always possible. Recently, Kaewkamnerdpong and Bentley (2005) proposed a new swarm algorithm, called the Perceptive Particle Swarm Optimization (PPSO) algorithm. The PPSO algorithm has extended the conventional PSO algorithm for applications in the physical world. This extension takes into consideration both the social interaction among particles and environmental interaction. The PPSO algorithm simulates the emerging collective intelligence of social insects more closely than the conventional PSO algorithm. The PPSO algorithm is designed to handle real-world physical control problems including programming or controlling agents of nanotechnology, for example nanorobots or DNA computers.

6. Conclusion
The main benet of using DNA computers to solve complex problems is that different possible solutions are created all at R36

once and in a parallel fashion. Humans and most electronic computers must attempt to solve the problem one process at a time. DNA itself provides the added benets of being a cheap, energy-efcient resource. The increasing ability to design complex molecules and systems makes these models of computation increasingly of interest for nanotechnology and biological engineering, as well as for the fundamental understanding of biological processes. Important events which have taken place in the eld of DNA computing initiated the possibility of exploiting the massive parallelism, high storage density, and nanostructures inherent in natural phenomena to solve computational problems. Here indeed remain tremendous scientic, engineering, and technological challenges to bring this paradigm to full fruition, and thus make DNA computing a competitive player in the landscape of practical computing (Garzon and Deaton 1999). The implementation of an intelligent system method such as a GA in DNA computing presents an attractive alternative to further evolutionary computation research by pushing the analogy into a fully edged in vivo implementation. DNA computing is hence poised to enable feasible solutions of previously infeasible search problems by using newly available molecular biological technology (Garzon and Deaton 1999). The DNA-based intelligent algorithms have potential advantages in many complex practical problems. The engineering and programming of biochemical circuits, in vivo and in vitro, could transform industries that use chemical and nanostructured materials. Information and algorithms appear to be central to biological organization and processes, from the storage and reproduction of genetic information to the control of developmental processes to the sophisticated computations performed by the nervous system. Much as human technology uses electronic microprocessors to control electromechanical devices, biological organisms use biochemical circuits to control molecular and chemical events. The engineering and programming of biochemical circuits would transform industries that use chemical and nanostructured materials. Although the construction of biochemical circuits has been explored theoretically since the birth of molecular biology, the current practical experience with the capabilities and possible programming of biochemical algorithms is still in its infancy (Winfree 2003). Bioelectronics is another sub-discipline that uses biological molecules such as bacteriorhodopsin in electronic or photonic devices (Gupta et al 2001). It seeks to exploit the growing technical ability to integrate biomolecules with electronics to develop a broad range of functional devices. An important research aspect is the development of the communication interface between the biological materials and electronic components. Bioelectronics research also seeks to use biomolecules to perform the electronic functions that semiconductor devices currently perform, thereby offering the potential to increase computing-microchip density sufciently to continue Moores law down to the nanometre level. DNA computing has expanded the notion of what is computation. However, up to now a practical mathematical problem that would justify the use of massive parallelism achieved by the DNA computations has not been developed. Therefore, we might have to wait some time for DNA to

Topical Review

replace the silicon in our computers. Future DNA computing would provide exciting opportunities and open doors to solve new research problems in combinatorics, complexity theory and algorithms, intelligent manufacturing systems, complex molecular diagnostics and molecular process control (McCaskill 2000). For the long term, one can speculate about the prospect for molecular computation. It seems likely that a single molecule of DNA can be used to encode the instantaneous description of a Turing machine and those currently available protocols and enzymes could be used to induce successive sequence modications, which would correspond to the execution of the machine. In the future, research in molecular biology may provide improved techniques for manipulating macromolecules. Research in chemistry may allow for the development of synthetic designer enzymes. One can imagine the eventual emergence of a general-purpose computer consisting of nothing more than a single macromolecule conjugated to a ribosome-like collection of enzymes that act on it (Manca 1999).

Adar R, Benenson Y, Linshiz G, Rosner A, Tishby N and Shapiro E 2004 Stochastic computing with biomolecular automata Proc. Natl Acad. Sci. USA 101 99605 Adleman L 1994 Molecular computation of solutions to combinatorial problems Science 266 10214 Adleman L M 1996 Statement, Cryptographers Expert Panel, RSA Data Security Conf. (San Francisco, CA, Jan. 1996) Alberti P and Mergny J L 2003 DNA duplexquadruplex exchange as the basis for a nanomolecular machine Proc. Natl Acad. Sci. USA 100 156973 Back T, Kok J and Rozenberg G 1999 Cross-fertilization between evolutionary computation and DNA-based computing Proc. Congr. of Evolutionary Computation (Washington, DC: IEEE Computer Society Press) pp 9807 Beaver D 1994 Factoring: the DNA solution 4th Int. Conf. on the Theory and Applications of Cryptology (Wollongong, 1994) (Berlin: Springer) pp 41923 Benenson Y, Adar R, Paz-Elizur T, Livneh Z and Shapiro E 2003 DNA molecule provides a computing machine with both data and fuel Proc. Natl Acad. Sci. USA 100 21916 Benenson Y, Gil B, Ben-Dor U, Adar R and Shapiro E 2004 An autonomous molecular computer for logical control of gene expression Nature 429 4239 Benenson Y, Paz-Elizur T, Adar R, Keinan E, Livneh Z and Shapiro E 2001 Programmable and autonomous computing machine made of biomolecules Nature 414 4304 Benenson Y and Shapiro E 2004 Molecular computing machines Encyclopedia of Nanoscience and Nanotechnology ed J A Schwarz, C I Contescu and K Putyera Dekker (New York: Dekker) pp 204356 Bischoff G and Hoffmann S 2002 DNA-binding of drugs used in medicinal therapies Curr. Med. Chem. 9 32148 Bonabeau B, Dorigo M and Thraulaz G 1999 Swarm Intelligence: From Natural to Articial Systems (Oxford: Oxford University Press) Braich R S, Chelyapov N, Johnson C, Rothemund P W and Adleman L 2002 Solution of a 20-variable 3-SAT problem on a DNA computer Science 296 499502 Chen J, Antipov E, Lemieux B, Cedeno W and Wood D H 1999 DNA computing implementing genetic algorithms Proc. DIMACS Workshop on Evolution as Computation (Princeton, NJ, 1999) (Berlin: Springer) pp 3949

Chen J, Wang Y and Deaton R 2004 Large scale genomic monitoring or proling using a DNA-based memory and microarrays 24th Army Science Conf. (Orlando, FL, 2004) (WWW document) SessionA/AP-06.pdf (accessed 27th May 2005) Cho S B and Won H H 2003 Machine learning in DNA microarray analysis for cancer classication First Asia-Pacic Bioinformatics Conf. (Adelaide) CRPITV19Cho.pdf (accessed 20th october 2005) Deaton R, Murphy R C, Rose J A, Garzon M, Franceschetti D R and Stevens S E Jr 1997 A DNA based implementation of an evolutionary search for good encodings for DNA computation IEEE Int. Conf. on Evolutionary Computation (Indianapolis, IL 1997) (Los Alamitos, CA: IEEE Computer Society Press) pp 26771 Eberhart R, Shi Y and Kennedy J 2001 Swarm Intelligence (San Mateo, CA: Morgan Kaufmann) Ezziane Z 2006 Applications of articial intelligence in bioinformatics: A review Expert Syst. Appl. 30 210 Farhat N H and Hernandez E D M 1995 Logistic networks with DNA-like encoding and interactions Int. Workshop on Articial Neural Networks (Malaga, 1995) (Springer: Berlin) pp 21522 Faulhammer D, Cukras A R, Lipton R J and Landweber L F 2000 Molecular computation: RNA solutions to chess problems Proc. Natl Acad. Sci. USA 97 13859 Feng L, Park S H, Reif J H and Yan H 2003 A two-state DNA lattice switched by DNA nanoactuator Angew. Chem. Int. Edn 42 43426 Garzon M H and Deaton R J 1999 Biomolecular computing and programming IEEE Trans. Evolutionary Comput. 3 23650 Graham S 2003 New DNA computer functions sans fuel (WWW document) articleID=000A4F2E-781B-1E5A-A98A809EC5880105 (accessed 17th April 2005) Guarnieri F and Bancroft C 1999 Use of a horizontal chain reaction for DNA-based addition DIMACS Series in Discrete Mathematics and Theoretical Computer Science vol 4, pp 10511 Guarnieri F, Fliss M and Bancroft C 1996 Making DNA add Science 273 2203 Gupta G, Mehra N and Chakraverty S 2001 DNA computing The Indian Programmer (WWW document) http:// computing.htm (accessed 4th May 2005) Gupta V, Parthasarathy S and Zaki M J 1997 Arithmetic and logic operations with DNA Proc. 3rd DIMACS Workshop on DNA-based Computers (Philadelphia, USA) pp 21220 Hall J S 1997 Agoric/genetic methods in stochastic design 5th Foresight Conf. on Molecular Nanotechnology http://www.cs. josh/chsmith.html (accessed 16th October 2005) Impagliazzo R, Paturi R and Zane F 1998 Which problems have strongly exponential complexity? Proc. 39th Annual Symp. on Foundations of Computer Science (Palo Alto, CA, 1998) (Washington, DC: IEEE Computer Society Press) pp 65363 Kaewkamnerdpong B and Bentley P J 2005 Perceptive particle swarm optimization Proc. Int. Conf. on Adaptive and Natural Computing Kaewkamnerdpong/bkpb-icannga05.pdf (accessed 17th October 2005) Kennedy J and Eberhart R 1995 Particle swarm optimization Proc. IEEE Int. Conf. on Neural Networks pp 19428 Kung S Y and Mak M W 2005 A machine learning approach to DNA microarray biclustering analysis (WWW document) mwmak/papers/mlsp05.pdf (accessed 19th October 2005) Langdon W B and Buxton B F 2004 Genetic programming for mining DNA chip data from cancer patients Genetic Programming and Evolvable Machines 5 (3) 2517


Topical Review

Leete T, Klein J, Salem J S and Rubin H 1997 Bit operations using a DNA template Proc. 3rd DIMACS Workshop on DNA-based Computers (Philadelphia) pp 159366 Li J and Tan W 2002 A single DNA molecule nanomotor Nano Lett. 2 3158 Li L, Weinberg C R, Darden T A and Pedersen L G 2001 Gene selection for sample classication based on gene expression data: Study of sensitivity to choice of parameters of the GA/KNN method Bioinformatics 17 113142 Lipton R J 1995 DNA solution of hard computational problem Science 268 5425 Liu D and Balasubramanian S 2003 A proton fuelled DNA nanomachine Angew. Chem. Int. Edn 42 57346 Liu Q, Wang L, Frutos A G, Condon A E, Corn R M and Smith L M 2000 DNA computing on surfaces Nature 403 1759 Manca V 1999 The logic of molecule manipulation systems (WWW document) (accessed 10th March 2005) Mao C, Labean T H, Reif J H and Seeman N C 2000 Logical computation using algorithmic self-assembly of DNA triple-crossover molecules Nature 407 4936 Mao C, Sun W, Shen Z and Seeman N C 1999 A nanomechanical device based on the BZ transition of DNA Nature 397 1446 McAdams H H and Arkin A 1997 Stochastic mechanisms in gene expression Proc. Natl Acad. Sci. USA 94 8149 McCaskill J 2000 Biomolecular computing (WWW document) News/enw43/mc caskill1.html (accessed 20th May 2005) Niemeyer N C and Adler M 2002 Nanomechanical devices based on DNA Angew. Chem. Int. Edn 41 377983 Orlian M, Guarnieri F and Bancroft C 1998 Parallel primer extension horizontal chain reactions as a paradigm of parallel DNA-based computation DIMACS: Series in Discrete Mathematics and Theoretical Computer Science pp 142158 Ouyang Q, Kaplan P D, Liu S and Libchaber A 1997 DNA solution of the maximal clique problem Science 278 4469 Parker J 2003 Computing with DNA EMBO Rep. 4 710 Pomeroy S L et al 2002 Prediction of central nervous system embryonal tumour outcome based on gene expression Nature 415 43642 Reif J H 1998 Paradigms for biomolecular computation (WWW document) reif/paper/paradigm.pdf (accessed on October 21st, 2005) Reif J H 2003 The design of autonomous DNA nanomechanical devices: Walking and rolling DNA Lecture Notes in Computer Science vol 2568, pp 2237 Reif J H, LaBean T H, Sahu S, Yan H and Yin P 2005 Design, simulation, and experimental demonstration of self-assembled DNA nanostructures and motors Lecture Notes in Computer Science vol 3566, pp 16680 Ren L, Ding Y, Ying H and Shao S 2003 Emergence of self-learning fuzzy systems by a new virus DNA-based evolutionary algorithm Int. J. Intell. Systems 18 33954 Rose J A, Hagiya M, Deaton R J and Suyama A 2002 DNA-based in vitro genetic program J. Biol. Phys. 28 4938 Rothemund P W, Papadakis N and Winfree E 2004 Algorithmic self-assembly of DNA Sierpinski triangles PLoS Biol. 2 (12) Rubin H, Klein J and Leete T 1997 A biomolecular implementation of logically reversible computation with minimal energy dissipation Proc. 3rd DIMACS Workshop on DNA-based Computers (Philadelphia) pp 15966 Russo M F, Huff A C, Heckler C E and Evans A C 1994 An improved DNA encoding scheme for neural network modeling Int. Neural Network Society Annual Mtg (San Diego, CA, 1994) pp I/3549

Ryu W 2000 DNA computing: a primer (WWW document) http:// (accessed 27th February 2005) Sakakibara Y and Suyama A 2000 Intelligent DNA chips: Logical operation of gene expression proles on DNA computers Genome Informatics 11 3342 Sakamoto K, Gouzu H, Komiya K, Kiga D, Yokoyama S, Yokomori T and Hagiya M 2000 Molecular computation by DNA hairpin formation Science 288 12236 Sakamoto K, Kiga D, Komiya K, Gouzu H, Yokoyama S, Ikeda S, Sugiyama H and Hagiya M 1999 State transitions by molecules Biosystems 52 8191 Seelig G 2004 DNA hybridization catalysts and catalyst circuits 10th Int. Mtg on DNA Based Computers (Milan, 2004) (Berlin: Springer) pp 20213 Seeman N C 2003 DNA in a material world Nature 421 42731 Shapiro E, Adar R, Benenson K, Linshitz G, Regev A and Silverman W 2002 Molecules and computation (WWW document) day 2002/book/ehud shapiro.pdf (accessed October 22nd 2005) Shapiro E, Adar R, Benenson Y, Linshiz G, Wasserstrom A, Frumkin D, Tuvi S, Gronau I, Gil G and Ben-Dor U 2004 Molecular computer: doctor in a test tube (WWW document) day/book/author/e shapiro.pdf (accessed 27th May 2005) Sherman W B and Seeman N C 2004 A precisely controlled DNA biped walking device Nano Lett. 4 12037 Shin J and Pierce N 2004 A synthetic DNA walker for molecular transport J. Am. Chem. Soc. 12 108345 Simmel F C and Yurke B 2001 Using DNA to construct and power a nanoactuator Phys. Rev. E 63 041913 Simmel F C and Yurke B 2002 A DNA-based molecular device switchable between three distinct mechanical states Appl. Phys. Lett. 80 8835 Stojanovic M N and Stefanovic D 2003 A deoxyribozyme-based molecular automaton Nat. Biotechnol. 21 106974 Sung H P, Yan H, Reif J H, LaBean T H and Finkelstein G 2004 Electronic nanostructures templated on self-assembled DNA scaffolds Nanotechnology 15 S5257 Takahashi K, Yaegashi S, Asanuma H and Hagiya M 2005 Photo- and thermoregulation of DNA nanomachines DNA11, 11th Int. Mtg on DNA Computing (WWW document) http:// (accessed 15th May 2005) at press Thaker E 2004 Biomedia (Minneapolis: University of Minnesota Press) Turbereld A 2003 DNA as an engineering material Phys. Rev. Lett. 16 436 Turbereld A J and Mitchel J C 2003 DNA fuel for free-running nanomachines Phys. Rev. Lett. 90 118102 Winfree E 2003 DNA computing by self-assembly National Academy of Engineering: The Bridge 33 318 Winfree E, Liu F, Wenzler L A and Seeman N C 1998 Design and self assembly of two-dimensional DNA crystals Nature 394 53944 Wood D H, Bi H, Kimbrough S O, Wu D and Chen J 2001 DNA starts to learn Poker Proc. 7th Int. Mtg on DNA-based Computers (Tampa, FL, 2001) (Berlin: Springer) pp 2332 Wood D H and Chen J 1999 Physical separation of DNA according to royal road tness IEEE Conf. on Evolutionary Computation (Washington, DC, 1999) (Los Alamitos, CA: IEEE Computer Society Press) pp 101625 Wood D H, Chen J, Antipov E, Cede o W and Lemieux B 1999 A n DNA implementation of the Max 1s problem Proc. Genetic and Evolutionary Computation Conf. (Orlando, FL 1999) (San Francisco: Morgan Kaufmann) pp 183542 Yan H, Zhang X, Shen Z and Seeman N C 2002 A robust DNA mechanical device controlled by hybridization topology Nature 415 625


Topical Review

Yin P, Turbereld A J, Sahu S and Reif J H 2005a Design of an autonomous DNA nanomechanical device capable of universal computation and universal translational motion Lecture Notes in Computer Science vol 3384, pp 42644 Yin P, Turbereld A J, Sahu S and Reif J H 2005b Designs of autonomous unidirectional walking DNA devices Lecture Notes in Computer Science vol 3384, pp 41025 Yoshikawa T, Furuhashi T and Uchikawa Y 1997 The effects of combination of DNA coding method with pseudo-bacterial

GA Proc. Int. Conf. on Evolutionary Computation (Indianapolis, IN, 1997) (Piscataway, NJ: IEEE Computer Press) pp 28590 Yuan L and Chen F 2004 Genetic algorithm in DNA computing: A solution to the maximal clique problem Chin. Sci. Bull. 49 96771 Yurke B, Turbereld A J Jr, Mills A P, Simmel F C and Neumann J L 2000 A DNA-fuelled molecular machine made of DNA Nature 406 6058