Researchers at Caltech have developed an artificial neural network made out of DNA that can solve a classic machine learning problem: correctly identifying handwritten numbers. The work is a significant step in demonstrating the capacity to program artificial intelligence into synthetic biomolecular circuits.
The work was done in the laboratory of Lulu Qian, assistant professor of bioengineering. A paper describing the research appears online on July 4 and in the July 19 print issue of the journal Nature.
"Though scientists have only just begun to explore creating artificial intelligence in molecular machines, its potential is already undeniable," says Qian. "Similar to how electronic computers and smart phones have made humans more capable than a hundred years ago, artificial molecular machines could make all things made of molecules, perhaps including even paint and bandages, more capable and more responsive to the environment in the hundred years to come."
Artificial neural networks are mathematical models inspired by the human brain. Despite being much simplified compared to their biological counterparts, artificial neural networks function like networks of neurons and are capable of processing complex information. The Qian laboratory's ultimate goal for this work is to program intelligent behaviors (the ability to compute, make choices, and more) with artificial neural networks made out of DNA.
"Humans each have over 80 billion neurons in the brain, with which they make highly sophisticated decisions. Smaller animals such as roundworms can make simpler decisions using just a few hundred neurons. In this work, we have designed and created biochemical circuits that function like a small network of neurons to classify molecular information substantially more complex than previously possible," says Qian.
To illustrate the capability of DNA-based neural networks, Qian laboratory graduate student Kevin Cherry chose a task that is a classic challenge for electronic artificial neural networks: recognizing handwriting.
Human handwriting can vary widely, and so when a person scrutinizes a scribbled sequence of numbers, the brain performs complex computational tasks in order to identify them. Because it can be difficult even for humans to recognize others' sloppy handwriting, identifying handwritten numbers is a common test for programming intelligence into artificial neural networks. These networks must be "taught" how to recognize numbers, account for variations in handwriting, then compare an unknown number to their so-called memories and decide the number's identity.
Key to creating biomolecular circuits out of DNA are the strict binding rules between molecules of DNA. A single-stranded DNA molecule is composed of smaller molecules called nucleotides—abbreviated A, T, C, and G—arranged in a string, or sequence. The nucleotides in a single-stranded DNA molecule can bond with those of another single strand to form double-stranded DNA, but the nucleotides bind only in very specific ways: An A nucleotide with a T or a C nucleotide with a G.
Taking advantage of these predictable binding rules, Qian and her colleagues can design short strands of DNA to undergo predictable chemical reactions in a test tube and thereby compute tasks, such as molecular pattern recognition. In 2011, Qian and her colleagues created the first artificial neural network made of DNA molecules that could recognize four simple patterns.
In the work described in the Nature paper, Cherry, who is the first author on the paper, demonstrated that a neural network made out of carefully designed DNA sequences could carry out prescribed chemical reactions to accurately identify "molecular handwriting." Unlike visual handwriting that varies in geometrical shape, each example of molecular handwriting does not actually take the shape of a number. Instead, each molecular number is made up of 20 unique DNA strands chosen from 100 molecules, each assigned to represent an individual pixel in any 10 by 10 pattern. These DNA strands are mixed together in a test tube.
"The lack of geometry is not uncommon in natural molecular signatures yet still requires sophisticated biological neural networks to identify them: for example, a mixture of unique odor molecules comprises a smell," says Qian.
Given a particular example of molecular handwriting, the DNA neural network can classify it into up to nine categories, each representing one of the nine possible handwritten digits from 1 to 9.
First, Cherry built a DNA neural network to distinguish between handwritten 6s and 7s. He tested 36 handwritten numbers and the test tube neural network correctly identified all of them. His system theoretically has the capability of classifying over 12,000 handwritten 6s and 7s—90 percent of those numbers taken from a database of handwritten numbers used widely for machine learning—into the two possibilities.
Crucial to this process was encoding a "winner take all" competitive strategy using DNA molecules, developed by Qian and Cherry. In this strategy, a particular type of DNA molecule dubbed the annihilator was used to select a winner when determining the identity of an unknown number.
"The annihilator forms a complex with one molecule from one competitor and one molecule from a different competitor and reacts to form inert, unreactive species," says Cherry. "The annihilator quickly eats up all of the competitor molecules until only a single competitor species remains. The winning competitor is then restored to a high concentration and produces a fluorescent signal indicating the networks' decision."
Next, Cherry built upon the principles of his first DNA neural network to develop one even more complex, one that could classify single digit numbers 1 through 9. When given an unknown number, this "smart soup" would undergo a series of reactions and output two fluorescent signals, for example, green and yellow to represent a 5, or green and red to represent a 9.
Qian and Cherry plan to develop artificial neural networks that can learn, forming "memories" from examples added to the test tube. This way, Qian says, the same smart soup can be trained to perform different tasks.
"Common medical diagnostics detect the presence of a few biomolecules, for example cholesterol or blood glucose." says Cherry. "Using more sophisticated biomolecular circuits like ours, diagnostic testing could one day include hundreds of biomolecules, with the analysis and response conducted directly in the molecular environment."
The paper is titled "Scaling up molecular pattern recognition with DNA-based winner-take-all neural networks." Funding was provided by the National Science Foundation, the Burroughs Wellcome Fund, and the Shurl and Kay Curci Foundation.