New technology has been developed that converts digital binary files into a genetic alphabet, bringing DNA storage closer to reality.
Researchers based at the Los Alamos National Laboratory have created a new codec that minimizes the error rate when writing to molecular storage, while making it easier to troubleshoot potential problems.
“Our software, the Adaptive DNA Storage Codec (ADS Codex), translates data from what a computer understands into what biology understands,” said Latchesar Ionkov, who heads the project. “It’s like translating from English to Chinese, but more difficult.”
The Los Alamos team is part of the larger Molecular Information Storage (MIST) program. The immediate goal of the project is to develop DNA storage technologies capable of writing 1TB and reading 10TB in 24 hours, at a cost of less than $ 1,000.
Once all the issues are resolved, DNA storage could provide a way to store large amounts of data at low cost, which will be vital in the years to come as the amount of data produced continues to increase.
Compared to tape storage, which is used today for archival purposes, DNA is much denser, does not degrade as quickly, and does not require any maintenance.
“DNA offers a promising solution over tape, the most popular cold storage method, which is a technology dating back to 1951,” said Bradley Settlemyer, another researcher at Los Alamos.
“DNA storage could disrupt the way you think about archival storage because the data retention is so long and the data density so high. You can store all of YouTube in your fridge, rather than acres and acres of data centers. “
However, Settlemyer also warned of the various “formidable technological hurdles” that will need to be overcome before DNA storage can materialize, largely to do with the interoperability of different technologies.
The Los Alamos team is specifically focused on issues related to encoding and decoding information, as binary 0s and 1s are translated into the four-letter genetic alphabet (A, C, G, and T) and vice versa.
Codex ADS is designed to combat the natural errors that occur when additional values are accidentally added or removed from the series of letters that make up a DNA sequence. When this data is converted back to binary, the codec checks for anomalies and, if one is detected, adds and removes letters from the string until the data can be verified.
Codex ADS version 1.0 is now finalized and will soon be used to assess the performance of systems built by other members of the MIST project.
Via the storage newsletter