Amphithéâtre Marguerite de Navarre, Site Marcelin Berthelot
Open to all
-

Abstract

The storage of digital data is becoming a challenge for mankind due to the relatively short lifespan of storage devices. Moreover, the exponential increase in the generation of digital data creates the need to constantly build new resources to manage their archiving. Recent studies suggest the use of the DNA molecule as a promising new candidate, which could theoretically hold 215 petabytes in a single gram. Any digital information can be synthesized into DNA in vitro and stored in special tiny capsules that offer storage reliability of several hundred years. The stored DNA sequence can be retrieved at any time using special machines called " sequencers ". The whole process is very difficult, as DNA synthesis is costly and sequencing is error-prone. However, studies have shown that by following several rules in coding, the probability of sequencing error is reduced. Consequently, coding digital information is not trivial, and input data must be efficiently compressed before coding to reduce the high cost of synthesis.

In this presentation, we will discuss the state-of-the-art in DNA data storage for efficient encoding of digital data in a quaternary code consisting of the 4 DNA bases A (Adenine), T (Thymine), C (Cytosine) and G (Guanine). We will also present a promising new solution for encoding digital images in synthetic DNA, which we have been developing at the I3S laboratory over the past 5 years, and which takes into account the constraints associated with storing data on DNA while optimizing the compromise between compression quality and synthesis cost.

Speaker(s)

Marco Antonini

CNRS Research Director