Lecture

Information biology - a dialogue between informatics and biology

from to

Walter Fontana presents his lecture in the series les courTs du Collège de France

The 2019-2020 Computer Sciences chair at the Collège de France aims to highlight computational biology. The term is often understood as " bioinformatics " - a practice of computing in the service of organizing, searching and analyzing large datasets for predictive understanding. An extraordinary amount of work has been done in this field, and I think enough attention has been paid to it to justify the organization of another lecture (not to mention my lack of expertise). Instead, I wanted to reinterpret the meaning of " calculus " by focusing on the representation of biomolecular systems (mainly made up of proteins) as a set of " if-then " rules whose pre- and post-conditions capture, at a higher level of abstraction, empirical results about interaction mechanisms.

For example, " if the RGS domain of Axin is linked to a SAMP domain of APC and GSK is linked to Axin and beta-catenin is linked to a twenty-amino acid repeat domain of APC, then GSK phosphorylates beta-catenin. " If this doesn't make sense, it's because it doesn't. It is an empirical fact devoid of meaning (other than that which it asserts), because it is a single Lego brick without companions. However, when combined with dozens or hundreds of other facts of the same nature, these facts begin to interlock dynamically, revealing a system of behavior. Such a modeling style treats a model as a program written in a domain-specific programming language. Models of this type can be built, debugged and analyzed like programs. The formal foundations of rule-based modeling are based on graph transformation, since the pre- and post-conditions of rules are expressed as graphs of special types.

Thus, from this angle, the term   acquires connotations of programming language theory. Two lecture units are therefore devoted to defining a language called " Kappa ", based on rules with examples for its application. However, I feel that there must be a wider story in the context of which concern for this type of modeling is justified. I have chosen as my thread the somewhat vague notion of " information biology ", intended to draw attention to the fact that information is always physically represented in one way or another, and that information processing does not proceed by manipulating ethereal, infinitely pliable information, but by acting on its physical representation in a way constrained by that representation. Thus, there are conferences reviewing topics ranging from genotype-phenotype mapping to the precision of replication and molecular recognition, and learning. The somewhat eclectic selection combines disparate topics in ways that those accustomed to learning detailed techniques for tackling open-ended problems in well-founded subjects might find frustrating. Finally, the inaugural lecture is an attempt to probe the term " computing " in depth, focusing on its generative and " chemical " qualities.

Program