Complexity of Life is Growing by Orders of Magnitude

by Computer Scientist Dave D’Onofrio

In an article titled “Life Is Complicated” in Nature 464, April 2010, Erika Check Hayden explores the frustration many biologist are experiencing with the realization that the genome is becoming more complex than they could of imagined. To illustrate the depth of the problem, the article states “Instead, as sequencing and other new technologies spew forth data, the complexity of biology has seemed to grow by orders of magnitude.” Notice the term “orders of magnitude” indicating a vast amount of interactions and elements they were not expecting. Certainly, with the discoveries of new complex genetic interactions with RNA’s, signal transduction pathways, proteins and the accompanying cascade of those interactions, this is not a surprising reaction. Jennifer Doudna, a biochemist at University of California, Berkeley adds, “It seems like we’re climbing a mountain that keeps getting higher and higher,”

The article states that the crux of the regulation problem is “that a regulator gene codes for a regulator protein that controls transcription by binding to particular site(s) on DNA.” A vastly simplified description of what happens follows: 1) A request for a protein is generated by the cell. 2) That request is translated into the RNA operating language of the nucleus through signal transduction pathways. 3) The request for a particular protein is interpreted by the RNA operating system that must generate the local machinery that locates copies and edits the appropriate gene via RNA’s and existing protein elements. 4) The needed RNA’s are first located and copied, forming a variety of small RNA’s, si RNA.s, etc. These work in conjunction with proteins to form the transcription factors and a host of other functions. 5) The above mechanisms work upon the formatted data base that I call the DNA hard drive to extract, copy and check for proper information. 6) Regulation (as biologists call it) is more like the process of locally enabling those volumes and clusters of the genome where the requested gene resides. Next, it is important to locate the proper gene and then, precisely identify and align the copy mechanism to the initiation start site. 7) Once the copy of the gene needed is made (mRNA), it is edited by the spliceosome (a highly complex and specific process in itself) followed by the addition of the cap and poly A tail transforming the mRNA into a biological bus analogous to the data bus of a computer system. The mRNA then travels through the nucleus pore complex towards its destination (the protein building factory). 8) More than one copy of the same gene may be necessary (such as building the filament of the flagellum) enabling repeated copying processes to continue on the same gene, regulated by the system.

Basically, this is a multi-integrated process that is orchestrated by the cellular operating system, just like windows or Linux orchestrates the processes on a computer. The article goes on to describe how “systems biology” can help to make sense of the added complexity. However, this hasn’t been the case “So far, all these attempts have run up against the same roadblock: there is no way to gather all the relevant data about each interaction included in the model.” I believe that this is because they are following the wrong paradigm. Structural isolation of gene modules is not what one would expect from random undirected processes of evolution. Instead, the creation of boundary conditions around gene modules (transposons, gene clusters) is what one would expect to see if an organized rule based process is applied to a generic database. In our experience, this formatting is a product of design. We see such structures in multiple drives and volumes as seen in hard drives and in subroutines and libraries in software programming. What appears to be regulatory system interactions may be the visible execution of the cellular operating system, or bio-BIOS occurring in the nucleus.