A biophysical basis for the emergence of the genetic code in protocells

Harrison SA, Nunes Palmeira R, Halpern A, Lane N Biochim Biophys Acta Bioenergetics 863: 148597


  • We provide a new framework for the origin of the genetic code in protocells growing by CO2 fixation.
  • Using a simple set of rules we are able allocate the large majority of codon assignments.
  • These codon assignments are based on biophysical interactions in an expanding protometabolism.
  • Biophysical interactions between RNA strings and templated amino acids can drive protocell growth.
  • Autotrophic protocell growth gives a new context for the origin of information in biology.


The origin of the genetic code is an abiding mystery in biology. Hints of a ‘code within the codons’ suggest biophysical interactions, but these patterns have resisted interpretation. Here, we present a new framework, grounded in the autotrophic growth of protocells from CO2 and H2. Recent work suggests that the universal core of metabolism recapitulates a thermodynamically favoured protometabolism right up to nucleotide synthesis. Considering the genetic code in relation to an extended protometabolism allows us to predict most codon assignments. We show that the first letter of the codon corresponds to the distance from CO2 fixation, with amino acids encoded by the purines (G followed by A) being closest to CO2 fixation. These associations suggest a purine-rich early metabolism with a restricted pool of amino acids. The second position of the anticodon corresponds to the hydrophobicity of the amino acid encoded. We combine multiple measures of hydrophobicity to show that this correlation holds strongly for early amino acids but is weaker for later species. Finally, we demonstrate that redundancy at the third position is not randomly distributed around the code: non-redundant amino acids can be assigned based on size, specifically length. We attribute this to additional stereochemical interactions at the anticodon. These rules imply an iterative expansion of the genetic code over time with codon assignments depending on both distance from CO2 and biophysical interactions between nucleotide sequences and amino acids. In this way the earliest RNA polymers could produce non-random peptide sequences with selectable functions in autotrophic protocells.