Figure  – Normal DNA and the expanded DNA with the Bases X and Y © Synthorx

1.The known genetic alphabet which consists of 4 bases was expanded to 6 bases.

2.Bases were successfully inherited by the offspring cells.

3.The cell was able to use these bases to produce new unnatural amino acids and from them new protein structures.


The information in DNA is stored as a code made up of four chemical bases: adenine (A), guanine (G), cytosine (C), and thymine (T). Human DNA consists of about 3 billion bases, and more than 99 percent of those bases are the same in all people. The order or sequence of these bases determines the information available for building and maintaining an organism, similar to the way in which letters of the alphabet appear in a certain order to form words and sentences.

In 2014, researchers at the Scripps Institute in California managed to extend this life code. They inserted into the genome of the gut bacterium Escherichia coli two additional artificial bases – X and Y (Malyshev, et al. 2014).

Now, a semisynthetic bacterium translated the gene code with the two new bases X and Y into amino acids, which it normally cannot produce. With their experiments, researchers expanded the code and are the first team that induced the cell to produce a protein from the expanded code.

At the beginning of this year, Floyd Romesberg and his team made a breakthrough experiment in which the semisynthetic bacteria with the six genetic letters in the    genome was able to pass the artificial DNA bases to their offspring and each cell division produced new "Frankenstein" microbes (Zhang et. al. 2017).

Now the team has also made the final breakthrough. They put the two artificial DNA bases into protein-coding genes of their bacteria and induced them to produce completely new amino acids, which don’t occur naturally in any living cell. Using these amino acids bacteria was able to produce a semi-artificial protein (Zhang et. al. 2017)

To make a protein out of a gene sequence, a cell must perform two translation steps. In the first step, called “Transcription”, the base code of the DNA is copied onto a transport RNA (tRNA). This brings the genetic building instructions from the cell nucleus to the protein factories of the cell, the ribosomes. The semi-artificial bacteria also performed this step. They copied the two artificial bases into RNA as well as the natural ones.

In the second step, called “Translation”, the genetic code will be decoded. A triplet stands for a specific amino acid. The sequence of these codons determines the structure of the protein. The extension of the DNA code in the manipulated bacterium created codons that do not exist in nature. Nevertheless, the ribosomes of the bacterium read these unnatural codons. They synthesized amino acids that did not belong to the normal genetic code of the 20 amino acids which are used for protein synthesis. The bacterium combined the new amino acids with the natural amino acids and formed a green fluorescent protein, which researchers were able to see under the microscope (Zhang et. al. 2017).


COPYRIGHT: This article is property of We Speak Science, a nonprofit institution co-founded by Dr. Detina Zalli (Harvard University) and Dr. Argita Zalli (Imperial College London). The article is written by Dardan Beqaj, M.Sc. Microbiology, Eberhard-Karls University of Tübingen).



Malyshev, Denis A., et al. "A semi-synthetic organism with an expanded genetic alphabet." Nature 509.7500 (2014): 385-388.

Zhang, Yorke, et al. "A semisynthetic organism engineered for the stable expansion of the genetic alphabet." Proceedings of the National Academy of Sciences (2017): 201616443.

Zhang, Yorke, et al. "A semi-synthetic organism that stores and retrieves increased genetic information." Nature 551.7682 (2017): 644.