Page tree
Skip to end of metadata
Go to start of metadata
Deoxyribonucleic acid (DNA) is a polymeric molecule (one composed of a chain of individual units) that encodes the genetic instructions used in the development and functioning of all known living organisms and many viruses.

In eukaryotic organisms like animals, plants and fungi, DNA is stored in the nucleus, enabling it to be protected from the harsh environment in the rest of the cell. 

The monomers – single units which are chained together – that make up DNA are called nucleotidesNucleotides are composed of a nucleic base which can be either cytosine (C), guanine (G), adenine (A) or thymine (T) as well as a sugar molecule called deoxyribose and a phosphate group.

Chemical structure of DNA © 2013 Wikipedia

How DNA is read: RNA and transcription

The central dogma of molecular biology states that the flow of information starts with DNA, which is transcribed into RNA which is then translated into a protein:

The three reading "levels" of genetic code © 2014 Wikipedia



Transcription: Turning DNA into RNA © 2013
RNA is implicated in various biological roles in coding, decoding, regulation, and expression of genes. Some RNA molecules play an active role within cells by catalyzing biological reactions, controlling gene expression, or sensing and communicating responses to cellular signals. 

One major type of RNA is messenger RNA (mRNA). Genes are transcribed into mRNA in the nucleus and mRNA then leaves the nucleus to be translated into a protein.

So why is a ‘middle man’ between DNA and protein required? There are several reasons, including protection of the underlying DNA and the fact that many mRNA copies can be active at the same time. If we return to the book analogy, when reading the instructions stored in DNA, rather than taking the whole book, cells make photocopies of the instruction they need and work from that. In this way, the original instructions can be kept safely in their protected position in the nucleus.

Once outside of the nucleus, mRNA is converted into proteins by complexes called ribosomes. These read the mRNA and use it to chain together amino acids into proteins. 

The chemical structure of RNA is very similar to that of DNA, but differs in three main ways: 

  • Unlike double-stranded DNA, RNA is a single-stranded molecule in many of its biological roles and has a much shorter chain of nucleotides.  

  • While DNA contains deoxyribose, RNA contains ribose. This makes the molecule less stable. 

  • The complementary base to adenine is not thymine, as it is in DNA, but rather uracil, which is an unmethylated form of thymine. 

Function of different DNA regions

Not all DNA has the same purpose. There are lots of ways to characterise the difference in function between different regions, but one broad distinction is between that of coding sequences and DNA which is non-coding.

Coding sequences 

DNA is called 'coding' if it contributes to the mRNA sequence that will be read by the translation machinery to produce proteins. Coding sequences are found within genes in the form of exons. In some organisms such as humans, the percentage of the genome which consists of coding sequences can be less than 10%. Whilst only representing a small amount of total genetic content, changes in coding sequences – by mutation or genetic engineering – often have a larger effect on cells as they change the structure a protein.  

Non-coding DNA 

Non-coding DNA refers to sequences that do not contribute directly to protein formation via transcription and translation. These non-coding sequences can have a huge variety of roles: 

  • Regulatory elements: These are DNA sequences which help control the expression of specific genes. Regulatory elements include promoters, which lie just upstream of the coding sequences of genes as well as enhancers, which can be much further away from the genes they control (sometimes as far as 1Mb away). 

  • Introns: Coding sequences within genes are often not present in a single continuous stretch. Instead, different coding sequences that contribute to the same protein are split apart by sequences of non-coding DNA called introns. Through a process called splicing, the coding sequences (exons) of mRNA are stitched together and the introns separating them are removed. 

  • Non-coding RNA: Some non-coding DNA is transcribed but the subsequent RNA is not translated into a protein. These non-coding RNAs can have a diverse range of roles within the cells, including aiding with translation (ribosomal RNA and transfer RNA) as well as playing roles in regulating gene expression e.g. microRNA and siRNA.

Here is a full list of different DNA functions from the iGem Registry of Standard Biological parts, which aims at standardizing and modularizing DNA sequences. With this library, you can start constructing your own DNA segments to perform new functions in organisms such as yeast or e.coli.


SymbolPart ListShort Description
RegistryListSmall.pngPromoters (?): A promoter is a DNA sequence that tends to recruit transcriptional machinery and lead to transcription of the downstream DNA sequence.
RegistryListSmall.pngRibosome Binding Site/about (?): A ribosome binding site (RBS) is an RNA sequence found in mRNA to which ribosomes can bind and initiate translation.
Protein domains
RegistryListSmall.pngProtein domains (?): Protein domains are portions of proteins cloned in frame with other proteins domains to make up a protein coding sequence. Some protein domains might change the protein's location, alter its degradation rate, target the protein for cleavage, or enable it to be readily purified.
Protein coding sequences
RegistryListSmall.pngProtein coding sequences (?): Protein coding sequences encode the amino acid sequence of a particular protein. Note that some protein coding sequences only encode a protein domain or half a protein. Others encode a full-length protein from start codon to stop codon. Coding sequences for gene expression reporters such as LacZ and GFP are also included here.
Translational units
 Translational units (?): Translational units are composed of a ribosome binding site and a protein coding sequence. They begin at the site of translational initiation, the RBS, and end at the site of translational termination, the stop codon.
RegistryListSmall.pngTerminators (?): A terminator is an RNA sequence that usually occurs at the end of a gene or operon mRNA and causes transcription to stop.
RegistryListSmall.pngDNA (?): DNA parts provide functionality to the DNA itself. DNA parts include cloning sites, scars, primer binding sites, spacers, recombination sites, conjugative tranfer elements, transposons, origami, and aptamers.
Plasmid backbones
RegistryListSmall.pngPlasmid backbones (?): A plasmid is a circular, double-stranded DNA molecules typically containing a few thousand base pairs that replicate within the cell independently of the chromosomal DNA. A plasmid backbone is defined as the plasmid sequence beginning with the BioBrick suffix, including the replication origin and antibiotic resistance marker, and ending with the BioBrick prefix.
RegistryListSmall.pngPlasmids (?): A plasmid is a circular, double-stranded DNA molecules typically containing a few thousand base pairs that replicate within the cell independently of the chromosomal DNA. If you're looking for a plasmid or vector to propagate or assemble plasmid backbones, please see the set of plasmid backbones. There are a few parts in the Registry that are only available as circular plasmids, not as parts in a plasmid backbone, you can find them here. Note that these plasmids largely do not conform to the BioBrick standard.
RegistryListSmall.pngPrimers (?): A primer is a short single-stranded DNA sequences used as a starting point for PCR amplification or sequencing. Although primers are not actually available via the Registry distribution, we include commonly used primer sequences here.
 RegistryListSmall.pngComposite parts (?): Composite parts are combinations of of two or more BioBrick parts.


© Text 2015 iGem


Complementary DNA - cDNA

The steps from DNA to cDNA © 2013

Complementary DNA (cDNA) is double-stranded DNA synthesized from a messenger RNA (mRNA) template in a reaction catalysed by the enzyme reverse transcriptase. As the RNA template has already been spliced, cDNA lacks any non-coding DNA and simply carries the DNA that codes directly for the protein of interest. 

cDNA is often used to clone eukaryotic genes into prokaryotes. When scientists want to express a specific protein in a cell that does not normally express that protein, they will transfer the cDNA that codes for the protein to the recipient cell. cDNA is also produced naturally by retroviruses (such as HIV-1HIV-2Simian Immunodeficiency Virus, etc.) and then integrated into the host's genome where it creates a provirus. 


© Text 2015 Wikipedia

What do you think?

About the authors

View full profile Jérôme Lutz from Berlin & Munich, Germany

I like to share the great things I discover daily while researching and working in the field of Synthetic Biology.

When I talk to people about it, they often refer to Science Fiction. However, when I send them links to this wiki and they read through those pages, they start understanding that this is real and it's happening right now.

View full profile Jake Curtis from London

I am a student at Cambridge University who has just finished a BA in Natural Sciences, focusing on Genetics in my third year. I am now studying for an MSc in Systems Biology.