Genetic Blueprints

Two strands of nucleotides, aligned side by side, are connected by the chemical joining of matching bases: adenine and thymine pairs, and guanine and cytosine pairs. The nucleotide strands are twisted by the base pair bonds, creating the "double helix."
Figure 1: A DNA strand, showing the double helix and base pairs (Graphic by CSS, Inc.)

DNA is an amazingly simple chemical structure, yet it contains an entire library of information on how to make, maintain, and reproduce an organism, and also keeps a record of clues to the organism's evolutionary history. The entire sequence of DNA in an organism is called its genome . A genome can be as small as the 9,750 bases of the human immunodeficiency virus (HIV; the cause of AIDS) or, as large as the +3 billion bases in mouse, human and frogs.

The genetic blueprint so carefully preserved in a genome is stored in the linear sequence of the molecules, referred to as nucleotides or bases . In DNA there are four bases: adenine (A), guanine (G), cytosine (C) and thymine (T). DNA consists of two strands of nucleotides laid side by side and connected by chemical paring of complementary (matching) bases: adenine pairs with thymine and guanine pairs with cytosine (Figure 1-B). The bonds between these molecules impose a twisting force (torsion) on the structure and cause it to wind slightly, much like a spiral staircase. This creates the familiar double helix shape of a DNA molecule (1-A).

DNA Sequences and Gene Expression

Table of DNA Sequences

DNA encodes the unique nature of different organisms by specifying the precise structure of each protein in a cell. In analogy to DNA, proteins are made from a linear sequence of amino acids, and the exact sequence of amino acids is what determines the function of the protein. A DNA sequence is translated into the protein sequence by a code, where a triplet of bases (a codon ) specifies a single amino acid; some codons specify the end of the protein. As a cell "reads" the DNA instructions, it builds a protein by adding successive amino acids one at a time, as defined by the codons. With 64 possible codons (4 DNA bases in the 3 positions of the codon, or 43) and only 20 standard amino acids, some amino acids can be specified by more than one codon.

Humans are built from an estimated 20,000-35,000 proteins, yet, only a small percentage of the 3 billion bases in the human genome codes for these proteins. The discrete sections of DNA that encode proteins are referred to as "genes." A gene contains special codons that tell the cell where the gene - and hence, the protein - start and stop. Between these signals, the regions of the DNA that code for amino acids (the exons ) are separated by a variable number and length of non-coding DNA sequence (the intervening sequence, or introns ).

Protein production is a highly regulated process. For example, a cell does not want to waste energy making the proteins needed for cell division if it is busy with other functions, such as secreting a hormone. This process of turning a gene on and off depending on the cell's need for a particular set of proteins is referred to as regulation of gene expression . Certain proteins, and even other regions of DNA, physically bind to the DNA sequence surrounding a gene to affect its expression. This interaction occurs at regions of the DNA with apt names, such as promoters or enhancers of gene expression.

Gene regulation is an essential part of life. Since every cell in an organism contains the same genetic blueprint, different cell types are created by turning on different genes at different times during development. In fact, it is differential gene expression that allows stem cells to become unique cell types. Gene regulation is also critical for cellular response to metabolic needs.

Quick Fact: RNA

Ribonucleic acid (RNA) is a single-stranded copy of DNA, in which the thymine (T) is replaced by uricil (U).

The NBII Program is administered by the Biological Informatics Program of the U.S. Geological Survey
About NBII | Accessibility Statement | NBII Disclaimer, Attribution & Privacy Statement | FOIA
Science.gov Logo       USGS Logo       USAgov Logo