Brinkman Laboratory Homepage
New Research Publications General Personnel Brinkman Contact Links

 

Bacterial Genomics 101

(by Fiona Brinkman, with the first half of the information partially derived from the the Oak Ridge Laboratories "Genetics 101" resource)

Bacterial Genomics is different in a number of ways from genomic studies of eukaryotes (eukaryotes are the group of organisms that includes plants and animals). There are a number of excellent resources describing the Human Genome Project and human genomics, but these resources are not completely applicable to bacterial genomics. Therefore, this summary has been written to aid understanding of this relatively new field, Bacterial Genomics.

What is Bacterial Genomics?

Generally, bacterial genomics is a research field that studies the genome of bacteria, or studies bacteria using approaches or technology derived from understanding the bacterial genome and its DNA sequence.

What the heck is a genome and DNA sequence?

The complete set of instructions for making any organism is called its genome. It contains the master blueprint for all cellular structures and biological processes for the lifetime of the organism. A copy of the genome is found in each bacterial cell, and it consists of two tightly coiled threads of deoxyribonucleic acid (DNA) organized into structures called chromosomes. To use an analogy, think of the genome as the "encyclopedia" of all instructions on how to make an organism. Chromosomes are equivalant to the different volumes of an encyclopedia. Many bacterial cells contain just one chromosome (their "encyclopedia" is only one book), however some have more than one, and humans have 24 chromosomes (i.e. equivalant to a 24 volume encyclopedia, to use the analogy).

If unwound and tied together, the strands of DNA for the genome of a human would stretch more than 5 feet but would be only 50 trillionths of an inch wide. However, through tight packaging, the DNA containing a copy of the full genome is packed into a single cell (all life is made up of units called cells). Likewise, in bacteria all the DNA for the complete genome is packed into each tiny cell.

Each strand of DNA is a linear arrangement of similar repeating units, often termed bases (think of these as the "letters" in an encyclopedia). Four different bases are present in DNA: adenine (commonly referred to as just "A"), thymine (T), cytosine (C), and guanine (you guessed it, G). The particular order of the bases is called the DNA sequence. The sequence specifies the exact genetic instructions required to create a particular organism, including a particular bacterial species, with its own unique traits.

The two complementary strands of DNA are held together by weak bonds between the bases on each strand, forming base pairs (bp). Think of these complementary strands as "two copies" of the encyclopedia (so that if one copy is damaged, the there is another copy to refer to). Genome size is usually stated as the total number of base pairs. The human genome contains roughly 3 billion bp. Bacterial genome sequences determined by the turn of the century (year 2000) range in size from approximately half a million bp, to 8 million bp.

Note that each time a bacterial cell divides into two daughter cells, its full genome is duplicated (both strands of the DNA). Therefore, each new bacterial cell produced contains a new copy of the genome sequence (that contains the two complementary strands). This genome sequence (encyclopedia), comprised of base-pairs (letters) is full of genes (which can be considered the "words" in an encyclopedia).

What are genes?

Each DNA molecule contains many genes. A gene is a specific sequence of base pairs (bases, or "letters") that contains the information required for constructing a protein (in most cases). The proteins encoded by the genes provide the structural components of the cell as well as enzymes for essential biochemical reactions. In other words, proteins are not just in the food we eat: They are the building blocks of life and are main components of our cells. Genes are the "words" in our genome "encyclopedia" that form the instructions for constructing these proteins. The human genome is estimated to comprise approximately 30,000 genes, while different bacteria can have as few as approximately 500 genes, or as many as 6000 genes or more. Proteins are large, complex molecules made up of long chains of subunits called amino acids. Twenty different kinds of amino acids are usually found in proteins.

So how can we build proteins comprising up to 20 different amino acids, if the code within genes for building them only comprises 4 bases? Well, it turns out that special proteins in our cells examine the genes' code in "triplets" of three base pairs at a time. Each "triplet" of three base pairs codes for a particular amino acid. The genetic code to make a given protein is thus a series of "triplets" of three bases in a gene that specify which amino acids are required to make up specific proteins. A collection of triplets that make up a protein is called a gene, and the collection of genes that make an organism is called its genome.

Studying Genomes:

Since a bacterial genome contains all the information for building and running a cell, there has been considerable interest in determining the DNA sequence for the genomes of bacteria, particularly for those bacteria that cause disease. We hope that by knowing the code for what makes bacteria tick, that we may be better able to come up with means to:
- control them (for example, if they cause disease)
- maintain them (for example, if they are important for our environment)
- use them for beneficial purposes (for example, industrial spill cleanup or making products)

So, technology has been developed to determine the complete DNA sequence ("all the letters") of a genome, using machines that are called (aptly) DNA sequencers. For bacteria, genome sequences had been obtained for approximately 35 different bacteria by the turn of the millennium, with the first genome sequence for a bacteria determined in 1995. Many more bacterial genomes are being sequenced and so databases have been formed to contain all the DNA sequence information (as long strings of A's, T's, C's and G's). In addition, early research determined what triplets of of the DNA bases encode what amino acids, and so that we can predict from a bacterial genome sequence what proteins it may encode. Both the DNA sequence of bases, and the protein sequences of amino acids, contain additional "strings" of sequence within the DNA or protein that specify particular functions and/or act as signals for the bacteria.

So, what does Bacterial Genomics do?

Bacterial genomic approaches to studying life include determining the DNA sequence of a given bacterial genome, comparing the DNA sequence with other genome sequences, and deciphering what the genomes encode (I predicting what makes up an organism from its DNA sequence). The whole gene complement of different organisms can be compared with one another, detecting, for example, genes specific to certain disease-causing bacteria that may be targets for drug therapy or for diagnostics. This field of research also attempts to study bacteria on a "whole-organism" scale. It takes a more holistic view regarding how the genes and proteins function and work together, or what genes or proteins are necessary for a certain function (such as causing disease). This whole-organism approach is quite different from the early approaches microbiology used to study genes and proteins: Classically, single genes or proteins were studied on a (relatively) individual basis, with little understanding of how all the genes and proteins in a cell worked together. Genomics attempts to gather information regarding how the full complement of genes in an organism work together and it is thought that such an approach to studying bacteria, and in fact all life, will complement the classical, but still very necessary, studies that are focused on particular genes or proteins.

What will Bacterial Genomics give us?

Bacterial genomics can give us a broader understanding of how a bacteria functions, a bacteria's origins, and what bacteria live in our world that we can't study by other means (i.e through obtaining their DNA from the environment and studying it). Of medical interest, bacterial genomics is also anticipated to play a significant role in speeding up the development of better therapies and vaccines for controlling disease-causing bacteria. It will also be the cornerstone of anticipated DNA-based diagnostic tools that will hopefully enable doctors to make quicker, more accurate diagnoses of infectious disease.