BERKELEY, CA — In 90 years of
study, the diminutive fruit fly Drosophila melanogaster has yielded
many of the most fundamental discoveries in genetics -- beginning with
proof, in 1916, that the genes are located on the chromosomes. Only during
the last year has the fly's whole genome been sequenced, however, and its
13,601 individual genes enumerated.
|
DROSOPHILA
|
The genome of D. melanogaster, the largest yet sequenced in
full, is described in the 24 March 2000 issue of Science magazine,
in a series of articles jointly authored by hundreds of scientists,
technicians, and students from 20 public and private institutions in five
countries.
The collaboration was led by Gerald Rubin of the University of
California at Berkeley and the Howard Hughes Medical Institute (HHMI), who
heads the Berkeley Drosophila Genome Project, and by J. Craig Venter of
Celera Genomics in Rockville, Maryland. The Berkeley Drosophila Genome
Project (BDGP) is supported by the Department of Energy, the National
Human Genome Research Institute, and HHMI, with the largest of its
facilities operated by the Life Sciences Division of the Department of
Energy's Lawrence Berkeley National Laboratory.
In 1998, when collaboration with Celera began, extensive but incomplete
maps of the location of specific DNA sequences on the fly chromosomes had
been constructed, and about 20 percent of the fly genome had already been
sequenced in detail -- mostly by the BDGP group at Berkeley Lab where,
with Rubin, Susan Celniker is co-director of the sequencing effort.
The purpose of the collaboration was to test whether a strategy known
as whole-genome shotgun sequencing could be used on organisms having many
thousands of genes encoded in millions of DNA base pairs; the strategy had
proven effective for small bacterial genomes.
"No one knew whether whole-genome shotgun sequencing would work
for the fly genome," says Roger Hoskins, leader of the BDGP physical
mapping project, "but we knew that if it did, it would be faster and
more efficient than traditional methods."
D. melanogaster has some 250 million bases in its genome,
arranged on five chromosomes; 80 percent of the genome is located on the
large chromosomes labeled 2 and 3. Hoskins and his colleagues set out to
produce a physical map of that part of chromosomes 2 and 3 that expresses
genes (about 45 percent of the chromosomal material is highly condensed
and does not encode genes).
Although physical maps are not sequences -- a sequence identifies every
pair of bases along a given stretch of DNA -- a good map pins down the
location of unique short sequences that can be used to establish the
correct long-range order of copies of longer DNA sequences, and thus of
any genes they represent.
The 17,000 clones used by the Berkeley Lab BDGP group are actual
stretches of DNA replicated in Escherichia coli bacteria and known
as "bacterial artificial chromosomes" (BACs). Each BAC
accurately represents a discrete stretch of the genome, and the map marks
each BAC with at least one unique "sequence-tagged site" (STS)
-- ideally with two or more such sites.
Using probes tailored to each sequence-tagged site, an STS can be found
wherever it occurs in a random collection of clones; 1,923 of these
markers, spaced roughly every 50,000 bases, were used to build the BDGP's
final map. By matching these sites among overlapping clones, sets of
clones of different lengths can be lined up with one another and
eventually "tiled" along the entire length of each chromosome.
The result is called an STS content map.
When their map of chromosomes 2 and 3 was complete -- along with maps
of the much shorter chromosomes 4 and X produced by others -- the BDGP
researchers made a "rough draft" sequence of the genome with
shallow coverage (less than two clones deep), which served as a check
against Celera's whole-genome shotgun sequence and is being used to close
some of its 1,600 gaps.
The multi-author Science paper summarizing the genome-sequence
results describes the importance of the BDGP's methods and results:
"The BAC end-sequences and STS content map provided the most
informative long-range sequence-based information at the lowest
cost." Increasing the number of BAC end-sequences is the authors'
primary recommendation for future genome-sequencing projects.
D. melanogaster's importance is far greater than as a trial run
for the mouse and human genome, however. In a set of 289 human genes
implicated in diseases, 177 are closely similar to fruit fly genes,
including genes that play roles in cancers, in kidney, blood, and
neurological diseases, and in metabolic and immune-system disorders.
"The underlying biochemistry of fruit flies and humans is remarkably
similar," says Hoskins, "so fruit flies can provide clues to
understanding human diseases caused by defective genes."
"We can find human tumor-suppressing genes in flies easier than we
can in the mouse," says Susan Celniker, pointing out that experiments
can be done using fly genes that would be impractical (or unthinkable)
using human subjects. Especially useful is the identification of networks
of other genes that interact with known disease genes, and their
associated metabolic pathways. The implications for medicine are
immediate.
To this end the BDGP researchers are continuing to refine the D.
melanogaster sequence already produced. "We're going to push it
to high accuracy," says Hoskins.
The Human Genome Project aims for a resolution of one error in 10,000
base pairs -- roughly the number of errors that could arise from normal
human variation -- but the Drosophila workers intend to achieve an
accuracy of one error in 100,000, a goal partly made possible by the
limited variation among inbred laboratory flies.
Meanwhile the completed genome of D. melanogaster reported in
the 24 March 2000 issue of Science stands as a milestone in the
history of genetic research and a doorway to new methods of progress. For
one thing, Celera is now attempting to apply the whole-genome shotgunning
technique to the much larger human genome.
"Celera did a great job," says Hoskins, "and the project
worked better than anyone could have hoped. Now, the BDGP and the rest of
the community of 5,000 Drosophila researchers around the world can
begin projects to understand how the genome sequence controls the
biology."
The Berkeley Lab is a U.S. Department of Energy national laboratory
located in Berkeley, California. It conducts unclassified scientific
research and is managed by the University of California.
Additional information:
|