| BERKELEY, CA — In 90 years of
      study, the diminutive fruit fly Drosophila melanogaster has yielded
      many of the most fundamental discoveries in genetics -- beginning with
      proof, in 1916, that the genes are located on the chromosomes. Only during
      the last year has the fly's whole genome been sequenced, however, and its
      13,601 individual genes enumerated. 
        
          |  |   DROSOPHILA |  The genome of D. melanogaster, the largest yet sequenced in
      full, is described in the 24 March 2000 issue of Science magazine,
      in a series of articles jointly authored by hundreds of scientists,
      technicians, and students from 20 public and private institutions in five
      countries. The collaboration was led by Gerald Rubin of the University of
      California at Berkeley and the Howard Hughes Medical Institute (HHMI), who
      heads the Berkeley Drosophila Genome Project, and by J. Craig Venter of
      Celera Genomics in Rockville, Maryland. The Berkeley Drosophila Genome
      Project (BDGP) is supported by the Department of Energy, the National
      Human Genome Research Institute, and HHMI, with the largest of its
      facilities operated by the Life Sciences Division of the Department of
      Energy's Lawrence Berkeley National Laboratory. In 1998, when collaboration with Celera began, extensive but incomplete
      maps of the location of specific DNA sequences on the fly chromosomes had
      been constructed, and about 20 percent of the fly genome had already been
      sequenced in detail -- mostly by the BDGP group at Berkeley Lab where,
      with Rubin, Susan Celniker is co-director of the sequencing effort. The purpose of the collaboration was to test whether a strategy known
      as whole-genome shotgun sequencing could be used on organisms having many
      thousands of genes encoded in millions of DNA base pairs; the strategy had
      proven effective for small bacterial genomes. "No one knew whether whole-genome shotgun sequencing would work
      for the fly genome," says Roger Hoskins, leader of the BDGP physical
      mapping project, "but we knew that if it did, it would be faster and
      more efficient than traditional methods." D. melanogaster has some 250 million bases in its genome,
      arranged on five chromosomes; 80 percent of the genome is located on the
      large chromosomes labeled 2 and 3. Hoskins and his colleagues set out to
      produce a physical map of that part of chromosomes 2 and 3 that expresses
      genes (about 45 percent of the chromosomal material is highly condensed
      and does not encode genes). Although physical maps are not sequences -- a sequence identifies every
      pair of bases along a given stretch of DNA -- a good map pins down the
      location of unique short sequences that can be used to establish the
      correct long-range order of copies of longer DNA sequences, and thus of
      any genes they represent. The 17,000 clones used by the Berkeley Lab BDGP group are actual
      stretches of DNA replicated in Escherichia coli bacteria and known
      as "bacterial artificial chromosomes" (BACs). Each BAC
      accurately represents a discrete stretch of the genome, and the map marks
      each BAC with at least one unique "sequence-tagged site" (STS)
      -- ideally with two or more such sites. Using probes tailored to each sequence-tagged site, an STS can be found
      wherever it occurs in a random collection of clones; 1,923 of these
      markers, spaced roughly every 50,000 bases, were used to build the BDGP's
      final map. By matching these sites among overlapping clones, sets of
      clones of different lengths can be lined up with one another and
      eventually "tiled" along the entire length of each chromosome.
      The result is called an STS content map. When their map of chromosomes 2 and 3 was complete -- along with maps
      of the much shorter chromosomes 4 and X produced by others -- the BDGP
      researchers made a "rough draft" sequence of the genome with
      shallow coverage (less than two clones deep), which served as a check
      against Celera's whole-genome shotgun sequence and is being used to close
      some of its 1,600 gaps. The multi-author Science paper summarizing the genome-sequence
      results describes the importance of the BDGP's methods and results:
      "The BAC end-sequences and STS content map provided the most
      informative long-range sequence-based information at the lowest
      cost." Increasing the number of BAC end-sequences is the authors'
      primary recommendation for future genome-sequencing projects. D. melanogaster's importance is far greater than as a trial run
      for the mouse and human genome, however. In a set of 289 human genes
      implicated in diseases, 177 are closely similar to fruit fly genes,
      including genes that play roles in cancers, in kidney, blood, and
      neurological diseases, and in metabolic and immune-system disorders.
      "The underlying biochemistry of fruit flies and humans is remarkably
      similar," says Hoskins, "so fruit flies can provide clues to
      understanding human diseases caused by defective genes." "We can find human tumor-suppressing genes in flies easier than we
      can in the mouse," says Susan Celniker, pointing out that experiments
      can be done using fly genes that would be impractical (or unthinkable)
      using human subjects. Especially useful is the identification of networks
      of other genes that interact with known disease genes, and their
      associated metabolic pathways. The implications for medicine are
      immediate. To this end the BDGP researchers are continuing to refine the D.
      melanogaster sequence already produced. "We're going to push it
      to high accuracy," says Hoskins. The Human Genome Project aims for a resolution of one error in 10,000
      base pairs -- roughly the number of errors that could arise from normal
      human variation -- but the Drosophila workers intend to achieve an
      accuracy of one error in 100,000, a goal partly made possible by the
      limited variation among inbred laboratory flies. Meanwhile the completed genome of D. melanogaster reported in
      the 24 March 2000 issue of Science stands as a milestone in the
      history of genetic research and a doorway to new methods of progress. For
      one thing, Celera is now attempting to apply the whole-genome shotgunning
      technique to the much larger human genome. "Celera did a great job," says Hoskins, "and the project
      worked better than anyone could have hoped. Now, the BDGP and the rest of
      the community of 5,000 Drosophila researchers around the world can
      begin projects to understand how the genome sequence controls the
      biology." The Berkeley Lab is a U.S. Department of Energy national laboratory
      located in Berkeley, California. It conducts unclassified scientific
      research and is managed by the University of California. Additional information: |