|  
       
       Science Beat: What major technical advances have allowed this 
        new way of working?  Hawkins: It's all been due to the Human Genome Project. We built 
        these very large, industrial-scale processors because we were desperate 
        to sequence the human genome, but now we're sitting on the ability to 
        sequence hundreds of bases every second. And that pretty much doubles 
        every six months. We work closely with outside technology developers, 
        and the spin-off for us is that we get their new machines and new reagents 
        cheaper and faster. There are two big areas we're focusing on today. One is functional genomics; 
        the other is the computing area.  Rokhsar: Comparative genomics is like the Rosetta stone. You see 
        these ancient texts, and you know they must mean something, but how do 
        you know what they mean?  Science Beat: You mean by comparing those languages with languages 
        you can understand?  Rokhsar: Yes, but the difference between the Rosetta stone and 
        the genomic situation is that these aren't dead languages. They're living 
        inside the bodies of these animals.  Now that we can generate all this sequence, it puts a lot more pressure 
        on us to assemble it. Our JAZZ assembler [a computer program] was developed 
        here over about a year. Its claim to fame is that it's the only assembler 
        in the public domain  and it takes into account aspects of the data 
        that other assemblers simply ignore, for example the fact that we sequence 
        from both ends of a piece of DNA. Which is, again, a technology that JGI 
        has employed from the very beginning.  Science Beat: And when you have that assembly, you have to figure 
        out how it all functions. Richardson: Trying to figure out the functions of genes includes 
        looking at things like alternative splicing. Remarkably, there's a single 
        gene of Drosophila that has 38,000 different splice variants! It's 
        a very large gene with many exons [coding sequences], and it directs neuronal 
        growth. By shuffling these exons around, and making different mRNA, which 
        leads to different proteins, it's able to direct those neurons to different 
        parts of the body. You can say that's one gene, but it's performing many 
        different functions.  Functional genomics also includes looking at gene expression: what are 
        the cues that turn genes on. And we're also getting into an area that 
        is now called proteomics  which is the next step after functional 
        genomics  actually getting right down to the biochemical function 
        of the proteins.  One of the first things we're doing, because we're interested in regulatory 
        translations, is looking at how expressed proteins bind to DNA, and getting 
        an idea of what genes they might affect.  Science Beat: What are the next steps for these comparative-genome 
        projects?  Hawkins: The squirt is just the start. We have other genomes in 
        mind that will help us compare with some of the networks that we've found 
        in the sea squirt. One of the next genomes that we're seriously considering 
        is the frog, Xenopus tropicalis.  When you look at a sea squirt you don't think of human development, but 
        when you look at a frog embryo, there's a lot of similarity between a 
        frog embryo and a human embryo. So if you begin to see something in the 
        sea squirt, and then begin to see it in the frog, and then someone else 
        starts to see it in the mouse, you can say, well this is probably going 
        to be present in the human.  Richardson: In Ciona, we're trying to get an idea of which 
        genes function in the patterning of an embryo and the development of the 
        notochord [a precursor of the spinal chord, found in some primitive animals 
        and in vertebrate embryos]. And those will have homologs, or genes that 
        are similar, in every other organism that has a similar body plan. In 
        a more primitive organism the same gene may carry out the functions of, 
        say, two, three, or four genes in a more specialized organism. Because 
        oftentimes those genes may have been duplicated or even reduplicated. 
        
 Rokhsar: In computation, the next hurdle is allowing people to 
        make use of this comparative data  and all the other data. As people 
        use those elements in their own experiments, they're going to bring information 
        back that's much more diverse than just the sequence of As, Cs, Ts, and 
        Gs [the nucleotide bases of DNA whose order determines genetic coding]. 
        You want to be able to interrogate the whole system and ask questions 
        at the meta-level, and not just be tied to finding a particular gene.
 
         
          |  |  |   
          |  |  |  Hawkins: With sequence production, before now, only the "bravest 
        and fittest" were able to interpret this data. We want to make it 
        so even your grandmother can use it.  As we look forward to producing other kinds of data, the burden is on 
        us to produce tools to allow people to get the most out of the data sets. 
       Additional information:
  
       |