Recommendations
Drosophila remains a key model
organism for the development and application of genomics and proteomics.
Its unique advantages include the exquisite genetic approaches and resources
that have accumulated over almost a century. These are continually being
augmented and improved upon as genetic technology advances. Drosophila
is also host to a large and diverse community of researchers representing
virtually every area of investigation of biological function. Drosophila
is also a system that can be studied at relatively low cost and that has
biological complexity comparable to that of a mammal. Many organ systems
in mammals have well-conserved homologues in Drosophila, and Drosophila
research has already led the way in providing new insights into cancer,
neurodegenerative diseases, behavior, aging, complex multifactorial inheritance,
and development. In addition, the past years of investment in Drosophila
research and the soon anticipated completion of the genomic sequence will
catalyze still more outstanding research and insights into normal and
disease mechanisms.
The Drosophila community is
in broad agreement that the opportunities and challenges in Drosophila
genomics and genetics can be met successfully only with the targeted development
of shared genetic resources including the complete genomic sequence, a
set of full-length unigene cDNA sequences, libraries of transposon mutants
in all genes, adequate databases, stock centers, complete genomic expression
analyses, polymorphism databases, and related goals. We believe that the
successful development of these resources will benefit all biomedical
researchers and catalyze a rapid wave of discovery with significant applications
to human biology and disease.
In anticipation of this meeting
the Drosophila board came together and discussed the needs of the community
and drew up a document containing their thoughts on this topic. This "white
paper" was then distributed to the community through FlyBase and the Drosophila
electronic bulletin board and comments solicited. The resulting list of
priorities was then discussed at the non-mammalian model organisms workshop
and amended. The resource needs of the community have been divided into
two major sets. Those in the first set are so firmly interdependent that
it was difficult if not impossible to assign individual priorities to
each of the items but that they as a group should have the highest priority.
The second group consists of needs that were perceived as important but
were of a somewhat lower standing. The following list represents the needs
that would have the greatest impact on the community and would be the
most valuable and cost effective to the research enterprise.
TOP LEVEL PRIORITIES
Finish the
Drosophila genome sequence by the end of 2000. The sequence should
also be well annotated. Achieving this crucial goal is consistent with
the Five Year Plan for the Genome Project and will require full funding
of the already approved grant for sequencing the Drosophila genome ($44,000,000
for the three year period from 12/1/9811/30/01). Finishing sooner
would greatly accelerate important biomedical research progress and
avoid lost opportunity costs.
Completion
of high quality sequences of full-length cDNA clones corresponding to
all genes in the genomic sequence (and their major alternative splice
forms) and the assembly of a complete "unigene set" of all major expressed
transcripts. The cDNAs should be made available in appropriate vectors
in anticipation of their use in proteomic analyses. This "rosetta stone"
will be crucial to fully comprehend the range of proteins encoded in
the Drosophila genome. This goal can likely be accomplished for $8,000,000
and could be accomplished in 2 years.
Genetic resources.
This top priority comprises three interrelated goals, the expansion
of the stock centers and FlyBase being absolutely indispensable to the
entire Drosophila genome project.
Generation
of a collection of P-element or other transposable element insertions
in all genes. We estimate that 50,000 different lines will
be initially accumulated and could be generated using P-element or
other transposable elements that are also capable of being used to
generate controlled misexpression. The final size of this collection
would number 20,000 lines and would be deposited in the stock center.
This library of mutants in all genes will be an indispensable resource
to all workers in the community and can be generated for $2,000,000.
In addition to the mutant collection we propose the generation of
a set of well-characterized transgenic lines expressing GAL4 in a
large variety of different cell types, tissues, and developmental
stages. These lines will be crucial to a great many labs for targeted
misexpression, or limited restoration of expression, experiments with
appropriate transposable element mutants or other UAS type constructs.
It is imperative that these lines have well-documented expression
patterns and that these patterns and the stocks are generally available
to the research community. The cost of generation and characterization
of these lines would be $150,000/yr for 5 years.
The
collection of these new lines as well as the accumulation of novel
transgenic stocks from other sources will require a significant expansion
of the capacity of the stock centers. This goal will
require expansion of the physical space and personnel to care for
and send out the many genetic strains to the community. We envision
that a national capacity in the range of 30,000 different stocks is
a necessary minimum to accommodate the anticipated development of
mutants in all genes. This goal will cost approximately $1,000,000
per year beyond current expenditures.
The
database capacities available to the community must be significantly
expanded. The current torrent of sequences requires annotation,
linkage to the genetic maps and phenotypes, and links to databases
of diversity. Currently FlyBase is serving its community very well.
However, better accessibility to phenotypic information must be achieved.
We also need to forge better links to the data bases of the other
communities and ensure that the data housed in FlyBase is available
to researchers in other systems. This objective can be accomplished
for $3,000,000 per year rising to $3,500,000 per year over a 5 year
period. It must be stressed however that continued support beyond
that period is absolutely necessary. The above would be used to support
the continuation but not the expansion of FlyBase into new areas such
as the incorporation of new data types and research into new computational
methods. This would require and additional $500,000 per year.
SECOND LEVEL PRIORITIES
In addition to the above
absolute top priorities listed above, the Drosophila community has reached
general agreement on a second set of desiderata to exploit genomic and
proteomic approaches to further our understanding of complex biological
processes and the genetic basis of human disease. These are so nearly
equal in importance that it would be arbitrary and inappropriate to
assign ranks or priorities to these opportunities. The specific structural
and functional resources comprising this second set of goals are as
follows.
Gene product
expression. This is a dual goal consisting of the:
Determination
of expression patterns of all genes and coding sequences.
These patterns would be determined at different developmental stages,
in different tissues, and under different environmental conditions.
This work will be facilitated by the availability of the full-length
uniset of cDNAs cloned into standard expression vectors setting the
stage for Drosophila proteomics, as well as through the application
of antibody, epitope tags, or other appropriate reagents for intracellular
protein localization. This work could be done for approximately $2,500,000
over a period of several years.
Development
and application of high resolution, high sensitivity measures of protein
expression and covalent modification in normal and mutant organisms.
The availability of a complete Drosophila sequence coupled to rapid
improvements in mass spectrometry-based protein sequencing and separation
technologies will allow unparalleled insights into cellular and biochemical
changes that will occur in various mutants. This emerging technology
is poised to be applied to a model genome prior to attempting such
efforts on vertebrates. Depending upon rates of technology development,
this goal could be achieved for $1,000,000$2,000,000.
Determination
of the sequence of Drosophila virilis. The sequence
of this related species will be crucial for the interpretation of the
Drosophila melanogaster sequence and for helping to infer
function and identify those characteristics of the Drosophila genome
that are conserved and therefore likely to be important for function.
Determining this sequence once the D. melanogaster sequence is
near complete can be done relatively economically since the D. melanogaster
sequence can help guide high priority regions of the D. virilis
genome for sequence determination and interpretation. This goal can
be achieved by starting with an investment of $2,000,000 per year to
sequence regions of greatest interest in BAC clones, rising to $4,000,000
per year as the effort picks up speed. Eventually all of the D. virilis
genome should be sequenced when it is cost-effective to do so, but the
Drosophila community is convinced that there is tremendous insight to
be gained by assigning high priority to immediate sequencing a few carefully
chosen, highly targeted regions.
Creation of
a standard set of cell culture models derived from various Drosophila
tissues and developmental stages. A crucial resource for elucidating
function for the various genes will be to have high quality cell culture
models in which to conduct biochemical and cell biological analysis
of mutants. Working to develop cell lines suitable for homologous recombination
(for example, analogs of embryonic stem cells for targeted mutagenesis)
and reintroduction into the organism is also an important component
of this goal. These goals will require both the establishment of new
permanent cell lines and the development of methods for readily preparing
primary cultures from Drosophila organs and tissues. These goals could
be achieved with an initial investment of $500,000$1,000,000 per
year.
ADDITIONAL CONSIDERATIONS
Drosophila research would
be greatly enhanced by the successful achievement of several other goals,
two of which are generally applicable to all genome projects and two
of which are specific to Drosophila. The two general goals are
The development
of sequencing technology costing an order of magnitude less that that
currently practiced. This would make many highly meritorious genomic
sequencing projects feasible, such as the complete sequencing of D.
virilis and D. simulans and perhaps other species. It would
also enable a quantum jump in the application of genomic approaches
to evolutionary and population genetics.
The creation
of appropriate source(s) of molecular materials for acquisition by researchers
at a reasonable cost without restriction on their use.
The two Drosophila-specific
goals are:
The development
of a system of gene replacement by homologous recombination, which
would rival the importance of P-element germline transformation in the
genetic manipulation of Drosophila.
The development
of an efficient, cost effective, high-throughput system for cryopreservation
of any stage(s) of development or cell type(s) from which the organisms
can be resuscitated, for the purpose of long-term storage of genetic
resources and relief of constantly increasing demands on the stock centers.
|