Вы находитесь на странице: 1из 13

feature

A vision for the future of


genomics research
A blueprint for the genomic era.

DARRYL LEJA
in a few weeks by a single graduate student
Francis S. Collins, Eric D. Green,
with access to DNA samples and associated
Alan E. Guttmacher and Mark S.
phenotypes, an Internet connection to the
Guyer on behalf of the US National
public genome databases, a thermal cycler
Human Genome Research Institute*
and a DNA-sequencing machine. With the
The completion of a high-quality, recent publication of a draft sequence of
comprehensive sequence of the the mouse genome11, identification of
human genome, in this fiftieth the mutations underlying a vast number
anniversary year of the discovery of the of interesting mouse phenotypes has simi-
double-helical structure of DNA, is a larly been greatly simplified. Comparison
landmark event. The genomic era is of the human and mouse sequences
now a reality. shows that the proportion of the
In contemplating a vision for the mammalian genome under evolu-
future of genomics research,it is appropri- tionary selection is more than twice that
ate to consider the remarkable path that previously assumed.
has brought us here. The rollfold Our ability to explore genome function is
(Figure 1) shows a timeline of land- increasing in specificity as each subsequent
mark accomplishments in genetics genome is sequenced. Microarray
and genomics, beginning with Gregor technologies have catapulted many
Mendels discovery of the laws of heredity1 laboratories from studying the expres-
and their rediscovery in the early days of the sion of one or two genes in a month
twentieth century.Recognition of DNA as the to studying the expression of tens of
hereditary material2, determination of its thousands of genes in a single after-

A. BARRINGTON BROWN/SPL
structure3, elucidation of the genetic code4, noon12. Clinical opportunities
development of recombinant DNA tech- for gene-based pre-symptomatic
nologies5,6, and establishment of increasingly prediction of illness and adverse
automatable methods for DNA sequen- drug response are emerging at a
cing710 set the stage for the Human Genome rapid pace, and the therapeutic
Project (HGP) to begin in 1990 (see also promise of genomics has ushered
www.nature.com/nature/DNA50). Thanks in an exciting phase of expansion
to the vision of the original planners, and and exploration in the commercial
the creativity and determination of a legion sector13. The investment of the HGP in
of talented scientists who decided to make studying the ethical, legal and social
this project their overarching focus, all of implications of these scientific advances
the initial objectives of the HGP have now has created a talented cohort of scholars in
been achieved at least two years ahead of ethics, law, social science, clinical research,
expectation, and a revolution in biological theology and public policy, and has already
research has begun. resulted in substantial increases in public
The projects new research strategies and are providing biologists with a awareness and the introduction of significant
experimental technologies have generated a markedly improved repertoire of research (but still incomplete) protections against
steady stream of ever-larger and more com- tools that will allow the functioning of organ- misuses such as genetic discrimination (see
plex genomic data sets that have poured into isms in health and disease to be analysed and www.genome.gov/PolicyEthics).
public databases and have transformed the comprehended at an unprecedented level of These accomplishments fulfil the expan-
study of virtually all life processes. The molecular detail. Genome sequences, the sive vision articulated in the 1988 report of
genomic approach of technology develop- bounded sets of information that guide bio- the National Research Council, Mapping and
ment and large-scale generation of commu- logical development and function, lie at the Sequencing the Human Genome14. The suc-
nity resource data sets has introduced an heart of this revolution. In short, genomics cessful completion of the HGP this year thus
important new dimension into biological and has become a central and cohesive discipline represents an opportunity to look forward
biomedical research. Interwoven advances of biomedical research. and offer a blueprint for the future of
in genetics, comparative genomics, high- The practical consequences of the emer- genomics research over the next several years.
throughput biochemistry and bioinformatics gence of this new field are widely apparent. The vision presented here addresses a
*Endorsed by the National Advisory Council for Human Genome Identification of the genes responsible for different world from that reflected in earlier
Research, whose members are Vickie Yates Brown, David R. Burgess, human mendelian diseases,once a herculean plans published in 1990, 1993 and 1998 (refs
Wylie Burke, Ronald W. Davis, William M. Gelbart, Eric T. Juengst,
Bronya J. Keats, Raju Kucherlapati, Richard P. Lifton, Kim J.
task requiring large research teams, many 1517). Those documents addressed the
Nickerson, Maynard V. Olson, Janet D. Rowley, Robert Tepper, years of hard work, and an uncertain out- goals of the 1988 report, defining detailed
Robert H. Waterston and Tadataka Yamada. come, can now be routinely accomplished paths towards the development of genome-
NATURE | VOL 422 | 24 APRIL 2003 | www.nature.com/nature 2003 Nature Publishing Group 835
feature
sion a more direct role for both the extra-
mural and intramural programmes of the
NHGRI in bringing a genomic approach to
the translation of genomic sequence infor-
mation into health benefits.
The NHGRI brings two unique assets to
this challenge. First, it has close ties to a scien-
tific community whose direct role over the
past 13 years in bringing about the genomic
revolution provides great familiarity with its
potential to transform biomedical research.
Second,the NHGRIs long-standing mission,
to investigate the broadest possible implica-
tions of genomics, allows unique flexibility to
explore the whole spectrum of human health
and disease from the fresh perspective of
genome science. By engaging the energetic
and interdisciplinary genomics-research
community more directly in health-related
research and by exploiting the NHGRIs abili-
ty to pursue opportunities across all areas of
human biology, the institute seeks to partici-
pate directly in translating the promises of
the HGP into improved human health.
To fully achieve this goal, the NHGRI
Fig 2 The future of genomics rests on the foundation of the Human Genome Project. must also continue in its vigorous support of
another of its vital missions the coupling
analysis technologies, the physical and from genomic information to improved of its scientific research programme with
genetic mapping of genomes, and the human health remains immense. Current research into the social consequences of
sequencing of model organism genomes efforts to meet this challenge are largely increased availability of new genetic tech-
and, ultimately, the human genome. Now, organized around the study of specific dis- nologies and information. Translating the
with the effective completion of these goals, eases, as exemplified by the missions of the success of the HGP into medical advances
we offer a broader and still more ambitious disease-oriented institutes at the US Nation- intensifies the need for proactive efforts to
vision, appropriate for the true dawning of al Institutes of Health (NIH, www.nih.gov) ensure that benefits are maximized and
the genomic era. The challenge is to capital- and numerous national and international harms minimized in the many dimensions
ize on the immense potential of the HGP to governmental and charitable organizations of human experience.
improve human health and well-being. that support medical research. The National
The articulation of a new vision is an Human Genome Research Institute A readers guide
opportunity to explore transformative new (NHGRI), in budget terms a rather small The vision for genomics research detailed
approaches to achieve health benefits. (less than 2%) component of the NIH, will here is the outcome of almost two years of
Although genome-based analysis methods work closely with all these organizations in intense discussions with hundreds of scien-
are rapidly permeating biomedical research, exploring and supporting these biomedical tists and members of the public,in more than
the challenge of establishing robust paths research capabilities. In addition, we envi- a dozen workshops and numerous individ-
ual consultations (see www.genome.gov/
BOX 1 Resources About/Planning). The vision is formulated
into three major themes genomics to biol-
One of the key and distinctive Comprehensive collections of knockouts and ogy, genomics to health, and genomics to
objectives of the Human Genome knock-downs of all genes in selected animals to society and six crosscutting elements.
Project (HGP) has been the generation accelerate the development of models of disease We envisage the themes as three floors
of large, publicly available, Comprehensive reference sets of proteins from of a building, firmly resting on the founda-
comprehensive sets of reagents and key species in various formats, for example in tion of the HGP (Figure 2). For each theme,
data (scientific resources or infrastructure) that, expression vectors, with affinity tags and spotted we present a series of grand challenges, in the
along with other new, powerful technologies, onto protein chips spirit of the proposals put forward for math-
comprise a toolkit for genomics-based research. Comprehensive sets of protein affinity reagents ematics by David Hilbert at the turn of the
Genomic maps and sequences are the most Databases that integrate sequences with twentieth century18. These grand challenges
obvious examples. Others include databases of curated information and other large data sets, as are intended to be bold, ambitious research
sequence variation, clone libraries and collections well as tools for effective mining of the data targets for the scientific community. Some
of anonymous cell lines. The continued generation Cohort populations for studies designed to can be planned on specific timescales, others
of such resources is critical, in particular: identify genetic contributors to health and to are less amenable to that level of precision.
Genome sequences of key mammals, assess the effect of individual gene variants on We list the grand challenges in an order that
vertebrates, chordates, and invertebrates disease risk, including a healthy cohort makes logical sense, not representing priori-
Comprehensive reference sets of coding Large libraries of small molecules, together ty. The challenges are broad in sweep, not
sequences from key species in various formats, with robotic methods to screen them and parochial some can be led by the NHGRI
for example, full-length cDNA sequences and access to medicinal chemistry for follow-up, alone, whereas others will be best pursued
corresponding clones, oligonucleotide primers, to provide investigators easy and affordable in partnership with other organizations.
and microarrays access to these tools Below, we clarify areas in which the NHGRI
intends to play a leading role.
836 2003 Nature Publishing Group NATURE | VOL 422 | 24 APRIL 2003 | www.nature.com/nature
feature
The six critically important crosscutting elements, such as protein-coding sequences,
elements are relevant to all three thematic still cannot be accurately predicted from
areas. They are: resources (Box 1); technolo- sequence information alone. Other types of
gy development (Box 2); computational known functional sequences, such as genetic
biology (Box 3); training (Box 4); ethical, regulatory elements, are even less well
legal and social implications (ELSI, Box 5); understood; undoubtedly new types remain
and education (Box 6). We also stress the to be defined, so we must be ready to investi-
critical importance of early, unfettered gate novel, perhaps unexpected, ways in
access to genomic data in achieving maxi- which DNA sequence can confer function.
mum public benefit. Finally, we propose a Similarly, a better understanding of epi-
series of quantum leaps, achievements that genetic changes (for example, methylation
would lead to substantial advances in and chromatin remodelling) is needed to
genomics research and its applications to comprehend the full repertoire of ways in
medicine. Some of these may seem overly which DNA can encode information.
bold, but no laws of physics need to be violat- Comparison of genome sequences from
ed to achieve them. Such leaps would have evolutionarily diverse species has emerged as
profound implications, just as the dreams of a powerful tool for identifying functionally
the mid-1980s about the complete sequence important genomic elements. Initial analyses
of the human genome have been realized in of available vertebrate genome sequences7,11,19
the accomplishments now being celebrated. have revealed many previously undiscovered
protein-coding sequences. Mammal-to-
I Genomics to biology mammal sequence comparisons have revealed
Elucidating the structure large numbers of homologies in non-coding
and function of genomes probably contains the bulk of the regulatory regions11, few of which can be defined in
The broadly available genome sequences of information controlling the expression of functional terms. Further comparisons of
human and a select set of additional organ- the approximately 30,000 protein-coding sequences derived from multiple species,espe-
isms represent foundational information genes, and myriad other functional ele- cially those occupying distinct evolutionary
for biology and biomedicine. Embedded ments, such as non-protein-coding genes positions, will lead to significant refinements
within this as-yet poorly understood code and the sequence determinants of chromo- in our understanding of the functional impor-
are the genetic instructions for the entire some dynamics. Even less is known about the tance of conserved sequences20. Thus, the
repertoire of cellular components, knowl- function of the roughly half of the genome generation of additional genome sequences
edge of which is needed to unravel the that consists of highly repetitive sequences or from several well-chosen species is crucial to
complexities of biological systems. Elucidat- of the remaining non-coding,non-repetitive the functional characterization of the human
ing the structure of genomes and identifying DNA. genome (Box 1). The generation of such large
the function of the myriad encoded elements The next phase of genomics is to cata- sequence data sets will benefit from further
will allow connections to be made between logue, characterize and comprehend the advances in sequencing technology that yield
genomics and biology and will, in turn, entire set of functional elements encoded in significant cost reductions (Box 2). The study
accelerate the exploration of all realms of the the human and other genomes. Compiling of sequence variation within species will also
biological sciences. this genome parts list will be an immense be important in defining the functional nature
For this, new conceptual and technologi- challenge. Well-known classes of functional of some sequences (see Grand Challenge I-3).
cal approaches will be needed to:
Develop a comprehensive and com- BOX 2 Technology development
prehensible catalogue of all of the
components encoded in the human The Human Genome Project was elements that do not encode protein
genome. aided by several breakthrough In vivo, real-time monitoring of gene expression
Determine how the genome-encoded technological developments, including and the localization, specificity, modification and
components function in an integrated Sanger DNA sequencing and its activity/kinetics of gene products in all relevant
manner to perform cellular and automation, DNA-based genetic cell types
organismal functions. markers, large-insert cloning systems and the Modulation of expression of all gene products
Understand how genomes change and polymerase chain reaction. During the project, using, for example, large-scale mutagenesis,
take on new functional roles. these methods were scaled up and made more small-molecule inhibitors and knock-down
efficient by evolutionary advances, such as approaches (such as RNA-mediated inhibition)
Grand Challenge I-1 Comprehensively automation and miniaturization. New Monitoring of the absolute abundance of
identify the structural and functional technologies, including capillary-based any protein (including membrane proteins,
components encoded in the human sequencing and methods for genotyping single- proteins at low abundance and all modified
genome nucleotide polymorphisms, have recently been forms) in any cell
Although DNA is relatively simple and well introduced, leading to further improvements in Improved imaging methods that allow non-
understood chemically,the human genomes capacity for genomic analyses. Even newer invasive molecular phenotyping
structure is extraordinarily complex and its approaches, such as nanotechnology and Correlating genetic variation to human health
function is poorly understood.Only 12% of microfluidics, are being developed, and hold great and disease using haplotype information or
its bases encode proteins7, and the full com- promise, but further advances are still needed. comprehensive variation information
plement of protein-coding sequences still Some examples are: Laboratory-based phenotyping, including the
remains to be established. A roughly equiva- Sequencing and genotyping technologies to use of protein affinity reagents, proteomic
lent amount of the non-coding portion of reduce costs further and increase access to a approaches and analysis of gene expression
the genome is under active selection11, sug- wider range of investigators Linking molecular profiles to biology,
gesting that it is also functionally important, Identification and validation of functional particularly pathway biology to disease
yet vanishingly little is known about it. It
NATURE | VOL 422 | 24 APRIL 2003 | www.nature.com/nature 2003 Nature Publishing Group 837
feature
Effective identification and analysis of contribute to cellular and organismal
functional genomic elements will require phenotypes
increasingly powerful computational capa- Genes and gene products do not function
bilities, including new approaches for tack- independently, but participate in complex,
ling ever-growing and increasingly complex interconnected pathways, networks and
data sets and a suitably robust computation- molecular systems that, taken together, give
al infrastructure for housing, accessing and rise to the workings of cells, tissues, organs
analysing those data sets (Box 3). In parallel, and organisms. Defining these systems and
investigators will need to become increasing- determining their properties and inter-
ly adept in dealing with this treasure trove of actions is crucial to understanding how
new information (Box 4). As a better under- biological systems function. Yet these
standing of genome function is gained, systems are far more complex than any
refined computational tools for de novo problem that molecular biology, genetics or
prediction of the identity and behaviour of genomics has yet approached.On the basis of
functional elements should emerge21. previous experience, one effective path will
Complementing the computational begin with the study of relatively simple
detection of functional elements will be model organisms22, such as bacteria and
the generation of additional experimental yeast, and then extend the early findings to
data by high-throughput methodologies. more complex organisms, such as mouse
One example is the production of full- and human. Alternatively, focusing on a few
length complementary DNA (cDNA) well-characterized systems in mammals will
sequences (see, for example, mgc.nci.nih.gov be a useful test of the approach (see, for
and www.fruitfly.org/EST/full.shtml). Major example, www.signaling-gateway.org).
challenges inherent in programmes to dis- entire human genome. Along these lines, Understanding biological pathways, net-
cover genes are the experimental identifica- the NHGRI recently launched the Encyclo- works and molecular systems will require
tion and validation of alternate splice forms pedia of DNA Elements (ENCODE) information from several levels.At the genetic
and messenger RNAs expressed in a highly Project (www.genome.gov/Pages/Research/ level, the architecture of regulatory inter-
restricted fashion. Even more challenging is ENCODE) to identify all the functional actions will need to be identified in different
the experimental validation of functional ele- elements in the human genome. In a pilot cell types, requiring, among other things,
ments that do not encode protein (for exam- project, systematic strategies for identifying methods for simultaneously monitoring the
ple, regulatory regions and non-coding RNA all functionally important genomic ele- expression of all genes in a cell12. At the gene-
sequences). High-throughput approaches ments will be developed and tested using a product level, similar techniques that allow
to identify them (Box 2) will be needed to selected 1% of the human genome. Parallel in vivo, real-time measurement of protein
generate the experimental data that will be projects involving well-studied model expression, localization, modification and
necessary to develop, confirm and enhance organisms,for example,yeast,nematode and activity/kinetics will be needed (Box 2). It
computational methods for detecting func- fruitfly, are ongoing. The lessons learned will will be important to develop, refine and scale
tional elements in genomes. serve as the basis for implementing a broader up techniques that modulate gene expression,
Because current technologies cannot programme for the entire human genome. such as conventional gene-knockout meth-
yet identify all functional elements, there is ods23, newer knock-down approaches24 and
a need for a phased approach in which Grand Challenge I-2 Elucidate the small-molecule inhibitors25 to establish the
new methodologies are developed, tested organization of genetic networks and temporal and cellular expression pattern of
on a pilot scale and finally applied to the protein pathways and establish how they individual proteins and to determine the
functions of those proteins. This is a key first
BOX 3 Computational biology step towards assigning all genes and their
products to functional pathways.
Computational methods have become regulation, the elucidation of protein structure and The ability to monitor all proteins in a cell
intrinsic to modern biological research, proteinprotein interactions, the determination of simultaneously would profoundly improve
and their importance can only increase the relationship between genotype and our ability to understand protein pathways
as large-scale methods for data phenotype, and the identification of the patterns and systems biology. A critical step towards
generation become more prominent, as of genetic variation in populations and the gaining a complete understanding of sys-
the amount and complexity of the data increase, processes that produced those patterns tems biology will be to take an accurate
and as the questions being addressed become Reusable software modules to facilitate census of the proteins present in particular
more sophisticated. All future biomedical research interoperability cell types under different physiological con-
will integrate computational and experimental Methods to elucidate the effects of ditions. This is becoming possible in some
components. New computational capabilities will environmental (non-genetic) factors and of model systems, such as microorganisms26.
enable the generation of hypotheses and stimulate geneenvironment interactions on health and It will be a major challenge to catalogue
the development of experimental approaches to disease proteins present in low abundance or in
test them. The resulting experimental data will, in New ontologies to describe different data types membranes. Determining the absolute
turn, be used to generate more refined models that Improved database technologies to facilitate abundance of each protein, including all
will improve overall understanding and increase the integration and visualization of different data modified forms, will be an important next
opportunities for application to disease. The areas types, for example, information about pathways, step. A complete interaction map of the pro-
of computational biology critical to the future of protein structure, gene variation, chemical teins in a cell,and their cellular locations,will
genomics research include: inhibition and clinical information/phenotypes serve as an atlas for the biological and med-
New approaches to solving problems, such as Improved knowledge management systems ical explorations of cellular metabolism27
the identification of different features in a DNA and the standardization of data sets to allow the (see www.nrcam.uchc.edu, for example).
sequence, the analysis of gene expression and coalescence of knowledge across disciplines These and other related areas constitute the
developing field of proteomics.
838 2003 Nature Publishing Group NATURE | VOL 422 | 24 APRIL 2003 | www.nature.com/nature
feature
Establishing a true understanding of how sphere of animal,plant and microbial species.
organized molecular pathways and networks A complete elucidation of genome function
give rise to normal and pathological cellular requires a parallel understanding of the
and organismal phenotypes will require sequence differences across species and the
more than large,experimentally derived data fundamental processes that have sculpted
sets. Once again, computational investiga- their genomes into the modern-day forms.
tion will be essential (Box 3), and there will The study of inter-species sequence com-
be a greatly increased need for the collection, parisons is important for identifying func-
storage and display of the data in robust tional elements in the genome (see Grand
databases. By modelling specific pathways Challenge I-1). Beyond this illuminating
and networks, predicting how they affect role, determining the sequence differences
phenotype, testing hypotheses derived from between species will provide insight into
these models and refining the models based the distinct anatomical, physiological and
on new experimental data, it should be developmental features of different organ-
possible to understand more completely the isms, will help to define the genetic basis for
difference between a bag of molecules and a speciation and will facilitate the characteri-
functioning biological system. zation of mutational processes. This last
point deserves particular attention, because
Grand Challenge I-3 Develop a detailed mutation both drives long-term evolution-
understanding of the heritable variation in ary change and is the underlying cause of
the human genome inherited disease. The recent finding that
Genetics seeks to correlate variation in DNA mutation rates vary widely across the mam-
sequence with phenotypic differences malian genome11 raises numerous questions
(traits). The greatest advances in human processes in normal and disease states. An about the molecular basis for these evolu-
genetics have been made for traits associated enhanced ability to incorporate information tionary changes.At present,our understand-
with variation in a single gene. But most about genetic variation into human genetic ing of DNA mutation and repair, including
phenotypes, including common diseases studies would usher in a new era for investigat- the important role of environmental factors,
and variable responses to pharmacological ing the genetic bases of human disease and is limited.
agents, have a more complex origin, involv- drug response (see Grand Challenge II-1). Genomics will provide the ability to sub-
ing the interplay between multiple genetic stantively advance insight into evolutionary
factors (genes and their products) and non- Grand Challenge I-4 Understand variation, which will, in turn, yield new
genetic factors (environmental influences). evolutionary variation across species and insights into the dynamic nature of genomes
Unravelling such complexity will require the mechanisms underlying it in a broader evolutionary framework.
both a complete description of the genetic The genome is a dynamic structure, continu-
variation in the human genome and the ally subjected to modification by the forces Grand Challenge I-5 Develop policy
development of analytical tools for using of evolution. The genomic variation seen options that facilitate the widespread use
that information to understand the genetic in humans represents only a small glimpse of genome information in both research
basis of disease. through the larger window of evolution, and clinical settings
Establishing a catalogue of all common where hundreds of millions of years of trial- Realization of the opportunities provided by
variants in the human population, including and-error efforts have created todays bio- genomics depends on effective access to the
single-nucleotide polymorphisms (SNPs),
small deletions and insertions, and other
structural differences, began in earnest BOX 4 Training
several years ago. Many SNPs have been Meeting the scientific, medical and their specific research efforts), at a collaborative
identified28, and most are publicly available social/ethical challenges now facing level (researchers will need to be able to
(www.ncbi.nlm.nih.gov/SNP). A public genomics will require scientists, participate effectively in interdisciplinary research
collaboration, the International HapMap clinicians and scholars with the skills collaborations that bring biology together with
Project (www.genome.gov/Pages/Research/ to understand biological systems and many other disciplines) and at the disciplinary
HapMap), was formed in 2002 to character- to use that information effectively for the benefit level (new disciplines will need to emerge at the
ize the patterns of linkage disequilibrium of humankind. Adequate training capacity will be interfaces between the traditional disciplines).
and haplotypes across the human genome required to address the following needs: Different perspectives Individuals from
and to identify subsets of SNPs that capture Computational skills As biomedical research minority or disadvantaged populations are
most of the information about these patterns is becoming increasingly data intensive, significantly under-represented as both
of genetic variation to enable large-scale computational capability is increasingly becoming researchers and participants in genomics
genetic association studies.To reach fruition, a critical skill. research. This regrettable circumstance deprives
such studies need more robust experimental Interdisciplinary skills Although a good start the field of the best and brightest from all
(Box 2) and computational (Box 3) methods has been made, expanded interactions will be backgrounds, narrows the field of questions
that use this new knowledge of human required between the sciences (biology, computer asked, can lessen sensitivity to cultural concerns
haplotype structure29. science, physics, mathematics, statistics, in implementing research protocols, and
A comprehensive understanding of genetic chemistry and engineering), between the basic compromises the overall effectiveness of the
variation, both in humans and in model and the clinical sciences, and between the life research. Genomics can learn from successful
organisms,would facilitate studies to establish sciences, the social sciences and the humanities. efforts in training individuals from under-
relationships between genotype and biologi- Such interactions will be needed at the individual represented populations in other areas of science
cal function. The study of particular variants level (scientists, clinicians and scholars will need and health (see, for example,
and how they affect the functioning of specific to be able to bring relevant issues, concerns and www.genome.gov/Pages/Grants/Policies/
proteins and protein pathways will yield capabilities from different disciplines to bear on ActionPlanGuide).
important new insights about physiological
NATURE | VOL 422 | 24 APRIL 2003 | www.nature.com/nature 2003 Nature Publishing Group 839
feature
information (such as data about genes, gene based diagnostic methods for the pre-
variants, haplotypes, protein structures, diction of susceptibility to disease, the
small molecules and computational models) prediction of drug response, the early
by a wide range of potential users, including detection of illness and the accurate
researchers, commercial enterprises, health- molecular classification of disease.
care providers, patients and the public. Develop and deploy methods that
Researchers themselves need maximum catalyse the translation of genomic
access to the data as soon as possible (see information into therapeutic advances.
Data release, below). Use of the information
for the development of therapeutic and Grand Challenge II-1 Develop robust
other products necessarily entails considera- strategies for identifying the genetic
tion of the complex issues of intellectual contributions to disease and drug response
property (for example, patenting and licens- For common diseases,the interplay of multi-
ing) and commercialization. The intellectual ple genes and multiple non-genetic factors,
property practices, laws and regulations that not a single allele, usually dictates disease
affect genomics must adhere to the principle susceptibility and response to treatments.
of maximizing public benefit, but must also Deciphering the role of genes in human
be consistent with more general and longer- health and disease is a formidable problem
established intellectual property principles. for many reasons, including impediments
Further, because genome research is global, to defining biologically valid phenotypes,
international treaties, laws, regulations, challenges in identifying and quantifying
practices, belief systems and cultures also environmental exposures, technological
come into play. obstacles to generating sufficient and useful
Without commercialization, most diag- treatment of disease. The report by the US genotypic information, and the difficulties
nostic and therapeutic advances will not National Research Council that originally of studying humans. Yet this problem can be
reach the clinical setting, where they can envisioned the HGP was explicit in its expec- solved. Vigorous development of cross-
benefit patients. Thus, we need to develop tation that the human genome sequence cutting genomic tools to catalyse advances
policy options for data access and for patent- would lead to improvements in human in understanding the genetics of common
ing, licensing and other intellectual property health, and subsequent five-year plans disease and in pharmacogenomics is needed.
issues to facilitate the dissemination of reaffirmed this view1517. But how this will Prominent among these will be a detailed
genomics data. happen has been less clearly articulated. haplotype map of the human genome
With the completion of the original goals (see Grand Challenge I-3) that can be used
II Genomics to health of the HGP, the time is right to develop for whole-genome association studies of
Translating genome-based knowledge and apply large-scale genomic strategies all diseases in all populations, as well as
into health benefits to empower improvements in human further advances in sequencing and
The sequencing of the human genome, health, while anticipating and avoiding genotyping technology to make such studies
along with other recent and expected potential harm. feasible (see Quantum leaps, below).
achievements in genomics, provides an Such strategies should enable the research More efficient strategies for detecting
unparalleled opportunity to advance our community to achieve the following: rare alleles involved in common disease
understanding of the role of genetic factors Identify genes and pathways with a are also needed, as the hypothesis that alleles
in human health and disease, to allow role in health and disease, and deter- that increase risk for common diseases are
more precise definition of the non-genetic mine how they interact with environ- themselves common30 will probably not be
factors involved, and to apply this insight mental factors. universally true. Computational and experi-
rapidly to the prevention, diagnosis and Develop, evaluate and apply genome- mental methods to detect genegene
and geneenvironment interactions, as well
as methods allowing interfacing of a variety
BOX 5 Ethical, legal and social implications (ELSI) of relevant databases, are also required
Todays genomics research and for enhancing the research, rather than viewing (Box 3). By obtaining unbiased assessments
applications rest on more than a such issues as impediments of the relative disease risk that particular
decade of valuable investigation into The continued development of appropriate gene variants contribute,a large longitudinal
their ethical, legal and social and effective genomics research methods and population-based cohort study, with collec-
implications. As the application of policies that promote the highest levels of tion of extensive clinical information and
genomics to health increases along with its science and of protecting human subjects ongoing follow-up, would be profoundly
social impact, it becomes ever more important The establishment of crosscutting tools, valuable to the study of all common diseases
to expand on this work. There is an increasing analogous to the publicly accessible genomic (Box 1). Already, such projects as the
need for focused ELSI research that directly maps and sequence databases that have UK Biobank (www.ukbiobank.ac.uk),
informs policies and practices. One can accelerated other genomics research (examples the Marshfield Clinics Personalized Medi-
envisage a flowering of translational ELSI of such tools might include searchable databases cine Research Project (www.mfldclin.edu/
research that builds on the knowledge of genomic legislation and policies from around pmrp) and the Estonian Genome Project
gained from prior and forthcoming basic the world, or studies of ELSI aspects of (www.geenivaramu.ee) seek to provide such
ELSI research, which would provide introducing clinical genetic tests) resources. But if the multiple population
knowledge for direct use by researchers, The evaluation of new genetic and genomic groups in the United States and elsewhere in
clinicians, policy-makers and the public. tests and technologies, and effective oversight the world are to benefit fully and fairly from
Examples include: of their implementation, to ensure that only such research (see Grand Challenge II-6), a
The development of models of genomics those with confirmed clinical validity are used for large population-based cohort study that
research that use attention to these ELSI issues patient care includes full representation of minority
populations is also needed.
840 2003 Nature Publishing Group NATURE | VOL 422 | 24 APRIL 2003 | www.nature.com/nature
feature
Grand Challenge II-2 Develop strategies developing new tools to detect many diseases
to identify gene variants that contribute to earlier than is currently feasible. Such
good health and resistance to disease sentinel methods might include analysis of
Most human genetic research has tradition- gene expression in circulating leukocytes,
ally focused on identifying genes that pre- proteomic analysis of body fluids, and
dispose to illness. A relatively unexplored, advanced molecular analysis of tissue biop-
but important, area of research focuses on sies. An example would be the analysis of
the role of genetic factors in maintaining gene expression in peripheral blood leuko-
good health. Genomics will facilitate further cytes to predict drug response. A focused
understanding of this aspect of human biol- effort to use a genomic approach to charac-
ogy and allow the identification of gene terize serum proteins exhaustively in health
variants that are important for the mainten- and disease might also be highly rewarding.
ance of health, particularly in the presence of
known environmental risk factors. One use- Grand Challenge II-4 Use new
ful research resource would be a healthy understanding of genes and pathways to
cohort, a large epidemiologically robust develop powerful new therapeutic
group of individuals (Box 1) with unusually approaches to disease
good health, who could be compared with Pharmaceuticals on the market target fewer
cohorts of individuals with diseases and who than 500 human gene products34. Even
could also be intensively studied to reveal though not all of the 30,000 or so human
alleles protective for conditions such as dia- protein-coding genes7 will have products
betes, cancer, heart disease and Alzheimers targetable for drug development, this sug-
disease. Another promising approach would gests that there is an enormous untapped
be rigorous examination of genetic variants tions, gene expression, protein expression pool of human gene-based targets for thera-
in individuals at high risk for specific dis- and protein modification should allow the peutic intervention. In addition, the new
eases who do not develop them, such as definition of a new molecular taxonomy of understanding of biological pathways pro-
sedentary, obese smokers without heart dis- illness, which would replace our present, vided by genomics (see Grand Challenge I-2)
ease or individuals with HNPCC mutations largely empirical, classification schemes and should contribute even more fundamentally
who do not develop colon cancer. advance both disease prevention and treat- to therapeutic design.
ment. The reclassification of neuromuscular The information needed to determine
Grand Challenge II-3 Develop genome- diseases32 and certain types of cancer33 pro- the therapeutic potential of a gene generally
based approaches to prediction of disease vides striking initial examples, but many overlaps heavily with the information that
susceptibility and drug response, early more such applications are possible. reveals its function. The success of imatinib
detection of illness, and molecular Such a molecular taxonomy would be the mesylate (Gleevec), an inhibitor of the BCR-
taxonomy of disease states basis for the development of better methods ABL tyrosine kinase, in treating chronic
The discovery of variants that affect risk for for the early detection of disease, which often myelogenous leukaemia relied on a detailed
disease could potentially be used in individ- allows more effective and less costly treat- molecular understanding of the diseases
ualized preventive medicine including ments. Genomics and other large-scale genetic cause35. This example offers promise
diet, exercise, lifestyle and pharmaceutical approaches to biology offer the potential for that therapies based on genomic informa-
intervention to maximize the likelihood
of staying well. For example, the discovery of
variants that correlate with successful out- BOX 6 Education
comes of drug therapy, or with unfortunate Marked health improvements from years away. For genomics-based health care to be
side effects, could potentially be rapidly integrating genomics into individual and maximally effective once it is widely feasible, and
translated into clinical practice. Turning this public health care depend on the for members of society to make the best decisions
vision into reality will require the following: effective education of health about the uses of genomics, we must take
(1) unbiased determination of the risk professionals and the public about the advantage now of this unique opportunity to
associated with a particular gene variant, interplay of genetic and environmental factors in increase understanding. Some examples are:
often overestimated in initial studies31; health and disease. Health professionals must be Health professionals vary, both individually and
(2) technological advances to reduce the cost knowledgeable about genomics to use the by discipline, in the amount and type of genomics
of genotyping (Box 2; see Quantum leaps, outcomes of genomics research effectively. The education that they require. So multiple models of
below); (3) research on whether this kind of public must be knowledgeable to make informed effective genomics-related education are needed.
personalized genomic information will decisions about participation in genomics research Print, web and video educational products that
actually alter health behaviours (see Grand and to incorporate the findings of such research the public can consume when actively seeking
Challenge II-5); (4) oversight of the imple- into their own health care. Both groups must be genomic information should be created and made
mentation of genetic tests to ensure that only knowledgeable to engage profitably in discussion easily available.
those with demonstrated clinical validity are and decision-making about the societal The media are crucial sources of information
applied outside of the research setting (Box implications of genomics. about genomics and its societal implications.
5); and (5) education of healthcare profes- Promising models for genomics and Initiatives to provide the media with greater
sionals and the public to be well-informed genetics education exist (see, for example, understanding of genomics are needed.
participants in this new form of preventive www.nchpeg.org), but they must be expanded and High-school students will be both the users of
medicine (Box 6). new models developed. We have entered a unique genomic information and the genomics researchers
The time is right for a focused effort to educable era regarding genomics; health of the future. Especially as they educate all sectors
understand, and potentially to reclassify, all professionals and the public are increasingly of society, high-school educators need information
human illnesses on the basis of detailed mol- interested in learning about genomics, but its and materials about genomics and its implications
ecular characterization. Systematic analyses widespread application to health is still several for society, to use in their classrooms.
of somatic mutations, epigenetic modifica-
NATURE | VOL 422 | 24 APRIL 2003 | www.nature.com/nature 2003 Nature Publishing Group 841
feature
tion will be particularly effective. Grand prevention or treatment plan; (3) the indi-
Challenge I-1 describes the functionationof vidual implements that plan; (4) this leads to
the genome, which will increasingly be the improved health; and (5) healthcare costs are
critical first step in the development of new reduced. Scrutiny of these assumptions is
therapeutics. But stimulating basic scientists needed, both to test them and to determine
to approach biomedical problems with a how each step could best be accomplished in
genomic attitude is not enough. A therapeu- different clinical settings.
tic mindset, lacking in much of academic Research is also required that critically
biomedical research and training, must be evaluates new genetic tests and interventions
explicitly encouraged, and tools developed in terms of parameters such as benefits,
and provided for its implementation. access and cost. Such research should be
A particularly promising example of the interdisciplinary and use the tools and
gene-based approach to therapeutics is the expertise of many fields, including
application of chemical genomics25. This genomics, health education, health behav-
strategy uses libraries of small molecules iour research, health outcomes research,
(natural compounds, aptamers or the healthcare delivery analysis, and healthcare
products of combinatorial chemistry) and economics. Some of these fields have histori-
high-throughput screening to advance cally paid little attention to genomics, but
understanding of biological pathways and high-quality research of this sort could pro-
to identify compounds that act as positive vide important guidance in clinical decision-
or negative regulators of individual gene making as the work of several disciplines
products, pathways or cellular phenotypes. has already been helpful in caring for people
Although the pharmaceutical industry with an increased risk of colon cancer as a
applies this approach widely as the first step ments for rare genetic diseases in the next result of mutations in FAP or HNPCC 37.
in drug development, few academic investi- decade. Further, the development of thera-
gators have access to this methodology or are peutic approaches to single-gene disorders Grand Challenge II-6 Develop genome-
familiar with its use. might provide valuable insights into apply- based tools that improve the health of all
Providing such access more broadly, ing genomics to reveal the biology of more Disparities in health status constitute a signif-
through one or more centralized facilities, common disorders and developing more icant global issue, but can genome-based
could lead to the discovery of a host of useful effective treatments for them (in the way approaches to health and disease help to
probes for biological pathways that would that, for example, the search for compounds reduce this problem? Social and other envi-
serve as new reagents for basic research that target the presenilins has led to general ronmental factors are major contributors to
and/or starting points for the development therapeutic strategies for late-onset health disparities; indeed, some would ques-
of new therapeutic agents (the hits from Alzheimers disease36). tion whether heritable factors have any
such library screens will generally require significant role. But population differences in
medicinal chemistry modifications to yield Grand Challenge II-5 Investigate how allele frequencies for some disease-associated
therapeutically usable compounds). genetic risk information is conveyed in variants could be a contributing factor to cer-
Also needed are new, more powerful clinical settings, how that information tain disparities in health status,so incorporat-
technologies for generating deep molecular influences health strategies and ing this information into preventive and/or
libraries, especially ones tagged to allow the behaviours, and how these affect health public-health strategies would be beneficial.
ready determination of precise molecular outcomes and costs Research is needed to understand the rela-
targets. A centralized database of screening Understanding how genetic factors affect tionship between genomics and health dis-
results should lead to further important health is often cited as a major goal of parities by rigorously evaluating the diverse
biological insights. Generating molecular genomics, on the assumption that applying contributions of socioeconomic status,
probes for exploring the basic biology of such understanding in the clinical setting will culture, discrimination, health behaviours,
health and disease in academic laboratories improve health. But this assumption actually diet,environmental exposures and genetics.
would not supplant the major role of bio- rests on relatively few examples and data, and It is also important to explore applications
pharmaceutical companies in drug develop- more research is needed to provide sufficient of genomics in the improvement of health in
ment, but could contribute to the start of the guidance about how to use genomic informa- the developing world (www3.who.int/whosis/
drug development pipeline. The private tion optimally for improving individual or genomics/genomics_report.cfm), where both
sector would doubtless find many of these public health. human and non-human genomics will play
molecular probes of interest for further Theoretically, the steps by which genetic significant roles.If we take malaria as an exam-
exploration through optimization by medic- risk information would lead to improved ple, a better understanding of human genetic
inal chemistry, target validation, lead com- health are: (1) an individual obtains factors that influence susceptibility and
pound identification, toxicological studies genome-based information about his/her response to the disease, and to the drugs used
and, ultimately, clinical trials. own health risks; (2) the individual uses this to treat it, could have a significant global
Academic pursuit of this first step in drug information to develop an individualized impact.So too could a better understanding of
development could be particularly valuable the malarial parasite itself and of its mosquito

T
for the many rare mendelian diseases, in vector, which the recently reported genome
which often the gene defect is known but the he non-coding sequences38,39 should provide. It will be neces-
small market size limits the private sectors sary to determine the appropriate roles of
motivation to shoulder the expense of part of the human governmental and non-governmental organi-
effective pharmaceutical development. Such zations, academic institutions, industry and
translational research in academic laborato-
genome is functionally individuals to ensure that genomics produces
ries, combined with incentives such as the important, yet little is clinical benefits for resource-poor nations,
US Orphan Drug Act, could profoundly and is used to produce robust local research
increase the availability of effective treat- known about it. expertise.
842 2003 Nature Publishing Group NATURE | VOL 422 | 24 APRIL 2003 | www.nature.com/nature
feature
To ensure that genomics research benefits to review new predictive genetic tests prior
all, it will be critical to examine how to marketing (www4.od.nih.gov/oba/sacgt/
genomics-based health care is accessed and reports/oversight_report.pdf). That recom-
used. What are the barriers to equitable mendation has not yet been acted on; mean-
access, and how can they be removed? This is while, numerous websites offering unvalidat-
relevant not only in resource-poor nations, ed genetic tests directly to the public, often
but also in wealthier countries where seg- combined with the sale of nutraceuticalsand
ments of society, such as indigenous popula- other products of highly questionable value,
tions, the uninsured, or rural and inner city are proliferating.
communities,have traditionally not received Many issues currently swirl around the
adequate health care. proper conduct of genetic research involving
human subjects, and further work is needed
III Genomics to society to achieve a satisfactory balance between the
Promoting the use of genomics to protection of research participants from
maximize benefits and minimize harms harm and the ability to conduct clinical
Genomics has been at the forefront of giving research that benefits society as a whole.Much
serious attention, through scholarly research effort has gone into developing appropriate
and policy discussions, to the impact of sci- guidelines for the use of stored tissue speci-
ence and technology on society. Although mens (www.georgetown.edu/research/nrcbl/
the major benefits to be realized from nbac/hbm.pdf), for community consultation
genomics are in the area of health, as when conducting genetic research with iden-
described above, genomics can also con- tifiable populations (www.nigms.nih.gov/
tribute to other aspects of society. Just as the news/reports/community_consultation.html
HGP and related developments have guide them to better health, but is deeply #exec), and for the consent of non-examined
spawned new areas of research in basic biol- concerned about potential misuses of that family members when conducting pedigree
ogy and in health, they have also created information (see www.publicagenda.org/ research (www.nih.gov/sigs/bioethics/nih_
opportunities for research on social issues, issues/pcc_detail.cfm?issue_type=medical_ third_party_rec.html), but confusion still
even to the extent of understanding more research&list=7). Topping the list of con- remains for many investigators and institu-
fully how we define ourselves and each other. cerns is the potential for discrimination in tional review boards.
In the next few years, society must not only health insurance and employment.A signifi- The use of genomic information is not
continue to grapple with numerous questions cant amount of research on this issue has limited to the arenas of biology and of health,
raised by genomics, but must also formulate been done40, policy options have been and further research and development of pol-
and implement policies to address many of published4143, and many US states have icy options is also needed for the many other
them. Unless research provides reliable data now passed anti-discrimination legislation applications of such information. The array
and rigorous approaches on which to base (see www.genome.gov/Pages/PolicyEthics/ of additional users is likely to include the life,
such decisions, those policies will be ill- Leg/StateIns and www.genome.gov/Pages/ disability and long-term care insurance
informed and could potentially compromise PolicyEthics/Leg/StateEmploy). The US industries, the legal system, the military, edu-
us all. To be successful, this research must Equal Employment Opportunity Commis- cational institutions and adoption agencies.
encompass both basic investigations that sion has ruled that the Americans with Dis- Although some of the research informing the
develop conceptual tools and shared vocabu- abilities Act should apply to discrimination medical uses of genomics will be useful in
laries, and more applied, translational based on predictive genetic information44, broader settings, dedicated research outside
projects that use these tools to explore and but the legal status of that construct remains the healthcare sphere is needed to explore the
define appropriate public-policy options that in some doubt. Although an executive order public values that apply to uses of genomics
incorporate diverse points of view. protects US government employees against other than for health care and their relation-
As it has in the past, such research will genetic discrimination,this does not apply to ship to specific contextual applications. For
continue to have important ramifications other workers. Thus, many observers have example, should genetic information on pre-
for all three major themes of the vision pre- concluded that effective federal legislation is disposition to hyperactivity be available in
sented here. We now address research that needed, and the US Congress is currently the future to school officials? Or should
focuses on society itself, more than on biolo- considering such a law. genetic information about behavioural traits
gy or health. Such efforts should enable the Making certain that genetic tests offered to be admissible in criminal or civil proceed-
research community to: the public have established clinical validity ings? Genomics also provides greater oppor-
Analyse the impact of genomics on and usefulness must be a priority for future tunity to understand ancestral origins of
concepts of race, ethnicity, kinship, research and policy making. In the United populations and individuals, which raises
individual and group identity, health, States,the Secretarys Advisory Committee on issues such as whether genetic information
disease and normality for traits and Genetic Testing extensively reviewed this area should be used for defining membership in a
behaviours. and concluded that further oversight is need- minority group.
Define policy options, and their poten- ed, asking the Food and Drug Administration Because uses of genomics outside the
tial consequences, for the use of gen- healthcare setting will involve a significantly

I
omic information and for the ethical broader community of stakeholders, both
boundaries around genomics research. t should be possible research and policy development in this area
must involve individuals and organizations
Grand Challenge III-1 Develop policy to understand the besides those involved in the medical applica-
options for the uses of genomics in tions of genomics. But many of the same
medical and non-medical settings
difference between a perspectives essential to research and policy
Surveys have repeatedly shown that the bag of molecules and development for the medical uses of genomics
public is highly interested in the concept are also essential. Both the potential users of
that personal genetic information might a biological system. non-medical applications of genomics and the
NATURE | VOL 422 | 24 APRIL 2003 | www.nature.com/nature 2003 Nature Publishing Group 843
feature
public need education to understand better between diverse parties based on an accurate
the nature and limits of genomic information and detailed understanding of the relevant sci-
(Box 6) and to grasp the ethical,legal and social ence and ethical, legal and social factors will
implications of its uses outside health care promote the formulation and implementa-
(Box 5). tion of effective policies.For instance,in repro-
ductive genetic testing, it is crucial to include
Grand Challenge III-2 Understand the perspectives from the disability community.
relationships between genomics, race and Research should explore how different indi-
ethnicity, and the consequences of viduals, cultures and religious traditions view
uncovering these relationships the ethical boundaries for the uses of genomics
Race is a largely non-biological concept con- for instance,which sets of values determine
founded by misunderstanding and a long attitudes towards the appropriateness of
history of prejudice. The relationship of applying genomics to such areas as reproduc-
genomics to the concepts of race and ethni- tive genetic testing,genetic enhancementand
city has to be considered within complex germline gene transfer.
historical and social contexts.
Most variation in the genome is shared Implementation: the NHGRIs role
between all populations, but certain alleles The vision for the future of genomics
are more frequent in some populations than presented here is broad and deep, and its real-
in others, largely as a result of history and ization will require the efforts of many. Con-
geography.Use of genetic data to define racial tinuation of the extensive collaboration
groups, or of racial categories to classify bio- between scientists and between funding
logical traits, is prone to misinterpretation. sources that characterized the HGP will be
To minimize such misinterpretation, the bio- it is particularly important to gather suffi- essential. Although the NHGRI intends to
logical and sociocultural factors that inter- cient scientifically valid information about participate in all the research areas discussed
relate genetics with constructs of race and genetic and environmental factors to pro- here, it will need to focus its efforts to use its
ethnicity need to be better understood and vide a sound understanding of the contribu- finite resources as effectively as possible.Thus,
communicated within the next few years. tions and interactions between genes and it will take a major role in some areas, actively
This will require research on how differ- environment in these complex phenotypes. collaborate in others,and have only a support-
ent individuals and cultures conceive of race, It is also important that there be robust ing role in yet others. The NHGRIs priorities
ethnicity, group identity and self-identity, research to investigate the implications, for and areas of emphasis will also evolve as mile-
and what role they believe genes or other both individuals and society, of uncovering stones are met and new opportunities arise.
biological factors have. It will also require a any genomic contributions that there may The approach that has characterized
critical examination of how the scientific be to traits and behaviours. The field of genomics and led to the success of the HGP
community understands and uses these con- genomics has a responsibility to consider the an initial focus on technology develop-
cepts in designing research and presenting social implications of research into the ment and feasibility studies, followed by
findings, and of how the media report these. genetic contributions to traits and behav- pilot efforts to learn how to apply new strat-
Also necessary is widespread education iours, perhaps an even greater responsibility egies and technologies efficiently on a larger
about the biological meaning and limita- than in other areas where there is less of a his- scale, and then implementation of full-scale
tions of research findings in this area (Box 6), tory of misunderstanding and stigmatiza- production efforts will continue to be at
and the formulation and adoption of tion.Decisions about research in this area are the heart of the NHGRIs priority-setting
public-policy options that protect against often best made with input from a diverse process. The following are areas of high
genomics-based discrimination or maltreat- group of individuals and organizations. interest, not listed in priority order.
ment (see Grand Challenge III-1).
Grand Challenge III-4 Assess how to Large-scale production of genomic data
Grand Challenge III-3 Understand the define the ethical boundaries for uses of sets The NHGRI will continue to support
consequences of uncovering the genomic genomics genomic sequencing, focusing on the
contributions to human traits and Genetics and genomics can contribute under- genomes of mammals, vertebrates, chord-
behaviours standing to many areas of biology, health and ates and invertebrates; other funders will
Genes influence not only health and disease, life. Some of these human applications are support the determination of additional
but also human traits and behaviours. Sci- controversial,with some members of the pub- genome sequences from microbes and
ence is only beginning to unravel the compli- lic questioning the propriety of their scientific plants.With current technology, the NHGRI
cated pathways that underlie such attributes exploration. Although freedom of scientific could support the determination of as
as handedness, cognition, diurnal rhythms inquiry has been a cardinal feature of human much as 4560 gigabases of genomic DNA
and various behavioural characteristics. Too progress, it is not unbounded. It is important sequence, or the equivalent of 1520 human
often, research in behavioural genetics, such for society to define the appropriate and inap- genomes, over the next five years. But as the
as that regarding sexual orientation or intel- propriate uses of genomics. Conversations cost of sequencing continues to decrease, the
ligence, has been poorly designed and its cost/benefit ratio of sequence generation
findings have been communicated in a way
that oversimplifies and overstates the role of
genetic factors.This has caused serious prob-
lems for those who have been stigmatized by
the suggestion that alleles associated with
what some people perceive as negativephys-
T he time is right to
develop and apply
large-scale genomic
will improve, so that the actual amount of
sequencing done will be greatly affected by
the development of improved sequencing
technology.
The decisions about which genomes to
sequence next will be based on the results of
iological or behavioural traits are more strategies to improve comparative analyses that reveal the ability of
frequent in certain populations. Given this genomic sequences from unexplored phylo-
history and the real potential for recurrence, human health. genetic positions to inform the interpreta-
844 2003 Nature Publishing Group NATURE | VOL 422 | 24 APRIL 2003 | www.nature.com/nature
feature
tion of the human sequence and to provide Molecular probes, including small mol-
other insights. Finally, the degree to which ecules and RNA-mediated interference, for
any new genomic sequence is completed exploring basic biology and disease. Explo-
finished, taken to an advanced draft stage or ration of the feasibility of expanding
lightly sampled will be determined by chemical genomics in the academic and
the use for which the sequence is generated. public sectors, particularly with regard to the
And,of course,the NHGRIs sequencing pro- establishment of one or more centralized
gramme will maintain close contact with, facilities, will be pursued by the NHGRI in
and take account of the plans and output partnership with others.
of, other sequencing programmes, as has
happened throughout the HGP. Databases Another type of community
A second data set ready for production- resource for the biological and biomedical
level effort is the human haplotype map research communities is represented by data-
(HapMap). This project, a collaboration bases (Box 3). But their support represents a
between the NHGRI, many other NIH insti- potentially significant problem. Funding
tutes, and four international partners, is agencies, reflecting the interest of the
scheduled for completion within three years. research community, tend to prefer to use
The outcome of the International HapMap their research funds to support the genera-
Project will significantly shape the future tion of new data, and the ongoing need for
direction of the NHGRIs research efforts in continued and increasing support for the
the area of genetic variation. data archives and robust access to them is
often given less attention. Both the scientific
Pilot-scale efforts The NHGRI has initiated community and the funding agencies must
the ENCODE Project to begin the develop- mination of all sequence-encoded functional recognize that investment in the creation
ment of the human genome parts list. The elements in genomes. and maintenance of effective databases is
first phase will address the application and Proteomics. In the short term, the NHGRI as important a component of research
improvement of existing technologies for expects to focus on the development of funding as data generation. The NHGRI has
the large-scale identification of coding appropriate, scalable technologies for the been a major source of support for several
sequences, transcription units and other comprehensive analysis of proteins and major genetics/genomics-oriented databas-
functional elements for which technology is protein machines in human health and in es, including the Mouse Genome Database
currently available. When the results of the both rare and complex diseases. (www.informatics.jax.org/mgihome/MGD/
ENCODE Project show evidence of efficacy Pathways and networks. As a complement aboutMGD.shtml), the Saccharomyces
and affordability at the pilot scale,considera- to the development of the genome parts list Genome Database (genome-www.stanford.
tion will be given to implementing the and increasingly effective approaches to pro- edu/Saccharomyces), FlyBase (flybase.bio.
appropriate technologies across the entire teome analysis,the NHGRI will encourage the indiana.edu), WormBase (www.wormbase.
human genome. development of new technologies that gener- org) and Online Mendelian Inheritance in
ate a synthetic view of genetic regulatory Man (www.ncbi.nlm.nih.gov/omim). The
Technology development Many areas of networks and interacting protein pathways. NHGRI will continue to be a leader in
critical importance to the realization of the Genetic contributions to health, disease exploring effective solutions to the issues of
genomics-based vision for biomedical and drug response. The NHGRI will place a integrating, displaying and providing access
research require new technological and high priority on creating and applying new to genomic information.
methodological developments before pilots crosscutting genomics tools, technologies
and then large-scale approaches can be and strategies needed to identify the genetic Ethical, legal and social research The
attempted. Recognizing that technology bases of medically relevant phenotypes. NHGRIs ELSI research activities will
development is an expensive and high-risk Research on the genetic contributions to rare increasingly focus on fundamental, widely
undertaking, the NHGRI is nevertheless and common diseases, and to drug response, relevant, societal issues. The community of
committed to supporting and fostering tech- will typically involve biological systems and scholars and researchers working in these
nology development in many of these crucial diseases of primary interest to other NIH social fields, as well as the scope of issues
areas,including the following. institutes and other funding organizations. being explored, need to be expanded. The
DNA sequencing. There is still great Accordingly, the NHGRI expects that its ELSI research community must include
opportunity to reduce the cost and increase involvement in this area of research will individuals from minority and other com-
the throughput of DNA sequencing, and to often be implemented through partnerships munities that may be disproportionately
make rapid, cheap sequencing available and collaborations. The NHGRI is particu- affected by the use or misuse of genetic infor-
more broadly. Radical reduction of sequen- larly interested in stimulating research mation. New mechanisms for promoting
cing costs would lead to very different approaches to the identification of gene dialogue and collaboration between the ELSI
approaches to biomedical research. variants that confer disease resistance and researchers and genomic and clinical
Genetic variation. Improved genotyping other manifestations of good health. researchers need to be developed; such
methods and better mathematical methods examples might include structural rewards
are necessary to make effective use of infor-
mation about the structure of variation in
the human genome for identifying the
genetic contributions to human diseases and
other complex traits.
The genome parts list. Beyond coding
S ociety must
formulate policies
to address many of the
for interdisciplinary research, intensive
summer courses or mini-fellowships for
cross-training, and the creation of centres of
excellence in ELSI studies to allow sustained
interdisciplinary collaboration.

sequences and transcriptional units, new questions raised by Longitudinal population cohort(s) This
computational and experimental approaches promising research resource will be so
are needed to allow the comprehensive deter- genomics. broadly applicable, and will require such
NATURE | VOL 422 | 24 APRIL 2003 | www.nature.com/nature 2003 Nature Publishing Group 845
feature
extensive funding that, although the NHGRI of scientists who wish to share their data with
might have a supporting role in design and the community in such a generous manner.
oversight, success will demand the involve-
ment and support of many other funding Quantum leaps
sources. It is interesting to speculate about potential
revolutionary technical developments that
Non-genetic factors in health and disease A might enhance research and clinical applica-
consequence of an improved definition of tions in a fashion that would rewrite entire
the genetic factors underlying human health approaches to biomedicine. The advent of
and disease will be an improvement in the the polymerase chain reaction, large-insert
recognition and definition of the environ- cloning systems and methods for low-cost,
mental and other non-genetic contributions high-throughput DNA sequencing are
to those traits. This is another area in which examples of such advances that have already
the NHGRI will be involved through the occurred.
development of new strategies and by form- During the course of the NHGRIs plan-
ing partnerships. ning discussions, other ideas were raised
about analogous technological leaps that
Use of genomic information to improve seem so far off as to be almost fictional but
health care The NHGRI will catalyse which, if they could be achieved, would revo-
collaboration between the diverse scholarly lutionize biomedical research and clinical
disciplines whose joint efforts will be neces- practice.
sary for research on the best ways for patients The following is not intended to be an
and healthcare providers to make effective exhaustive list, but to provoke creative
use of personalized genetic information in projects have followed suit (such as those dreaming:
the improvement of health. The NHGRI for full-length cDNAs and single-nucleotide the ability to determine a genotype at
will also strive to ensure that research in polymorphisms), to the benefit of the scien- very low cost, allowing an association
this area is informed by, and extends tific community. Scientific progress and study in which 2,000 individuals
knowledge of, the societal implications public benefit will be maximized by early, could be screened with about 400,000
of genomics. open and continuing access to large data sets genetic markers for $10,000 or less;
and by ensuring that excellent scientists are the ability to sequence DNA at costs
Improving the health of all people It will be attracted to the task of producing more that are lower by four to five orders
important for the NHGRI to support resources of this sort. For this system to con- of magnitude than the current cost,
research that explores how to ensure that tinue to work, the producers of community- allowing a human genome to be
genomic information is used, to the extent resource data sets have an obligation to make sequenced for $1,000 or less;
that such information is relevant, to reduce the results of their efforts rapidly available the ability to synthesize long DNA
global health disparities. That will include a for free and unrestricted use by the scientific molecules at high accuracy for $0.01
vigorous effort to increase the representation community, and resource users have an per base, allowing the synthesis of
of minorities in the ranks of genomics obligation to recognize and respect the gene-sized pieces of DNA of any
researchers. But the full solution of the health important contribution made by the scien- sequence for between $10 and
disparities problem can only come about tists who contribute their time and efforts to $10,000;
through a committed and sustained effort by resource production. the ability to determine the methyla-
governments,medical systems and society. Although these principles have been gener- tion status of all the DNA in a single
ally realized in the case of genomic DNA cell; and
Policy development The NHGRI will con- sequencing,they have not been for many other the ability to monitor the state of all
tinue to help facilitate public-policy devel- types of community-resource projects (struc- proteins in a single cell in a single
opment in the area of genetic/genomic tural biology coordinates or gene expression experiment.
science. Effective policy development will data, for example). The development of effec-
require attention to those issues for which it tive systems for achieving the rapid release of Conclusions
could have the greatest impact on the policy data without restrictions and for providing Preparing a vision for the future of genomics
agenda and could help to facilitate genomic continued widespread access to materials and research has been both daunting and exhila-
science. The NHGRI will also focus on issues research tools should be an integral compo- rating. The willingness of hundreds of
that would assist the public in benefiting nent of the planning and development of new experts to volunteer their boldest and best
from genomics, such as privacy of genetic community resources. The scientific commu- ideas, to step outside their areas of self-inter-
information, access to genetics services, nity should also develop incentives to support est and to engage in intense debates about
direct-to-consumer/providers marketing, the voluntary release of such data before publi- opportunities and priorities, has added a
patenting and licensing of genetic infor- cation by individual investigators, by appro- richness and audacity to the outcome that
mation, appropriate treatment of human priately rewarding and protecting the interests was not fully anticipated when the planning
participants in research, and standards, process began. To the extent that this article
usefulness and quality in genetic testing.

Data release
An important lesson of the HGP has been
the benefit of immediately releasing data
from large-scale sequencing projects, as
S cientific progress
will be maximized
by early, open and
captures the sense of excitement of the new
discipline of genomics, it is to their credit. A
complete list of the participants in this plan-
ning process can be found at www.genome.
gov/About/Vision/Acknowledgements.
A final word is appropriate about the
embodied in the Bermuda principles continuing access to breadth of the vision articulated here. A
(www.gene.ucl.ac.uk/hugo/bermuda.htm). choice had to be made between portraying
Some other large-scale data production large data sets. a broad view of the future of genomics
846 2003 Nature Publishing Group NATURE | VOL 422 | 24 APRIL 2003 | www.nature.com/nature
feature
15. US Department of Health and Human Services, US DOE.
research and focusing more narrowly on the Understanding Our Genetic Inheritance. The US Human Genome
specific role of the NHGRI. Recognizing that Project: The First Five Years. NIH Publication No. 90-1590
researchers and the public are more interest- (National Institutes of Health, Bethesda, MD, 1990).
ed in the promise of the field than about the 16. Collins, F. & Galas, D. A new five-year plan for the US Human
Genome Project. Science 262, 4346 (1993).
funding source responsible, we have focused 17. Collins, F. S. et al. New goals for the US Human Genome Project:
here on the broad landscape of scientific 19982003. Science 282, 682689 (1998).
opportunity. We have, however, identified 18. Hilbert, D. Mathematical problems. Bull. Am. Math. Soc. 8,
437479 (1902).
the areas that are particularly appropriate for 19. Aparicio, S. et al. Whole-genome shotgun assembly and analysis
leadership by the NHGRI throughout this of the genome of Fugu rubripes. Science 297, 13011310 (2002).
article. These are generally research areas 20. Sidow, A. Sequence first. Ask questions later. Cell 111, 1316
that are not specific to a particular disease or (2002).
21. Zhang, M. Q. Computational prediction of eukaryotic protein-
organ system, but have broader biomedical coding genes. Nature Rev. Genet. 3, 698709 (2002).
and/or social implications. Yet even in those 22. Banerjee, N. & Zhang, M. X. Functional genomics as applied to
instances, the word partnership appears mapping transcription regulatory networks. Curr. Opin.
numerous times intentionally. We expect to Microbiol. 5, 313317 (2002).
23. Van der Weyden, L., Adams, D. J. & Bradley, A. Tools for targeted
have partnerships not only with other public manipulation of the mouse genome. Physiol. Genomics 11,
funding sources, such as the other 26 NIH 133164 (2002).
institutes and centres, but also with many 24. Hannon, G. J. RNA interference. Nature 418, 244251 (2002).
25. Stockwell, B. R. Chemical genetics: Ligand-based discovery of
other governmental agencies, private foun- gene function. Nature Rev. Genet. 1, 116125 (2000).
dations and private-sector organizations. 26. Gavin, A. C. et al. Functional organization of the yeast proteome
Indeed, publicprivate partnerships, such as by systematic analysis of protein complexes. Nature 415,
the SNP Consortium,the Mouse Sequencing 141147 (2002).
27. Tyson, J. J., Chen, K. & Novak, B. Network dynamics and cell
Consortium and the International HapMap physiology. Nature Rev. Mol. Cell Biol. 2, 908916 (2001).
Project, provide powerful new models for Make no little plans; they have no magic 28. Sachidanandam, R. et al. A map of human genome sequence
the generation of public data sets with to stir mens blood and probably will variation containing 1.42 million single nucleotide
immediate and far-reaching value. Thus, themselves not be realized. Make big plans; polymorphisms. Nature 409, 928933 (2001).
29. Gabriel, S. B. et al. The structure of haplotype blocks in the
many of the most exciting opportunities in aim high in hope and work, remembering human genome. Science 296, 22252229 (2002).
genomics research cross traditional bound- that a noble, logical diagram once recorded 30. Reich, D. E. & Lander, E. S. On the allelic spectrum of human
aries of specific disease definitions, classi- will not die, but long after we are gone will be disease. Trends Genet. 17, 502510 (2001).
31. Hirschhorn, J. N., Lohmueller, K., Byrne, E. & Hirschhorn, K. A
cally defined scientific disciplines, funding a living thing, asserting itself with ever- comprehensive review of genetic association studies. Genet.
sources and public versus private enterprise. growing insistency (attributed to Daniel Med. 4, 4561 (2002).
The new era will flourish best in an environ- Burnham, architect). 32. Wagner, K. R. Genetic diseases of muscle. Neurol. Clin. 20,
ment where such traditional boundaries Francis S. Collins, Eric D. Green, Alan E. 645678 (2002).
33. Golub, T. R. Genomic approaches to the pathogenesis of
become ever more porous. Guttmacher and Mark S. Guyer are at the hematologic malignancy. Curr. Opin. Hematol. 8, 252261
Although the opportunities described National Human Genome Research Institute, (2001).
here are thought to be highly achievable, the National Institutes of Health, Bethesda, Maryland 34. Drews, J. & Ryser, S. The role of innovation in drug development.
formal initiation of specific programmes 20892, USA Nature Biotechnol. 15, 13181319 (1997).
35. Druker, B. J. Imatinib alone and in combination for chronic
will require more detailed analysis. The rela- 1. Mendel, G. Versuche ber Pflanzen-Hybriden. Verhandlungen
myeloid leukemia. Semin. Hematol. 40, 508 (2003).
tive priorities of each component must be des naturforschenden Vereines, Abhandlungen, Brnn 4, 347
36. Selkoe, D. J. Alzheimers disease: genes, proteins, and therapy.
(1866).
addressed in the light of limited resources 2. Avery, O. T., MacLeod, C. M. & McCarty, M. Studies of the
Physiol. Rev. 81, 74166 (2001).
37. Lynch, H. T. & de la Chapelle, A. Genomic medicine: hereditary
to support research. The NHGRI plans to chemical nature of the substance inducing transformation of
colorectal cancer. N. Engl. J. Med. 348, 919932 (2003).
release a revised programme announcement pneumococcal types. Induction of transformation by a
38. Gardner, M. J. et al. Genome sequence of the human malaria
desoxyribonucleic acid fraction isolated from Pneumococcus
and other grant solicitations later this year, parasite Plasmodium falciparum. Nature 419, 498511 (2002).
Type III. J. Exp. Med. 79, 137158 (1944).
providing more specific guidance to extra- 3. Watson, J. D. & Crick, F. H. C. Molecular structure of nucleic
39. Holt, R. A. et al. The genome sequence of the malaria mosquito
Anopheles gambiae. Science 298, 129149 (2002).
mural researchers about plans for the imple- acids: A structure for deoxyribose nucleic acid. Nature 171, 737
40. Anderlik, M. R. & Rothstein, M. A. Privacy and confidentiality of
mentation of this vision. Furthermore, in (1953).
genetic information: What rules for the new science? Annu. Rev.
4. Nirenberg, M. W. The genetic code: II. Sci. Am. 208, 8094 (1963).
genomics research,we have learned to expect 5. Jackson, D. A., Symons, R. H. & Berg, P. Biochemical method for
Genom. Hum. Genet. 2, 401433 (2001).
the unexpected. From past experience, it 41. Hudson, K. L., Rothenberg, K. H., Andrews, L. B., Kahn, M. J. E.
inserting new genetic information into DNA of Simian Virus 40:
& Collins, F. S. Genetic discrimination and health-insurance
would be surprising (and rather disappoint- circular SV40 DNA molecules containing lambda phage genes
An urgent need for reform. Science 270, 391393 (1995).
ing) if biological,medical and social contexts and the galactose operon of Escherichia coli. Proc. Natl Acad. Sci.
42. Rothenberg, K. et al. Genetic information and the workplace:
USA 69, 29042909 (1972).
did not change in unpredictable ways. That 6. Cohen, S. N., Chang, A. C., Boyer, H. W. & Helling, R. B.
Legislative approaches and policy challenges. Science 275,
17551757 (1997).
reality requires that this vision be revisited Construction of biologically functional bacterial plasmids in
43. Fuller, B. P. et al. Policy forum: Ethics privacy in genetics
on a regular basis. vitro. Proc. Natl Acad. Sci. USA 70, 32403244 (1973).
research. Science 285, 13591361 (1999).
7. The International Human Genome Sequencing Consortium.
In conclusion, the successful completion 44. Miller, P. S. Is there a pink slip in my genes? J. Health Care Law
Initial sequencing and analysis of the human genome. Nature
this month of all of the original goals of the 409, 860921 (2001).
Policy 3, 225265 (2000).

HGP emboldens the launch of a new phase 8. Sanger, F. & Coulson, A. R. A rapid method for determining Acknowledgements
for genomics research, to explore the sequences in DNA by primed synthesis with DNA polymerase. The formulation of this vision could not have happened without the
J. Mol. Biol. 94, 441448 (1975). thoughtful and dedicated contributions of a large number of
remarkable landscape of opportunity that 9. Maxam, A. M. & Gilbert, W. A new method for sequencing DNA. people. The authors were greatly assisted by Kathy Hudson, Elke
now opens up before us. Like Shakespeare, Proc. Natl Acad. Sci. USA 74, 560564 (1977). Jordan, Susan Vasquez, Kris Wetterstrand, Darryl Leja and Robert
we are inclined to say, whats past is pro- 10. Smith, L. M. et al. Fluorescence detection in automated DNA- Nussbaum. A subcommittee of the National Advisory Council for
logue (The Tempest, Act II, Scene 1). If we, sequence analysis. Nature 321, 674679 (1986). Human Genome Research, including Wylie Burke, William Gelbart,
11. The Mouse Genome Sequencing Consortium. Initial sequencing Eric Juengst, Maynard Olson, Robert Tepper and David Valle,
like bold architects, can design and build this and comparative analysis of the mouse genome. Nature 420, provided a critical sounding board for draft versions of this
unprecedented and noble structure, resting 520562 (2002). document. We also thank Aravinda Chakravarti, Ellen Wright
on the firm bedrock foundation of the 12. The Chipping Forecast II. Nature Genet. 32, 461552 (2002). Clayton, Raynard Kington, Eric Lander, Richard Lifton and Sharon
13. Guttmacher, A. E. & Collins, F. S. Genomic medicine A Terry for serving as working-group chairs at the meeting in
HGP (Figure 2), then the true promise
primer. N. Engl. J. Med. 347, 15121520 (2002). November 2002 that refined this document. Finally, we thank the
of genomics research for benefiting 14. National Research Council. Mapping and Sequencing the Human hundreds of individuals who participated as workshop planners
humankind can be realized. Genome (National Academy Press, Washington DC, 1988). and/or participants during this 18-month process.

NATURE | VOL 422 | 24 APRIL 2003 | www.nature.com/nature 2003 Nature Publishing Group 847

Вам также может понравиться