Вы находитесь на странице: 1из 4

Plant Genome Projects

Renate Schmidt, Max Planck Institute, Cologne, Germany


A genome project aims to discover all genes and their function in a particular species. Plant genome projects have focused on a few model organisms that are characterized by small genomes or their amenability to genetic studies.
. Maize

Secondary article
Article Contents
. Introduction . Arabidopsis and Rice

. Comparative Genomics

Introduction
Plant genomes have been extensively studied at the cytological, genetic and molecular level and large dierences in chromosome number, genome size and ploidy level have been found in the plant kingdom. Detailed chromosome maps and whole-genome sequences are prerequisites for a detailed structural description of a genome. To assess gene function, expression studies and mutational analyses are required. Comparative approaches serve as a tool to transfer the knowledge and resources that have been assembled for model plants, especially Arabidopsis, rice and maize, to a wide variety of species. consensus sequences of ESTs that are longer than individual ESTs; in some cases even the full-length transcript of a gene can be reconstructed. Over 110 000 EST sequences have been generated for Arabidopsis thaliana and approximately 70 000 for rice. It has been estimated that these tags represent approximately 3060% of all genes in these species. Particularly in rice, many EST sequences have been used as RFLP markers, thus enabling genes to be anchored on the genetic map. In order to study a genome in detail, it is necessary to establish clone libraries covering the entire genome. To do this, high-molecular weight plant DNA is cloned into bacterial (BAC) or yeast articial chromosome (YAC) vectors and the resulting articial chromosomes carrying inserts of plant DNA spanning 100 kbp or more are maintained alongside the bacterial or yeast chromosomes. Unique coordinates are assigned to any particular clone in a library, ensuring that all mapping results obtained with these libraries can be directly compared. Chromosome maps based on articial chromosome clones can be generated by applying a map-based approach and a ngerprinting strategy. Using the map-based approach, molecular-mapped markers are used as probes to identify and anchor clones on the genetic map. Given a large number of markers and suciently redundant libraries with large DNA inserts, clones will be identied that span two or more markers. Those clones sharing the same markers can be assembled into a set of contiguous clones (contig). This strategy has been successfully used to generate YAC contigs spanning large areas of the Arabidopsis and rice genomes and maps covering entire chromosome arms have been assembled (Figure 1) (Schmidt et al., 1995). For a ngerprinting strategy, all clones of a library are digested with appropriate restriction endonucleases and the resulting fragments separated on gels. The sizes of all fragments are estimated and recorded. A comparison of the fragment patterns for all dierent clones reveals overlapping clones. According to these results, the clones are arranged into contigs. This strategy has been successfully applied to generate large BAC contigs for the Arabidopsis and rice genomes (Marra et al., 1999). Anchoring of the resulting contigs on the genetic map is performed using molecular-mapped markers as probes to identify corresponding BAC clones (Figure 1).
1

Arabidopsis and Rice


Arabidopsis thaliana, a small dicotyledonous crucifer with approximately 125 Mbp has one of the smallest known genomes in higher plants. The short life cycle, small stature and large number of progeny make it ideally suited for genetic and mutational analysis. The important crop plant rice, Oryza sativa, with 430 Mbp, contains one of the smallest genomes known for monocotyledonous plants. Due to their small genome sizes, Arabidopsis and rice have been chosen for detailed genome analyses. Mutants have been identied in rice and Arabidopsis and many of them have been placed on genetic maps. Likewise, very extensive molecular marker maps have been assembled for the ve Arabidopsis and the 12 rice chromosomes (Harushima et al., 1998). Restriction fragment length polymorphism (RFLP) markers constitute a particularly versatile molecular marker system. Genomic or complementary DNA (cDNA) clones are used to detect polymorphisms at restriction sites in the DNA of individuals in genomic blot hybridizations. Analysis of progeny derived from crosses of individual plants that are polymorphic at the DNA level with dierent markers results in the construction of genetic linkage maps. For expressed sequence tag (EST) projects, thousands of partial sequences of randomly chosen cDNA clones are generated. These projects provide a catalogue of transcribed sequences for an organism in a cost-ecient manner. In such a collection, many genes will be represented multiple times. This is exploited to construct

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Macmillan Publishers Ltd, Nature Publishing Group / www.els.net

Plant Genome Projects

frequently observed. Approximately half of the predicted genes have sucient similarity to assign a putative function to the encoded proteins. Retroelement-like sequences are rarely found interspersed with genes and the majority of repetitive sequences are found clustered in the centromeric regions of the chromosomes (Lin et al., 1999; Mayer et al., 1999). The Arabidopsis and rice genome projects have resulted in the construction of densely populated genetic maps and detailed clone contig maps that are highly integrated (Figure 1). This facilitates gene isolation procedures using map-based approaches as has been documented by the successful completion of positional cloning experiments. Now the emphasis is shifting towards the functional analysis of genes. Insertional mutagenesis systems (see below) as well as global transcript analysis via high density arrays of oligonucleotides or cDNAs will then play a crucial role.

Maize
Figure 1 Components of a genome project. On the left, a schematic representation of a molecular marker map (A) for a chromosome is shown. Molecular markers are depicted as horizontal lines. Yeast artificial chromosome (YAC) clones, shown as long vertical black lines, are anchored by molecular markers onto the genetic map. The marker content of all clones in a particular region of the chromosome is assessed to build large contigs (B). High-density bacterial artificial chromosome (BAC) contigs (C), displayed as short vertical black lines, are established by fingerprinting techniques. Molecular markers anchor the contigs on the genetic map. For a completely sequenced BAC (D), predicted genes are shown as open boxes in the right part of the figure for the Watson and Crick strands. A gene corresponding to one of the genes of the sequenced BAC has been used to anchor YAC clones and for genetic mapping experiments, thus it provides a direct link between the genetic map, the physical map and the genomic sequence indicated by the dashed line.

High-density BAC contig maps are currently providing the templates for large-scale sequencing of genomes (Figure 1). BAC clones sharing minimal overlaps are chosen for sequencing experiments. The identication of genes in the resulting genomic sequence must rely largely on predictions using suitable computer algorithms. Comparisons of genomic sequence with sequence databases, e.g. the EST databases, are equally important for annotation of gene sequences. Sequencing of the rice genome began in 1998 and is due to conclude in 2003. For Arabidopsis, more than 90% of the genomic sequence has been determined, only highly repetitive regions, such as centromeric and nucleolar organizing regions have not been sequenced. Analysis of large contiguous segments of sequence has shown that a gene is found on average every 45 kbp in the Arabidopsis genome. Clusters of related genes are
2

The maize genome, with approximately 2500 Mbp, is much larger than the Arabidopsis or rice genomes. Furthermore, it is of polyploid origin, with most genes being present in duplicate. The number of genes has been estimated to be between 40 000 and 50 000. A high frequency of retrotransposon-like sequences are found interspersed with gene sequences (SanMiguel et al., 1996). The large proportion of these repetitive elements explains the large genome size. A large genome size poses special problems in genome analysis studies: although most of the described techniques can be applied to large as well as small genomes, the labour and cost involved is far higher for large genomes. Hence, the complete genome sequence is not the immediate goal of the maize genome project, rather large-scale EST projects are carried out to describe most of the maize genes in a costecient manner. A similar situation exists for other cereals such as wheat and barley. In parallel, high-density genetic maps are being assembled for maize and clone contig construction is on the way to establishing highly integrated genetic and physical maps. Furthermore, information on the smaller rice and sorghum genomes is being exploited to further genome studies in maize (see below). Due to possessing a number of extremely well-characterized mobile genetic elements (transposons), maize is extremely amenable to gene function studies, as insertion mutants may readily be generated. Transposon mutagenesis (Figure 2a) has led to the discovery of many important genes in maize and large collections of lines carrying transposon insertion in dierent positions of the genome have been generated. Using a reverse genetics approach, insertion mutants in virtually any gene of interest can now be identied (Figure 2b).

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Macmillan Publishers Ltd, Nature Publishing Group / www.els.net

Plant Genome Projects

Figure 2 Insertion mutagenesis. (a) Gene inactivation upon insertion of a mobile element. The gene is displayed as an open box with a black rectangle corresponding to the mobile element. Upon insertion of the element, transcription of the gene indicated by an arrow can no longer proceed. If the nature of the element is known, the inactivated gene can be isolated. (b) The rationale of a reverse genetic approach. Only if a known element is inserted in the gene of interest, shown as an open box, can a DNA fragment be generated if primers specific for the gene of interest and the transposon, indicated by a black rectangle, are used for amplification of DNA sequences by polymerase chain reaction (PCR). Arrows correspond to primer sequences. The resulting PCR product is shown as a hatched bar.

Since maize transposons are also active if introduced into other plants, they are also widely exploited for the generation of insertion mutants in other species, such as Arabidopsis and rice. Also, in these species the systematic elucidation of gene functions is carried out using reverse genetics.

Comparative Genomics
Genomes from closely related plant species show considerable DNA homology and often have the same number of chromosomes. Comparative genetic mapping experiments have been carried out to address the question whether the order of genes is also conserved between species. Genetic linkage maps based on molecular markers have been assembled for many dierent plant species.

Often, cDNA or gene sequences are used as RFLP markers. Their high conservation during evolution allows the use of RFLP markers not only in the species they are derived from but also in closely related species. If the same set of molecular markers is used for genetic mapping experiments in dierent species, the resulting genetic maps can then be compared. Such experiments have been carried out in several dierent plant families and extensive conservation of marker repertoire and order has been found. A colinear order of markers has been observed for segments of chromosomes or in some cases even entire chromosomes (Figure 3a). Almost complete genome colinearity has been observed in the Solanaceae family. Dierences in marker organization on the 12 tomato and potato chromosomes can be explained by ve chromosomal inversions (Tanksley et al., 1992). For the grass family (Poaceae), a high degree of genome conservation has been established even between species which diverged as long as 60 million years ago and which dier considerably in genome size. A close examination of data for the rice, maize and wheat genomes has led to the conclusion that the organization of all dierent chromosomes in the grass family can be described by a limited number of evolutionarily related chromosome segments. This concept allows multiple alignments of chromosome maps. Comparison of the genetic maps of Arabidopsis thaliana and Brassica has also revealed many colinear segments. To obtain information about areas of the chromosomes lying between molecular markers it is necessary to clone and characterize these regions in detail. Using compara-tive physical mapping and sequencing experiments, the conservation of local gene order, orientation and spacing is addressed. So far, only a few studies have been carried out in the Poaceae and Brassicaceae families and more data are needed to draw rm conclusions about the degree of micro-colinearity between genomes. Through the analysis of small homologous chromosome segments in dierent species, the same genes are discovered. Furthermore, the order of genes is generally maintained, although their spacing can vary widely in dierent species. High sequence homologies are conned to exon sequences and repetitive elements are not conserved. Disruptions of the overall conservation of local gene order have also been found. Copy number changes of genes are observed, as well as insertions or deletions of gene sequences (Figure 3b) (Tikhonov et al., 1999). Their small sizes have made the Arabidopsis and rice genomes the best-studied plant genomes. Extensive genome colinearity at the genetic and molecular level has been established for closely related plant species. Therefore, genome analysis studies in the Poaceae and Brassicaceae families may benet from the transfer of information and resources that are assembled in the framework of the Arabidopsis and rice genome projects.
3

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Macmillan Publishers Ltd, Nature Publishing Group / www.els.net

Plant Genome Projects

gene

gene

Transcription

Mobile element

PCR product transposon

X
(a) (b)

Figure 3 Patterns of genome colinearity. (a) Comparative genetic mapping. Using the same set of molecular markers for genetic mapping experiments in different species (A and B) allows the alignment of chromosome maps. Molecular markers are depicted as horizontal bars and markers which have been mapped in both species to the chromosomes shown are connected by lines. The chromosome of species A shares colinear segments with two chromosomes of species B, indicating translocation events. An example for an inversion event of a chromosomal segment is highlighted as a box. (b) Microcolinearity. A comparison of homologous genomic regions derived from two different species (A and B) at the sequence level is shown. Gene sequences, black and white boxes, are highly conserved as indicated by grey bars. In contrast, intergenic sequences do not show significant homologies. The gene marked by an asterisk is duplicated in species A, whereas the gene indicated by an arrow is not found in species A.

References
Harushima Y, Yano M, Shomura A et al. (1998) A high-density rice genetic linkage map with 2275 markers using a single F2 population. Genetics 148: 479494. Lin X, Kaul S, Rounsley S et al. (1999) Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana. Nature 402: 761768. Marra M, Kucaba T, Sekhon M et al. (1999) A map for sequence analysis of the Arabidopsis thaliana genome. Nature Genetics 22: 265270. Mayer K, Schu ller C, Wambutt R et al. (1999) Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana. Nature 402: 769777. SanMiguel P, Tikhonov A, Jin Y-K et al. (1996) Nested retrotransposons in the intergenic regions of the maize genome. Science 274: 765768. Schmidt R, West J, Love K et al. (1995) Physical map and organization of Arabidopsis thaliana chromosome 4. Science 270: 480483. Tanksley SD, Ganal MW, Prince JP et al. (1992) High density molecular linkage maps of the tomato and potato genomes. Genetics 132: 1141 1160. Tikhonov AP, SanMiguel PJ, Nakajima Y et al. (1999) Colinearity and its exceptions in orthologous adh regions of maize and sorghum. Proceedings of the National Academy of Sciences of the USA 96: 7409 7414.

Further Reading
Chang C and Meyerowitz EM (1991) Plant genome studies: restriction fragment length polymorphism and chromosome mapping information. Current Opinion in Genetics and Development 1: 112118. Dean C and Schmidt R (1995) Plant genomes: a current molecular description. Annual Review of Plant Physiology and Plant Molecular Biology 46: 395418. Gale MD and Devos KM (1998) Comparative genetics in the grasses. Proceedings of the National Academy of Sciences of the USA 95: 1971 1974. Rounsley S, Lin X and Ketchum KA (1998) Large-scale sequencing of plant genomes. Current Opinion in Plant Biology 1: 136141. Sasaki T and Burr B (2000) International genome sequencing project: the eort to completely sequence the rice genome. Current Opinion in Plant Biology 3: 138141. Schmidt R (1998) The Arabidopsis thaliana genome: towards a complete physical map. In: Anderson M and Roberts JA (eds) Arabidopsis, Annual Plant Reviews, vol. I, pp. 130. Sheeld: Sheeld Academic Press. Walbot V (2000) Saturation mutagenesis using maize transposons. Current Opinion in Plant Biology 3: 103107. Database information accessible via the World Wide Web: Maize DB [http://www.agron.missouri.edu/] Rice Genome Research Program [http://rgp.dna.arc.go.jp/] The Arabidopsis information resource [http://www.arabidopsis.org/]

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Macmillan Publishers Ltd, Nature Publishing Group / www.els.net

Вам также может понравиться