Вы находитесь на странице: 1из 10

Cell, Vol.

100, 377–386, February 4, 2000, Copyright 2000 by Cell Press

The Complete Sequence of a Heterochromatic


Island from a Higher Eukaryote

The Cold Spring Harbor Laboratory, staining material, heterochromatin, that can be differen-
Washington University Genome Sequencing Center, tiated cytologically from the surrounding lightly staining
and PE Biosystems Arabidopsis chromosomal regions (Cavalli and Paro, 1998; Holm-
Sequencing Consortium* quist et al., 1998). In Drosophila and maize, heterochro-
matic regions have been shown to suppress recombina-
tion, to deter transposon insertion, and to influence the
expression of genes found nearby (Cook and Karpen,
Summary 1994; Henikoff, 1998; Dimitri and Junakovic, 1999). In
Drosophila, the composition of heterochromatin has
Heterochromatin, constitutively condensed chromo- been extensively investigated in minichromosomes de-
somal material, is widespread among eukaryotes but rived from chromosome 4. Sample sequencing and FISH
incompletely characterized at the nucleotide level. We (fluorescent in situ hybridization) have revealed that
have sequenced and analyzed 2.1 megabases (Mb) much of the minichromosome comprised arrays of short
of Arabidopsis thaliana chromosome 4 that includes satellite repeats, but pulsed-field gel analysis revealed
0.5–0.7 Mb of isolated heterochromatin that resembles islands of low complexity DNA (Le et al., 1995). Some
the chromosomal knobs described by Barbara McClin- of these islands contained intact retrotransposable ele-
tock in maize. This isolated region has a low density ments, but none were found to be unique to centromeric
of expressed genes, low levels of recombination and regions (Sun et al., 1997). In maize, heterochromatin
a low incidence of genetrap insertion. Satellite repeats has been extensively investigated since chromosome
were absent, but tandem arrays of long repeats and staining procedures were sensitive enough to detect
many transposons were found. Methylation of these them (McClintock, 1929, 1930; McClintock et al., 1981).
sequences was dependent on chromatin remodeling. Heterochromatin is found in large pericentromeric do-
Clustered repeats were associated with condensed mains up to 100 megabases (Mb) in size as well as at
chromosomal domains elsewhere. The complete se- nucleolar organizers. Smaller islands of heterochroma-
quence of a heterochromatic island provides an op- tin, known as knobs and chromomeres, are distributed
portunity to study sequence determinants of chromo- along the chromosome arms, though they are often telo-
some condensation. meric. Knobs occur at different positions in different
strains of maize and are stably polymorphic, allowing
their extensive use as cytogenetic landmarks (Richards
Introduction and Dawe, 1998). Classically, the correlation between
cytological and genetic crossover was established in
The chromosomes of higher eukaryotes are character- maize using these landmarks (Creighton and McClin-
ized by the presence of condensed regions of deeply tock, 1931). Both interstitial and terminal knobs can vary

* The Cold Spring Harbor Laboratory, Washington Susan Granat,1 Nadim Shohdy,1 Amy Hasegawa,1
University Genome Sequencing Aliyah Hameed,1 Mohammad Lodhi,1,5
Center, and PE Biosystems (CSHL/WUGSC/ Arthur Johnson,1,6 Ellson Chen,4 Marco Marra,3

PEB†) Arabidopsis Sequencing Consortium Richard K. Wilson,3 and Robert Martienssen2,
1
is comprised of the following researchers: Lita Annenberg Hazen Genome Center
W. Richard McCombie,1 Melissa de la Bastide,1 2
Cold Spring Harbor Plant Biology Group
Kristina Habermann,1 Laurence Parnell,1 Cold Spring Harbor Laboratory
Neilay Dedhia,1 Lidia Gnoj,1 Kristin Schutz,1 Cold Spring Harbor, New York 11724
Emily Huang,1 Lori Spiegel,1 Cristy Yordan,2 3
Genome Sequencing Center
Mundeep Sehkon,3 Jennifer Murray,3 Paul Sheet,3 Washington University
Matt Cordes,3 Jane Threideh,3 Tamberlyn Stoneking,3 School of Medicine
Joelle Kalicki,3 Tina Graves,3 Gwen Harmon,3 St. Louis, Missouri 63108
Jennifer Edwards,3 Phil Latreille,3 Laura Courtney,3 4
Applied Biosystems
James Cloud,3 Amanda Abbott,3 Kelsi Scott,3 Foster City, California 94494
Doug Johnson,3 Pat Minx,3 Dan Bentley,3 5
Axys
Bob Fulton,3 Nancy Miller,3 Tracie Greco,3 La Jolla, California 92037
Kim Kemp,3 Jason Kramer,3 Lucinda Fulton,3 6
Department of Forestry
Elaine Mardis,3 Mike Dante,3 Kym Pepin,3 North Carolina State University
LaDeana Hillier,3 Joanne Nelson,3 John Spieth,3 Raleigh, North Carolina 27695
Joe Simorowski,2 Bruce May,2 Peter Ma,4
Ray Preston,1 Daniel Vil,1 Lei Hoon See,1 † Future citations of this work should use following form: (CSHL/
Monica Shekher,1 Anthony Matero,1 Ravi Shah,1 WUGSC/PEB Arabidopsis Sequencing Consortium, 2000).
I’Kyori Swaby,1 Andrew O’Shaughnessy,1 ‡ To whom correspondence should be addressed (e-mail: martiens@
Milka Rodriguez,1 Jane Hoffman,1 Sally Till,1 cshl.org).
Cell
378

in size, suggesting that some property of the chromo-


somal location itself, such as the DNA sequence, might
be required for chromosome condensation and might
vary from one allele to another.
Sample sequencing and end labeling has revealed
that heterochromatic knobs in maize are composed of
180 bp and 350 bp tandem repeat arrays that comprise
70% of the sequence, with retrotransposons making up
the remaining 30% (Ananiev et al., 1998a, 1998b). It has
therefore been proposed that either the repeats or the
retrotransposons might be responsible for the hetero-
chromatic properties of these regions. Supernumerary
maize B chromosomes have also been extensively ana-
lyzed for their heterochromatic content (Alfenito and
Birchler, 1993). B chromosome–specific high copy re-
peats were isolated by sample cloning and shown to be
related to knob sequences. By taking advantage of their
unusual disjunction at meiosis, it was possible to associ-
ate these sequences with euchromatic and heterochro-
matic regions of the minichromosome (Kaszas and
Birchler, 1996). However, no more than a small percent-
age of the contiguous sequence of a heterochromatic
region has yet been determined in any higher eukaryote.
Furthermore, the complete sequence of these regions
is unlikely to be determined in the near future. In part,
this is because long regions of satellite repeats are re-
fractory to complete sequence analysis due to the diffi-
culty in assembling overlapping reads (Rubin, 1998; Be- Figure 1. Physical Map of the Heterochromatic Knob on Arabi-
van et al., 1999). Because these repeats are coextensive dopsis Chromosome 4
with the condensed chromatin, the sequence elements Maps of the 2.1 Mb region from the assembly of YACs (blue) and
responsible for the formation of heterochromatin cannot BACs (green) are aligned with the genetic map derived from recom-
binant inbreds (left). The heterochromatic knob (right) is shown sche-
be unambiguously assigned.
matically. It lies within the YAC CIC8B1 (Fransz et al., 2000).
In the model plant Arabidopsis thaliana, heterochro-
matin is largely confined to the pericentromeric regions
of each of the five chromosomes as well as to the nucleo-
lar organizers at the ends of chromosomes 2 and 4. Results
These regions have been shown to include hundreds of
kilobases of tandemly repeated sequences (Copenhaver Mapping and Sequencing
and Pikaard, 1996; Round et al., 1997). Sample sequenc- The ten chromosome arms of the Arabidopsis genome
ing and FISH have shown that pericentromeric repeats are almost completely represented by contiguous bac-
include 180 bp satellite repeats, as well as retrotranspo- terial artificial chromosome (BAC) clones assembled in
sons, most notably from the family known as Athila (Pel- order by restriction enzyme fingerprinting and hybridiza-
issier et al., 1996; Brandes et al., 1997; Heslop-Harrison tion (Marra et al., 1999; Mozo et al., 1999). Fingerprinting
et al., 1999). These sequences have also proved impos- allowed us to select an optimal set of these clones for
sible to assemble into a single contiguous stretch (Lin sequencing the short arm of chromosome 4 (Figure 1),
et al., 1999; Mayer et al., 1999). However, the short arm even though this chromosomal region is highly repeti-
of chromosome 4 has an isolated region of heterochro- tive, which prohibited assembly by hybridization. Each
matin resembling a maize chromomere or knob (Fransz of these BAC clones was then sequenced to a very
et al., 2000), and we report here its complete sequence high level of accuracy, and the resulting sequence was
as well as the 2.1 Mb region surrounding it. The knob analyzed for gene and repeat content. The sequenced
has a very low gene density, and the genes are inter- region was placed on the cytogenetic map using five
spersed with numerous classes of repetitive sequence. sequenced genetic markers and a physical map con-
These include retrotransposons and DNA transposons structed from yeast artificial chromosomes (YAC) (Schmidt
as well as a central domain of tandemly repeated 1950 et al., 1995). In an accompanying paper, Fransz and
bp elements. The repeats and transposons are heavily coworkers describe the use of these YACs as probes
methylated, and their methylation is dependent on chro- for FISH, allowing genetic and cytological regions to be
matin remodeling. We have used genetrap mutagenesis aligned directly (Fransz et al., 2000 [this issue of Cell]).
and recombinant inbred lines to show that the entire A diagram of the short arm, showing the clones selected
region is refractory to meiotic recombination as well as for sequencing and salient markers is shown at http://
to detectable genetrap insertions. Unlike heterochro- www.cshl.org/protarab/chrom4map.htm.
matic regions in other organisms, this region in Arabi- The region described here is between BACs F2N1 and
dopsis is relatively small and complex, allowing the com- T1J1 with the knob lying in the center of YAC CIC8B1
plete sequence to be determined. (Figure 1).
Arabidopsis Heterochromatin Sequence Analysis
379

posons and other repeats comprised less than 10% of


the genes (Figures 2A and 2C). In the heterochromatic
region, repeats outnumbered genes by ten to one. In
the most condensed portion of the chromosome, there
is a stretch of 183,155 bp (from T5L23.27 to T24M8.10)
that has no predicted genes and only four hypothetical
genes (Figure 3). Unlike predicted genes, hypothetical
genes do not match any known gene or transcribed
sequence and were only found by gene modeling (see
Experimental Procedures). Eight DNA transposons and
21 retrotransposons were found in this stretch, along
with a large number of other repeats (Figure 3).
The frequency of expressed genes, as estimated by
matches to EST (expressed sequence tag) and cDNA
clones was also significantly biased (Figure 2A). In the
distal portion of the arm, the number of genes that
matched ESTs (37%) was similar to that reported for
the whole chromosome (Mayer et al., 1999). In the proxi-
mal half of the arm, only 13% of the genes matched
known transcribed sequences, and none of the 24 genes
in the central 450 kb of the knob corresponded to ESTs
(Figure 3). This bias was not evident in the distribution
of hypothetical genes (Figure 2A).

Genetrap Integration and Expression


We examined the number of gene traps and enhancer
traps that had integrated in this region from a collection
of approximately 2000 random integrations. These engi-
neered genetraps were based on the Ac/Ds transposons
from maize that had been introduced artificially into Ara-
bidopsis (Springer et al., 1995; Sundaresan et al., 1995).
Random integrations were mapped by amplifying and
sequencing DNA flanking the insertion site (see Experi-
mental Procedures). BLAST analysis indicated that 33
of the 2000 genetraps had integrated within the 2.1 Mb
Figure 2. Distribution of Genes, Transposons, and Repeats region, consistent with expectations for a 130 Mb ge-
The sequence of the 2080 kb region shown in Figure 1 was divided nome. However, none had integrated in heterochromatic
into 250 kb intervals, and the total number of genes and repeats was
sequences, and all but one had integrated in the distal
calculated for each interval. (A) Transcribed (green) and hypothetical
(yellow) genes; (B) dispersed (pink) and tandem (purple) repeats; (C) euchromatin (Figure 2D). Seventeen of these 33 euchro-
DNA transposons (light blue), Athila (green), and other retrotranspo- matic genetraps landed within genes, and ten resulted
sons (dark blue). (D) The heterochromatic knob (green) is shown in expression of the reporter gene in 7-day-old seed-
along with the locations of genetrap and enhancer trap elements lings. Two of the 33 insertions were within retrotranspo-
(triangles). sons. A similar distribution was found in the genome as
a whole for the other 2000 genetraps examined (R. M.
et al., unpublished data).
Gene Distribution
Three hundred forty-six genes, seven tRNA genes, 88
intact and defective transposons, and 150 other repeats Analysis of Repeats
were distributed along the length of the 2.1 Mb region. Repeats and endogenous transposons were detected
One hundred eighty-five of the genes could be catego- by both sequence comparison and compositional analy-
rized in functional classes (data not shown) and included sis (see Experimental Procedures). Sequence compari-
some genes that were previously thought unique to ani- son detected reading frames with homology to plant and
mals, such as an expressed homolog of the NMDA ligand- animal transposase. This analysis revealed a number of
gated ion-channel receptor, found in neurons (N. Kaplan transposons not previously described in Arabidopsis,
et al., unpublished data, GenBank entry AC002330; Lam including those of the Robertson’s Mutator (Bennetzen,
et al., 1998), and a homolog of the zinc finger apoptosis 1996) and Mariner (Plasterk et al., 1999) classes (see
inhibitor protein IAP-5. However, these classes were GenBank annotations). Inverted repeats were frequently
not significantly different from those observed in the found at the ends of these elements, and transposons
chromosome as a whole (Mayer et al., 1999). Although were classified as defective if they had suffered re-
functional classes did not differ significantly between arrangements in the coding region (data not shown). All
euchromatic and heterochromatic regions (data not 15 copies of the Athila retrotransposon were found in
shown), considerable differences in gene density were the proximal portion of the chromosome arm (Figure 2C)
observed (Figure 2). In the euchromatic regions, trans- along with other classes of repeat sequence (such as
Cell
380

Figure 3. Detailed Analysis of the Knob


Region
The diagram shows a 0.52 Mb region within
the interval that includes the heterochromatic
knob (T5L23 to T27D20, Figure 1). Hypotheti-
cal (black) and predicted (red) genes are
shown as thin arrows, tandem repeats as
block arrows, and transposons as triangles.
The only two genes corresponding to ESTs in
this region are highlighted as thick red arrows
(T5L23.27 and T27D20.10). The 183,000 bp
region that corresponds to the heterochro-
matic region and has no expressed or pre-
dicted genes is bracketed.

mi167) previously thought to be restricted to centro- 1950 bp element included a terminal 31 bp direct repeat
meric regions (Pelissier et al., 1996; Thompson et al., and varied by 2%–8% in nucleotide sequence from other
1996). None of these elements were found in the distal members of the array (Figure 4C). Gene modeling pro-
euchromatic portion. Other retrotransposons as well as grams did not predict any exon structure in the knob
DNA transposons were clustered in the heterochromatin repeat, but BLASTN searches identified one sequence
(Figure 3) so that it more closely resembled the maize on chromosome 5 and two sequences on chromosome
genome than it did Arabidopsis (San Miguel et al., 1996). 2 with significant homology (BLASTN score ⬎500, bases
However, these transposons were also found in the dis- 576 to 1400). The two related sequences on chromo-
tal portion, although they were relatively rare. For exam- some 2 were annotated as genes, located approximately
ple, out of ten Mu transposons located on this chromo- 400 kb apart (T19K21.3 and F17L24.8). BLAST searches
some arm, only one lay in the euchromatic region, and it revealed that T19K21.3 appeared to be integrated within
was inserted in the most condensed portion (see below). a transposase gene from the Spm family (data not
Retrotransposons were also slightly enriched in this dis- shown; Lin et al., 1999). Taken together, we conclude
tal region (Figure 2C). that the knob repeat may be related to a defective trans-
Compositional analysis of repeats was performed us- posable element. We next performed searches using
ing PrintRepeats and DOTPLOT (Figure 4). Dispersed only the nucleotide sequence from the most conserved
repeats were distributed throughout the region and in- portion of the repeat (nucleotides 1747–1904), which
cluded putative transposons as well as gene families. shared 40% identity at the amino acid level with the other
Similar repeats found close to each other may reflect three genes (Figure 4C). This 160 bp region matched six
the propensity of transposons to integrate close to the additional BAC clones from the pericentromeric regions
site from which they came as well as local gene duplica- of chromosomes 2 and 4. Of special note, 14 copies of
tion. Several clusters of related genes were observed, the sequence were included in a tandem array of 1 kb
including one that contained five calcium-dependent degenerate repeats on T6L9, a BAC from the pericen-
protein kinases and another with three NAM (no apical tromeric heterochromatic region of the long arm of chro-
meristem) transcription factors. These clusters were of-
mosome 4 (Figure 4C). A dotplot analysis of the two
ten interrupted by transposons or dispersed repeats,
BACs is shown in Figure 4D.
which may have been involved in the duplication events.
Two clusters of genes encoding CHP-rich zinc finger
proteins were found in the distal region (Fig 4B). Thirty- DNA Methylation
three genes in the Arabidopsis sequence database en- We tested the methylation status of the knob-specific
code similar proteins, some of which are highly con- repeats in cv. Columbia and cv. Landsberg by Southern
served at the nucleotide as well as the amino acid level analysis (Figure 5). We found that the array of 1950 bp
and lack introns, suggesting recent duplication or trans- knob repeats was present in both strains and was heav-
position. Interestingly, this region of the chromosome ily methylated at SalI and PvuII sites, which cut once in
(corresponding to CIC9F6 in Figure 1) is twice as con- each repeat (Figure 4C). Levels of methylation at SalI
densed as the surrounding regions during meiosis (re- sites in the repeat array were equivalent in Columbia
gion IIb in Fransz et al., 2000). and Landsberg, but PvuII was found to digest Columbia
Tandem repeats were nonrandomly distributed rela- but not Landsberg repeat arrays (Figure 5A). Athila ele-
tive to dispersed repeats (Figure 2B). In particular, 22.5 ments from the knob and throughout the genome were
tandemly arranged copies of a 1950 bp repeat were also methylated at EcoRII, SalI, and HpaII sites. Repeat
located in the heterochromatic knob (Figure 3). Each methylation was examined in ddm1 (decrease in DNA
Arabidopsis Heterochromatin Sequence Analysis
381

Figure 4. Analysis of Repeated Sequences


(A) PrintRepeats output for the entire 2.1 Mb
region. Repeated sequences are joined by
arches. The knob region and the distal con-
densed region are shown schematically.
(B) PrintRepeats output for the condensed dis-
tal euchromatic region around BAC T15B16
(Figure 1). Arches join individual members of
gene families described in the text.
(C) Structure of the 1950 bp knob repeat,
aligned with the 1 kb degenerate repeat found
on the long arm BAC, T6L9.
(D) DOTPLOT comparison of the BACs T5H22
(short arm) and T6L6 (long arm) showing the
tandem arrays found on each BAC as well as
a comparison between them.

methylation) mutants, which have unmethylated centro- heterochromatic portion. Interestingly, the highest lev-
meric satellite and nucleolar organizer DNA (Vongs et els of recombination were in the region of the chromo-
al., 1993). Single copy DNA retains its methylation status some identified by Fransz et al. (2000) as being the least
in these mutants, at least for the first few generations condensed during meiosis (YAC CIC4A7, Figure 1). This
(Kakutani et al., 1996, 1999). Both Athila retrotranspo- relationship was examined further by plotting the cumu-
sons and the 1950 bp repeats were found to be exten- lative number of genes and transposons as a function
sively, if not fully, demethylated in F2 ddm1 mutants of genetic distance along the chromosome arm (Figure
(Figures 5A and 5C). Interestingly, enzymes that cut once 6). This analysis revealed that physical distance, the
within the 1950 bp repeat gave a ladder of monomers, total number of genes, and especially the number of
dimers, and trimers consistent with the repeats being transposons were unrelated to genetic distance. Only
found in tandem arrays in both Landsberg and Co- the cumulative number of expressed genes varied lin-
lumbia. early with genetic distance, indicating a correlation be-
tween gene expression and recombination.
Recombination
Recombination across the region could be measured Discussion
using sequence-tagged sites from the recombinant in-
bred map (Schmidt et al., 1995). Figure 1 shows that Heterochromatin Structure
the level of recombination observed across the region In both Drosophila and mammalian cells, two classes
varied dramatically. In the euchromatic portion, each of heterochromatin, ␣ and ␤, have been defined by the
cM spanned 70 to 150 kb, compared to 950 kb in the extent of condensation and other cytological criteria

Figure 5. Repeated Sequences in the Het-


erochromatic Knob Are Methylated
Southern hybridization analysis of Arabi-
dopsis DNA digested with various enzymes
and probed with the 1950 bp tandem repeat
(A) and the Athila retrotransposon (B and C).
DNA samples were as follows: (A) lane 1, Co-
lumbia digested with PvuII; lane 2, Landsberg
digested with PvuII; lane 3, Columbia di-
gested with SalI; lane 4, Landsberg digested
with SalI; lane 5, Columbia ddm1 digested
with SalI; lane 6, Landsberg ddm1 di-
gested with SalI. (B) Columbia DNA digested
with EcoRII (lane 1), BstNI (lane 2), HpaII (lane
3), and MspI (lane 4). (C) DNA samples di-
gested with EcoRII from Columbia (lane 1),
ddm1 homozygous plants (lane 2), ddm1/⫹
heterozygous plants (lane 3), and ddm1 ho-
mozygous plants after two further genera-
tions of inbreeding (lanes 5 and 6). Arrows
indicate fragments corresponding to fully di-
gested repeats.
Cell
382

Heterochromatin Dynamics
Tandem repeats can form by unequal exchange (Hennig,
1999). In the case of Arabidopsis rDNA, such exchange
can be detected by examining repeat structure in recom-
binant inbred populations (Copenhaver and Pikaard,
1996). However, in the centromere (Round et al., 1997),
as well as in the knob region (see below), meiotic recom-
bination is substantially eliminated, suggesting that mei-
otic crossover can not be responsible for the amplifica-
tion of repeated arrays. The presence of transposons
in Drosophila heterochromatin has led to the suggestion
that satellite repeats might be amplified in part by un-
equal exchange following transposon excision (Thomp-
son-Stewart et al., 1994). Alternatively, the integration
of transposons multiple times at the same site, or within
Figure 6. Recombination Varies Linearly with Gene Expression
one another, could also result in tandem or more com-
The cumulative number of genes, expressed genes, transposons, plex arrays (Figure 3). The knob repeat on chromosome 4
and physical distance in 10 s of kb is plotted against recombination
resembles a defective transposable element, consistent
distance throughout the region.
with this idea. Repeat arrays could acquire condensa-
tion if the repeats carry chromatin binding sites that
(reviewed by Wallrath, 1998). At the sequence level, cooperate to induce heterochromatinization (Fanti et al.,
␣-heterochromatin has extensive arrays of tandem re- 1998). Such chromatin-binding sites may help to target
peats occasionally interrupted with transposons. In further transposon integrations (Pirrotta and Rastelli,
contrast, ␤-heterochromatin has scrambled arrays of 1994). In this way, knobs could form very quickly in
transposons with little evidence of satellite arrays (Holm- plant genomes during evolutionary bursts of transposon
quist et al., 1998). In this respect, the knob on chromo- activity. According to this view, heterochromatin forma-
some 4 resembles ␤-heterochromatin in that there are tion is simply a function of transposon density that re-
no tandem arrays of short repeats. In maize, sample sults in novel genetic properties such as silencing and
sequencing has revealed that heterochromatic knobs reduced levels of recombination. An alternative view,
are composed of 180 bp and 350 bp tandem repeat that knobs are themselves megatransposons (Ananiev
arrays that comprise 70% of the sequence, with retro- et al., 1998a), cannot be excluded but is less likely in
transposons making up the remaining 30% (Ananiev et this case given the complex composition of the knob and
al., 1998b). This is a lower density of retrotransposons the paucity of other knobs in the Arabidopsis genome.
than is found in the maize genome as a whole. In the Fransz et al. (2000) report that Arabisopsis thaliana
heterochromatic knob on chromosome 4 of Arabidopsis, cv. Landsberg plants do not have a heterochromatic
retrotransposons also make up 30% of the region, but knob at this position on chromosome 4, indicating that
less than 5% of the genome as a whole. One class of knobs are polymorphic in Arabidopsis, as they are in
retroelements (the 11 kb Athila elements) are found only maize (McClintock et al., 1981). However, Southern hy-
in heterochromatic regions (Pelissier et al., 1996) and bridization indicates that the knob repeat array is pres-
are highly represented in the Arabidopsis knob (Figure ent in the Landsberg genome (Figure 5) as are sequences
3), suggesting they are specifically targeted to hetero- between the knob and the centromere (Copenhaver et
chromatin and may contribute to its structure. Other al., 1999; Fransz et al., 2000). One possibility is that the
classes of retrotransposons are found throughout the knob has been fused to the centromere in Landsberg,
region between the centromere and the knob, sug- a possibility suggested by inversion of at least part of
gesting they do not preferentially target heterochroma- the intervening region observed by FISH (Fransz et al.,
tin, but rather may be retained in regions with sup- 2000). Under conditions of “genome stress,” knobs have
pressed recombination (see below). a tendency to spontaneously fuse with telomeres and
Unlike in maize, neither the Arabidopsis 180 bp centro- centromeres as well as other knobs (McClintock, 1951).
meric satellite repeat nor any other known satellite was DNA transposons in heterochromatic regions, such as
found in the heterochromatic knob. Instead, tandem re- the elements reported here, could account for these
peats of a 1950 bp sequence were observed. A related fusions if they were activated under similar conditions
tandem repeat was found in the pericentromeric hetero- (McClintock, 1978). This is because abortive transposi-
chromatin on the other side of the centromere. This sug- tions can result in fusion of donor and target sites and
gests that tandem repeats, rather than satellite arrays, inversion of the intervening sequences (Hudson et al.,
might be responsible for local condensation via a mech- 1990; English et al., 1993; Weil and Wessler, 1993).
anism similar to that reported in Drosophila and S.
pombe (Dorer and Henikoff, 1994; Grewal and Klar, The Genetic Properties of Heterochromatin
1997). In these studies, the ability of tandem repeats to The heterochromatic knob on Arabidopsis chromosome
induce the formation of heterochromatin was demon- 4 is composed largely of transposons and retrotranspo-
strated by gene silencing and nuclease sensitivity as sons, many of which are also found near centromeres
well as association with the chromocentre, which is an (Thompson et al., 1996). Further, the knob maps to an
important property of functional heterochromatin in Dro- interval thought to contain the centromere on chromo-
sophila (Csink and Henikoff, 1996). some 4 (Copenhaver et al., 1998). However, it has been
Arabidopsis Heterochromatin Sequence Analysis
383

shown recently that the knob and the centromere are their control (McClintock, 1965; Martienssen, 1998b). In
distinct (Copenhaver et al., 1999). While the function at least one case, the genes affected by DDM1 undergo
of heterochromatin remains a mystery, we have taken a form of paramutation (Jeddeloh et al., 1998), an epige-
advantage of the complete sequence to characterize netic change in gene expression linked to transposons
the genetic properties of the region. as well as repeats (Martienssen, 1996). It is possible
Many fewer genes matched ESTs in the heterochro- that paramutation, and related imprinting mechanisms
matic region than in the euchromatic region (Figure 2A). (Martienssen, 1998a), might preferentially affect re-
Even more dramatically, the number of ESTs that peated regions of the genome that are relatively con-
matched heterochromatic genes (four in 645,000 bp) densed and whose methylation depends on chromatin
was 40-fold less than in the distal euchromatin (260 remodeling.
in 1,102,089 bp) and 20-fold less than in the proximal
euchromatin (35 in 305,447 bp). This quantitative mea- Chromatin Condensation and Repeats
sure strongly suggests that gene expression is severely In the accompanying paper, Fransz et al. (2000) describe
reduced in the heterochromatic knob (Adams et al., the changes in chromosomal condensation that accom-
1991; McCombie et al., 1992a). Furthermore, genetraps pany meiosis in the short arm of chromosome 4. While
were not found in heterochromatic sequences. This was condensation is most severe in the conspicuous knob,
not due to a propensity to avoid retrotransposons, as it is also observed in a distal portion of the arm that
integrations in dispersed retrotransposons elsewhere in can therefore be considered “cryptic” heterochromatin
the genome were at normal levels. One possibility is (Figure 4B). This region encompasses three families of
that the selectable marker gene carried by the genetrap clustered genes interspersed with transposons and ret-
was silenced when integrated into heterochromatin so rotransposons and is twice as condensed as the sur-
that these integrations could not be detected (Sundare-
rounding euchromatin (Fransz et al., 2000). Interestingly,
san et al., 1995). Another possibility is that genetrap
the clustered genes are not found in EST databases,
transposons preferentially integrate within expressed
raising the possibility that they are downregulated in the
genes that are rare or absent in heterochromatin. A simi-
same way as the genes found in the knob. McClintock
lar relationship has been observed in Drosophila (Berg
(1951) predicted that the study of heterochromatin
and Spradling, 1991). It may be possible to target gene-
would lead to the understanding of euchromatic regula-
traps to integrate within heterochromatin by including
tion of gene expression. Given the role that transposons
heterochromatic sequences in the transposon construct
(McClintock’s “controlling elements”) and repeats play
(Kassis et al., 1992; Fauvarque and Dura, 1993; Taille-
in heterochromatin and in epigenetic gene regulation,
burg et al., 1999), and we are currently exploring this
this prediction may yet be borne out in such regions.
possibility.
Unlike in maize, or any other higher eukaryote, the
The frequency of recombination varied greatly with
complete sequence of the Arabidopsis knob, including
physical distance (Figure 1), but it varied almost linearly
the boundaries of the heterochromatic region, has been
with the density of expressed genes: EST matches oc-
determined. The complete and highly accurate se-
curred eight times per cM on average throughout the
quence of this region of the genome should greatly aid
region (Figure 6). This observation is consistent with the
efforts to determine the mechanism behind chromatin
notion that recombination might be limited to euchro-
condensation, recombination suppression, and gene si-
matic portions of the genome that encode expressed
genes. Such a relationship has been predicted on the lencing responsible for this ancient property of plant
basis of genome size and genetic distance (Thuriaux, genomes. Although there are many sequence-based
1977; Bennett, 1998) and has received some support strategies for the analysis of complex genomes (Adams
from studies of intragenic recombination in maize (Ci- et al., 1991; McCombie et al., 1992a; Waterston et al.,
vardi et al., 1994; Xu et al., 1995; Timmermans et al., 1992), only complete sequencing linked to genetic and
1996; Schnable et al., 1998). cytological maps can provide an understanding of
The paucity of expressed genes was also correlated higher order chromosome organization and function
with the highly repeated and methylated nature of the (McCombie et al., 1992a; Sulston et al., 1992; Wilson et
heterochromatic region. DNA methylation was depen- al., 1994; Bevan et al., 1998; Ashburner et al., 1999).
dent on the DDM1 gene, which is required for methyla- Analysis of the first complete genome of an animal, C.
tion of heterochromatic regions in Arabidopsis (Vongs elegans (The C. elegans Sequencing Consortium, 1998),
et al., 1993). DDM1 is also known to influence gene has already led to new insights into genome structure
expression, and in each case the affected genes are and evolution. As Arabidopsis genome sequencing
repeated as well as methylated (Jeddeloh et al., 1998; nears completion, we can expect a dramatic increase
Paszkowski and Mittelsten Scheid, 1998). DDM1 en- in our understanding of higher order chromosome struc-
codes a SNF2 chromatin remodeling enzyme (Jeddeloh ture and function in flowering plants.
et al., 1999) and might influence gene expression by
altering the interrelationship of repeated sequences, Experimental Procedures
allowing access to methyltransferase and potentially to
transcriptional repressors, such as methyl-binding pro- Sequencing Methods
BAC clones F2N1, F3D13, F11O4, T15B16, T7B11, T10M13, T2H3,
teins (Wolffe, 1998; Martienssen and Henikoff, 1999;
T14P8, T10P11, T5J8, T4I9, F4C21, F9H3, T5L23, T5H22, T7M24,
Wolffe and Hayes, 1999). The influence of ddm1 on T25H8, T24M8, T24H24, T27D20, T19B17, T26N6, F4H6, T19J18,
methylation of transposons might also lead to changes T4B21, and T1J1 were individually cultured for large-scale prepara-
in gene expression. This is because integration of trans- tion, and the DNA was isolated via alkaline lysis, phenol/chloroform
posons in the vicinity of genes can bring them under extraction, and isopropanol precipitation. The purified DNA was
Cell
384

then sheared by nebulization with 10 psi of nitrogen gas. 1.5–3 kb Gene Trap and Enhancer Trap Transposon Mutagenesis
fragments were size selected on agarose gels, repaired with mung Two thousand independent Arabidopsis plants (cv. Landsberg) were
bean nuclease, and ligated into M13. Cells were transformed by isolated so that each contained one or occasionally more than one
electroporation and plaques picked from a lawn of JM101. Single- insertion of the DsG gene trap or DsE enhancer trap transposable
stranded M13 phage DNA was isolated from overnight cultures in elements described by Sundaresan et al. (1995) and Springer et al.
96-well format via the Thermomax procedure (Mardis, 1994). The (1995). Genomic DNA was prepared from each plant and the inser-
purified DNA was then cycle sequenced using a combination of dye tion sites were amplified using TAIL PCR and sequenced essentially
primer and dye terminator chemistries and subsequently separated as described by Tsukegi et al. (1996). BLASTN searches were per-
and detected by electrophoresis through acrylamide gels run on formed against the Arabidopsis genome sequence in GenBank using
ABI 377 sequencers (McCombie at al., 1992b). Data were tracked an automated annotation routine (B. M. and R. M., unpublished
and processed and raw data quality monitored using a web-based data). Unique matches of greater than 96% identity among readable
lab management data system, Kaleidaseq (Dedhia and McCombie, bases were taken to indicate insertion of the element in the genomic
1998). BAC clones were sequenced to approximately 8-fold cover- region indicated by the GenBank entry. Enhancer (ET) and gene (GT)
age. The random fragments were basecalled using PHRED (Phil traps found in the region were: ET1967, GT5365, GT6223, GT4985,
Green) and assembled into contiguous pieces using the assembly ET587, GT5961, GT6437, ET5292, GT6437, GT5927, GT5912,
algorithm PHRAP (Phil Green). Assemblies were converted (using GT6230, ET1949, GT5222, GT331, ET120, GT4985, ET446, GT148,
software developed by Gabor Marth at the WUGSC) into the editing ET5413, GT150, GT6149, ET37, ET5862, ET5867, ET5864, ET134,
programs Gap4 (Roger Staden) or CONSED (David Gordon) for man- GT4985, ET5506, GT6702, GT5962, ET238, ET5490, and GT2722.
ual inspection and finalization. Directed reads were chosen to close
sequence gaps and achieve contiguity as well as resolve ambiguities Acknowledgments
within the sequence. Vector junctions were also delineated. The
sequence was finished to an accuracy of less than one error per We would like to thank Eric Richards for comments on the manu-
100 kb by ensuring that each base was covered by at least two script and for ddm1 seed. We would like to thank Bruce May, John
independent subclones representing both strands of DNA or se- Healy, and Andy Reiner for help with the genetrap sequence analy-
quenced with both dye primer and dye terminator chemistries on sis, and Juana-Marie Arroyo and Rulan Shen for sharing unpublished
multiple clones representing the same strand. The final assemblies genetrap data. We acknowledge the support of the United States
were confirmed by comparison of the assemblies’ predicted restric- Department of Agriculture NRI (95-37300-1578), the National Sci-
tion fragments to actual restriction digests of the BAC clone, includ- ence Foundation (DBI-9813578), the Robertson Research Fund, West-
ing the fingerprints from the mapping group. vaco Corporation, and the generous support of David L. Luke III.

Gene Modeling Received October 26, 1999; revised January 10, 2000.
Gene modeling was accomplished through an integration of gene
prediction algorithms and similarity searching. An extensive analysis References
of the performance of Arabidopsis exon and gene prediction algo-
rithms (L. P. et al., unpublished data) indicated that GenScan, MZEF, Adams, M.D., Kelley, J.M., Gocayne, J.D., Dubnick, M., Polymero-
and NetPlantGene perform best at delineating the exon-intron struc- poulos, M.H., Xiao, H., Merril, C.R., Wu, A., Olde, B., Moreno, R.F.,
ture of a gene (Hebsgaard et al., 1996; Burge and Karlin, 1998; et al. (1991). Complementary DNA sequencing: expressed sequence
Zhang, 1998). GRAIL is best at identifying terminal exons. A consen- tags and the human genome project. Science 252, 1651–1656.
sus gene model is then built, guided by the results of similarity Alfenito, M.R., and Birchler, J.A. (1993). Molecular characterization
searching. BLASTN identifies EST matches and BLASTX marks se- of a maize B chromosome centric sequence. Genetics 135, 589–597.
quence with the potential to encode a protein similar to entries in
a protein database. (EST matches above 96% identity were scored Ananiev, E.V., Phillips, R.L., and Rines, H.W. (1998a). Complex struc-
in which most mismatches were unreadable bases.) Intergenic seg- ture of knob DNA on maize chromosome 9. Retrotransposon inva-
ments and large (typically greater than 800 bp) introns are routinely sion into heterochromatin. Genetics 149, 2025–2037.
searched for the occurrence of repeats. DNA repeats (LINES, MITES, Ananiev, E.V., Phillips, R.L., and Rines, H.W. (1998b). A knob-associ-
solo LTRs, etc.) with the exception of simple sequence repeats ated tandem repeat in maize capable of forming fold-back DNA
(SSRs) are ⬎90% identical and over 100 bp in length; SSRs must segments: are chromosome knobs megatransposons? Proc. Natl.
extend for ⬎25 bp. Repeats with the potential to encode transposon Acad. Sci. USA 95, 10785–10790.
proteins are identified by similarity to transposase, reverse tran- Ashburner, M., Misra, S., Roote, J., Lewis, S.E., Blazej, R., Davis,
scriptases, and polyproteins. PrintRepeats analysis (Parsons, 1995) T., Doyle, C., Galle, R., George, R., Harris, N., et al. (1999). An explora-
was performed at a cutoff of 150. Dotplots were performed on line tion of the sequence of a 2.9 Mb region of the genome of Drosophila
at the Virtual Genome Centre at the University of Minnesota (http:// melanogaster. The Adh region. Genetics 153, 179–219.
alces.med.umn.edu/rawdot.html) or by using the Wisconsin GCG Bennett, M.D. (1998). Plant genome values: how much do we know?
group sequence analysis package. Proc. Natl. Acad. Sci. USA 95, 2011–2016.
Bennetzen, J.L. (1996). The mutator transposable element system
DNA Methylation Analysis of maize. Curr. Top. Microbiol. Immunol. 204, 195–229.
Arabidopsis genomic DNA was digested with restriction enzymes Berg, C.A., and Spradling, A.C. (1991). Studies on the rate and site-
and subjected to DNA gel blot analysis as described previously specificity of P element transposition. Genetics 127, 515–524.
(Springer et al., 1995). DNA from ddm1 mutants was from the F2 Bevan, M., Bancroft, I., Bent, E., Love, K., Goodman, H., Dean, C.,
generation. Landsberg ddm1 heterozygotes were genotyped by Bergkamp, R., Dirkse, W., Van Staveren, M., Stiekema, W., et al.
progeny test, and were only self-pollinated after six generations of (1998). Analysis of 1.9 Mb of contiguous sequence from chromo-
backcrossing (L. Das and R. M., unpublished data) so that demethyl- some 4 of Arabidopsis thaliana. Nature 391, 485–488.
ation was a result of loss of ddm1 function rather than inheritance Bevan, M., Bancroft, I., Mewes, H.W., Martienssen, R., and McCom-
of unmethylated DNA (Vongs et al., 1993; Kakutani et al., 1999). DNA bie, R. (1999). Clearing a path through the jungle: progress in Arabi-
gel blots were hybridized and washed at high stringency with probes dopsis genomics. Bioessays 21, 110–120.
corresponding to bases 570 to 1340 of the 1950 bp repeat and to
Brandes, A., Thompson, H., Dean, C., and Heslop-Harrison, J.S.
bases 1123 to 1847 of the adjacent defective Athila element T5H22.4.
(1997). Multiple repetitive DNA sequences in the paracentromeric
Probes were amplified from genomic DNA using the primer pairs
regions of Arabidopsis thaliana L. Chromosome Res. 5, 238–246.
5⬘-ACGACGAGAGGCTCCATCTA-3⬘/5⬘-CTGTTCCTCGCAAGTTC
CTC-3⬘ and 5⬘-CCACACAGGTGGATCAACAG-3⬘/ 5⬘-TGATTGAT Burge, C.B., and Karlin, S. (1998). Finding the genes in genomic
GAGGACATCGGA-3⬘ respectively. Probes were gel purified and se- DNA. Curr. Opin. Struct. Biol. 8, 346–354.
quenced to verify their origin. C. elegans Sequencing Consortium (1998). Genome sequence of
Arabidopsis Heterochromatin Sequence Analysis
385

the nematode C. elegans: a platform for investigating biology. Sci- elements, chiasmata, and the unique organization of beta-hetero-
ence 282, 2012–2018. chromatin. Cytogenet. Cell Genet. 80, 113–116.
Cavalli, G., and Paro, R. (1998). Chromo-domain proteins: linking Hudson, A.D., Carpenter, R., and Coen, E.S. (1990). Phenotypic ef-
chromatin structure to epigenetic regulation. Curr. Opin. Cell. Biol. fects of short-range and aberrant transposition in Antirrhinum ma-
10, 354–360. jus. Plant Mol. Biol. 14, 835–844.
Civardi, L., Xia, Y., Edwards, K.J., Schnable, P.S., and Nikolau, B.J. Jeddeloh, J.A., Bender, J., and Richards, E.J. (1998). The DNA meth-
(1994). The relationship between genetic and physical distances in ylation locus DDM1 is required for maintenance of gene silencing
the cloned a1-sh2 interval of the Zea mays L. genome. Proc. Natl. in Arabidopsis. Genes Dev. 12 1714–1725.
Acad. Sci. USA 91, 8268–8272. Jeddeloh, J.A., Stokes, T.L., and Richards, E.J. (1999). Maintenance
Cook, K.R., and Karpen, G.H. (1994). A rosy future for heterochroma- of genomic methylation requires a SWI2/SNF2-like protein. Nat.
tin. Proc. Natl. Acad. Sci. USA 91, 5219–5221. Genet. 22, 94–97.
Copenhaver, G.P., and Pikaard, C.S. (1996). Two-dimensional RFLP Kakutani, T., Jeddeloh, J.A., Flowers, S.K., Munakata, K., and Rich-
analyses reveal megabase-sized clusters of rRNA gene variants in ards, E.J. (1996). Developmental abnormalities and epimutations
Arabidopsis thaliana, suggesting local spreading of variants as the associated with DNA hypomethylation mutations. Proc. Natl. Acad.
mode for gene homogenization during concerted evolution. Plant Sci. USA 93, 12406–12411.
J. 9, 273–282. Kakutani, T., Munakata, K., Richards, E.J., and Hirochika, H. (1999).
Copenhaver, G.P., Browne, W.E., and Preuss, D. (1998). Assaying Meiotically and mitotically stable inheritance of DNA hypomethyla-
genome-wide recombination and centromere functions with Arabi- tion induced by ddm1 mutation of Arabidopsis thaliana. Genetics
dopsis tetrads. Proc. Natl. Acad. Sci. USA 95, 247–252. 151, 831–838.
Copenhaver, G.P., Nickel, K., Kuromori, T., Benito, M., Kaul, S., Lin, Kassis, J.A., Noll, E., VanSickle, E.P., Odenwald, W.F., and Perrimon,
X., Bevan, M., George Murphy, G., Harris, B., McCombie, W.R., et N. (1992). Altering the insertional specificity of a Drosophila trans-
al. (1999). Genetic definition and sequence analysis of Arabidopsis posable element. Proc. Natl. Acad. Sci. USA 89, 1919–1923.
centromeres. Science 286, 2468–2474. Kaszas, E., and Birchler, J.A. (1996). Misdivision analysis of centro-
Creighton, H., and McClintock, B. (1931). A correlation of cytological mere structure in maize. EMBO J. 15, 5246–5255.
and genetical crossing over in Zea mays. Proc. Natl. Acad. Sci. USA Lam, H.M., Chiu, J., Hsieh, M.H., Meisel, L., Oliveira, I.C., Shin, M.,
17, 492–497. and Coruzzi, G. (1998). Glutamate-receptor genes in plants. Nature
Csink, A.K., and Henikoff, S. (1996). Genetic modification of hetero- 396, 125–126.
chromatic association and nuclear organization in Drosophila. Na- Le, M.H., Duricka, D., and Karpen, G.H. (1995). Islands of complex
ture 381, 529–531. DNA are widespread in Drosophila centric heterochromatin. Genet-
Dedhia, N.N., and McCombie, W.R. (1998). Kaleidaseq: a web-based ics 141, 283–303.
tool to monitor data flow in a high throughput sequencing facility. Lin, X., Kaul, S., Rounsley, S., Shea, T.P., Benito, M.I., Town, C.D.,
Genome Res. 8, 313–318. Fujii, C.Y., Mason, T., Bowman, C.L., Barnstead, M., et al. (1999).
Dimitri, P., and Junakovic, N. (1999). Revising the selfish DNA hy- Sequence and analysis of chromosome 2 of the plant Arabidopsis
pothesis: new evidence on accumulation of transposable elements thaliana. Nature, 402, 761–768.
in heterochromatin. Trends Genet. 15, 123–124.
Mayer, K., et al. The European Union Arabidopsis Genome Sequenc-
Dorer, D.R., and Henikoff, S. (1994). Expansions of transgene re- ing Consortium and the Cold Spring Harbor, Washington University
peats cause heterochromatin formation and gene silencing in Dro- in St. Louis and PE Biosystems Arabidopsis Sequencing Consortium
sophila. Cell 77, 993–1002. (1999). Sequence and analysis of chromosome 4 of the plant Arabi-
English, J., Harrison, K., and Jones, J.D. (1993). A genetic analysis dopsis thaliana. Nature 402, 769–777.
of DNA sequence requirements for Dissociation state I activity in Mardis, E. (1994). High-throughput detergent extraction of M13 sub-
tobacco. Plant Cell 5, 501–514. clones for fluorescent DNA sequencing. Nucleic Acids Res. 22,
Fanti, L., Dorer, D.R., Berloco, M., Henikoff, S., and Pimpinelli, S. 2173–2175.
(1998). Heterochromatin protein 1 binds transgene arrays. Chro- Marra, M., Kucaba, T., Sekhon, M., Hillier, L., Martienssen, R., Chin-
mosoma 107, 286–292. walla, A., Crockett, J., Fedele, J., Grover, H., Gund, C., et al. (1999).
Fauvarque, M.O., and Dura, J.M. (1993). Polyhomeotic regulatory A map for sequence analysis of the Arabidopsis thaliana genome.
sequences induce developmental regulator-dependent variegation Nat. Genet. 22, 265–270.
and targeted P-element insertions in Drosophila. Genes Dev. 7, Martienssen, R. (1996). Epigenetic phenomena: paramutation and
1508–1520. gene silencing in plants. Curr. Biol. 6, 810–813.
Fransz, P., Armstrong, S., de Jong, J.H., Parnell, L.D., Van Drunen, Martienssen, R. (1998a). Chromosomal imprinting in plants. Curr.
C., Dean, C., Zabel, P., Bisseling, T., and Jones, G.H. (2000). Inte- Opin. Genet. Dev. 8, 240–244.
grated cytogenetic map of chromosome arm 4S of A. thaliana: struc-
Martienssen, R. (1998b). Transposons, DNA methylation and gene
tural organization of heterochromatic knob and centromeric hetero-
control. Trends Genet. 14, 263–264.
chromatin. Cell 100, this issue, 367–376.
Martienssen, R., and Henikoff, S. (1999). The House and Garden
Grewal, S.I., and Klar, A.J. (1997). A recombinationally repressed
region between mat2 and mat3 loci shares homology to centromeric guide to chromatin remodelling. Nat. Genet. 22, 6–7.
repeats and regulates directionality of mating-type switching in fis- McClintock, B. (1929). Chromosome morphology in Zea mays. Sci-
sion yeast. Genetics 146, 1221–1238. ence 69, 629.
Hebsgaard, S.M., Korning, P.G., Tolstrup, N., Engelbrecht, J., Rouze, McClintock, B. (1930). A cytological demonstration of the location
P., and Brunak, S. (1996). Splice site prediction in Arabidopsis thali- of an interchange between two non-homologous chromosomes of
ana pre-mRNA by combining local and global sequence information. Zea mays. Proc. Natl. Acad. Sci. USA 16, 791–796.
Nucleic Acids Res. 24, 3439–3452. McClintock, B. (1951). Chromosome organization and genic expres-
Henikoff, S. (1998). Conspiracy of silence among repeated sion. Cold Spring Harb. Symp. Quant. Biol. 16, 13–47.
transgenes. Bioessays 20, 532–535. McClintock, B. (1965). The control of gene action in maize. Brookha-
Hennig, W. (1999). Heterochromatin. Chromosoma 108, 1–9. ven Symp. Biol. 18, 162–184.
Heslop-Harrison, J.S., Murata, M., Ogura, Y., Schwarzacher, T., and McClintock, B. (1978). Mechanisms that rapidly reorganize the ge-
Motoyoshi, F. (1999). Polymorphisms and genomic organization of nome. Satadler Genet. Symp. 10, 25–48.
repetitive DNA from centromeric regions of Arabidopsis chromo- McClintock, B., Angel, T., Kato, Y., and Bluemenshein, A. (1981).
somes. Plant Cell 11, 31–42. Chromosomal Constitution of Races of Maize (Chapingo, Mexico:
Holmquist, G.P., Kapitonov, V.V., and Jurka, J. (1998). Mobile genetic Colegio de Postgraduados).
Cell
386

McCombie, W.R., Adams, M.D., Kelley, J.M., FitzGerald, M.G., Ut- Tsugeki, R., Kochieva, E.Z., and Fedoroff, N.V. (1996). A transposon
terback, T.R., Khan, M., Dubnick, M., Kerlavage, A.R., Venter, J.C., insertion in the Arabidopsis SSR16 gene causes an embryo-defec-
and Fields, C. (1992a). Caenorhabditis elegans expressed sequence tive lethal mutation. Plant J. 10, 479–489.
tags identify gene families and potential disease gene homologues. Vongs, A., Kakutani, T., Martienssen, R.A., and Richards, E.J. (1993).
Nat. Genet. 1, 124–131. Arabidopsis thaliana DNA methylation mutants. Science 260, 1926–
McCombie, W.R., Heiner, C., Kelley, J.M., Fitzgerald, M.G., and Go- 1928.
cayne, J.D. (1992b). Rapid and reliable fluorescent cycle sequencing Wallrath, L.L. (1998). Unfolding the mysteries of heterochromatin.
of double-stranded templates. DNA Seq. 2, 289–296. Curr. Opin. Genet. Dev. 8, 147–153.
Mozo, T., Dewar, K., Dunn, P., Ecker, J.R., Fischer, S., Kloska, S.,
Waterston, R., Martin, C., Craxton, M., Huynh, C., Coulson, A., Hillier,
Lehrach, H., Marra, M., Martienssen, R., Meier-Ewert, S., and Alt-
L., Durbin, R., Green, P., Shownkeen, R., Halloran, N., et al. (1992).
mann, T. (1999). A complete BAC-based physical map of the Arabi-
A survey of expressed genes in Caenorhabditis elegans. Nat. Genet.
dopsis thaliana genome. Nat. Genet. 22, 271–275.
1, 114–123.
Parsons, J.D. (1995). Miropeats: graphical DNA sequence compari-
Weil, C.F., and Wessler, S.R. (1993). Molecular evidence that chro-
sons. Comput. Appl. Biosci. 11, 615–619.
mosome breakage by Ds elements is caused by aberrant transposi-
Paszkowski J., and Mittelsten Scheid, O. (1998). Plant genes: the tion. Plant Cell 5, 515–522.
genetics of epigenetics. Curr. Biol. 8, R206–R208.
Wilson, R., Ainscough, R., Anderson, K., Baynes, C., Berks, M.,
Pelissier, T., Tutois, S., Tourmente, S., Deragon, J.M., and Picard,
Bonfield, J., Burton, J., Connell, M., Copsey, T., et al. (1994). 2.2
G. (1996). DNA regions flanking the major Arabidopsis thaliana satel-
Mb of contiguous nucleotide sequence from chromosome III of C.
lite are principally enriched in Athila retroelement sequences. Genet-
elegans. Nature 368, 32–38.
ica 97, 141–151.
Wolffe, A.P. (1998). When more is less. Nat. Genet. 18, 5–6.
Pirrotta, V., and Rastelli, L. (1994). White gene expression, repressive
chromatin domains and homeotic gene regulation in Drosophila. Wolffe, A.P., and Hayes, J.J. (1999). Chromatin disruption and modi-
Bioessays 16, 549–556. fication. Nuc. Acids Res. 27, 711–720.
Plasterk, R.H., Izsvak, Z., and Ivics, Z. (1999). Resident aliens: the Xu, X., Hsia, A.P., Zhang, L., Nikolau, B.J., and Schnable, P.S. (1995).
Tc1/mariner superfamily of transposable elements. Trends Genet. Meiotic recombination break points resolve at high rates at the 5⬘
15, 326–332. end of a maize coding sequence. Plant Cell 7, 2151–2161.
Richards, E.J., and Dawe, R.K. (1998). Plant centromeres: structure Zhang, M. (1998). Identification of protein-coding regions in Arabi-
and control. Curr. Opin. Plant Biol. 1, 130–135. dopsis thaliana genome based on quadratic discriminant analysis.
Round, E.K., Flowers, S.K., and Richards, E.J. (1997). Arabidopsis Plant Mol. Biol. 37, 803–806.
thaliana centromere regions: genetic map positions and repetitive
DNA structure. Genome Res. 7 1045–1053. GenBank Accession Numbers
Rubin, G.M. (1998). The Drosophila genome project: a progress
report. Trends Genet. 14, 340–343. The GenBank accession numbers for the sequences reported in this
paper are as follows: AF007269 (F2N1), AF069300 (F3D13),
San Miguel, P., Tikhonov, A., Jin, Y.K., Motchoulskaia, N., Zakharov,
AF096370 (F11O4), AF104919 (T15B16), AC007138 (T7B11),
D., Melake-Berhan, A., Springer, P.S., Edwards, K.J., Lee, M., et al.
(1996). Nested retrotransposons in the intergenic regions of the AF001308 (T10M13), AF075597 (T2H3), AF069298 (T14P8),
maize genome. Science 274, 765–768. AC002330 (T10P11), AC004044 (T5J8), AF069442 (T4I9), AC005275
(F4C21), AF071527 (F9H3), AC005142 (T5L23), AF096372 (T5H22),
Schmidt, R., West, J., Love, K., Lenehan, Z., Lister, C., Thompson,
AF077408 (T7M24), AF128394 (T25H8), AF077409 (T24M8),
H., Bouchez, D., and Dean, C. (1995). Physical map and organization
AF075598 (T24H24), AF076274 (T27D20), AF069441 (T19B17),
of Arabidopsis thaliana chromosome 4. Science 270, 480–483.
AF076243 (T26N6), AF074021 (F4H6), AF149414 (T19J18), AF118223
Schnable, P.S., Hsia, A.P., and Nikolau, B.J. (1998). Genetic recom- (T4B21), and AF128393 (T1J1).
bination in plants. Curr. Opin. Plant Biol. 1, 123–129.
Springer, P.S., McCombie, W.R., Sundaresan, V., and Martienssen,
R.A. (1995). Gene trap tagging of PROLIFERA, an essential MCM2-
3-5-like gene in Arabidopsis. Science 268, 877–880.
Sulston J., Du, Z., Thomas, K., Wilson, R., Hillier, L., Staden, R.,
Halloran, N., Green, P., Thierry-Mieg, J., Qiu, L., et al. (1992). The
C. elegans genome sequencing project: a beginning. Nature 356,
37–41.
Sun, X., Wahlstrom, J., and Karpen, G. (1997). Molecular structure
of a functional Drosophila centromere. Cell 91, 1007–1019.
Sundaresan, V., Springer, P., Volpe, T., Haward, S., Jones, J.D.,
Dean, C., Ma, H., and Martienssen, R. (1995). Patterns of gene action
in plant development revealed by enhancer trap and gene trap trans-
posable elements. Genes Dev. 9, 1797–1810.
Taillebourg, E., and Dura, J.-M. (1999). A novel mechanism for P
element homing in Drosophila. Proc. Acad. Natl. Sci. USA 96, 6956–
6961.
Thompson, H., Schmidt, R., Brandes, A., Heslop-Harrison, J.S., and
Dean, C. (1996). A novel repetitive sequence associated with the
centromeric regions of Arabidopsis thaliana chromosomes. Mol.
Gen. Genet. 253, 247–252.
Thompson-Stewart, D., Karpen, G.H., and Spradling, A.C. (1994). A
transposable element can drive the concerted evolution of tandemly
repetitious DNA. Proc. Natl. Acad. Sci. USA 91, 9042–9046.
Thuriaux, P. (1977). Is recombination confined to structural genes
on the eukaryotic genome? Nature 268, 460–462.
Timmermans, M.C., Das, O.P., and Messing, J. (1996). Characteriza-
tion of a meiotic crossover in maize identified by a restriction frag-
ment length polymorphism-based method. Genetics 143, 1771–
1783.

Вам также может понравиться