Вы находитесь на странице: 1из 7

2001 Wiley-Liss, Inc.

Cytometry 47:17 (2002)


DOI 10.1002/cyto.10030

Lack of Correlation Between AT Frequency and


Genome Size in Higher Plants and the Effect of
Nonrandomness of Base Sequences on Dye Binding
Martin Barow and Armin Meister
Institut fur Pflanzengenetik und Kulturpflanzenforschung (IPK), Department of Cytogenetics, Gatersleben, Germany
Received 11 December 2000; Revision Received 3 September 2001; Accepted 22 September 2001

Background: Different plant species vary as to the ratio ences among the plant families were shown. The compar-
of nucleotide base pairs of genomic DNA. A correlation ison of DAPI and HO DFs gave no consistent differences as
between genome size and base pair ratio has been would be predicted from the model of different binding
claimed. Base composition can be analyzed by base-spe- site length of dyes. This result may be explained by the
cific dyes. nonrandom distribution of base pairs.
Methods: Genome size is determined by flow cytometry Conclusions: There is no general correlation between
of suspensions of nuclei stained by the base independent genome size and AT/GC ratio in higher plants. Similar
dye, PI. For estimation of the AT frequency, the AT- AT/GC ratios within a plant family result from the general
specific dyes 4,6-diamidino-2-phenylindole, dihydrochlo- similarity of the DNA sequences within a family. The
ride (DAPI) and Hoechst 33342 (HO) were used. We fluorescence of base-specific dyes is influenced by the
define a dye factor (DF) as the ratio of the two estimates nonrandom distribution of bases in the DNA molecule.
(peak ratios) of nuclear fluorescence intensities of sample Cytometry 47:17, 2002. 2001 Wiley-Liss, Inc.
relative to reference plant nuclei using a given dye and an
intercalating fluorochrome.
Results: No significant correlation between genome size Key terms: AT/GC ratio; DNA content; flow cytometry;
and the DF for DAPI was found when 54 plant species base-specific dyes; dye binding; Hoechst; PI; DAPI; angio-
were investigated. However, similarities within and differ- sperms

Different species do not vary as to their genomic DNA fluorescence is not proportional to the base content, but
content only. The ratio of the complementary nucleic rather follows a curvilinear relation. Provided that the
bases adenine and thymine (AT) and guanine and cytosine bases in the genome are distributed randomly, a formula is
(GC) also differs (1). In animals, there is a tendency of given for the relation between base frequency and fluo-
larger genomes to have a higher GC frequency (2,3). A rescence intensity (8,9). It is based on the condition that
similar relation seems to exist in higher plants (2), but this only a certain number, n, of consecutive bases of the same
assumption is based on the data of only six species, com- type (AT or GC) is able to bind a dye molecule. For
piled from different sources (4,5). Hoechst 33342 (HO), this number has been found to be
According to Vinogradov (3), the positive correlation of n 5 (8,9), whereas a range of n 3 4 is proposed for
GC frequency and genome size may be explained by the 4,6-diamidino-2-phenylindole, dihydrochloride (DAPI; 10).
greater physical and chemical stability of large genomes Using this formula, a comparison between DAPI and HO
containing a relatively high GC frequency. It is possible to should give an unambiguous value of n for DAPI.
check the claimed correlation between genome size and The results do not support this idea. A possible reason
AT frequency by means of flow cytometry after staining is the deviation of base distribution from randomness.
with intercalating and base-specific fluorochromes, re- Therefore, by evaluating known base sequences from sev-
spectively. Several papers report on base pair ratios of eral organisms, the effect of nonrandom base distribution
some species using this method (6 8). However, to our on fluorescence intensity is determined and discussed.
knowledge, no comprehensive investigation of base pair
ratios has been done for plant species in relation to their
DNA content and phylogenetic position. Grant sponsors: Deutsche Forschungsgemeinschaft; Grant number: ME
1083/2-1; Grant sponsor: Fond der Chemischen Industrie.
Beside this general aim, the relation between AT-spe- *Correspondence to: Armin Meister, Institut fur Pflanzengenetik und
cific fluorescence and the AT frequency of the genome is Kulturpflanzenforschung (IPK), D-06466 Gatersleben, Germany.
investigated. Dolezel et al. (6) showed that base-specific E-mail: meister@ipk-gatersleben.de
2 BAROW AND MEISTER

MATERIALS AND METHODS of instrument settings, which might possibly influence the
Plant Material results.
Fifty-four species of 17 families have been studied (Ta- Statistical Analysis
ble 1), 6 of them for the first time with respect to DNA
content. The plants were cultivated in pots on garden The statistical analyses were performed using Statistica
mulch in a greenhouse under standard conditions. In for MacIntosh (Statsoft, Tulsa, OK).
some cases, especially for gymnosperms, Fagaceae and
RESULTS AND DISCUSSION
Rosaceae, material was taken from plants growing out-
Estimation of Genome Size and AT Frequency
side. Young leaves were used for analysis. For gymno-
sperms, Fagus sylvatica, and the Rosaceae Physocarpus For the base-independent intercalating dye PI, the fluo-
opulifolius and Cydonia oblongaflower and leaf buds rescence intensity F_PI is proportional to the genome size
were analyzed instead. 2C with a proportionality constant k 1:
Some of the standards proposed by Dolezel et al. (11)
were used as references: Allium cepa, Glycine max, Pi- F_PIsample k1 2Csample (1)
sum sativum, Raphanus sativus, Secale cereale, and Vi-
cia faba. The calculation of reference values is described The genome size of an unknown sample can be deter-
in the Results and Discussion section. mined by comparing with a reference plant (11) as

Preparation of Nuclear Suspensions 2Csample 2Creference R_PIsample (2)


The analysis is based on the mean of four measurements
where R_PIsample is the ratio of sample and reference
of material from different individuals. For preparation of
fluorescence intensity with dye PI ( peak ratio in the
suspensions of nuclei, 30 100 mg tissue from each spe-
flow cytometric histogram):
cies was chopped with a razor blade together with mate-
rial from a reference plant in 1 ml ice-cold staining buffer
in a Petri dish according to Galbraith et al. (4) and filtered R_PIsample F_PIsample/F_PIreference (3)
through a 35-m mesh (Falcon 12 75 mm tube with a
For a base-specific dye such as AT-specific DAPI with a
35-m strainer cap).
frequency AT of AT bases of the genome, one would
The Galbraith buffer (4) was supplemented either
expect proportionality to the total number of the corre-
with 50 g/ml propidium iodide (PI; Molecular Probes,
sponding bases, i.e., an equation similar to eq. (1), but
Eugene, OR) 50 g/ml DNase-free RNase (Boehringer
with AT 2C instead of 2C. However, this seems not to be
Ingelheim Bioproducts Partnership, Heidelberg, Ger-
the case (6). Rather, a curvilinear relation exists in the
many), with 1 g/ml DAPI (Molecular Probes), or with
form
1 g/ml HO (Molecular Probes). In some cases, 0.51%
Triton (Sigma, Steinheim, Germany) for reducing adhe-
sion of cellular debris, 5% polyvinylpyrrolidone 25 F_DAPIsample k2 fATsample 2Csample (4)
(PVP; Serva, Heidelberg/New York) for binding phe-
where f(AT) represents the relation between the fre-
nolic compounds, and/or 50 100 mM potassium met-
quency of AT bases and the AT-specific fluorescence and
abisulfite (Riedel-de Haen, Seelze, Germany) as an anti-
k 2 is a proportionality constant. f(AT) can be considered
oxidant were added in order to obtain acceptable
the binding probability of a dye molecule to the DNA
coefficient of variation (CV) values. Different buffers
molecule with a given AT frequency. The ratio
result only in minor variation of peak ratios in experi-
R_DAPIsample between fluorescence of sample and refer-
ments with internal standardization (11).
ence is then analogous to eq. (3)
The analysis was done with a FACStarPLUS flow cytom-
eter (Becton Dickinson, San Jose, CA) equipped with two
argon lasers INNOVA 90-5 (Coherent, Palo Alto, CA) using fATsample 2Csample
R_DAPIsample (5)
the analysis program CellQuest. PI fluorescence was ex- fATreference 2Creference
cited with 500 mW at 514 nm and measured in the FL1
channel using a 630-nm band pass filter. DAPI and HO In order to obtain a parameter that characterizes the
fluorescence was excited with 200 mW in the UV range AT-specific fluorescence independently of the genome
and measured in the FL1 channel using a 450-nm band size, Zonneveld and Van Iren (12) used the PI peak ratio
pass filter. To reduce counts from fluorescent debris, gates relative to the DAPI peak ratio as an experimental param-
were set in the FL1/SSC dot plot. Usually 10,000 nuclei eter correlated with base composition. Similarly, we de-
within the gate were measured. fine a dye factor (DF) for a certain base-specific dye (in this
case, DAPI) in the following way:
Comparison of DAPI and HO
Seven combinations of species were measured after DF_DAPIsample R_DAPIsample/R_PIsample
staining with DAPI and HO. The corresponding measure-
ments were carried out on the same day to avoid a change fATsample/fATreference (6)
CORRELATION OF AT FREQUENCY AND GENOME SIZE 3
Table 1
DNA Content, DAPI Factor, and AT Frequency of the Investigated Species*

Reference 2C DNA DAPI AT frequency


Family Code Species speciesb content (pg) factor (%)
Ginkgoaceae A Ginkgo biloba S.c. 21.6 1.20 65.3
Pinaceae B Abies concolora V.f. 36.1 0.97 60.9
B Larix decidua S.c. 25.7 0.96 60.7
B Picea abies V.f. 38.6 0.93 60.1
B Pinus sylvestris V.f. 44.2 0.97 60.9
Ranunculaceae C Anemone ranunculoidesa V.f. 36.8 0.91 59.7
C Anemone sylvestris V.f. 17.0 0.93 60.1
C Aquilegia vulgarisa R.s. 1.00 0.94 60.3
Chenopodiaceae D Atriplex rosea R.s. 2.11 1.00 61.5
D Beta vulgaris R.s. 1.84 1.06 62.7
D Spinacia oleracea R.s. 2.68 0.99 61.3
Urticaceae E Urtica dioica P.s. 2.33 1.01 61.7
E Urtica urens G.m. 1.07 1.08 63.1
Fagaceae F Castanea sativaa R.s. 1.95 1.06 62.7
F Fagus sylvatica G.m. 1.30 1.09 63.3
F Quercus robur G.m. 2.18 1.06 62.7
Rosaceae G Cydonia oblonga G.m. 1.98 0.89 59.2
G Duchesnea indica G.m. 4.18 0.96 60.7
G Physocarpus opulifolius R.s. 0.710 0.98 61.1
Fabaceae H Glycine max Std. 2.73 1.11 63.6
H Lathyrus articulatus P.s. 11.5 1.05 62.5
H Phaseolus vulgaris G.m. 1.57 1.05 62.5
H Pisum sativum Std. 9.07 1.00 61.5
H Trifolium pratense G.m. 1.06 1.17 64.8
H Vicia faba Std. 26.2 1.02 61.9
Brassicaceae I Alliaria petiolata R.s. 2.69 0.97 60.9
I Arabidopsis thaliana R.s. 0.424 0.97 60.9
I Brassica napus R.s. 2.94 0.98 61.1
I Raphanus sativus Std. 1.38 0.97 60.9
I Sinapis arvensis G.m. 1.35 0.96 60.7
Cucurbitaceae J Cucumis sativus R.s. 1.02 1.06 62.7
J Cucurbita moschata R.s. 0.968 0.99 61.3
J Cucurbita pepo G.m. 1.17 0.99 61.3
J Momordica charantia R.s. 1.42 1.10 63.5
Solanaceae K Capsicum frutescens P.s. 7.34 1.13 64.0
K Lycopersicon pimpinellifolium G.m. 2.28 1.07 62.9
K Nicotiana tabacum V.f. 9.77 1.02 61.9
Lamiaceae L Hyssopus officinalis R.s. 1.12 1.05 62.5
L Stachys grandifloraa P.s. 12.4 0.94 60.3
L Teucrium scorodonia R.s. 2.85 0.95 60.5
Asteraceae M Chrysanthemum multicolor V.f. 32.5 1.10 63.5
M Haplopappus gracilis G.m. 2.38 1.01 61.7
M Lactuca sativa G.m. 6.59 0.94 60.3
Asparagaceae N Asparagus officinalis R.s. 3.63 0.90 59.4
Alliaceae O Allium cepa Std. 33.7 1.20 65.3
O Allium ledebourianum P.s. 17.4 1.11 63.6
O Allium porrum A.c. 65.5 1.12 63.8
O Allium ursinum A.c. 62.1 1.17 64.8
Liliaceae P Fritillaria uva-vulpisa A.p./A.c. 165 1.15 64.4
Poaceae Q Hordeum vulgare S.c. 10.3 0.72 55.3
Q Oryza sativa R.s. 1.17 0.79 57.0
Q Secale cereale Std. 16.0 0.72 55.4
Q Triticum aestivum V.f. 32.8 0.67 54.1
Q Zea mays G.m. 5.90 0.62 52.8
*The genome size was calculated with Pisum sativum Viktoria, Kifejto Borso as the primary standard using the intercalating dye PI.
The AT frequency was calculated from the DAPI factor with Eqs. (6) and (7). For the primary standard Pisum sativum, a value of 2C
9.07 pg and an AT frequency of 61.5% were used (see Genome Size and AT Frequency of the Standards). The 2C values and AT
frequencies of the secondary standards were calculated on the base of the primary standard and the results obtained from Dolezel et al.
(11), Tables 4 and 7 and laboratory L1. The family-attached code is used for the identification of the families in Figure 1.
a
Species tested for the first time for DNA content.
b
The species used as secondary standards beside the primary standard Pisum sativum (P.s.): A.c., Allium cepa; A.p., Allium porrum;
G.m., Glycine max; R.s., Raphanus sativus; S.c., Secale cereale; V.f., Vicia faba; Std., standard itself.
4 BAROW AND MEISTER

which depends only on the AT-specific fluorescence and


is independent of the genome size, as can be seen from
the equation.
Because the actual value of DF_DAPIsample depends on
the reference used, we define P. sativum as a primary
standard that results in DF_DAPIPisum 1 because the
sample agrees with the reference. This value corresponds
to an AT frequency of 61.5% (see next section).
According to references (8) and (9), the relation f(AT)
has the form

fAT 1 AT ATn /1 ATn (7)

It is based on the following assumptions: (1) a certain


number, n, of consecutive AT bases is necessary for bind-
ing one dye molecule and (2) the AT bases are distributed
randomly within the DNA molecule. If n and the ATreference
are known and DF_DAPIsample is measured, the AT fre-
quency of the sample can be calculated in the following
way. According to eq. (6), f(ATsample) results in

fATsample DF_DAPIsample fATreference (8)


FIG. 1. DAPI factor (correlated with the AT frequency) in relation to the
genome size. The letters indicate the family according to Table 1. No
By means of a numeric approximation method (regula correlation between genome size and DAPI factor is found, neither in the
falsi, carried out by a VisualBasic program), the AT fre- total amount of data nor within the families (same letter). However,
quency can be calculated from f(ATsample) using eq. (7). significant differences exist among different families (see Table 2).
For the sake of simplicity, the above formalism was
established for DAPI. However, it is valid for any other
dye, including GC-specific dyes, analogously. Most authors
used these equations to calculate the AT/GC ratio from DF_DAPIleuco 1 ATleuco ATleuco
n
/1 ATleuco
n

their flow cytometric results. However, for various rea-
sons, eq. (7) may give incorrect results: (1) the AT bases /1 ATPisum ATPisum
n
/1 ATPisum
n
(9)
may be distributed nonrandomly and (2) the value of n is
not known exactly [e.g., 3 or 4 for DAPI according to With the corresponding values for DF_DAPIleuco, ATleuco,
Portugal and Waring (10)]. and n, it results in:
The DF for DAPI, DF_DAPI may be a better parameter
for characterizing the AT frequency. It depends only on 0.904 1 0.595 0.595 4/1 0.595 4
the AT frequency itself and does not require any assump-
tion as to the fluorescence dependence on AT. /1 ATPisum ATPisum
4
/1 ATPisum
4
(10)
Genome Size and AT Frequency of the Standards From this equation, ATPisum is calculated by a numerical
P. sativum Viktoria, Kifejto Borso was used as the method as described above as 0.615 or 61.5%. Based on
primary standard. The values for the genome size and AT this primary standard, the AT frequency of the secondary
frequency are based on reference (6). For the genome standards was calculated by eqs. (6) and (7) in a similar
size, a value of 2C 9.07 pg is used. The 2C values of the way.
other references were calculated on the basis of this
primary standard and the results obtained by Dolezel et al. Correlation Between Genome Size
(11), Table 4, laboratory L1. and AT Frequency
The AT frequency of P. sativum was calculated from The investigated species, their genome sizes, DAPI fac-
eqs. (6) and (7) using a value of 59.5% for human leuko- tors, and AT frequencies (calculated from the DAPI factor
cytes (6,13) and an assumed binding site length for DAPI on the basis of eq. (7) with n DAPI 4) are listed in Table
of n 4. According to Table 1 in Dolezel et al. (6), the PI 1. The relationship between genome size and DAPI factor
peak ratio of Homo sapiens leukocytes relative to the is shown in Figure 1. No trend is discernible. Correspond-
primary standard P. sativum is R_PIleuco 1/1.295 ingly, the coefficient of correlation, r 0.19 (P 0.16),
0.772. Analogously, the DAPI peak ratio is R_DAPIleuco is clearly below the significance level. It even has an
1/1.432 0.698. Therefore, the DAPI DF of human leu- opposite sign when compared with the results of Vino-
kocytes relative to P. sativum is DF_DAPIleuco 0.698/ gradov (2). Because enlargement of a genome should be
0.772 0.904. By combining eqs. (6) and (7), the follow- connected with an increased GC frequency, the DAPI
ing equation results: factor should be negatively correlated with the genome
CORRELATION OF AT FREQUENCY AND GENOME SIZE 5
Table 2 family effect. It results also in a very small, nonsignificant
Differences Between the DAPI DFs coefficient of correlation (r 0.09).
of the Investigated Families*
Therefore, the high correlation that Vinogradov (2) ob-
Average DAPI factor Significance tained using data from literature is rather astonishing: only
Family with SD codeb six species were included; for three species, only a range
Poaceae 0.70 0.06 a of possible DNA contents was listed (4); for two of the
Asparagaceae 0.90 0a b three species (Zea mays and Lycopersicon esculentum)
Ranunculaceae 0.93 0.02 bc the DNA content varied more than twofold. If the latest
Rosaceae 0.94 0.05 bc
Pinaceae 0.96 0.02 bc
available values of genome size for these six species are
Brassicaceae 0.97 0.01 bc used instead (website http://www.rbgkew.org.uk/cval/
Lamiaceae 0.98 0.06 bc homepage.html), the coefficient of correlation will be dimin-
Chenopodiaceae 1.02 0.04 bcd ished to r 0.549 (P 0.26). These facts might be ex-
Asteraceae 1.02 0.08 bcd plained by a random constellation of data, which led to an
Cucurbitaceae 1.04 0.05 bcd
Urticaceae 1.05 0.05 bcd apparent significance within the limits of error probability.
Solanaceae 1.07 0.06 cd
Fagaceae 1.07 0.02 cde Differences of AT Frequency Among Plant Families
Fabaceae 1.07 0.06 cde Figure 1 shows obvious differences among the families.
Liliaceae 1.15 0a de
Alliaceae 1.15 0.04 de For example, all analyzed Poaceae (letter Q) have very low
Ginkgoaceae 1.20 0a e DAPI factors and, thus, a relatively small AT frequency,
whereas the corresponding values are clearly above the
*Significant differences appear among several families. Note
that Poaceae with the lowest DF (lowest AT frequency) signifi- average for Alliaceae (letter O). This finding has been
cantly differ from all other families. confirmed by analysis of variance with subsequent New-
a
Only one species was analyzed. man-Keuls-test.
b
Same letters indicate a lack of significant differences among In Table 2, the mean values and SDs of the DAPI factor
the corresponding families. Only families that do not have the
same letter are significantly different.
for each family are listed with an indication of significance
to characterize the differences among the families. De-
spite the small number of species studied within each
family, a large number of significant differences is found.
size. Hence, the positive correlation between genome size The existence of such differences is not surprising. Be-
and GC frequency with r 0.90 according to Vinogradov cause of the phylogenetic relations and the corresponding
could not be verified by our analysis of 54 plant species. similarities of the DNA sequences, similar AT frequencies
Using the AT values calculated on the basis of eq. (7) and, therefore, similar DAPI factors within a family and
instead of DAPI factors gives rather similar results (r corresponding differences between the families were to
0.18). This is not surprising, because the AT range in be expected. If the evolutionary increase of a genome is
spermatophyta is rather narrow and the DAPI factor and caused mainly by amplification of sequences already
AT frequency are highly correlated within this range. present in the genome and not by the addition to a large
Figure 1 also shows that within the families no trends are extent of sequences of deviating AT ratios, this could
detectable although species characterized by a wide range provide an explanation for the observed phenomenon.
of genome sizes have been included. This is confirmed by
the calculation of the pooled within-groups correlations Comparison Among DAPI and HO
that determines the correlations after subtracting the From the literature, no unequivocal value for the
mean DAPI factor for each family, so eliminating the binding site length n of DAPI is available. It is assumed

Table 3
Comparison of HO and DAPI DFs (Means of Four Independent Measurements SD)
DF
Species combination HO DAPI Difference n DAPIa
1. Glycine max/Lactuca sativa 1.307 (0.013) 1.160 (0.007) 0.147* 2.77 (0.15)
2. Pisum sativum/Secale cereale 1.463 (0.016) 1.416 (0.008) 0.047* 4.57 (0.15)
3. Allium ledebourianum/Secale cereale 1.770 (0.026) 1.557 (0.022) 0.213* 3.88 (0.16)
4. Raphanus sativus/Oryza sativa 1.278 (0.005) 1.249 (0.009) 0.029* 4.53 (0.16)
5. Allium cepa/Vicia faba 1.236 (0.011) 1.235 (0.004) 0.001 4.98 (0.22)
6. Glycine max/Raphanus sativus 1.297 (0.020) 1.190 (0.015) 0.107* 3.34 (0.31)
7. Vicia faba/Secale cereale 1.378 (0.061) 1.420 (0.010) 0.042 5.47 (0.76)
Difference DF_HO DF_DAPI

*Significant at P 0.05 according to the Mann-Whitney test.


a
n DAPI is calculated using eq. 11 (in parentheses: SD according to the law of error propagation). Contrary to the expectation, it varies
over a wide range.
6 BAROW AND MEISTER

Table 4
Relative AT-Specific Fluorescence (DF)*

Genome size DF for the given DF for random Difference relative to


Organism (Mbp) AT (%) sequence distribution random distribution (%)
Arabidopsis thaliana, chromosome 2 19.6 64.1 1.081 1.136 4.8
1.104 1.174 6.0
Arabidopsis thaliana, chromosome 4 17.5 64.0 0.848 1.128 24.8
0.792 1.163 31.9
Archaeoglobus fulgidis 2.2 51.4 0.545 0.568 4.1
0.474 0.490 3.3
Borrelia burgdorferi 1.5 71.8 1.657 1.591 4.2
1.923 1.793 7.2
Caenorhabditis elegans, chromosome 1 13.0 64.1 1.210 1.136 6.5
1.330 1.714 13.3
Caenorhabditis elegans, chromosome 2 14.9 63.8 1.179 1.120 5.2
1.287 1.152 11.7
Caenorhabditis elegans, chromosome 3 13.0 64.3 1.229 1.144 7.5
1.362 1.185 15.0
Caenorhabditis elegans, chromosome 4 16.8 65.3 1.253 1.198 4.5
1.384 1.255 10.3
Caenorhabditis elegans, chromosome 5 21.2 64.6 1.210 1.159 4.4
1.314 1.206 8.9
Caenorhabditis elegans, chromosome X 17.4 64.8 1.202 1.175 2.3
1.292 1.222 5.7
Chlamydia pneumoniae 1.2 59.4 0.864 0.899 3.9
0.829 0.872 4.9
Chlamydia trachomatis 1.1 59.7 0.910 0.914 0.4
0.867 0.894 3.0
Deinococcus radiodurans 3.3 33.4 0.144 0.132 8.8
0.102 0.075 35.7
Escherichia coli 4.6 49.2 0.599 0.494 21.3
0.560 0.404 38.7
Haemophilus influenzae 1.8 61.8 1.132 1.019 11.1
1.212 1.023 18.4
Helicobacter pylori 1.7 61.1 1.144 0.980 16.7
1.217 0.975 24.9
Homo sapiens, chromosome 21 33.8 59.1 0.899 0.883 1.8
0.926 0.856 8.2
Homo sapiens, chromosome 22 33.6 52.2 0.611 0.595 2.6
0.598 0.517 15.6
Methanococcus jannaschii 1.7 68.7 1.431 1.396 2.5
1.616 1.524 6.0
Mycobacterium tuberculosis 4.4 34.4 0.121 0.144 16.2
0.059 0.086 31.3
Mycoplasma genitalium 0.6 68.3 1.326 1.373 3.4
1.416 1.492 5.1
Neisseria meningitidis 2.3 48.5 0.630 0.467 35.0
0.576 0.382 50.7
Pseudomonas aeruginosa 6.3 33.4 0.097 0.132 26.5
0.054 0.075 28.6
Synechocystis sp. 3.6 52.3 0.731 0.599 22.1
0.732 0.522 40.2
Thermotoga maritima 1.9 53.8 0.513 0.653 21.4
0.425 0.587 27.5
Treponema pallidum 1.1 47.2 0.451 0.428 5.5
0.388 0.339 14.3
Vibrio cholerae 4.0 52.5 0.622 0.607 2.6
0.560 0.533 5.1
*Binding site lengths n 4 (first line, regular characters) and n 5 (second line, italic characters) for different organisms calculated
from the base sequence compared with a random base distribution of the same AT frequency. A DF of 1 corresponds to an AT frequency
of 61.5% with random distribution of AT base pairs (standard Pisum sativum). The base sequence of Pseudomonas aeruginosa was
downloaded from the website of the Pseudomonas Genome Project http://www.pseudomonas.com/, all remaining sequences from the
website of The Institute of Genomic Research (TIGR) http://www.tigr.org/

to be within the range of 3 4, whereas for HO n 5 is possible to calculate the unknown binding site length
found (10). According to eqs. (6) and (7), the DF for nDAPI from the known value nHO by the simplified
both dyes should be different. Moreover, it should be relation
CORRELATION OF AT FREQUENCY AND GENOME SIZE 7

n DAPI nHO lnDF_DAPI/lnDF_HO (11) seria meningitidis and 31.9% for chromosome 4 of
Arabidopsis thaliana (both with n 5) relative to the
which follows from eq. (10) in Godelle et al. (9). expected random distribution. Moreover, chromosomes 2
In order to obtain reliable estimates from this equation, and 4 of the sole representative of the plant kingdom,
the DF values should differ clearly from the value of 1; A. thaliana, although having nearly the same AT fre-
otherwise, extremely imprecise values for nDAPI may result. quency (64.0 and 64.1%), differ clearly from each other
This can be reached in the following way. Pairs of species (4.8/6.0% and 24.8/31.9%, respectively, relative to
were selected with DF values differing by at least 10%. The the random value).
DF values were not calculated relative to the primary stan- In order to be sure that the calculation made by the
dard P. sativum but relative to each other. The order of computer program is correct, it was tested by analyzing
species in the ratio is selected in such a way that DF 1. pseudorandom sequences, which gave very good agree-
Under this condition, the DF values are expected to be ment with the results expected from eq. (7). These calcu-
higher for larger binding length n. This means, the DF values lations support the idea of nonrandomness of base distri-
should be always greater for HO compared with DAPI. bution as the reason for deviation from the expected
According to these conditions, seven pairs of species (random) values. Nevertheless, the correlation coeffi-
were selected, the results for which are shown in Table 3. cients r 0.975 for n 4 and r 0.951 for n 5
With one exception (V. faba/S. cereale), all calculated between random and real distribution are highly signifi-
values for n DAPI are less than n HO 5. In the case of the cant. This means that, in general, a good approximation of
pair V. faba/S. cereale, the missing significance of differ- AT content can be computed on the basis of the DAPI
ence and the high SD of n DAPI indicate that the values for factor, but that important deviations are possible in some
this combination do not really deviate from those for HO. cases. This may explain the inconsistency between DAPI
Nevertheless, the results of comparison of HO and DAPI and HO values, which are based on different binding site
data are inconsistent. Two data pairs (V. faba/S. cereale lengths, but not the lack of correlation between AT con-
and A. cepa/V. faba) give an n DAPI value practically iden- tent and genome size. Therefore, an existent correlation
tical with n HO. The others vary between n DAPI 2.77 and might become diminished, but not extinguished.
4.57. The mean of all calculated n DAPI is 4.22, so the
nearest integer value is 4. ACKNOWLEDGMENTS
What may be the reasons for the inconsistency of the We thank Ingo Schubert, Jaroslav Dolezel and Paul
computed n DAPI values? If the assumption of a certain Fransz for helpful discussions, Barbara Hildebrandt for
number of consecutive base pairs of the same kind (AT in technical support, Katrin Menzel and her staff for the
this case) required for binding a dye molecule is correct, qualified cultivation of the plants, and the team of the IPK
the deviation may be explained by a nonrandom distribu- germplasm collection for providing seed material.
tion of base pairs within the genome. The assumption of
random distribution of AT and GC base pairs as the con- LITERATURE CITED
dition for eq. (7) may not always be fulfilled. This will be 1. Bonner J. The nucleus. In: Bonner J, Varner JE, editors. Plant bio-
the case especially for regions of repetitive sequences. chemistry, 3rd edition. New York: Academic Press; 1976. p 37 64.
2. Vinogradov AE. Measurement by flow cytometry of genomic AT/GC
The extent to which such a nonrandomness of sequences ratio and genome size. Cytometry 1994;16:34 40.
may cause the observed deviations can be investigated by 3. Vinogradov AE. Genome size and GC-percent in vertebrates as deter-
mined by flow cytometry: the triangular relationship. Cytometry
analyzing known sequences. 1998;31:100 109.
4. Galbraith DW, Harkins KR, Maddox JM, Ayres NM, Sharma DP,
Effect of Nonrandomness of DNA Sequences on Firoozabady E. Rapid flow cytometric analysis of the cell cycle in
Dye Binding intact plant tissues. Science 1983;220:1049 1051.
5. Bennett MD, Smith JB. Nuclear DNA amounts in angiosperms. Philos
We have investigated 27 known sequences, published Trans R Soc Lond [Biol] 1976;274:227274.
in the internet (Table 4). The number of successive AT 6. Dolezel J, Sgorbati S, Lucretti S. Comparison of three DNA fluoro-
chromes for flow cytometric estimation of nuclear DNA content in
base pairs necessary to bind one dye molecule is assumed plants. Physiol Plantarum 1992;85:625 631.
to be n 4 (corresponding to the most probable integer 7. Marie D, Brown SC. A cytometric exercise in plant DNA histograms
value for DAPI as mentioned above) and n 5 (corre- with 2C values for 70 species. Biol Cell 1993;78:4151.
8. Langlois RG, Carrano AV, Gray JW, Van Dilla MA. Cytochemical
sponding to HO). The relative fluorescence intensities studies of metaphase chromosomes by flow cytometry. Chromosoma
were standardized in such way that a DF of 1 corresponds 1980;77:229 251.
9. Godelle B, Cartier D, Marie D, Brown SC, Siljak-Yakovlev S. Heterochro-
to an AT frequency of 61.5% (the calculated AT frequency matin study demonstrating the non-linearity of fluorimetry useful for
of P. sativum) as used in Tables 1 and 2. The relative calculating genomic base composition. Cytometry 1993;14:618626.
fluorescence for a random distribution of AT base pairs was 10. Portugal J, Waring MJ. Assignment of DNA binding sites for 4,6-di-
amidino-2-phenylindol and bisbenzimide (Hoechst 33258). A compara-
calculated from the AT frequency using eq. (7) and the tive footprinting study. Biochim Biophys Acta 1988;949:158168.
relative fluorescence of the real distributions was calcu- 11. Dolezel J, Greilhuber J, Lucretti S, Meister A, Lysak MA, Nardi L,
lated by scanning the sequences for groups of n 4 and Obermayer R. Plant genome size estimation by flow cytometry: inter-
laboratory comparison. Ann Bot 1998;82 (supplement A):1726.
n 5 successive AT bases with the aid of a Visual Basic 12. Zonneveld BJM, Van Iren F. Flow cytometric analysis of DNA content
computer program. The results are shown in Table 4. in Hosta reveals ploidy chimera. Euphytica 2000;111:105115.
13. Shapiro HS. Distribution of purines and pyrimidines in deoxyribonu-
Surprisingly, high differences in both directions result cleic acids. In: Fasman GD, editor. Handbook of biochemistry and
from the calculations, up to extrema of 50.7% for Neis- molecular biology, volume 2. Cleveland: CRC Press; 1976. p 241281.

Вам также может понравиться