Академический Документы
Профессиональный Документы
Культура Документы
com
R
Abstract
Isoform GFAP⑀ of the human cytoskeletal protein GFAP carries, as the result of alternative splicing of exon 7a of GFAP, a novel
42-amino-acid-long C-terminal region with binding capacity for the presenilin proteins. Here we show that exon 7a is present in a variety
of mammals but absent from GFAP of chicken and fish. Comparison of the mouse and human GFAP exons showed an increased rate of
nonsynonymous nucleotide substitutions in exon 7a compared to the other exons. This resulted in 10 nonconservative and 2 conservative
amino acid substitutions and suggests that exon 7a has evolved under different functional constraints. Exons 7a of humans and higher
primates are 100% identical apart from alanine codon 426, which is conserved in only 9% of the human alleles, while 21 and 70% of the
alleles, respectively, have a valine or a threonine codon at that position. Threonine represents a potential phosphorylation site, and positive
selection of that effect could explain the high allele frequency.
© 2003 Elsevier Science (USA). All rights reserved.
Glial fibrillary acidic protein (GFAP) is the principal the protein as an antigen marker specific for the astrocyte
intermediate filament (IF) protein of the mature astrocytes [1–5].
of the central nervous system. It belongs to type 3 of the IF The human GFAP is a 432-amino-acid-long polypeptide
protein family and has a characteristic monomeric structure of 55 kDa encoded by the nine exons of GFAP, which
composed of a highly conserved central ␣-helical rod do- extend over 10 kb on chromosome 17q21 [6 – 8]. GFAP is
main flanked by nonhelical head and tail domains. The phylogenetically old. Compared with mouse Gfap [9] the
monomers form homodimers and homotetramers or het- nucleotide sequence and exon/intron organization of the
erotetramers with other IF proteins. Further multimerization human gene are highly conserved and the polypeptide
produces the intermediate fibers of the cytoskeleton. Thus, shows more than 90% homology to the mouse and pig
GFAP provides structural stability to the astrocyte and may GFAP and about 85% homology to GFAP of the goldfish
take part in modulating its shape and motility. Regulatory [6,8,10]. Accordingly, antimammalian GFAP antibodies
elements directing astrocyte-specific transcription have have been used successfully in comparative immunohisto-
been identified, and synthesis of GFAP is rapidly upregu- chemical studies of astrocytes in brains from bird, reptile,
lated in activated astrocytes. The cell-limited expression of and fish [11–15].
GFAP is the basis for the routinely and widespread use of We have previously characterized a novel human GFAP
isoform, designated GFAP⑀ [16]. This isoform results from
alternative splicing of a novel exon embedded in intron 7
and the use of a new polyadenylation signal present in this
夞 Sequence data from this article have been deposited with the EMBL/
GenBank Data Libraries under Accession Nos. AY142187–AY142200.
exon, termed exon 7a. Hereby, the exons 8 and 9-encoded
* Corresponding author. Fax: ⫹45-86123173. tail region of the classical isoform GFAP␣ is replaced by a
E-mail address: alj@humgen.au.dk (A.L. Jørgensen). new tail region encoded by exon 7a. The generated isoform
0888-7543/03/$ – see front matter © 2003 Elsevier Science (USA). All rights reserved.
doi:10.1016/S0888-7543(03)00106-X
186 R. Singh et al. / Genomics 82 (2003) 185–193
Fig. 1. Alternative splicing of human GFAP. (A) Exon/intron organization of the 3⬘ end of the gene and the corresponding two mRNA splice forms GFAP␣
and GFAP⑀. Note polyadenylation signal pA⑀ in exon 7a. (B) Amino acid sequences of the tail domain of GFAP␣ and GFAP⑀. Sequences were obtained
from Nielsen et al. [16].
GFAP⑀ has protein binding capacity for the presenilin pro- To study whether the nucleotide sequence of exon 7a has
teins in vitro [16]. In the present study we show that exon 7a been conserved during evolution we obtained genomic
is present also in GFAP of higher primates, the pig, and the DNA from nonhuman primates, including pygmy chimpan-
mouse, but absent from GFAP of chicken, zebrafish, and zee (Pan paniscus), common chimpanzee (Pan troglodytes),
goldfish. Interspecies comparison showed that the coding gorilla (Gorilla gorilla), orangutan (Pongo pygmaeus), and
region of exon 7a has been under evolutionary constraints baboon (Papio), and from the domestic pig (Sus scrofa
different from those on the other exons of the gene and we domesticus), the mouse (Mus musculus), the rat (Rattus
discovered a high-frequency polymorphism in this exon norvegicus), the chicken (Gallus gallus domesticus), the
among humans. We will argue that exon 7a is mammalian goldfish (Carassius auratus), and the zebrafish (Danio re-
specific and propose that it may confer new and advanta- rio) and used these DNAs to identify and to sequence the
geous functions to the GFAP⑀ isoform. coding region and some of the 3⬘ UTR of exon 7a of GFAP.
The primers used to PCR amplify and sequence exon 7a are
described in Table 4 and under Materials and methods.
Results We were able to identify exon 7a only in the mammalian
species. With respect to the nonmammalian species we
Species comparison of the nucleotide sequences amplified and sequenced the entire intron 7 of GFAP. Intron
of exon 7a 7 is about 2.3 kb long in the human and the mouse gene, but
only 88 and 82 bp in goldfish and zebrafish, respectively,
The head and especially the highly conserved rod do- and 675 bp in chicken (Fig. 2). The nonmammalian intron 7
mains of the IF proteins secure proper dimer and tetramer sequences contained no indications of the presence of exon
formation and higher order polymerization, while the less 7a or other alternative splicing and polyadenylation signals
conserved tail domains of the IF proteins are available for (for specific intron 7 sequence information the accession
interaction with other cytosolic proteins [17]. Fig. 1A shows numbers for zebrafish, goldfish, and chicken are given under
the exon/intron organization of the 3⬘ end of human GFAP, Materials and methods).
and the two mRNA splice forms GFAP␣ and GFAP⑀ are In Fig. 3A are shown the nucleotide sequences of the
indicated. Exon 7a contains a functional polyadenylation coding regions of exons 7a, identified in the species listed.
site and GFAP⑀ is created by splicing of exon 7a directly The human sequence represents 12 unrelated individuals
onto exon 7 [16]. This results in a tail domain of the isoform having identical sequences apart from a polymorphism at
GFAP⑀ whose amino acid sequence is different from and codon 426 of which the most frequent codon is shown. The
one amino acid shorter than the tail domain of GFAP␣ sequence of the common chimpanzee represents four unre-
(Fig. 1B). lated individuals whose exon 7a sequences were 100%
R. Singh et al. / Genomics 82 (2003) 185–193 187
Fig. 2. Species comparison of intron 7 of GFAP. Exon 7a is present only in intron 7 of the mammalian species and is flanked by direct repeats (arrows) in
the mouse gene. Numbers refer to lengths in base pairs. UTR, 3⬘ untranslated region of exon 7a, i.e., from stop codon to polyadenylation signal pA⑀. Mouse
and rat intron 7 sequences were obtained from Refs. [9] and [18]. Accession numbers for determined sequences are given under Materials and methods.
identical, while the other sequences represent one individual deviations in exon 7a among the primates, including hu-
from each species. mans, is consistent with their evolutionary relatedness.
The human exon 7a nucleotide sequence is 100% iden- The pig sequence has accumulated only one nucleotide
tical to the exon 7a sequences in the three most closely change not shared by the other species, namely the neutral
related higher primates (pygmy chimpanzee, common T of the glycine codon GGT at position 400. The corre-
chimpanzee, gorilla) except for codon 426. This codon sponding glycine codon in the mouse reads GGC, while
encodes alanine in all the nonhuman species listed: in the humans and the nonhuman primates have the asparagine
nonhuman higher primates the alanine codon is GCG, in the codon AAT at that position. All other deviations of the pig
baboon it reads GCA, and in the pig and the mouse it reads sequence from the human and the nonhuman primate se-
GCC. Alanine at position 426 of the polypeptide, therefore, quences are shared by the mouse: the glutamic acid codon
appears to be conserved. In humans, codon 426 can be GAA at position 397, the glutamine codon CAA at position
either a threonine codon, ACG, shown in Fig. 3A, or a 413, the alanine codon GCC at position 426, and the leucine
valine codon, GTG, or the ancestral alanine codon GCG. codon CTC at position 430.
The threonine codon results from a G to A transition at the Five codons of the mouse sequence encode amino acids
first position of the GCG alanine codon and represents a not shared by any of the other species at these positions:
nonconservative amino acid substitution, while a C to T glutamine codon CAA at position 401, proline codon CCT
transition at the second position creates the valine codon at position 406, valine codon GTC at position 415, glutamic
and represents a conservative amino acid substitution. The acid codon GAA at position 423, and proline codon CCT at
tyrosine codon TAT at position 406 is found only in hu- position 431. But the mouse sequence contains no neutral
mans, the chimpanzee, and the gorilla and most likely re- nucleotide deviation from the human sequence that is not
sults from a C to T transition at the first position of the shared by, at least, the rat.
histidine codon CAT present in the orangutan, the baboon, The rat sequence is unique, having experienced an insertion
and the pig. The mouse has a proline codon, CCG, at of the dinucleotide GC between codons 420 and 421 (Fig. 3A).
position 406. The resulting shift in reading frame has changed the specificity
In addition to the species-specific A in the third position of codons 421, 422, and 423 and created a stop codon, TAA,
of the alanine codon 426, the baboon sequence contains the from the TA of codon 423 and the first A of codon 424. The
proline codon CCA at position 428, shared only by the tail region of the rat GFAP⑀, therefore, not only is truncated but
mouse, while the other higher primates have the proline also contains four amino acids at the very C-terminus that are
codon CCG at this position. Thus, the pattern of sequence not found in any of the other species.
188 R. Singh et al. / Genomics 82 (2003) 185–193
Fig. 3. Species comparison of the coding region of exon 7a. (A) Nucleotide sequences relative to the human sequence from codon 391 to stop codon TAG
at position 432, indicated by an asterisk. Codon 426, which is polymorphic in the human population, is marked by a dot. Note the GC insertion in the rat
sequence between codons 420 and 421. (B) Amino acid sequences derived from the nucleotide sequences in (A). Alanine at position 426, marked by a dot,
is conserved among the nonhuman species. In humans, this position is most frequently occupied by threonine, less frequently by valine, and only rarely by
the ancestral alanine. Note the truncated rat sequence due to the GC insertion indicated in (A). Asterisk corresponds to stop codon in (A). Abbreviations: C.
and P. chimpanzee, common and pygmy chimpanzee. Accession numbers for determined sequences are given under Materials and methods.
The amino acid sequences encoded by exon 7a of the The coding region of exon 7a has accumulated a unique
different species are aligned in Fig. 3B. Threonine at posi- pattern of nucleotide changes
tion 426 represents the most frequent of the 3 amino acid
variants (threonine, valine, alanine) of the human-specific We conducted a sequence comparison between all 10
polymorphism at that position. Otherwise, the amino acid exons of human and mouse GFAP. The numbers listed in
sequences are identical among the higher primates except at Table 1 show that synonymous substitutions are more fre-
position 406, where the orangutan, instead of tyrosine, quent than nonsynonymous ones in all exons except exon
shares histidine with the baboon and the pig. The amino acid 7a, for which the pattern is the opposite, with 15 nonsyn-
sequences diverged 30% between humans and the mouse, onymous and 5 synonymous substitutions. Exons 8 and 9
i.e., amino acid substitutions at 12 of 41 positions. Ten of together contain only 1 nonsynonymous and 6 synonymous
these changes are nonconservative, only the changes of substitutions. In Table 1 are also listed the numbers of
glutamic acid to aspartic acid at position 397 and valine to nonsynonymous and synonymous sites in exon 7a, exon 8,
isoleucine at position 415 are conservative. By contrast, the and exon 9. Synonymous and nonsynonymous sites are
only amino acid substitution that has occurred in the corre- counted as follows: If the number of possible synonymous
sponding 42-amino-acid-long tail region of the isoform changes at a particular position in a codon is i, then this site
GFAP␣ is a conservative aspartic acid to glutamic acid is counted as i/3 synonymous and (3 ⫺ i)/3 nonsynony-
substitution at position 423 [16]. mous. The numbers of synonymous and nonsynonymous
R. Singh et al. / Genomics 82 (2003) 185–193 189
Table 1
Characteristics of the nucleotide changes in human and mouse GFAP
Exon Species Amino acids Syn. subst. Nonsyn. subst. CpG Syn. sites Nonsyn. sites
(n) (n) (n) (n) (n) (n)
1 Human 154a 48 25 26
Mouse 153 27
2 Human 20 4 0 1
Mouse 20 1
3 Human 32 7 5 4
Mouse 32 2
4 Human 54 17 9 9
Mouse 54 8
5 Human 51 17 2 15
Mouse 51 16
6 Human 65 23 3 14
Mouse 65 12
7 Human 14 5 0 2
Mouse 14 1
7a Human 41 5 15 7 92 5/6 30 1/6
Mouse 41 0
8 Human 29 5 0 3 66 21
Mouse 29 3
9 Human 13 2 1 0 31 2/3 7 1/3
Mouse 14b 2
Note. Abbreviations: Syn. and Nonsyn. subst., synonymous and nonsynonymous substitutions.
a
Human exon 1 carries a duplication of alanine codon 9.
b
The last valine codon is duplicated in the mouse gene.
sites are counted in both the human and the mouse sequence 1 expected for a sequence under no functional constraint. A
and the average is calculated. From these numbers we cal- KA/KS ratio ⬎1 is normally regarded as a sign of positive
culated the frequency of nonsynonymous substitutions per selection since nonsynonymous substitutions are far more
nonsynonymous site (KA) and the frequency of synonymous likely than synonymous substitutions to improve the func-
substitutions per synonymous site (KS) and their ratios (Ta- tion of a protein [19,21].
ble 2). More synonymous than nonsynonymous nucleotide Table 1 contains the numbers of CpG dinucleotides
substitutions are expected to accumulate, over time, in a present in the exons of the human and mouse GFAP. Seven
coding sequence and the tighter a functional constraint is, CpGs are present in exon 7a of the human gene but none in
the fewer nonsynonymous substitutions are allowed. Com- exon 7a of the mouse gene. This discrepancy is unique to
parisons between human and mouse genes have identified exon 7a, as the numbers of CpGs in all the other exons of
the KA/KS ratios to be ⬍1, with an average of 0.2 [19,20]; human and mouse GFAP proved to be similar. We also
in genes encoding highly conserved amino acid sequences counted the numbers of CpGs in the intronic sequences,
KS may exceed KA by more than 25 times [21]. Accord- presumably under no functional constraint, between exon 7
ingly, we found that KS exceeds KA by some 30 times in the and exon 7a and found no difference between the human
tail region of GFAP␣, encoded by the two exons 8 and 9 and the mouse sequences (data not shown). Because of
(KA/KS ⫽ 0.0344). In exon 7a, the nonsynonymous substi- spontaneous deamination of the methylated C-residue of
tution rate is 20 times higher than in the combined exons 8 CpG dinucleotides, these dinucleotides tend to change to
and 9 (0.1819 vs 0.0103) and the synonymous substitution TpG or CpA, especially for CpGs present in a sequence that
rate is lower (0.1873 vs 0.2997). Thus, the KA/KS ratio of is no longer subject to any functional constraint. To this end
exon 7a is 0.9716, which is close to the theoretical ratio of it is interesting that the seven CpG dinucleotides present in
the human sequence do occur as TpG or CpA in the mouse
sequence, suggesting that the human sequence is under
Table 2
different functional constraints.
Exon 7a has a distinct nucleotide substitution profile
Table 4
Primer description
Note. The last 2 nucleotides of primers 14 and 15 are specific for threonine and valine, respectively, at codon 426.
quencing of both strands was done by following the proto- GTG on the other allele and were genotyped by sequencing.
col of the DYEnamic ET Terminator Cycle Sequencing Kit Absence of the GCG alanine codon on both alleles produces
(Amersham Pharmacia Biotech, Inc.). the HhaI banding pattern shown in Fig. 4B, lane 5. Lack of
HhaI cutting at P1 is due to either a G to A substitution at
Assay for codon 426 polymorphism position 1 or a C to T substitution at position 2 of the GCG
alanine codon and hence either an ACG threonine or a GTG
DNA samples collected from 64 unrelated healthy adults valine codon at position 426. To distinguish between these
of Danish extraction were PCR amplified using primers two possibilities we employed a PCR assay using S2R as
SFP2 and S2R and the protocol described above. The prim- reverse primer in combination with each of two new for-
ers define a 342-bp-long fragment that contains the coding ward primers, CHK1 and CHK2 (Fig. 4A and Table 4), in
sequence of exon 7a and adjacent 3⬘ UTR sequences (Fig. which the last 2 nucleotides at the 3⬘ end have specificity for
4A). The ancestral alanine codon GCG at position 426 and either the ACG allele (CHK1) or the GTG allele (CHK2).
the first C of the proline codon CCG at position 427 together Each sample was tested in two corresponding PCRs, per-
form the HhaI recognition site 5⬘GCGC3⬘ (P1 in Fig. 4A). formed essentially as mentioned above. Production of a
Another HhaI recognition site is located 41 bp farther PCR fragment of 184 bp using CHK1 as forward primer and
downstream in the 3⬘ UTR (P2 in Fig. 4A). Both HhaI sites absence of a PCR product using CHK2 as forward primer
are polymorphic, and cutting at P1 is in linkage disequilib- indicated the presence of the ACG (threonine) allele; the
rium with absence of cutting at P2 and vice versa. Cutting opposite result indicated the presence of the GTG (valine)
at P1 results in two fragments of 179 and 163 bp and cutting allele, while production of a PCR product with each of the
at P2 produces two fragments of 220 and 122 bp (Fig. 4B). forward primers would indicate the presence of both the
With a combination of PCR amplification and HhaI diges- ACG and the GTG allele in the sample tested (Figs. 4A and
tion it is possible to detect homozygosity and heterozygosity 4C).
for the presence or absence of the ancestral alanine codon at
position 426. One-fifth of the PCR product was cut by HhaI Accession numbers.
under conditions recommended by the supplier (New En-
gland BioLabs, Inc.) and the restriction fragments were The DNA sequences determined have the following ac-
visualized as bands by electrophoresis in an ethidium bro- cession numbers: human exon 7a GTG polymorphism
mide-stained 2% agarose gel. Among the 64 samples we (AY142187), human exon 7a GCG polymorphism
never observed a banding pattern consistent with HhaI cut- (AY142188), human exon 7a ACG polymorphism
ting at P1 on both alleles, i.e., homozygosity for the ances- (AY142191), baboon exon 7a (AY142190), common chim-
tral alanine codon GCG. Samples that showed a heterozy- panzee exon 7a (AY142192), pygmy chimpanzee exon 7a
gous banding pattern (Fig. 4B, lane 3) had either ACG or (AY142189), gorilla exon 7a (AY142193), orangutan exon
R. Singh et al. / Genomics 82 (2003) 185–193 193
7a (AY142196), pig exon 7a (AY142199), rat exon 7a [9] J.M. Balcarek, N.J. Cowan, Structure of the mouse glial fibrillary
(AY142198), mouse exon 7a (AY142200), chicken intron 7 acidic protein gene: implications for the evolution of the intermediate
filament multigene family, Nucleic Acids Res. 13 (1985) 5527–5543.
(AY142197), goldfish intron 7 (AY142194), zebrafish in-
[10] I. Cohen, M. Schwartz, cDNA clones from fish optic nerve, Comp.
tron 7 (AY142195). Biochem. Physiol. 104B (1993) 439 – 447.
[11] M. Kálmán, A.D. Székely, A. Csillag, Distribution of glial fibrillary
acidic protein-immunopositive structures in the brain of the domestic
Acknowledgments chicken (Gallus domesticus), J. Comp. Neurol. 330 (1993) 221–237.
[12] M. Kálmán, M.B. Pritz, Glial fibrillary acidic protein-immunoposi-
The Danish Medical Research Council (Ældreforskning tive structures in the brain of a crocodilian, Caiman crocodilus, and
II Grant 9502112) supported this work. We thank Samir its bearing on the evolution of astroglia, J. Comp. Neurol. 431 (2001)
460 – 480.
Deeb (University of Washington, Seattle, WA, USA) for the
[13] R.C. Marcus, S.S. Easter, Expression of glial fibrillary acidic protein
primate samples. The study was done in accordance with the and its relation to tract formation in embryonic zebrafish (Danio
guidelines of the Aarhus County Research Ethical Commit- rerio), J. Comp. Neurol. 359 (1995) 365–381.
tee. [14] M. Kálmán, Astroglial architecture of the carp (Cyprinus carpio)
brain as revealed by immunohistochemical staining against glial
fibrillary acidic protein (GFAP), Anat. Embryol. 198 (1998) 409 –
References 433.
[15] M. Kálmán, R.M. Gould, GFAP-immunopositive structures in spiny
[1] E. Fuchs, K. Weber, Intermediate filaments: structure, dynamics, dogfish, Squalus acanthias, and little skate, Raia erinacea, brains:
functions, and disease, Annu. Rev. Biochem. 63 (1994) 345–382. differences have evolutionary implications, Anat. Embryol. 204
[2] L.F. Eng, R.S. Ghirnikar, Y.L. Lee, Glial fibrillary acidic protein: (2001) 59 – 80.
GFAP—thirty-one years (1969 –2000), Neurochem. Res. 25 (2000) [16] A.L. Nielsen, et al., A new spliceform of glial fibrillary acidic protein,
1439 –1451. GFAP, interacts with the presenilin proteins, J. Biol. Chem. 277
[3] F. Besnard, et al., Multiple interacting sites regulate astrocyte-specific (2002) 29983–29991.
transcription of the human gene for glial fibrillary acidic protein, [17] E. Fuchs, D.W. Cleveland, A structural scaffolding of intermediate
J. Biol. Chem. 266 (1991) 18877–18883. filaments in health and disease, Science 279 (1998) 514 –519.
[4] R. Kaneko, N. Sueoka, Tissue-specific versus cell type-specific ex- [18] D.F. Condorelli, et al., Structural features of the rat GFAP gene and
pression of the glial fibrillary acidic protein, Proc. Natl. Acad. Sci. identification of a novel alternative transcript, J. Neurosci. Res. 56
USA 90 (1993) 4698 – 4702. (1999) 219 –228.
[5] R. Kaneko, N. Hagiwara, K. Leader, N. Sueoka, Glial-specific cAMP
[19] D. Graur, Li, W-H., Fundamentals of Molecular Evolution, 2nd
response of the glial fibrillary acidic protein gene in the RT4 cell
edition, Sunderland, MA, Sinauer, 2000.
lines, Proc. Natl. Acad. Sci. USA 91 (1994) 4529 – 4533.
[20] W. Makalowski, M.S. Boguski, Evolutionary parameters of the tran-
[6] S.A. Reeves, L.J. Helman, A. Allison, M.A. Israel, Molecular cloning
scribed mammalian genome: an analysis of 2,820 orthologous rodent
and primary structure of human glial fibrillary acidic protein, Proc.
Natl. Acad. Sci. USA 86 (1989) 5178 –5182. and human sequences, Proc. Natl. Acad. Sci. USA 95 (1998) 9407–
[7] E. Bongcam-Rudloff, et al., Human glial fibrillary acidic protein: 9412.
complementary DNA cloning, chromosome localization, and messen- [21] D.A. Liberles, D.R. Schreiber, S. Govindarajan, S.G. Chamberlin,
ger RNA expression in human glioma cell lines of various pheno- S.A. Benner, The adaptive evolution database (TAED), Genome Biol.
types, Cancer Res. 51 (1991) 1553–1560. 2 (2001) 1– 6.
[8] A. Isaacs, M. Baker, F. Wavrant-De Vrieze, M. Hutton, Determina- [22] A.D. Polydorides, H.J. Okano, Y.Y.L. Yang, G. Stefani, R.B. Darnell,
tion of the gene structure of human GFAP and absence of coding A brain-enriched polypyrimidine tract-binding protein antagonizes
region mutations associated with frontotemporal dementia with par- the ability of nova to regulate neuron-specific alternative splicing,
kinsonism linked to chromosome 17, Genomics 51 (1998) 152–154. Proc. Natl. Acad. Sci. USA 97 (2000) 6350 – 6355.