You are on page 1of 6

126 Opinion TRENDS in Biochemical Sciences Vol.27 No.

3 March 2002

32 Rossmann, M.G. et al. (1985) Structure of a human 38 Casasnovas, J.M. and Springer, T.A. (1995) Kinetics cellular receptor. J. Biol. Chem. 275,
common cold virus and functional relationship to and thermodynamics of virus binding to receptor. 23089–23096
other picornaviruses. Nature 317, 145–153 Studies with rhinovirus, intercellular adhesion 44 Xing, L. et al. (2000) Distinct cellular receptor
33 Casasnovas, J.M. et al. (1998) A dimeric crystal molecule-1 (ICAM-1), and surface plasmon interactions in poliovirus and rhinoviruses.
structure for the N-terminal two domains of resonance. J. Biol. Chem. 270, 13216–13224 EMBO J. 19, 1207–1216
intercellular adhesion molecule-1. Proc. Natl. 39 Mavaddat, N. et al. (2000) Signaling lymphocytic 45 Tatsuo, H. et al. (2000) SLAM (CDw150) is a
Acad. Sci. U. S. A. 95, 4134–4139 activation molecule (CDw150) is homophilic but cellular receptor for measles virus. Nature 406,
34 Bella, J. et al. (1998) The structure of the two self-associates with very low affinity. J. Biol. 893–897
amino-terminal domains of human ICAM-1 Chem. 275, 28100–28109 46 Dveksler, G.S. et al. (1991) Cloning of the mouse
suggests how it functions as a rhinovirus receptor 40 Barton, E.S. et al. (2001) Junction adhesion hepatitis virus (MHV) receptor: expression in
and as an LFA-1 integrin ligand. Proc. Natl. Acad. molecule is a receptor for reovirus. Cell 104, human and hamster cell lines confers
Sci. U. S. A. 95, 4140–4145 441–451 susceptibility to MHV. J. Virol. 65, 6881–6891
35 Shimaoka, M. et al. (2001) Reversibly locking a 41 Lin, H-H. et al. (2001) Molecular analysis of the 47 Huber, S.A. (1994) VCAM-1 is a receptor for
protein fold in an active conformation with a epidermal growth factor-like short consensus encephalomyocarditis virus on murine vascular
disulfide bond: integrin αL I domains with high repeat domain-mediated protein–protein endothelial cells. J. Virol. 68, 3453–3458
affinity and antagonist activity in vivo. Proc. Natl. interactions. J. Biol. Chem. 276, 24160–24169 48 Geraghty, R.J. et al. (1998) Entry of
Acad. Sci. U. S. A. 98, 6009–6014 42 Lea, S.M. et al. (1998) Determination of the αherpesviruses mediated by poliovirus receptor-
36 Xiong, Y. et al. (2001) T cell receptor binding to a affinity and kinetic constants for the interaction related protein 1 and poliovirus receptor. Science
pMHCII ligand is kinetically distinct from and between the human virus echovirus 11 and its 280, 1618–1620
independent of CD4. J. Biol. Chem. 276, 5659–5667 cellular receptor, CD55. J. Biol. Chem. 273, 49 Nicholls, A. et al. (1991) Protein folding and
37 Myszka, D.G. et al. (2000) Energetics of the HIV 30443–30447 association: insights from the interfacial and
gp120–CD4 binding reaction. Proc. Natl. Acad. 43 McDermott, B.M. et al. (2000) Two distinct thermodynamic properties of hydrocarbons.
Sci. U. S. A. 97, 9026–9031 binding affinities of poliovirus for its Proteins 11, 281–296

The MUC family: an interactions to be mediated through their highly


elaborate structures. Their strategic position places
the mucins at centre stage in many disease processes

obituary in which the interactions of epithelial cells and their


surroundings have gone astray, as in inflammatory
and infectious diseases, cancer and metastasis.
Despite rapidly accumulating circumstantial
Jan Dekker, John W.A. Rossen, Hans A. Büller and evidence, most of the MUC-type mucins have no
unequivocally defined function that can be used as a
Alexandra W.C. Einerhand basis for terminology.
Biochemically, these molecules have always been
difficult to study because they are very large and
Mucins are glycoproteins that are common on the surfaces of many epithelial much of their structure is concealed by complex
cells; they are deemed to mediate many interactions between these cells and O-glycosylation. The study of the structures and
their milieu. Several of these mucins form the mucus layer that is found in functions of mucins is a growing field that has
many hollow organs. The biophysical properties of mucins are related to their received a recent boost from developments in
extensive O-linked glycosylation rather than directly to their polypeptide molecular biology. Each epithelial mucin gene
sequences. Despite the frequent absence of sequence homology, many human discovered was enthusiastically welcomed as a new
genes encoding mucins have been named MUC followed by a number, member of a gene family named MUC, established in
unjustly suggesting the existence of one large gene family. In this article, it is 1990. After the initial excitement over these
suggested that the mucin genes be renamed according to their sequence developments among the workers in the mucin field,
homologies. the realization grew that particular members of the
MUC-type mucin family were, in fact, very
Mucins are large, abundant, filamentous dissimilar. Currently, 14 mucin-type glycoproteins
glycoproteins that are present at the interface have been assigned to the MUC gene family, as
between many epithelia and their extracellular approved by the Human Genome Organization Gene
environments. This interface is often the lumen of a Nomenclature Committee (HUGO/GNC;
hollow organ within the body such as the http://www.hugo-international.org/hugo/). We aim
Jan Dekker*
John W.A. Rossen
gastrointestinal tract, lungs and urogenital tract. here to demonstrate that the existence of one MUC
Hans A. Büller Several of these mucins are known to form mucus gene family, encompassing all known human mucins,
Alexandra W.C. Einerhand layers, whereas others form the glycocalyx on the is not justified by the existing sequence data, and we
Laboratory of Paediatrics, intestinal enterocytes. As for their functions, these would like to provoke a discussion on the definition
Erasmus University and
Sophia Children’s
bulky and abundant glycoproteins are strategically and nomenclature of mucins.
Hospital, positioned to mediate the interactions between Box 1 gives a comprehensive overview of the
Dr Molewaterplein 50, epithelial cells and their milieu. They are considered terminology that is in use in the study of mucins.
3015GE Rotterdam,
to act as powerful two-edged swords, keeping Originally, the term ‘mucins’ was used for
The Netherlands.
*e-mail: dekker@ unwanted substances and organisms at an arm’s glycoproteins found in mucus that had been secreted
kgk.fgg.eur.nl length while, at the same time, allowing specific by epithelia. Later, these epithelia were found to

http://tibs.trends.com 0968-0004/02/$ – see front matter © 2002 Elsevier Science Ltd. All rights reserved. PII: S0968-0004(01)02052-7
Opinion TRENDS in Biochemical Sciences Vol.27 No.3 March 2002 127

Box 1. What are mucins? A matter of words

Mucin Cysteine-rich domains


The late W. Pigman was probably the first to define mucins as Cysteine-rich domains are identifiable cysteine-rich peptide
having exclusively O-linked glycans, and to recognize that the motifs that occur in the non-PTS regions of mucins, such as the
O-glycosylated peptide sequences were often tandemly repeated epidermal-growth-factor-like domains in membrane-bound
[a]. However, we now know that additional N-glycosylation is a mucins, the VWF-C and -D domains, and the C-terminal domains
feature common to most mucins [b]. Consensus tells us that found in 11p15 mucins.
mucin molecules consist of at least 50% O-glycans by weight,
which are concentrated in particular regions of the polypeptide MUC-type mucin
referred to as ‘mucin domains’ or ‘PTS regions’. A MUC-type mucin is one of the 14 mucins that have been
incorporated in the MUC-plus-number nomenclature, officially
Mucin domains/PTS regions assigned by HUGO/GNC, that has been in use since 1990.
Mucin domains are the (putatively) O-glycosylated sequences
that are often found within (deduced) sequences encoding 11p15 mucins
mucins. These sequences are also referred to as PTS regions, The 11p15 mucins are the most clearly described and related
indicating regions within the polypeptide that are enriched in family known among the assigned MUCs. This family comprises
proline, threonine and/or serine residues. Threonine and serine MUC2, MUC5AC, MUC5B and MUC6, all of which are found at
are usually both present in PTS regions and, of all known human chromosomal locus 11p15.5 [c]. These secretory mucus-gel-
PTS regions, only that of MUC2 does not contain any serines. forming mucins show significant homology in their non-PTS
These large regions, up to 6000 amino acids in length, are regions. Because locus 7q22 also seems to harbour a cluster of
usually devoid of cysteine residues and composed of short related mucin genes, the term ‘7q22 mucins’ has also been
(8–169 amino acids) tandemly repeated peptides. PTS regions used.
are found in all MUC-type mucins but also in many other
glycoproteins, although the PTS regions in mucins are Mucoprotein
exceptionally long and usually comprise more than half the Mucoprotein is an obsolete term for mucin that also encompasses,
polypeptide. for example, ‘peptidoglycan’, ‘intrinsic factor’, ‘orosomucoid’ and
‘cell wall skeleton’. However, this term is still in use by the National
Variable number of tandem repeats Library of Medicine (Bethesda, MD, USA), which lists mucins
Variable number of tandem repeats refers to the repetitive under Mucoproteins as a Medical Subject Heading (MeSH).
nature of most of the PTS regions. However, it has often not been
proved that these sequences are actually ‘variable’ within each Mucus glycoprotein
MUC. As for ‘tandem repeats’, many cysteine-rich domains in The term mucus glycoprotein only reflects the occurrence of
mucins are also repeated. For both reasons, this term should not mucins in mucus. In fact, several molecules described as mucins
be used generically. ‘PTS region’ circumvents these definition do not seem to be part of mucus, so ‘mucus glycoprotein’ should
problems. be avoided as a generic term. ‘Mucin glycoprotein’ is also used
but is a definite misnomer. If we comply with the biochemical
Non-PTS regions definition, adding ‘glycoprotein’ is unnecessary; mucin-type
Non-PTS regions are the N- and C-terminal parts of the mucin glycoprotein is the better phrase.
polypeptides beyond the PTS regions. They usually contain all
the cysteine residues of the mucins. Non-PTS regions can be References
called ‘unique’ because they are not excessively repeated, unlike a Pigman, W. et al. (1973) The occurrence of repetitive glycopeptide sequences
PTS regions. However, the term ‘unique’ can lead to confusion in bovine submaxillary glycoprotein. Eur. J. Biochem. 32, 148–154
b Strous, G.J. and Dekker, J. (1992) Mucin-type glycoproteins. Crit. Rev.
because particular cysteine-rich domains, such as the von-
Biochem. Mol. Biol. 27, 57–92
Willebrand-factor-like D domains in the 11p15 mucins, are c Desseyn, J.L. et al. (1998) Evolutionary history of the 11p15 human mucin
repeated within the non-PTS regions. gene family. J. Mol. Evol. 46, 102–106

produce transmembrane glycoproteins that were also any additions to the MUC-type mucin family. Instead,
described as mucins. The term mucin was thus used we seek to end the forced cohabitation of particular
for both secretory and membrane-bound epithelial MUC family members.
glycoproteins, which were included in the MUC Our primary concern is that the tendency to
family on the basis of sequencing data [1]. As incorporate more and more mucins into this one MUC
discussed previously [2], there are many other gene ‘family’ will not aid scientific understanding.
glycoproteins, such as the human endothelial and Furthermore, expansion of this family is against the
leukocyte glycoproteins, that were also described as formal policy of HUGO/GNC, which strives towards a
mucins. However, these are historically viewed as a logical and systematic nomenclature of related genes.
separate group of glycoproteins with separate With hindsight, the confusion began with the naming
functions, and were consequently not included in the of MUC1 and MUC2, both coined in 1990 [3,4].
MUC nomenclature. For clarity, we see no reason for Although both of these mucins are widely recognized

http://tibs.trends.com
128 Opinion TRENDS in Biochemical Sciences Vol.27 No.3 March 2002

MUC2

Y Y Y Y Y Y

Y
Y Y Y Y Y Y
Y Y

Key:

MUC1 Unique sequence MUC1


Unique sequence MUC2
PTS-domain: PTTTPITTTTTVTPTPTPTGTQT
PTS-domain: GSTAPPAHGVTSAPDTRPAP
Cysteine-rich domain
VWF-D-like domain

VWF-C-like domain

C-terminal domain

SEA-domain
Y
Y
YY

Transmembrane domain
X
Y Y N-glycan
O-glycan
Plasma membrane

X Proteolytic processing site


Ti BS

Fig. 1. How do we recognize a mucin? Two well-characterized mucins, on their modification with lipid moieties and calling
MUC1 and MUC2, showing the superficial structural similarities of the encoding genes ‘LIP-number’. Thus, post-
these glycoproteins, and also showing that these molecules are, in fact,
very different (indicated by the different colouring schemes).
translational modifications seem to be a brittle basis
MUC1 has become the epitome of the membrane-bound mucins; for gene nomenclature.
MUC2 represents the secretory mucins that form mucus layers. The second approach tries to define amino acid
sequences that are common to all mucins.
as mucins (by the biochemical definition), they are Unfortunately, there are no such sequences.
very dissimilar molecules (Fig. 1). Later additions to However, mucins are very densely O-glycosylated
the MUC gene family can be roughly divided into two and, consequently, contain sequences that are rich in
opposite camps, resembling either MUC1 or MUC2, proline, threonine and/or serine (PTS regions; Fig. 1),
although still other MUC genes seem to have totally which are acceptors for the O-glycans during
unrelated sequences. In trying to illuminate these biosynthesis. PTS regions usually comprise
matters, we cannot escape scrutinizing the definition relatively short, tandemly repeated sequences. The
of a mucin. peptide sequences of PTS regions have a limited
amino acid composition and, consequently, a limited
Defining mucins: family values codon usage at DNA and mRNA levels; therefore,
One international authority (IUPAC–IUBMB) these sequences are often superficially alike. It
defines glycoproteins and proteoglycans, but does not seems probable that the PTS regions of the various
mention mucins [5]. There are two approaches to the mucins have emerged through convergent rather
definition of mucins but both are unsatisfactory when than divergent evolution (Box 2). Each human MUC-
it comes to defining the relationships of the mucin- type mucin has a distinctive repetitive PTS-rich
encoding genes. The first approach relates to the one sequence, with respect to both amino acid sequence
characteristic of mucins that has stuck through the and length of the repeated unit. Also, the number of
years: mucins contain many relatively short O-linked PTS regions can vary with the mucin; for example, in
glycans (2–20 monosaccharides per chain), which MUC1, the PTS-region is uninterrupted, whereas
usually form >50% of the molecular weight of the MUC2 contains two distinct PTS-regions (Fig. 1).
mucin molecule [6]. The problem here is that a Strikingly, the known MUC orthologues in other
(glyco-)protein is being defined not by polypeptide mammals do not usually show conservation of either
sequence or function but by its post-translational sequence or length of the repeated amino acid
modifications. Using this criterion to define mucins sequence (Box 2). If there are any significant
would be similar to conflating all lipoproteins based sequence homologies among the MUC-type mucins,

http://tibs.trends.com
Opinion TRENDS in Biochemical Sciences Vol.27 No.3 March 2002 129

Box 2. Mucins: a successful convergence

Biochemically, mucins share many common features, such as MUC-type mucins between humans and rodents reveals that the
their huge size, filamentous structure, many O-linked glycans and non-PTS regions are particularly conserved whereas the PTS
an amino acid composition rich in threonine and serine. However, regions are not [b–k]. Apparently, the constraints as sketched
some mucins currently included in the MUC family show no above rule the overall amino acid composition of the PTS region
sequence homology. Apparently, mucins were developed but lay no burden on the exact sequence as long as it functions as
independently several times through convergent evolution. an O-glycan acceptor. Naturally, evolution found more ways to
The addition of the O-linked glycans to the polypeptides of form extended structures from polypeptide chains, such as the
mucins is used to maintain an extended conformation to create a coiled coil structures found in myosin and the triple-stranded
long, filamentous structure. O-Glycosylation begins by the helical structure found in collagen. Furthermore, these types of
addition of single N-acetyl-galactosamine (GalNAc) residues, extended polypeptide structures have a typical repeated peptide
which are relatively small and can be fitted on adjacent threonine structure.
and/or serine residues. N-Glycosylation cannot serve such a References
function because this involves the addition of very bulky a Schultz, J. et al. (2000) SMART: a web-based tool for the study of genetically
oligosaccharides. In practice, N-glycosylation is always spaced mobile domains. Nucleic Acids Res. 28, 231–234
along polypeptides, whereas O-glycans can be packed very tightly. b van Klinken, B.J. et al. (1999) Gastrointestinal expression and partial cDNA
cloning of murine Muc2. Am. J. Physiol. 276, G115–G124
If there was selection pressure during evolution to develop c Shekels, L.L. et al. (1995) Mouse gastric mucin: cloning and chromosomal
hydrophilic, elongated structures, then some proteins will have localisation. Biochem. J. 311, 775–785
developed dense O-glycosylation. Any peptide that must act as a d Vos, H.L. et al. (1991) The mouse episialin (Muc1) gene and its promoter:
ligand for O-GalNAc transferases (the enzymes that add the first rapid evolution of the repetitive domain in the protein. Biochem. Biophys.
Res. Commun. 181, 121–130
GalNAc to a serine or threonine residue) will meet similar
e Spicer, A.P. et al. (1991) Molecular cloning and analysis of the mouse
constraints and will consequently contain: (1) many threonines homologue of the tumor-associated mucin, MUC1, reveals conservation of
and/or serines as targets for O-GalNAc addition; (2) many proline potential O-glycosylation sites, transmembrane, and cytoplasmic domains
and small neutral amino acids to enable an elongated, flexible, and a loss of minisatellite-like polymorphism. J. Biol. Chem. 66, 5099–5159
random coil peptide; (3) no cysteines, because these would coil up f Shekels, L.L. et al. (1998) Cloning and characterisation of mouse intestinal
MUC3 mucin: 3′ sequence contains epidermal-growth-factor-like domains.
the polypeptides by forming disulfide bonds; and (4) very few
Biochem. J. 330, 1301–1308
bulky hydrophobic amino acids, because these would clot up the g Inatomi, T. et al. (1997) Cloning of rat Muc5AC mucin gene: comparison of its
polypeptide. As a result of these special constrictions, the SMART structure and tissue distribution to that of human and mouse homologues.
software invariably marks the PTS regions of mucins as regions of Biochem. Biophys. Res. Commun. 236, 789–797
‘low complexity’ [a]. Once a polypeptide had developed a h Wu, K. et al. (1994) Molecular cloning and sequencing of the mucin subunit of
a heterodimeric, bifunctional cell surface glycoprotein complex of ascites rat
successful ligand for O-glycosylation (i.e. a small PTS region), and
mammary adenocarcinoma cells. J. Biol. Chem. 269, 11950–11955
while there was still selection pressure to develop even longer i Gum, J.R. et al. (1991) Molecular cloning of rat intestinal mucin. Lack of
molecules, natural selection apparently favoured duplication of conservation between mammalian species. J. Biol. Chem. 266, 22733–22738
this successful peptide sequence within the polypeptide, resulting j Ohmori, H. et al. (1994) Molecular cloning of the amino-terminal region of a
in the repeated peptide sequence of the PTS region. rat MUC 2 mucin gene homologue. Evidence for expression in both intestine
and airway. J. Biol. Chem. 269, 17833–17840
The PTS region apparently evolved independently many times k Khatri, I.A. et al. (1997) The carboxyl-terminal sequence of rat intestinal
during evolution. Also, there appears to have been little selective mucin RMuc3 contains a putative transmembrane region and two EGF-like
pressure to maintain the exact PTS region. Comparison of known motifs. Biochim. Biophys. Acta 1326, 7–11

these are found within the non-PTS regions at the N- mucins because VWF-C, VWF-D and C-terminal
and C-termini of the polypeptides. Thus, unravelling domains are found in 78, 44 and 40 other human
any kinship among the MUC-type mucins comes proteins, respectively [19]. By contrast, the cysteine-
down to analysing the homologies of their non-PTS rich domains of MUC2, MUC5AC and MUC5B, with a
regions. consensus sequence as previously identified [20], are
so far unique to these mucins [19]. The 11p15 mucins
All in the family? share considerable overall homology in their non-PTS
There is good reason to believe that selection pressure regions (21–33%) and have probably evolved through
to preserve the polypeptide sequences of mucins was gene duplication of one ancestral gene [21]. Therefore,
mainly on the non-PTS regions (Box 2). As sequence these 11p15 mucins do, after all, form a closely knit
data on the PTS regions are therefore not informative, family. A great deal of evidence indicates that these
we are left with the N and C termini of the mucins to mucins are responsible for the formation of the mucus
establish their consanguinity. The most easily layers in the body [1,2].
identifiable relationships are found among the four Several membrane-bound mucins are probably
MUC-type mucins located within the 11p15 locus: related to each other, especially those localized to
MUC2, MUC5AC, MUC5B and MUC6 [3,7–18]. chromosomal locus 7q22: MUC3A, MUC3B and
These mucins have von-Willebrand-factor (VWF) MUC12 [22–27]. These mucins characteristically
domains (C and D types) and a conserved C-terminal have a transmembrane domain, a sea-urchin-sperm-
domain (Fig. 2). None of these domains is specific for protein–enterokinase–agrin (SEA) domain and one or

http://tibs.trends.com
130 Opinion TRENDS in Biochemical Sciences Vol.27 No.3 March 2002

MUC6 11p15

MUC2 11p15

MUC5B 11p15

MUC5AC 11p15
(Varying from 2334 to 6334 aa) MUC4 3q29

MUC12 7q22

MUC13 3q13

MUC3A 7q22
Signal sequence Transmembrane domain
MUC3B 7q22
VWF-D-like domain EGF-like domain
MUC1 1q21
PTS-region Nidogen domain
MUC16 17q21
VWF-C-like domain Cysteine-rich domain
MUC7 4q13-q21
C-terminal domain SEA-domain
MUC8 12q24
AMOP-domain
MUC11 7q22

500 aa
Ti BS

Fig. 2. A MUC family album. The relationships between the deduced cDNA fragment could prove to be continuous with
polypeptide sequences of the mucins presently assigned to the MUC another mucin cDNA fragment (e.g. with the
gene family by HUGO/GNC (http://www.hugo-international.org/hugo/).
The chromosomal location of each mucin gene is also indicated. All
C-terminal domain assigned to MUC12).
mucin sequences were aligned with their C terminus to the right, and The membrane-bound MUC13, whose gene is at
any known peptide domains present were detected in the sequences locus 3q13.3 [28], also shares some homology with the
using the SMART software [19]. In addition, MUC4 was found to contain 7q22 mucins, in particular in the SEA, EGF and
the very recently discovered AMOP domain [33]. Each type of peptide
domain is depicted in a separate colour, indicated in the key. The PTS
transmembrane domains (Fig. 2). MUC4 is an
regions in the sequences were invariably recognized by the SMART extremely large membrane-bound mucin that lies on
software as structures with ‘low complexity’ and are depicted in blue. the same chromosomal arm as MUC13 (3q29) [29–32].
MUC16, MUC7 and MUC8 show no homology to any other known
MUC4 seems to be a crossbreed, sharing limited
mucin. MUC11 cannot be aligned because it is so far only characterized
by its PTS region; its gene has the same chromosomal location as homology with the 7q22 mucins (one transmembrane
MUC3A, MUC3B and MUC12, and the possibility cannot be excluded and three EGF domains) and with the 11p15 mucins
that the MUC11 cDNA sequence is, in fact, continuous with the MUC12 (VWF-D domain; Fig. 2) [19]. MUC4 also has a
cDNA, of which only the C terminus has been identified. The
representations are based on the full-length cDNA sequences. Any
nidogen domain, which has not been identified in any
splice variants (e.g. MUC1, MUC3A, MUC3B and MUC4) are not other MUC protein but is found in 15 human non-
accounted for, and allelic variation (known to exist in at least some mucin polypeptides [19]. In addition, MUC4 is the only
MUCs as variations in the number of tandemly repeated PTS
MUC family member to contain the newly discovered
sequences) was also not taken into account. Dashed lines indicate
unknown sequences. The brackets around the N terminus of MUC6 ‘adhesion-associated domain in MUC4 and other
indicate that this sequence is not publicly available but has been proteins’ (AMOP domain; Fig. 2) [33]. The membrane-
reported [18]. The scale of the deduced polypeptides is indicated by the bound MUC1 is also related to the 7q22 mucins as it
bar representing 500 amino acids (aa).
Acknowledgements contains a transmembrane and a SEA domain (Fig. 2)
Our work on mucins has [34–36]. The homology between MUC1, MUC3A,
been made possible by
grants from the
two epidermal-growth-factor (EGF)-like domains. MUC3B, MUC4, MUC12 and MUC13 can be taken as
Netherlands Digestive Although these features distinguish these mucins sufficient evidence for a second mucin family.
Diseases Foundation from the 11p15 mucins, neither the SEA nor the EGF Putting this aside, some strange characters
(Nieuwegein), the Irene
domains are specific to mucins; both occur in many remain. The sketchy data around MUC8 hamper
Foundation (Arnhem),
ASTRA/Zeneca other non-mucin human proteins (28 and 182, definite conclusions but, from the available sequence
(Zoetermeer), the respectively). MUC3 was one of the first MUC data, there is no homology between this protein and
Netherlands Foundation proteins found, in 1990 [4], but it has recently been any other MUC-type mucin [37]. MUC7 is an
for Scientific Research
(the Hague), the Sophia
discovered that there are, in fact, two closely related unusually small secretory mucin (Fig. 2), sharing no
Foundation for Scientific and adjacent genes (MUC3A and MUC3B) with 98% homology with other MUC proteins [38,39]. MUC16
Research (Rotterdam), the homology [26]. The membrane-bound MUC12 is the is the most recent addition to the MUC family (Fig. 2).
Jan Dekker/Ludgardine most similar to MUC3A and MUC3B (29% homology It was characterized from a partial cDNA sequence
Bouman Foundation
(Amsterdam) and the
within the C-terminal non-PTS region). MUC12 and encoding a membrane-bound mucin that has long
Gastrostart Foundation the recently discovered MUC11 were localized to been known as the tumour marker CA125 [40].
(Haarlem). We apologize chromosome region 7q22 [27], so MUC12, MUC11, MUC16 does not appear to be related to the other
to the colleagues whose
MUC3A and MUC3B might share a common MUC-type mucins (Fig. 2). Whereas MUC16 might be
primary work could not be
cited because of space ancestral gene. However, only PTS sequences of a candidate for the transmembrane mucin family, the
limitations. MUC11 are known, and it is still possible that this unique sequences of MUC7 and MUC8 suggest that

http://tibs.trends.com
Opinion TRENDS in Biochemical Sciences Vol.27 No.3 March 2002 131

these mucins are true orphans. Meanwhile, HUGO mucins; and (2) the mucin genes at loci 7q22, 3q and
has reserved the names MUC15 and MUC17 for 1q21, presumably encoding membrane-bound
further (still anonymous) additions to the MUC mucins. Currently, several orphan MUCs remain:
family. Bearing the above considerations in mind, we MUC7 and MUC8, and probably also MUC16, which
are holding our breath about the nature of these have no identifiable homology to any other human
newly assigned MUC proteins. protein. Because of the dissimilarity of these
molecules, we would like to see an adaptation of the
Conclusions: families and orphans mucin nomenclature to distinguish at least two
Most of the MUC-type mucins have no unequivocally separate families.
defined function on which to base a terminology. If we Mucins might have evolved through convergent
wish to define mucin relationships, it must be by evolution, and this less usual form of evolution could
primary structure instead of function. Yet, there is no satisfactorily explain why particular mucins share
unifying sequence homology for the MUC-type biochemical and cell biological similarities but have
mucins. Based on sequence homology, two families of no apparent common evolutionary descent. This
mucins can be distinguished: (1) the mucin genes at insight will hopefully soften the pain of dividing up
locus 11p15, which probably encode mucus-forming the MUC family.
References super-repeat. Structural evidence for a 11p15.5 identified by differential display. Cancer Res. 59,
1 Strous, G.J. and Dekker, J. (1992) Mucin-type gene family. J. Biol. Chem. 272, 3168–3178 4083–4089
glycoproteins. Crit. Rev. Biochem. Mol. Biol. 27, 14 Offner, G.D. et al. (1998) The amino-terminal 28 Williams, S.J. et al. (2001) Muc13, a novel human
57–92 sequence of MUC5B contains conserved cell surface mucin expressed by epithelial and
2 Van Klinken, B.J. et al. (1995) Mucin gene multifunctional D domains: implications for hemopoietic cells. J. Biol. Chem. 276, 18327–18336
structure and expression: protection vs. adhesion. tissue-specific mucin functions. Biochem. 29 Porchet, N. et al. (1991) Molecular cloning and
Am. J. Physiol. 269, G613–G627 Biophys. Res. Commun. 251, 350–355 chromosomal localisation of a novel human
3 Griffiths, B. et al. (1990) Assignment of the 15 Desseyn, J.L. (1998) Genomic organisation of the tracheo-bronchial mucin cDNA containing
polymorphic intestinal mucin gene (MUC2) to human mucin gene MUC5B. cDNA and genomic tandemly repeated sequences of 48 base pairs.
chromosome 11p15. Ann. Hum. Genet. 4, sequences upstream of the large central exon. Biochem. Biophys. Res. Commun. 175, 414–422
277–285 J. Biol. Chem. 273, 30157–30164 30 Gross, M.S. et al. (1992) Mucin 4 (MUC4) gene:
4 Taylor-Papadimitriou, J. (1991) Report on the 16 Toribara, N.W. et al. (1993) Human gastric mucin. regional assignment (3q29) and RFLP analysis.
first international workshop on carcinoma- Identification of a unique species by expression Ann. Genet. 35, 21–26
associated mucins. Int. J. Cancer 49, 1–5 cloning. J. Biol. Chem. 268, 5879–5885 31 Nollet, S. et al. (1998) Human mucin gene MUC4:
5 Sharon, N. (1987) IUPAC–IUB joint commission 17 Toribara, N.W. et al. (1997) The carboxyl-terminal organisation of its 5′-region and polymorphism of
on biochemical nomenclature (JCBN). sequence of the human secretory mucin, MUC6. its central tandem repeat array. Biochem. J. 332,
Nomenclature of glycoproteins, glycopeptides and Analysis of the primary amino acid sequence. 739–748
peptidoglycans. Recommendations 1985. J. Biol. J. Biol. Chem. 272, 16398–16403 32 Moniaux, N. et al. (1999) Complete sequence of
Chem. 262, 13–18 (http://www.chem.qmw.ac.uk/ 18 Toribara, N.W. et al. (1997) The molecular the human mucin MUC4: a putative cell
iupac/misc/glycp.html) structure of MUC6 human gastric mucin and membrane-associated mucin. Biochem. J. 338,
6 Van Klinken, B.J. et al. (1998) Strategic analysis of its features. Gastroenterology 112, 325–333
biochemical analysis of mucins. Anal. Biochem. A314 33 Ciccarelli, F.D. et al. (2002) AMOP, a protein
265, 103–116 19 Schultz, J. et al. (2000) SMART: a web-based tool module alternatively spliced in cancer cells.
7 Gum, J.R., et al. (1989) Molecular cloning of for the study of genetically mobile domains. Trends Biochem. Sci. 27, 113–115
human intestinal mucin cDNAs. Sequence Nucleic Acids Res. 28, 231–234 34 Gendler, S. et al. (1988) A highly immunogenic
analysis and evidence for genetic polymorphism. 20 Perez-Vilar, J. and Hill, R.L. (1999) The structure region of a human polymorphic epithelial mucin
J. Biol. Chem. 264, 6480–6487 and assembly of secreted mucins. J. Biol. Chem. expressed by carcinomas is made up of tandem
8 Gum, J.R., Jr et al. (1994) Molecular cloning of 274, 31751–31754 repeats. J. Biol. Chem. 263, 12820–12823
human intestinal mucin (MUC2) cDNA. 21 Desseyn, J.L. et al. (1998) Evolutionary history of 35 Middleton-Price, H. et al. (1988) Close linkage of
Identification of the amino terminus and overall the 11p15 human mucin gene family. J. Mol. Evol. PUM and SPTA within chromosome band 1q21.
sequence similarity to prepro-von Willebrand 46, 102–106 Ann. Hum. Genet. 52, 273–278
factor. J. Biol. Chem. 269, 2440–2446 22 Fox, M.F. et al. (1992) Regional localisation of the 36 Ligtenberg, M.J. et al. (1991) A single nucleotide
9 Meezaman, D. et al. (1994) Cloning and analysis intestinal mucin gene MUC3 to chromosome polymorphism in an exon dictates allele
of cDNA encoding a major airway glycoprotein, 7q22. Ann. Hum. Genet. 56, 281–287 dependent differential splicing of episialin
human tracheobronchial mucin (MUC5). J. Biol. 23 Gum, J.R., Jr et al. (1997) MUC3 human intestinal mRNA. Nucleic Acids Res. 9, 297–301
Chem. 269, 12932–12939 mucin. Analysis of gene structure, the carboxyl 37 Shankar, V. et al. (1997) Chromosomal
10 Klomp, L.W. et al. (1995) Cloning and analysis of terminus, and a novel upstream repetitive region. localisation of a human mucin gene (MUC8) and
human gastric mucin cDNA reveals two types of J. Biol. Chem. 272, 26678–26686 cloning of the cDNA corresponding to the carboxyl
conserved cysteine-rich domains. Biochem. J. 308, 24 Van Klinken, B.J. et al. (1997) Molecular cloning terminus. Am. J. Respir. Cell Mol. Biol. 16,
831–838 of human MUC3 cDNA reveals a novel 59 amino 232–241
11 van de Bovenkamp, J.H. et al. (1998) Molecular acid tandem repeat region. Biochem. Biophys. 38 Bobek, L.A. et al. (1993) Molecular cloning,
cloning of human gastric mucin MUC5AC reveals Res. Commun. 238, 143–148 sequence, and specificity of expression of the gene
conserved cysteine-rich D-domains and a putative 25 Williams, S.J. et al. (1999) The MUC3 gene encoding the low molecular weight human
leucine zipper motif. Biochem. Biophys. Res. encodes a transmembrane mucin and is salivary mucin (MUC7). J. Biol. Chem. 268,
Commun. 245, 853–859 alternatively spliced. Biochem. Biophys. Res. 20563–20569
12 Escande F. et al. (2001) Human mucin gene Commun. 261, 83–89 39 Bobek, L.A. et al. (1996) Structure and
MUC5AC: organisation of its 5′-region and 26 Pratt, W.S. et al. (2000) Multiple transcripts of chromosomal localisation of the human salivary
central repetitive region. Biochem. J. 358, MUC3: evidence for two genes, MUC3A and mucin gene, MUC7. Genomics 31, 277–282
763–772 MUC3B. Biochem. Biophys. Res. Commun. 275, 40 Yin, B.W. and Lloyd, K.O. (2001) Molecular
13 Desseyn, J.L. et al. (1997) Human mucin gene 916–923 cloning of the CA125 ovarian cancer antigen.
MUC5B, the 10.7-kb large central exon encodes 27 Williams, S.J. et al. (1999) Two novel mucin Identification as a new mucin, MUC16. J. Biol.
various alternate subdomains resulting in a genes down-regulated in colorectal cancer Chem. 276, 27371–27375

http://tibs.trends.com