Академический Документы
Профессиональный Документы
Культура Документы
RESEARCH ARTICLE
Open Access
Abstract
Background: Plant ALDH10 enzymes are aminoaldehyde dehydrogenases (AMADHs) that oxidize different -amino
or trimethylammonium aldehydes, but only some of them have betaine aldehyde dehydrogenase (BADH) activity
and produce the osmoprotectant glycine betaine (GB). The latter enzymes possess alanine or cysteine at position
441 (numbering of the spinach enzyme, SoBADH), while those ALDH10s that cannot oxidize betaine aldehyde (BAL)
have isoleucine at this position. Only the plants that contain A441- or C441-type ALDH10 isoenzymes accumulate
GB in response to osmotic stress. In this work we explored the evolutionary history of the acquisition of BAL
specificity by plant ALDH10s.
Results: We performed extensive phylogenetic analyses and constructed and characterized, kinetically and
structurally, four SoBADH variants that simulate the parsimonious intermediates in the evolutionary pathway from
I441-type to A441- or C441-type enzymes. All mutants had a correct folding, average thermal stabilities and similar
activity with aminopropionaldehyde, but whereas A441S and A441T exhibited significant activity with BAL, A441V
and A441F did not. The kinetics of the mutants were consistent with their predicted structural features obtained by
modeling, and confirmed the importance of position 441 for BAL specificity. The acquisition of BADH activity could
have happened through any of these intermediates without detriment of the original function or protein stability.
Phylogenetic studies showed that this event occurred independently several times during angiosperms evolution
when an ALDH10 gene duplicate changed the critical Ile residue for Ala or Cys in two consecutive single mutations.
ALDH10 isoenzymes frequently group in two clades within a plant family: one includes peroxisomal I441-type, the
other peroxisomal and non-peroxisomal I441-, A441- or C441-type. Interestingly, high GB-accumulators plants have
non-peroxisomal A441- or C441-type isoenzymes, while low-GB accumulators have the peroxisomal C441-type,
suggesting some limitations in the peroxisomal GB synthesis.
Conclusion: Our findings shed light on the evolution of the synthesis of GB in plants, a metabolic trait of most
ecological and physiological relevance for their tolerance to drought, hypersaline soils and cold. Together, our
results are consistent with smooth evolutionary pathways for the acquisition of the BADH function from ancestral
I441-type AMADHs, thus explaining the relatively high occurrence of this event.
Keywords: Osmoprotection, Osmotic stress, Aminoaldehyde dehydrogenase, Enzyme kinetics, Substrate specificity,
Enzyme subcellular location, Protein stability, Protein structure, Protein evolution
* Correspondence: clares@unam.mx
1
Departamento de Bioqumica, Facultad de Qumica, Universidad Nacional
Autnoma de Mxico, Mxico D.F., Mxico
Full list of author information is available at the end of the article
2014 Muoz-Clares et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the
Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use,
distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public
Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this
article, unless otherwise stated.
Background
The synthesis of the osmoprotectant glycine betaine
(GB) is a metabolic trait of great adaptive importance
that allows the plants possessing it to contend with osmotic stress caused by drought, salinity or low temperatures. Since these adverse environmental conditions are
the major limitations of agricultural production, engineering the synthesis of GB in crops that naturally lack
this ability has been, and still is, a biotechnological goal
for improving their tolerance to osmotic stress (reviewed
in [1]). Also, it is becoming increasingly appreciated that
the GB content of an edible plant is valuable for human
and animal nutrition [2].
In plants, GB is formed by the NAD+-dependent oxidation
of betaine aldehyde (BAL) catalyzed by betaine aldehyde dehydrogenases (BADHs). Within the aldehyde dehydrogenase
(ALDH) superfamily, plant BADHs belong to the family 10
[3] whose members are -aminoaldehyde dehydrogenases
(AMADHs) that in vitro can oxidize small aldehydes possessing an -primary amine group, such as 3-aminopropionaldehyde (APAL) and 4-aminobutyraldehyde (ABAL) [4-12], a
trimethylammonium group, such as 4-trimethylaminobutyraldehyde (TMABAL) [9,12], or a dimethylsulfonium group,
such as 3-dimethylsulfoniopropionaldehyde [4,5]. In vivo, depending of the substrate used, these enzymes may participate
in diverse biochemical processes, which range from the
catabolism of polyamines to the synthesis of several osmoprotectants (alanine betaine, 4-aminobutyrate, carnitine or
3-dimethylsulfoniopropionate) in addition to GB. Although
the biochemically characterized plant ALDH10s oxidize all
the above-mentioned aldehydes, only some of them efficiently use BAL as substrate [9,13-18] and therefore can participate in the synthesis of GB. The difference in BAL
specificity among the plant ALDH10s was puzzling given
the high structural similarity between BAL and the other
-aminoaldehydes, as well as between the plant ALDH10
enzymes. Recently, by means of X-ray crystallography, in
silico model building, site-directed mutagenesis, and kinetic
studies of the ALDH10 enzyme from spinach (SoBADH), we
found that only an amino acid residue at position 441 is critical for an ALDH10 enzyme being able to accept or reject
BAL as a substrate [19]. This residue, located in the second
sphere of interaction with the substrate behind the indole
group of the tryptophan equivalent to W456 in SoBADH, determines the size of the pocket formed by the Trp and Tyr
residues equivalent to Y160 and W456 (SoBADH numbering) where the bulky trimethylammonium group of BAL
binds. If this residue is an Ile it pushes the Trp against the
Tyr, thus hindering the binding of BAL, whereas if it is an
Ala or a Cys the Trp adopts a conformation that leaves
enough room for productive BAL binding [19]. This conclusion was drawn by Daz-Snchez et al. [19] by comparing the
crystal structures of the SoBADH (PDB code 4A0M) with
those of the two pea AMADH enzymes, which do not have
Page 2 of 16
strong phylogenetic evidence that confirms that peroxisomal I441-type isoenzymes correspond to the ALDH10
ancestral form and that independent duplication events
occurred in monocots and eudicots plants. Indeed, the
change to A441-type isoenzymes was the commonest in
eudicots, whereas the change to C441-type isoenzymes
was in monocots.
Results
Phylogenetic analysis of the ALDH10 enzymes
Page 3 of 16
Page 4 of 16
Page 5 of 16
The phylogenetic analysis described above strongly supports that A441- and C441-type isoenzymes evolved from
the I441-type ones. Ile can be coded by three different
triplets, ATT, ATC and ATA, and of these the most frequently found in monocots and eudicots ALDH10 genes
is ATT and the least frequent ATA, which indeed was not
found in monocots; alanine is coded by four, GCT, GCC,
GCA, and GCG, of which GCT is the most used in eudicot ALDH10 genes; and cysteine is coded by two, TGT
and TGC, and both are present in monocots ALDH10
Page 6 of 16
Page 7 of 16
Figure 3 Effects of mutation of A441 on the steady-state kinetic parameters of SoBADH. Wild-type and mutant SoBADH enzymes were
assayed at pH 8.0 and 30C with BAL as variable substrate at fixed 0.2 mM NAD+. Other conditions are given in the Methods section. The kinetic
parameter values were calculated from the best fit of initial velocity data to the Michaelis-Menten equation by non-linear regression. Each saturation
curve was determined at least in duplicate using enzymes from two different purification batches. Bars indicate standard deviations. In the inset the
kcat/Km values of A441V, A441F and A441I are plotted using a scale smaller than that of the main figure.
the structure and/or thermal stability of the mutant enzymes. The amino acid substitutions made at position
441 were well tolerated, and the levels of expression of
soluble mutant proteins were similar to that of the wildtype enzyme (results nor shown). The mutations did not
affect either the native dimeric state of the enzymes, as
judged by gel filtration experiments (results not shown),
or the protein secondary structure, as judged by their
almost identical far-UV CD spectra (Figure 4A). Changes
in the protein tertiary structure could be detected in
their CD spectra in the near-UV range (Figure 4B), where
the signals originate from aromatic residues. The nearUV-CD spectrum of wild-type SoBADH shows well-
Table 1 Steady-state kinetic parameters of wild-type and mutant SoBADH enzymes in the oxidation of BAL
Kinetic parameters
kcat (s-1)
Km (M)
Wild type
3.36 0.13
98 15
35 4
A441C
2.99 0.19
90 6
33 2
A441S
3.29 0.12
119 8
28 3
A441T
2.64 0.16
180 6
15 1
A441V
0.54 0.05
512 79
1.1 0.3
A441F
0.29 0.02
605 26
0.48 0.05
0.74 0.05
1791 115
0.41 0.05
Wild type
4.25 0.16
22 2
195 8
A441C
2.20 0.09
14 1
179 17
A441S
3.27 0.18
29 3
114 20
A441T
2.39 0.00
24 4
100 15
A441V
0.51 0.00
6.4 0.5
80 5
A441F
0.39 0.02
18 1
22 0
A441I
0.68 0.03
2.8 0.0
243 11
Enzyme
Variable substrate
kcat/Km (mM-1s-1)
BAL
A441I
NAD+
Initial velocities were obtained at 30C in 50 mM HEPES-KOH buffer, pH 8.0, containing 0.1 mM EDTA. In the experiments with variable BAL, the fixed concentration
of NAD+ was 0.2 mM, and in the experiments with variable NAD+ the fixed BAL concentrations were at least 10-times their appKm values estimated for each enzyme at
fixed 0.2 mM NAD+. The apparent kinetic parameters were estimated by non-linear regression fit of the experimental data to the Michaelis-Menten equation. The values
given in the Table are the mean standard deviation of the kinetic parameters estimated in two duplicate saturation experiments performed with enzymes from two
different purification batches. Values for kcat are expressed per subunit.
Page 8 of 16
Table 2 Steady-state kinetic parameters of wild-type and mutant SoBADH enzymes in the oxidation of APAL
Kinetic parameters
Enzyme
Variable substrate
kcat (s-1)
Km (M)
kcat/Km (mM-1s-1)
APAL
Wild type
0.99 0.04
3.9 0.1
256 5
A441C
0.67 0.02
0.72 0.16
931 188
A441S
1.12 0.00
2.0 0.5
550 142
A441T
1.50 0.12
4.6 0.5
326 13
A441V
0.52 0.10
1.1 0.3
473 216
A441F
0.34 0.04
3.7 0.0
92 11
A441I
1.85 0.06
4.8 1.3
375 80
Wild type
0.99 0.04
4.0 0.1
250 7
A441C
0.89 0.00
2.6 0.5
342 76
A441S
0.88 0.02
2.8 0.0
314 13
NAD+
A441T
0.76 0.02
4.0 0.4
190 16
A441V
0.54 0.06
1.7 0.2
318 2
A441F
0.43 0.01
5.9 1.4
73 17
A441I
2.11 0.01
5.5 0.1
382 9
Initial velocities were obtained at 30C in 50 mM HEPES-KOH buffer, pH 8.0, containing 0.1 mM EDTA. In the experiments with variable APAL, the fixed concentration of NAD+ was 0.2 mM, and in the experiments with variable NAD+ the fixed APAL concentrations were at least 10-times their appKm values estimated for each
enzyme at fixed 0.2 mM NAD+. The apparent kinetic parameters were estimated by non-linear regression fit of the experimental data to the Michaelis-Menten equation
(saturation by NAD+ at fixed APAL) or to Equation 1 given in the main text (saturation by APAL at fixed NAD+). The values given in the Table are the mean standard
deviation of the kinetic parameters estimated in two duplicate saturation experiments performed with enzymes from two different purification batches. Values for kcat
are expressed per enzyme subunit. Substrate inhibition constants for APAL are not given because they could not be accurately estimated in the concentration range
used in these experiments, but the observed degree of inhibition by high APAL concentrations was roughly the same in all the enzymes.
Page 9 of 16
Discussion
Importance of size, polarity and conformation of the side
chain at position 441 of SoBADH for the kinetics and
stability
All data in this work agree with our previous proposal that
only one amino acid residue at position 441 is critical for
ALDH10 enzymes to accept or reject BAL as substrate
[19]. I441-type isoenzymes posses low or very low activity
with BAL whereas A441- and C441-type isoenzymes exhibit high activity with BAL [19]. The SoBADH mutants
that have a residue of similar size to Ala or Cys, as A441S
or A441T, exhibit a high activity with BAL, but the mutants with a bulky nonpolar residue, as A441V or A441I,
have a very low activity with BAL (Figure 3). The exquisite
sensitivity of SoBADH affinity for BAL to the size of the
side chain of the residue at position 441 is reflected in the
Page 10 of 16
to accommodate several mutations, some of them giving rise to a new function: the BADH activity and
therefore, the capacity to participate in the synthesis of
the osmoprotectant GB. The higher thermal stability of
the A441V and A441I mutants relative to that of the
wild-type and A441C, A441S, and A441T is consistent
with the extensive interactions made by the side chains
of Val and Ile, as indicated by our models and the
known crystal structures of ALDH10 enzymes that contain Ile at this position, and with the hydrophobicity of
these two residues. In the case of the A441F mutant the
strain exerted on the protein by the bulky benzyl ring
of Phe probably causes a decrease in the stability of this
enzyme when compared with that of the A441I or
A441V mutants.
Evolution of plant ALDH10 enzymes
Page 11 of 16
single mutations of the duplicates. In this way, the A441or C441-type isoenzymes acquired an extra BADH activity,
necessary for GB synthesis, while the original peroxisomal
I441-type isoenzymes remained as AMADHs devoid of
BADH activity and most likely retained their metabolic
role. Although the gain of the BADH activity does not
imply the loss of the other AMADH activities when
assayed in vitro ([19]; this work), the above observations indicate that the physiological functions of these
isoenzymes are not interchangeable. Indeed, in many plant
species the duplication of the ALDH10 gene did not result
in functional redundancy, since the duplicated genes not
only derived in enzymes with BADH activity but also in
non-peroxisomal I441-type AMADH isoenzymes, as in
the Solanaceae, Malvaceae, Rosacease and Brassicaceae
families, whose different cellular location probably allows
them to perform a different physiological function than
the peroxisomal ones. This is one possible reason for the
permanence of the duplicate genes that encode nonperoxisomal I441-type isoenzymes.
Our phylogenetic analysis also shows that all known
plant genomes contain at least one ADH10 gene coding
for a peroxisomal I441-type isoenzyme, which is the one
present even in those plants with a sequenced genome
that possess only one ALDH10 gene. This underscores the
importance of the biochemical processes in which these
peroxisomal I441-type isoenzymes participate, and indicate that, as mentioned above, in vivo they cannot be replaced by the ALDH10 isoenzymes that have gained the
extra BADH activity, i.e. the A441- or C441-type isoenzymes. Together, our findings strongly support that the
activity of the peroxisomal I441-type isoenzymes is essential for the plant, which is not unexpected considering that
the AMADH activity of these enzymes is involved in the
catabolism of polyamines taking place in the peroxisome
[39]. As mentioned above, the cytosolic I441-type isoenzyme may be involved in other physiological processes, for
instance the synthesis of several osmoprotectants, such as
4-aminobutyrate, -alanine, or carnitine by oxidizing
ABAL, APAL or TMABAL. The different intracellular location of the ALDH10 isoenzymes, predicted and in some
cases experimentally proven, suggest that there are two
kinds of the I441-typeperoxisomal (the commonest)
and cytosolic, both devoid of significant BADH activity,
and possibly two kinds of the A441- and C441-type
chloroplastic and cytosolic in the first case, and peroxisomal and cytosolic in the second, all of them having
BADH activity.
Regarding the intracellular location, the Amaranthaceae
A441-type isoenzymes most likely are chloroplastic, as
those from spinach [40] and sugarbeet [28], although they
lack a typical chloroplast-targeting signal [14]. Therefore,
in the Amaranthaceae the synthesis of GB most likely
takes place in the chloroplasts, as experimentally proven
for spinach [13]. This is in accordance with the chloroplastic location of the CMO enzyme in this plant [41] and
with the fact that the CMO activity requires the electrons
provided by reduced ferredoxin [17] and plant ferredoxins
are plastidic proteins [42]. But in the Poaceae plants that
have the C441-type isoenzyme, GB synthesis may take
place in the cytosolsince some of these isoenzymes are
non-peroxisomal and do not have a clear chloroplastleading signalor in peroxisomes, since other C441-type
isoenzymes have the peroxisomal signal. Indeed, it has
been suggested that in barley the synthesis of GB may take
place in the peroxisome, given the finding of a peroxisomal CMO in this plant [43]. But this proposal is at odds
with the proven non-peroxisomal location of the C441type ALDH10 isoenzyme of barley [9], and with the fact
that peroxisomal plant ferredoxins have not been
so far reported. Regardless of the peroxisomal or nonperoxisomal CMO location, it seems probable that the
need of transport into the peroxisomes of either choline
or BAL limits the synthesis of GB in those plants with
C441-type peroxisomal isoenzymes in comparison with
those in which this isoenzyme is non-peroxisomal. This is
in full agreement with the finding that the level of GB accumulation observed in cereal plants that have a predicted
peroxisomal C441-type isoenzymemaize, sorghum and
foxtail milletis much lower than that in plants that have
the non-peroxisomal C441-type, as wheat and barley
[44-46]. It could be predicted that other Poaceae species
which have a moderate ability to accumulate GB, such
as Pennisetum [44] and Panicum [46], have peroxisomal
C441-type isoenzymes, while those that are high GB accumulators, such as Secale [44,46], have the non-peroxisomal
C441-type isoenzyme.
Since the ability to synthesize the osmoprotectant GB
protects the plant against the most frequent environmental stresses such as salinity, drought, and low temperatures, as well as indirectly against other stresses that
usually accompany the formers, such as oxidative stress
and high temperatures (rev. in [47,48]), it is clear that
the presence of A441- or C441-type isoenzymes provides
a strong adaptive advantage. And not only to halophytes,
which grow in habitats where saline soils are prevalent,
but also to mesophytes, which may experience sporadic
episodes of water deficit or freezing temperatures. This,
together with the relatively easiness of the evolutionary
process, explains why the A441- and C441-type isoenzymes evolved independently several times through the
evolution of flowering plants. However, the A441- or
C441-type isoenzymes exhibit a limited phyletic distribution (Figures 1B and 1C), which agrees with the finding
that GB accumulation is also restricted to some eudicot
and monocot families [19]. This finding further support
the proposal that a significant BADH activity is essential
for GB accumulation [19]. Indeed, a significant BADH
Page 12 of 16
Page 13 of 16
Methods
Conclusions
The phylogenetic, biochemical, and structural evidence
presented here support relatively smooth evolutionary
The plasmid pET28-SoBADH, containing the full sequence of the spinach BADH gene and a N-terminal
His-tag [19], was used as template for site-directed mutagenesis, which was performed via polymerase chain
reaction (PCR) using the Quick Change XL-II Site Directed Mutagenesis system (Agilent) and the following
mutagenic primers: A441V, GAAGGCTCTAGAAGTT
GGAGTTGTTTGGGTTAATTGCTCAC (forward) and
TTGTGAGCAATTAACCCAAACAACTCCAACTTCTA
GAGCC (reverse); A441S, GAAGGCTCTAGAAGTTGG
AACTGTTTGGGTTAATTGCTCAC (forward) and TTG
TGAGCAATTAACCCAAACAGTTCCAACTTCTAGAG
CC (reverse); A441T, GAAGGCTCTAGAAGTTGGAA
CCGTTTGGGTTAATTGCTCAC (forward) and TTGTG
AGCAATTAACCCAAACGGTTCCAACTTCTAGAGCC
(reverse); A441F, GAAGGCTCTAGAAGTTGGATTTGT
TTGGGTTAATTGCTCAC (forward) and TTGTGAGC
AATTAACCCAAACAAATCCAACTTCTAGAGCC (reverse);. The non-complementary mutagenic codons are in
italics. Mutagenesis was confirmed by DNA sequencing.
The expression and purification of the recombinant proteins were carried out as reported [19]. Protein concentrations were determined spectrophotometrically, using the
molar absorptivity at 280 nm of 86,400 M1 cm1 deduced
from the amino acid sequence by the method of Gill and
von Hippel [50].
Activity assay and kinetic characterization of the wildtype and mutant SoBADH enzymes
Page 14 of 16
used for the APAL saturation data, which exhibited substrate inhibition:
v=E k cat S=fK m S1 S=K IS g
CD signals were recorded with a Jasco J-715 spectropolarimeter (Jasco Inc., Easton, MD) equipped with a Peltiertype temperature control system (Model PTC-423S) and
calibrated with d-10-(+)-camphorsulfonic acid. Near-UV
(250320 nm) and far-UV (200250 nm) CD spectra were
recorded for samples of 0.25 or 1.0 mg/mL protein concentration, respectively, placed in quartz cuvettes of
1.0-cm and 0.1-cm path length, respectively. Data were
collected at 0.5 nm (near-UV) or 1.0 nm (far-UV) intervals, a bandwidth of 1.0 nm and at a scan rate of 20 nm/
min. Spectra were averaged over 5 scans and the average
spectrum of a reference sample without protein was subtracted. The observed ellipticities were converted to mean
residue ellipticities [] on the basis of a mean molecular
mass per residue of 109.1. Thermal-induced protein denaturation was monitored by following the changes in ellipticity at 222 nm by increasing the temperature from 20
to 90C at a constant rate of 1.5C/min. Determination of
apparent Tm values was performed by non-linear regression fit of the data to a single Bolztman sigmoidal function. ORIGIN software (OriginLab Corp.) was used for
data analysis and display.
In silico mutagenesis and modeling
ALDH10 amino acid and nucleotide sequences were retrieved by Blast searches at the NCBI site [53] (http://blast.
ncbi.nlm.nih.gov/Blast.cgi) or Phytozome v9.1 database ([54];
Additional files
Additional file 1: Table S1. Aldehyde dehydrogenases identified as
members of the ALDH10 family.
Additional file 2: Figure S1. Interactions of the residue at position
equivalent to A441 of SoBADH. Table S2. Distances of the side-chain
atoms of the residue at position 441 to their closest neighbors.
Abbreviations
ALDH: Aldehyde dehydrogenase; ABAL: 4-Aminobutyraldehyde;
AMADH: Aminoaldehyde dehydrogenase; APAL: 3-Aminopropionaldehyde;
BADH: Betaine aldehyde dehydrogenase; BAL: Betaine aldehyde;
CMO: Choline monooxygenase; GB: Glycine betaine; PDB: Protein data bank;
PsAMADH: Pea AMADH; SlAMADH: Tomato AMADH; SoBADH: Spinach BADH;
TMABAL: 4-Trimethylaminobutyraldehyde; WT: Wild type; ZmAMADH: Maize
AMADH.
Competing interests
All the authors declare that they have no competing interests.
Authors contributions
RAMC conceived, designed and coordinated the study, analyzed kinetic and
structural data, and wrote the manuscript. HRR and AJS carried out the
phylogenetic analysis, interpreted and wrote the results of this analysis, and
contributed to the general discussion and writing of the manuscript. CMJ
constructed and purified the mutant proteins and performed the kinetic
experiments. GGR carried out the CD and thermostability experiments and
analyzed these data. LGS made the in silico mutants, constructed their
models and helped with the making of the figures. All authors read and
approved the final manuscript.
Page 15 of 16
Acknowledgements
This work was supported by UNAM-PAPIIT grants IN217814 to RAMC and
IN216513 to HRR. We thank Dr. LP Martnez-Castilla (Faculty of Chemistry,
UNAM) for helpful discussions and to Dr. I. Chvez-Bjar (Faculty of Chemistry,
UNAM) for her help with the sequencing of the recombinant mutant enzymes.
Author details
Departamento de Bioqumica, Facultad de Qumica, Universidad Nacional
Autnoma de Mxico, Mxico D.F., Mxico. 2Departamento de Bioqumica,
Facultad de Medicina, Universidad Nacional Autnoma de Mxico, Mxico D.
F., Mxico.
1
Page 16 of 16
41. Brouquisse R, Weigel P, Rhodes D, Yocum CF, Hanson AD: Evidence for a
ferredoxin-dependent choline monooxygenase from spinach chloroplast
stroma. Plant Physiol 1989, 90(1):322329.
42. Hanke G, Mulo P: Plant type ferredoxins and ferredoxin-dependent
metabolism. Plant Cell Environ 2013, 36(6):10711084.
43. Mitsuya S, Kuwahara J, Ozaki K, Saeki E, Fujiwara T, Takabe T: Isolation and
characterization of a novel peroxisomal choline monooxygenase in
barley. Planta 2011, 234(6):12151226.
44. Hitz WD, Hanson AD: Determination of glycine betaine by pyrolysis-gas
chromatography in cereals and grasses. Phytochemistry 1980, 19(11):23712374.
45. Lerma C, Rich PJ, Ju GC, Yang W-J, Hanson AD, Rhodes D: Betaine deficiency
in maize: complementation tests and metabolic basis. Plant Physiol 1991,
95(4):1131119.
46. Ishitani M, Arakawa K, Mizuno N, Kishitani S, Takabe T: Betaine aldehyde
dehydrogenase in the gramineae: levels in leaves of both betaine-accumulating
and nonaccumulating cereal plants. Plant Cell Physiol 1993, 34(3):493495.
47. Chen THH, Murata N: Glycinebetaine: an effective protectant against
abiotic stress in plants. Trends Plant Sci 2008, 13(9):499505.
48. Ahmad R, Lim CJ, Kwon S-Y: Glycine betaine: a versatile compound with
great potential for gene pyramiding to improve crop plant performance
against environmental stresses. Plant Biotechnol Rep 2013, 7:4957.
49. Flores HE, Filner P: Polyamine catabolism in higher plants: characterization
of pyrroline dehydrogenase. Plant Growth Reg 1985, 3:277291.
50. Gill SC, von Hippel PH: Calculation of protein extinction coefficients from
amino acid sequence data. Anal Biochem 1989, 182(2):319326.
51. Emsley P, Cowtan K: Coot: model-building tools for molecular graphics.
Acta Crystallogr D Biol Crystallogr 2004, 60:21262132.
52. Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC,
Ferrin TE: UCSF Chimeraa visualization system for exploratory research
and analysis. Comput Chem 2004, 13:16051612.
53. Benson DA, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW: GenBank.
Nucl Acids Res 2014, 42:D32D37.
54. Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, Mitros T, Dirks
W, Hellsten U, Putnam N, Rokhsar DS: Phytozome: a comparative platform
for green plant genomics. Nucleic Acids Res 2012, 40:D1178D1186.
55. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam
H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ,
Higgins DG: Clustal W and Clustal X version 2.0. Bioinformatics 2007,
23(21):29472948.
56. Madej T, Lanczycki CJ, Zhang D, Thiessen PA, Geer RC, Marchler-Bauer A,
Bryant SH: MMDB and VAST+: tracking structural similarities between
macromolecular complexes. Nucleic Acids Res 2013, 42:D297D303.
57. Rose PW, Bi C, Bluhm WF, Christie CH, Dimitropoulos D, Dutta S, Green RK,
Goodsell DS, Prlic A, Quesada M, Quinn GB, Ramos AG, Westbrook JD,
Young J, Zardecki C, Berman HM, Bourne PE: The RCSB protein data
bank: new resources for research and education. Nucleic Acids Res 2013,
41:D475D482.
58. Hall TA: BioEdit: a user-friendly biological sequence alignment editor and
analysis program for Windows 95/98/NT. Nucl Acids Symp Ser 1999, 41:9598.
59. Julin-Snchez A, Riveros-Rosas H, Martnez-Castilla LP, Velasco-Garca R,
Muoz-Clares RA: Phylogenetic and structural relationships of the
betaine aldehyde dehydrogenases. In Enzymology and Molecular Biology of
Carbonyl Metabolism. 13th edition. Edited by Weiner H, Plapp B, Lindahl R,
Maser E. West Lafayette, IN: Purdue University Press; 2007:6476.
60. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S: MEGA5:
molecular evolutionary genetics analysis using maximum likelihood,
evolutionary distance, and maximum parsimony methods. Mol Biol Evol
2011, 28(10):27312739.
61. Whelan S, Goldman N: A general empirical model of protein evolution
derived from multiple protein families using a maximum-likelihood
approach. Mol Biol Evol 2001, 18(5):691699.
62. Posada D, Buckley TR: Model selection and model averaging in
phylogenetics: advantages of Akaike information criterion and Bayesian
approaches over likelihood ratio tests. Syst Biol 2004, 53(5):793808.
doi:10.1186/1471-2229-14-149
Cite this article as: Muoz-Clares et al.: Exploring the evolutionary route
of the acquisition of betaine aldehyde dehydrogenase activity by plant
ALDH10 enzymes: implications for the synthesis of the osmoprotectant
glycine betaine. BMC Plant Biology 2014 14:149.
Organism
Lineage
Genome assembly
Number of genes
Embriophyta
Physcomitrella patens subsp. patens (Moss)
Streptophyta; Embryophyta; Bryophyta; Bryophytina; Bryopsida;
Funariidae; Funariales; Funariaceae
Phytozome: v3.0 assembly (preliminary)
27 chromosomes
26610 loci containing protein-coding transcripts
Selaginella moellendorffii (Spikemoss)
Streptophyta; Embryophyta; Tracheophyta; Lycopodiophyta;
Isoetopsida; Selaginellales; Selaginellaceae
Phytozome: v1.0 Dec 20, 2007 FilteredModels3 annotation
27 chromosomes
22 285 protein-coding transcripts
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
XP_001756623
A9RQZ4
PHYPADRAFT_204867
507 aa
XP_002974519
D8RTU8
174224
SELMODRAFT_174224
503 aa
XP_002990779
D8T5B3
SELMODRAFT_272160
503 aa
(redundant copy because the
assembled genome includes two
haplotypes)
Gymnosperm
Picea sitchensis (Sitka spruce) (Pinus sitchensis)
Tracheophyta; Spermatophyta; Coniferopsida; Coniferales; Pinaceae
Angiosperm
Liliopsida (monocots)
Aegilops tauschii (Tausch's goatgrass)
Tracheophyta; Spermatophyta; Magnoliophyta; Liliopsida; Poales;
Poaceae; BEP clade; Pooideae; Triticeae
Aeluropus lagopoides
Liliopsida; Poales; Poaceae; PACMAD clade; Chloridoideae;
Cynodonteae; Aeluropinae
Agropyron cristatum
(Crested wheatgrass) (Bromus cristatus)
Liliopsida; Poales; Poaceae; BEP clade; Pooideae; Triticeae
Brachypodium distachyon
Liliopsida; Poales; Poaceae; BEP clade; Pooideae; Brachypodieae
BioProject PRNJ74771
v1.0 8x assembly
5 chromosomes
26,552 loci containing protein-coding transcripts
ABK24463
A9NV02
(ABK24261)
503aa
EMT10403
M8C5I2
F775_31015
503 aa
AEV53927
G9JNB0
334 aa (fragment)
ACZ67850
D2K6H6
394 aa (fragment)
XP_003579919
LOC100826926
Chromosome 5
506 aa
XP_003574495
LOC100845613
Chromosome 3
501 aa
Organism
Lineage
Genome assembly
Number of genes
Hordeum brevisubulatum
Liliopsida; Poales; Poaceae; BEP clade; Pooideae; Triticeae
Hordeum vulgare subsp. vulgare (domesticated barley)
Tracheophyta; Spermatophyta; Magnoliophyta; Liliopsida; Poales;
Poaceae; BEP clade; Pooideae; Triticeae /cultivar="Haruna-nijyo
Hordeum vulgare
Liliopsida; Poales; Poaceae; BEP clade; Pooideae; Triticeae /cultivar:
116 Jeonju Native Korca
Leymus chinensis
Liliopsida; Poales; Poaceae; BEP clade; Pooideae; Triticeae
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
AAS66641
505aa
BAB62846
BBD2
503 aa
BAB62847
BBD1
506 aa
Q40024
LOC548296
505 aa
(BAB62847 ortolog)
BAD86758
LcBADH2
502 aa
BAM10432
BAD86757
LcBADH1
506 aa
AFA36547
288aa (fragment)
ABG34273
Q153G6
BADH
500 aa
XP_482470
NP_001061833
BADH2
Os08g0424500
(Phytozome: LOC_Os08g32870)
Chromosome 8
503 aa
O24174
NP_001053016
Os04g0464200
BADH1
(Phytozome: LOC_Os04g39020)
Chromosome 4
505 aa
Organism
Lineage
Genome assembly
Number of genes
Oryza sativa Indica Group (long-grained rice)
Tracheophyta;Spermatophyta; Magnoliophyta; Liliopsida; Poales;
Poaceae; BEP clade; Ehrhartoideae; Oryzeae
BioProject PRJN361
39285 genes (12 chromosomes)
37358 proteins
Pandanus amaryllifolius
Liliopsida; Pandanales; Pandanaceae
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
CM000133
Chromosome 8
503 aa
(unreported gen)
EEC77423
B8AV58
OsI_16212
Chromosome 4
505 aa
Almost identical to:
ABB83473
505 aa
AFD62259
H9BS79
BADH2
387 aa (fragment)
Si009902m
Si009902m.g
505 aa
Si013592m
Si013592m.g
505 aa
AAC49268_NC012875
SORBIDRAFT_06q019200
Chromosome 6
506 aa
XP_002444357
SORBIDRAFT_07g020650
Chromosome 7
505 aa
AAL05264
CBK51339
503 aa
EMS68665
M8ATW1
TRIUR3_04819
417 aa
EMS48376
M7YN64
TRIUR3_05640
544 aa
Organism
Lineage
Genome assembly
Number of genes
Zea mays
Liliopsida; Poales; Poaceae; PACMAD clade; Panicoideae;
Andropogoneae
BioProject B73RefGen_V2
10 chromosomes
39 454 genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
AAT70230
NP_001105781
LOC606443
AMADH1B
505 aa
NP_001157807
(PDB: 4I8P)
C0P9J6
gpm154
AMADH1A
LOC100302679
505 aa
Zoysia tenuifolia
Liliopsida; Poales; Poaceae; PACMAD clade; Chloridoideae;
Zoysieae; Zoysiinae
NP_001157804
AMADH2
506 aa
BAD34955
clone="18
BAD34954
BAD34947
504 aa
BAD34957
clone="ZBD1"
BAD34952
BAD34953
507 aa
Eudicotyledons
Amaranthus hypochondriacus (grain amaranth)
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
Ammopiptanthus mongolicus
eudicotyledons; core eudicotyledons; rosids; fabids; Fabales;
Fabaceae; Papilionoideae; Thermopsideae
O04895
AAB58165
BADH4
Chloroplastic
501 aa
AAB70010
O22467
BADH17
500 aa
ABC86862
Q2I693
241aa (fragment)
Organism
Lineage
Genome assembly
Number of genes
Aquilegia coerulea Goldsmith (Rocky mountain columbine;
Colorado blue columbine)
Streptophyta; Embryophyta; Tracheophyta; Spermatophyta;
Magnoliophyta; eudicotyledons; Ranunculales; Ranunculaceae
Phytozome: initial 8X unmapped genome assembly and the version
1.1 annotation
24,823 loci containing protein-coding transcripts
Arabidopsis lyrata subsp. lyrata (Lyre-leaved rock-cress)
eudicotyledons; core eudicotyledons; rosids; malvids; Brassicales;
Brassicaceae
Phytozome: includes JGI release v1.0
32670 protein-coding transcripts
Atriplex centralasiatica
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
Atriplex hortensis
(Mountain spinach)
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Aquca_002_00337
503 aa
XP_002877599
D7LRJ5
ARALYDRAFT_906054
937567
ALDH10A9
503 aa
XP_002889000
D7KSI7
ARALYDRAFT_895359
926872
ALDH10A8
501 aa
AAL34161
NP_190400
ALDH10A9
AT3G48170
Chromosome 3
Peroxisome
503 aa
Q9S795
NP_565094
NP_001185399
ALDH10A8
AT1G74920
ALDH10A8
Chromosome 1
501 aa
AAM19157
Q8L8I3
BADH
500aa
ABF72123
Q19QV8
(P42575; same gene)
500 aa
Organism
Lineage
Genome assembly
Number of genes
Atriplex micrantha
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
Atriplex prostrata
(Spear-leaved orache) (Atriplex triangularis)
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
Atriplex tatarica
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
ABM97658
A2TJX7
BADH
500 aa
AAM08913
Q8RX99
BADH1
(AAP13999)
500 aa
AAM08914
Q8RX98
BADH2
424 aa (fragment)
ABQ18317
A5HMM5
BADH
500 aa
BAB18544
Q9FRY2
Clone 13
502 aa
BAB18543
Q9FRY3
Clone 2
503 aa
P28237
CAA41377
CAA41376
Chloroplastic
500 aa
(not reported yet at NCBI)
Locus 1572
500 aa
AAQ55493
503 aa
Bra003781
Bra003781
501 aa
Bra019528
Bra019528
503 aa
Organism
Lineage
Genome assembly
Number of genes
Camellia sinensis (tea)
eudicotyledons; core eudicotyledons; asterids; Ericales; Theaceae
Capsella rubella
Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons; rosids; malvids; Brassicales; Brassicaceae;
Camelineae
Phytozome: initial 22x genome assembly and a preliminary
annotation
26,521 total loci containing protein-coding transcripts
Carthamus tinctorius (safflower)
eudicotyledons; core eudicotyledons; asterids; campanulids;
Asterales; Asteraceae; Carduoideae; Cardueae; Centaureinae
Chorispora bungeana
eudicotyledons; core eudicotyledons; rosids; malvids; Brassicales;
Brassicaceae; Chorisporeae
Chrysanthemum lavandulifolium (Daisy) (Dendranthema
lavandulifolium)
eudicotyledons; core eudicotyledons; asterids; campanulids;
Asterales; Asteraceae; Asteroideae; Anthemideae; Artemisiinae
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
AFP19449
I7F760
505 aa
EOA23841
CARUB_v10017058mg
503 aa
EOA35065
CARUB_v10020176mg
501 aa
ADW80905
G8D1H1
510 aa
AAV67891
Q5Q033
502 aa
AAY33871
Q4U5F2
DlBADH1
503 aa
AAY33872
Q4U5F1
DlBADH2
506 aa
XP_004508822
LOC101506136
Chromosome Ca7
503 aa
XP_004501961
LOC101507930
Chromosome Ca5
503 aa
Ciclev10019800m
Ciclev10019800m.g
505 aa
Ciclev10019822m
Ciclev10019800m.g
501 aa
(same locus, alternative transcript)
Organism
Lineage
Genome assembly
Number of genes
Citrus sinensis (sweet orange)
Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons; rosids; malvids; Sapindales; Rutaceae
Phytozome: (v.1) of the assembly is 319 Mb spread over 12,574
scaffolds
25,376 protein-coding loci
Corylus heterophylla
eudicotyledons; core eudicotyledons; rosids; fabids; Fagales;
Betulaceae
Cucumis melo (muskmelon)
eudicotyledons; core eudicotyledons; rosids; fabids; Cucurbitales;
Cucurbitaceae; Benincaseae
Cucumis sativus (cucumber)
eudicotyledons; core eudicotyledons; rosids; fabids; Cucurbitales;
Cucurbitaceae; Benincaseae
Phytozome: Roche/JGI v1 annotation
21491 loci containing protein-coding transcripts
Fragaria vesca subsp. Vesca (woodland strawberry)
Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons; rosids; fabids; Rosales; Rosaceae; Rosoideae;
Potentilleae; Fragariinae
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Located in scaffold_0016_14
(non-reported gene)
ADW80331
E9NRZ9
503 aa
AEK81574
G1EH68
503 aa
XP_004138132
LOC101213657
Cucsa.197230
503 aa
XP_004299590
mrna09492.1-v1.0-hybrid
LOC101300913
gene09492-v1.0-hybrid
505 aa
Organism
Lineage
Genome assembly
Number of genes
Glycine max (Soybean) (Glycine hispida)
eudicotyledons; core eudicotyledons; rosids; fabids; Fabales;
Fabaceae; Papilionoideae; Phaseoleae
50202 genes (20 chromosomes)
44642 proteins
Phytozome: Soybean Glyma1.0 annotation; 20 chromosomes, with a
small additional amount of mostly repetitive sequence in unmapped
scaffolds.
54,175 protein-coding loci and 73,320 transcripts have been
predicted
Halostachys caspica
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
Haloxylon ammodendron
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
Haloxylon persicum
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
NP_001234990
BADH1
B0M1A6
Glyma06g19820
Chromosome 6
503 aa
NP_001238427
B0M1A5
ADN03184
BADH2
Glyma05g01770
Chromosome 5
503 aa
XP_003550754
LOC100795267
Glyma17g10120
Chromosome 17
315 aa (pseudogene)
AAR23816
503 aa
Gorai.001G071200.2
Gorai.001G071200
503 aa
Gorai.007G047400.1
Gorai.007G047400
502 aa
AFB74193
H6VX92
BADH
500 aa
ABO45931
A4LAP2
500 aa
ACS96437
D0E0H5
BADH
500 aa
AEW31327
G9I1P9
BADH
500 aa
10
Organism
Lineage
Genome assembly
Number of genes
Helianthus annuus (common sunflower)
eudicotyledons; core eudicotyledons; asterids; campanulids;
Asterales; Asteraceae; Asteroideae; Heliantheae alliance; Heliantheae
Jatropha curcas
eudicotyledons; core eudicotyledons; rosids; fabids; Malpighiales;
Euphorbiaceae; Crotonoideae; Jatropheae
Kalidium foliatum
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
Ligusticum sinense
eudicotyledons; core eudicotyledons; asterids; campanulids; Apiales;
Apiaceae; Apioideae; apioid superclade; Selineae
Lycium barbarum (Matrimony vine)
eudicotyledons; core eudicotyledons; asterids; lamiids; Solanales;
Solanaceae; Solanoideae; Lycieae
Medicago sativa (alfalfa)
eudicotyledons; core eudicotyledons; rosids; fabids; Fabales;
Fabaceae; Papilionoideae; Trifolieae
Medicago truncatula (barrel medic)
eudicotyledons; core eudicotyledons; rosids; fabids; Fabales;
Fabaceae; Papilionoideae; Trifolieae
45000 genes (8chromosomes)
46092 proteins
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
ACU65243
C8CBI9
BADH
503 aa
ABO69575
B2BBY6
BADH
503 aa
(AFY98894-521aa, incorrect
insertion)
ABI95806
Q06AI9
500 aa
ADL61811
H2KKR8
508 aa
ACQ99195
D2DEK8
503 aa
AFS33786
J9XXT4
BADH
505 aa
ABE82378
XP_003608928
G7JNS2
(AFK34878)
(AFK43263)
MTR_4g106510
Chromosome 4
503 aa
(ACJ85836; fragment)
AFK44601
(NC_016409)
Chromosome 3
503 aa
mgv1a004929m
mgv1a004929m.g
504 aa
AGG82698
M4NDU5
501 aa
11
Organism
Lineage
Genome assembly
Number of genes
Panax ginseng (Korean ginseng)
eudicotyledons; core eudicotyledons; asterids; campanulids; Apiales;
Araliaceae
Phaseolus vulgaris (Kidney bean; French bean)
eudicotyledons; core eudicotyledons; rosids; fabids; Fabales;
Fabaceae; Papilionoideae; Phaseoleae
Phytozome: release of V1.0, the first chromosome scale version
11 chromosomes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
AAQ76705
Q6JSK3
BADH1
503 aa
Phvul.009G182300.1
Phvul.009G182300
Chromosome 9
503 aa
Phvul.003G196700.1
Phvul.003G196700
Chromosome 3
503 aa
CAC48393
Q93YB2
(PDB: 3IWJ)
AMADH2
503 aa
CAC48392
(PDB: 3IWK)
AMADH1
503 aa
AFA53117
H6V966
BADH2
503 aa
AFA53116
H6V965
BADH1
503 aa
XP_002322147
(B9IEL4)
Potri.015G070600
(POPTRDRAFT_666405)
Chromosome LGXV
503 aa
XP_002318630
(B9I351)
Potri.012G075600
(POPTRDRAFT_661953)
Chromosome LGXII
503 aa
12
Organism
Lineage
Genome assembly
Number of genes
Prunus persica (peach)
Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons; rosids; fabids; Rosales; Rosaceae; Maloideae;
Amygdaleae
Solanum torvum
eudicotyledons; core eudicotyledons; asterids; lamiids; Solanales;
Solanaceae; Solanoideae; Solaneae
Solanum tuberosum (Potato) (SOLTU)
eudicotyledons; core eudicotyledons; asterids; lamiids; Solanales;
Solanaceae; Solanoideae; Solaneae
Phytozome: Solanum tuberosum Group Phureja DM1-3 516R44
(CIP801092) Genome Annotation v3.4 mapped to pseudomolecule
sequence
12 chromosomes
35,119 loci containing protein-coding transcripts
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
EMJ18033
M5X943
PRUPE_ppa022568mg
503 aa
EMJ11127
M5WBJ6
PRUPE_ppa004563mg
503 aa
AER10508
G8EWJ4
503 aa
XP_002511463
B9R8Y8
RCOM_1512620
30147.t000284
503 aa
AEK98521
G1FG21
500aa
AAX73303
Q56R04
(PDB: 4I8Q; 4I9B)
Chromosome 6
AMADH1
Solyc06g071290.2
504 aa
NP_001234235
B6ECN9
Chromosome 3
AMADH2
Solyc03g113800.2
505 aa
AFA37976
H6V8T9
BADH
505 aa
PGSC0003DMP400055759
PGSC0003DMT400083025
PGSC0003DMG400033028
504 aa
PGSC0003DMP400042549
PGSC0003DMT400063205
PGSC0003DMG400024582
505 aa
13
Organism
Lineage
Genome assembly
Number of genes
Spinacia oleracea (spinach)
Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons; Caryophyllales; Amaranthaceae; Chenopodioideae;
Anserineae
Suaeda liaotungensis
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
P17202
(PDB: 4A0M)
chloroplastic
497 aa
AAL33906
Q8W5A1
BADH
501aa
AFW04226
K7T3P2
501 aa
ABG23669
Q155V4
BADH
501 aa
EOY20585
Domain 1
TCM_011968
Chromosome 3
503 aa
EOY20585
Domain 2
TCM_011968
Chromosome 3
511 aa
Comprise two adjacent ALDH10
genes
14
Organism
Lineage
Genome assembly
Number of genes
Vitis vinifera (wine grape)
eudicotyledons; core eudicotyledons; rosids; Vitales; Vitaceae
NCBI BioProject: PRJNA34679
19 chromosomes
24,508 genes
Phytozome: 12X March 2010 release of the draft genome and
annotation of Vitis vinifera by the French-Italian Public Consortium
for Grapevine Genome Characterization
19 chromosomes
26346 loci containing protein-coding transcripts
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
XP_003634296
LOC100251899
Chromosome 17
497 aa
(CBI15092) fragment
XP_002283690
D7SHY3
GSVIVT01007829001
LOC100246770
GSVIVT01007829001
Chromosome 17
503 aa
XP_002281984
GSVIVG01032588001
LOC100250859
GSVIVG01032588001
Chromosome 14
499 aa
Fungi
Ascomycota
Schizosaccharomyces pombe (fission yeast)
Fungi; Dikarya; Ascomycota; Taphrinomycotina;
Schizosaccharomycetes; Schizosaccharomycetales;
Schizosaccharomycetaceae
Bioproject: PRJNA127, PRJNA13836, PRJNA20755
3 chromosomes
5133 protein coding genes
Protist
Alveolata
Perkinsus marinus ATCC 50983
Eukaryota; Alveolata; Perkinsea; Perkinsida; Perkinsidae
Bioproject: PRJNA46451, PRJNA12737
23654 protein coding genes
Cryptophyta
Guillardia theta CCMP2712
Eukaryota; Cryptophyta; Pyrenomonadales; Geminigeraceae
Bioproject: PRJNA223305, PRJNA53577
24822 protein coding genes
Stramenopiles
Ectocarpus siliculosus (brown alga)
Eukaryota; Stramenopiles; PX clade; Phaeophyceae; Ectocarpales;
Ectocarpaceae
Bioproject: PRJEA42625
4005 protein coding genes
O59808
NP_588102
SPCC550.10
Chromosome III
500 aa
XP_002784436
Pmar_PMAR003695
394aa (fragment)
incomplete on both ends
EKX35999
GUITHDRAFT_117911
542 aa
CBN79171
Esi_0010_0069
517 aa
15
Organism
Lineage
Genome assembly
Number of genes
Phytophthora sojae
Eukaryota; Stramenopiles; Oomycetes; Peronosporales
Bioproject: PRJNA17989
26489 protein coding genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
EGZ28599
G4YMH3
PHYSODRAFT_248120
473 aa
EGZ28472
G4YIQ2
PHYSODRAFT_284276
418 aa
Bacteria
Alphaproteobacteria
Rhizobium leguminosarum bv. trifolii WSM2297
Bacteria; Proteobacteria; Alphaproteobacteria; Rhizobiales;
Rhizobiaceae; Rhizobium/Agrobacterium group
Betaproteobacteria
Burkholderia ambifaria MC40-6
Betaproteobacteria; Burkholderiales; Burkholderiaceae; Burkholderia
6784 genes
WP_003575619
EJC83393
ZP_18315655
Rleg4DRAFT_5155
496 aa
EJT02404
ZP_10838419
J6DKY5
RCCGE510_27926
plasmid="pRspCCGE510d"
496 aa
ZP_01555907
YP_001809915
BamMC406_3226
Chromosome 2
493 aa
ZP_01568328
YP_001585229
Bmul_5267
Chromosome 2
493 aa
ZP_01765086
BURPS305_6395
482 aa
YP_366413
Bcep18194_C6720
Chromosome 3
490 aa
YP_266864
CPS_0096
491 aa
16
Organism
Lineage
Genome assembly
Number of genes
Pseudomonas fluorescens Pf0-1
Gammaproteobacteria; Pseudomonadales; Pseudomonadaceae;
Pseudomonas
5833 genes (chromosome)
Pseudomonas protegens Pf-5 (PSEF5)
Gammaproteobacteria; Pseudomonadales; Pseudomonadaceae;
Pseudomonas
6233 genes (chromosome)
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
YP_350198
Pfl01_4470
483 aa
YP_261892
PFL_4811
482 aa
YP_260018
PFL_2912
476 aa
ZP_01639992
YP_001751324
PputW619_4475
476 aa
17
SoBADH
ZmAMADH1a
W461
W456
W443
W448
A441
C446
PsAMADH2
SlAMADH1
W460
W459
W446
W447
I444
I445
W456
W456
W443
W443
S441
C441
W456
W456
W443
W443
T441
V441
W456
W456
W443
F441
W443
I441
Table S2. Distances of the side-chain atoms of residue at position 441 to their closets neighbor
Enzyme (residue)
Atoms
Distance ()
SoBADH (A441)
CB A441-CZ3 W456
3.71
ZmAMADH1 (C446)
CB C446-CZ3 W461
SG C446-CZ3 W461
SG C446-CZ2 W448
SG C446-CE2 W448
SG C446-NE1 W448
3.90
3.94
3.62
3.52
3.76
PsAMADH2b (I444)
3.90
3.39
3.54
3.85
3.84
3.55
3.74
3.59
3.81
SlAMADH (I445)
3.81
3.51
3.56
3.74
3.76
3.62
3.81
3.54
3.67
CB C441-CZ3 W456
SG C441-CZ3 W456
SG C441-CE2 W443
3.67
3.94
3.74
CB S441-CZ3 W456
OG S441-CZ3 W456
OG S441-CE3 W456
3.67
3.56
3.50
CB T441-CZ3 W456
CG2 T441-CZ3 W456
OG1 T441-CZ3 W456
OG1 T441-CE3 W456
3.69
3.72
3.34
3.30
3.73
3.70
CG F441-CZ3 W456
CG F441-CE3 W456
CD1 F441-CE3 W456
CE1 F441-CE3 W456
CZ F441-CE3 W456
CZ F441-CD2 W456
CZ F441-CG W456
CD2 F441-CZ3 W456
CD2 F441-CE3 W456
CE2 F441-CE3 W456
CE2 F441-CD2 W456
CE2 F441-CG W456
CZ F441-CE3 W456
CZ F441-CD2 W456
CZ F441-CG W456
3.60
3.56
3.70
3.67
3.48
3.70
3.74
3.59
3.36
3.32
3.40
3.71
3.48
3.70
3.74
3.41
3.86
3.17
3.41
3.85
The distances given are the average of those observed in the two monomers of each crystal structure or model. The cutoff was made at 4.0 . bThe crystal
structure of the PsAMDH2 enzyme is very similar in this region to that of PsAMDH1, which is not included here for this reason. The PDB codes of the crystal
structures are: SoBADH, 4A0M; ZmAMADH1a, 4I8P; PsAMADH2, 3IWJ; SlAMADH1, 4I9B.