Академический Документы
Профессиональный Документы
Культура Документы
Contributed by Andrew H. Knoll, July 1, 2011 (sent for review February 9, 2011)
Although macroscopic plants, animals, and fungi are the most ibrating molecular clocks has also been greatly improved with
familiar eukaryotes, the bulk of eukaryotic diversity is microbial. both the recognition that single calibration points are insufficient
Elucidating the timing of diversification among the more than 70 (21, 22), and the availability of methods incorporate uncertainty
lineages is key to understanding the evolution of eukaryotes. Here, from the fossil record by specifying calibrations as time dis-
we use taxon-rich multigene data combined with diverse fossils tributions rather than points (15, 16). Additional limitations in
and a relaxed molecular clock framework to estimate the timing previous molecular clock studies of eukaryotes stem from the
of the last common ancestor of extant eukaryotes and the diver- tradeoff between analyses of many taxa and calibration points
gence of major clades. Overall, these analyses suggest that the last but only a single gene (4), and analyses of many genes but a small
common ancestor lived between 1866 and 1679 Ma, consistent with number of taxa and calibrations (5, 23).
the earliest microfossils interpreted with confidence as eukaryotic. Molecular clock estimates rely on robust phylogenies. Recon-
During this interval, the Earth’s surface differed markedly from to-
structions of relationships among eukaryotes have begun to sta-
bilize in recent years with the increasing availability of multigene
day; for example, the oceans were incompletely ventilated, with
data from diverse lineages (24–26). The majority of the >70 lin-
ferruginous and, after about 1800 Ma, sulfidic water masses com-
eages of eukaryotes fall within four major groups: Opisthokonta;
monly lying beneath moderately oxygenated surface waters. Our
Excavata; Amoebozoa; and Stramenopiles, Alveolates, and Rhi-
time estimates also indicate that the major clades of eukaryotes zaria (SAR) (25, 26), while the placement of some photosynthetic
diverged before 1000 Ma, with most or all probably diverging be- lineages remains controversial (25, 27, 28). Greater data avail-
fore 1200 Ma. Fossils, however, suggest that diversity within major ability also yields more accurate estimates of divergence times
extant clades expanded later, beginning about 800 Ma, when the because more nodes are available for calibration (29).
oceans began their transition to a more modern chemical state. In The availability of taxon- and gene-rich datasets coupled with
combination, paleontological and molecular approaches indicate flexible molecular clock methods make this an ideal time to re-
that long stems preceded diversification in the major eukaryotic visit the timing of early eukaryotic evolution. Here, broadly
lineages. sampled multigene trees are used to estimate dates, with rate
heterogeneity across the tree and among genes incorporated into
microbial eukaryotes | Proterozoic oceans | taxon sampling | the model. We use 23 calibration points derived from diverse
origin of eukaryotes fossils of Proterozoic and Phanerozoic age specified as prior
distributions (Table 1). The Proterozoic fossil record is sparse (2,
8, 9), and the taxonomic assignment of some Proterozoic fossils
T he antiquity of eukaryotes and the tempo of early eukaryotic
diversification remain open questions in evolutionary biology.
Proposed dates for the origin of the domain based on the fossil
has been called into question by a minority of researchers (6). In
the spirit of testing these ideas, we assess the impact of including
record and molecular clock analyses differ by up to 2 billion years calibration constraints derived from Phanerozoic fossils alone
(1). Microfossils attributed to eukaryotes occur at about 1800 Ma and Phanerozoic plus Proterozoic fossils. We also assess di-
(2) and putative biomarkers of early eukaryotes have been found vergence dates across analyses that varied in the position of the
in 2700 Ma rocks (3). Such geological interpretations contrast root, and the number of taxa included, as well as across different
with both molecular clock studies that place the origin of software platforms and models.
eukaryotes at 1250–850 Ma (4, 5), and a controversial hypothesis
that rejects the eukaryotic interpretation of all older fossils and Results
places eukaryogenesis at 850 Ma (6, 7). Taxon-rich analyses of multiple genes reveal a stability in di-
Paleontologists generally agree that an unambiguous record vergence dates across the eukaryotic tree of life that is robust to
of eukaryotic microfossils extends back to ∼1800 Ma (2, 8, 9). changing taxon inclusion, position of the root, molecular clock
Microfossils of this age are assigned to eukaryotes because they model, and choice of calibration points (Phanerozoic only or
combine informative characters that include complex morphol- both Phanerozoic and Proterozoic fossils). Collectively, these
ogy (e.g., the presence of processes and evidence for real-time analyses provide a mean age for the root of extant eukaryotes
modification of vegetative morphology), complex wall ultra- to 1866–1679 Ma in analyses including both Proterozoic and
structure, and specific inferred behaviors (2, 9, 10). Despite being Phanerozoic calibrations (“All” analyses; Fig. 1A and Table S1).
interpreted as eukaryotic, the taxonomic affinities of these fossils Varying the position of the root had little impact on divergence
remain unclear (2). Eukaryotic fossils that can be assigned to
extant taxonomic groups begin to appear ∼1200 Ma (11) and
become more widespread, abundant, and diverse in rocks ∼800 Author contributions: L.W.P. and L.A.K. designed research; L.W.P. and D.J.G.L. performed
Ma and younger (2, 12, 13). research; L.W.P., D.J.G.L., A.H.K., and L.A.K. analyzed data; and L.W.P., D.J.G.L., A.H.K.,
Molecular estimation of divergence times has improved dra- and L.A.K. wrote the paper.
matically in recent years due the development of methods that The authors declare no conflict of interest.
incorporate uncertainty from sources that include phylogenetic Freely available online through the PNAS open access option.
reconstruction, fossil calibrations, and heterogeneous rates of 1
To whom correspondence may be addressed. E-mail: aknoll@oeb.harvard.edu or lkatz@
molecular evolution (1, 14, 15). Relaxed clock approaches ac- smith.edu
count for heterogeneity in evolutionary rates across branches 2
Present address: Department of Chemistry and Biochemistry, University of Colorado,
and enable the use of complex models of sequence evolution Boulder, CO 80309.
(reviewed in refs. 16 and 17), although debate continues as to the This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
best method for relaxing the clock (18–20). The process of cal- 1073/pnas.1110633108/-/DCSupplemental.
*Eon: Phan, Phanerozoic; Protero, Proterozoic. Proterozoic calibrations are excluded from Phan analyses.
†
Calibration constraints are specified for BEAST using a gamma distribution with a minimum date in Ma based
on the fossil record parameters as indicated: min, minimum divergence data; dist, gamma prior distribution
(shape, scale). See Table S3 for details of PhyloBayes calibrations.
‡
In the All 720 analysis (c), the minimum age constraint for the red algae node is set to 720 Ma.
dates, especially for the estimated date of the root itself, which age of 720 Ma to this constraint, representing the absolute
generally changed by <100 million years (myr; Fig. 1A). Phylo- younger bound of the Hunting Formation, Canada, in which it is
bayes estimates generally showed more uncertainty than those found (SI Text) (11). In BEAST, placing the Bangiomorpha
from BEAST analyses, but around similar means. Similarly, constraint at 720 Ma shifted the estimated age of the root by only
estimates were robust to changing models (uncorrelated or 95 myr toward the present (Fig. 1A and Fig. S3, analysis c).
autocorrelated) and to the inclusion of only Phanerozoic (Phan) The autocorrelated CIR model combined with the low number
or all calibrations (All) with one exception: under the auto- of substitutions on deep branches of the eukaryotic tree appears
EVOLUTION
correlated Cox–Ingersoll–Ross (CIR) model, estimates are much more sensitive to the distribution of calibration dates included in
more recent in Phan analyses (1038 Ma and 1180 Ma; Fig. 1A). these analyses. Under the CIR autocorrelated model, a consistent
age was estimated with All calibrations included (1798–1691 Ma;
Impact of Calibration Constraints on Estimates of the Origin of Extant Fig. 1A, analyses m and o), although confidence intervals are
Eukaryotes. We assessed the impact of including Proterozoic greater in PhyloBayes analyses in general (Fig. 1A, analyses i–p).
fossils, which are considered controversial by some (6, 7), by However, excluding Proterozoic calibration points did cause es-
analyzing datasets without these seven calibration constraints timated ages to shift more than 600 myr younger under the CIR
(Phan analyses). In BEAST analyses, the exclusion of Proterozoic model (1180–1038 Ma; Fig. 1A, analyses n and p), pushing the
fossils shifted estimated divergence times toward the present, but estimated age for the root of extant eukaryotes younger than
not dramatically so: estimates for the mean age of root of extant the widely accepted date for the Bangiomorpha fossils. Similarly,
eukaryotes fall between 1506–1471 Ma in Phan analyses [95% the CIR analyses in PhyloBayes were sensitive to the age of the
highest-probability density (HPD) range 1643–1347 Ma; Fig. 1A, Bangiomorpha constraint, shifting more than 500 myr younger to
Figs. S1, S5, and S7, analyses b, f, and h] compared with 1837– 1296 Ma and 1167 Ma in analyses with All calibration points
1717 Ma (95% HPD range 1954–1601 Ma; Figs. 1A and 2 and rooted with Opisthokonta and “Unikonta,” respectively (Dataset
Figs. S4 and S6; analyses a, e, and g) when Proterozoic fossils S1). The necessity of using PhyloBayes to explore the differences
were included (All analyses). Similar dates were recovered in between autocorrelated and uncorrelated models introduces
Phan and All PhyloBayes analyses when the uncorrelated gamma confounding factors, as PhyloBayes requires both uniform dis-
model (UGAM) model (uncorrelated) of the molecular clock tributions around calibration points and a fixed tree topology.
was assumed (Fig. 1A, analyses i–l). Given that calibration points are likely best represented by more
Of the seven Proterozoic calibration points used in our anal- informative distributions, and that the topology of the tree is not
yses, only the Bangiomorpha point is controversial in terms of fully known, we focus the rest of our discussions on the results
either systematic attribution or age. The Bangiomorpha calibra- from BEAST, although data from all PhyloBayes analyses are
tion constraint is more than 400 myr older than our other Pro- available in Fig. 1A and Dataset S1.
terozoic constraints (Table 1). To determine whether this
calibration point drives results in analyses with All calibrations, Origin of Major Clades. In most analyses, the major clades of extant
we assessed the age of the root with a much more conservative eukaryotes diverged before 1200 Ma, with SAR, Excavata, and
estimate for the age of this red alga (All 720; Fig. 1, analysis c). A Amoebozoa arising within a similar time frame, as evidenced by
number of factors place the age of Bangiomorpha ∼1200 Ma (SI overlapping 95% HPD ranges (Figs. 1 and 2, Figs. S1–S7, and
Text); however, given the importance of the fossil we assigned an Dataset S1). The 95% HPD intervals are wider for clades with few
Parfrey et al. PNAS | August 16, 2011 | vol. 108 | no. 33 | 13625
A BEAST PhyloBayes algae, diverged within a similar time frame (Fig. 2). These results
2400 imply an early acquisition of photosynthesis in eukaryotes, in ac-
2200
cordance with both previous molecular clock estimates (30) and the
∼1200 Ma age assigned to the red algal fossil Bangiomorpha (11).
2000
d i Discussion
1800 e k m
a The molecular clock analyses presented here suggest that the last
c g o
1600 j l
common ancestor of extant eukaryotes lived between 1866 and
b f h 1679 Ma when both Phanerozoic and Proterozoic fossils are
1400 considered. We favor these more-inclusive analyses as they should
1200
reveal a more accurate picture of eukaryotic diversification, es-
p
pecially because the chosen fossils are widely accepted by pale-
n
1000 ontologists, and calibration constraints were assigned in a
conservative manner that accounts for age uncertainties. Esti-
800
mated ages are younger when we remove Proterozoic calibration
uncorrelated autocorrelated constraints, though not dramatically so, with the notable excep-
tion of the autocorrelated model CIR as implemented in Phylo-
Root op op op op est est un un op op un un op op un un Bayes with only Phanerozoic calibrations. Thus, our results tend
Calibration All Ph 720 All All Ph All Ph All Ph All Ph All Ph All Ph to place the last common ancestor of extant eukaryotes deep
within the Proterozoic Eon.
B 2400 Our estimates for the timing of the origin of extant eukaryotes
2200 are in line with fossil evidence (2, 13), but reject the hypothesis
2000 that eukaryotes originated only 850 Ma (6, 7). Fossils provide
minimum dates, leaving open the possibility that clades evolved
1800
d
e much earlier than their first fossil appearance (2, 31). Thus, it is
1600 d a g
a c c de g not surprising that divergence times for many eukaryotic clades
a
1400 e g b
f
h d c are older than their first unambiguous fossil occurrence (Table
b a
1200
c e g b f h 2). The paleontological literature contains some references to
f h b f h eukaryotic fossils older than our estimate of the last common
1000
ancestor. In some cases, these paleontological reports are in-
800 correct or ambiguous. For example, large carbonaceous fossils
assigned to the genus Grypania were originally reported to be
older than our molecular clock estimate (32), but more recent
radiometric dates indicate an age of 1874 ± 9 Ma (33), consistent
Fig. 1. Summary of mean divergence dates for the most recent common with the clock analyses presented here. Older still are the 50- to
ancestor of major clades of extant eukaryotes. Letters are at the mean di- 300-μm spheroidal microfossils described from ∼3200 Ma rocks
vergence time and denote analyses, as detailed in Table S1. Error bars rep-
by Javaux et al. (34), and proposed as possible eukaryotes by
resent 95% HPD for BEAST analyses (a–h) and the 95% confidence interval
Buick (35), and sterane biomarkers from 2700 Ma shales (3).
for PhyloBayes (analysis i–p). (A) Estimated age of the root of extant
eukaryotes across analyses. Root position: Opis, root constrained to Opis-
Whether these materials record Archean eukaryotes remains a
thokonta; Uni, root constrained to “Unikonta”; Estim, root estimated by subject of debate (34, 36). Our molecular clock estimates suggest
BEAST. Calibration: All, all Phanerozoic and Proterozoic CCs; Phan, Phaner- that if these fossils do represent eukaryotes, they record stem
ozic CCs only; 720, All CCs with the minimum age of red algae set to 720 Ma. lineages—early representatives of eukaryotic groups that went
d = 91 taxa. (B) Estimated ages of major clades from BEAST analyses. extinct—that were present before the emergence of extant eu-
karyotic clades.
The major lineages of extant eukaryotes (Opisthokonta, SAR,
calibration points, such as Excavata and Amoebozoa (Fig. 1B). Excavata, and Amoebozoa) are projected to have diverged from
Estimates for the last common ancestor of extant Opisthokonta are one another by the Mesoproterozoic era (1600–1000 Ma), rela-
younger than the other clades, at 1389–1240 Ma in analyses with All tively early in the history of the domain (Fig. 1 and Table 2).
calibration constraints. This, in turn, suggests that these lineages were present for hun-
Exclusion of Proterozoic calibration constraints (Phan analy- dreds of millions of years before the observed increase in the
ses) shifted age estimates for the origins of major extant abundance and diversity of eukaryotic microfossils beginning
eukaryotic clades younger by 200–300 myr (Fig. 1B). Differences ∼800 Ma (2, 37–40). Our molecular clock estimates indicate that
in divergence times are relatively small for nested clades—e.g., stem groups were present well before recognizable members of
the 95% HPD for Alveolata shifts from 1445 to 1236 Ma in crown lineages—monophyletic groups consisting of living rep-
analysis a (Fig. 2) to 1206–1020 Ma with only Phanerozoic cal- resentatives and their ancestors—diversified. A similar pattern of
ibration points (analysis b; Fig. S1). Not surprisingly, the differ- long stems preceding diversification is seen in animal and plants
ing calibration schemes had their most dramatic impact on the and may be a consistent pattern in evolution (38).
estimated age of the red algae, which changes from 1285 to 1180 Fossils and our molecular clock analyses agree that eukaryotes
Ma 95% HPD (Fig. 2) to 959–625 Ma 95% HPD when Prote- originated and diversified during a time when oceans differed
rozoic calibration points, including the constraint on red algae at substantially from the modern seas. Increasingly, geochemical
1174 Ma in accordance with the widely cited age for Bangio- data indicate that for much of the Proterozoic eon, mildly oxic
morpha, are excluded (Fig. S1). Estimated ages of major clades surface waters lay above an oxygen-minimum zone that was per-
were also much younger in analyses using the CIR model with sistently anoxic and commonly sulfidic (41, 42). Such conditions
Phan calibrations (analyses n and p; Dataset S1). are compatible with scenarios for eukaryogenesis that rely on
The topology of the eukaryotic tree produced through coes- anaerobic methanogens in symbiotic partnership with faculta-
timation of phylogeny and divergence times in BEAST is broadly tively aerobic proteobacteria or sulfate reducers (see references
consistent with other analyses (SI Text) (25, 26). Hence, the in ref. 43), because facultatively anaerobic mitochondria may
BEAST topology was also used for the PhyloBayes analyses, which have enabled early eukaryotes to live in the sulfidic Proterozoic
require a fixed topology. Though the relationships among the oceans (44). Because sulfide interferes with the function of mi-
photosynthetic eukaryotes remain uncertain (25), our analyses tochondria in aerobically respiring eukaryotes, the radiation of
suggest that many photosynthetic clades, such as red and green diverse species within eukaryotic clades may have become pos-
Heterocapsa rotundata
Alexandrium tamarense
Crypthecodinium cohnii
Karenia brevis
Oxyrrhis marina SAR
Perkinsus marinus
Theileria parva Alveolates
Plasmodium berghei
Toxoplasma gondii
Eimeria tenella
Stylonychia lemnae
Sterkiella histriomuscorum
Nyctotherus ovalis
Paramecium tetraurelia
Tetrahymena thermophila
Chilodonella uncinata
Reticulomyxa filosa
Ovammina opaca
Plasmodiophora brassicae Rhizaria
Bigelowiella natans
Gromia
Corallomyxa tenera
Heteromita globosa
Thalassiosira pseudonana
Phaeodactylum tricornutum
Aureococcus anophagefferens Stramenopiles
Heterosigma akashiwo
Ectocarpus siliculosus
Apodachlya brachynema
Phytophthora infestans
Isochrysis galbana
Emiliania huxleyi
Prymnesium parvum Haptophytes
Pavlova lutheri
Oryza sativa
Arabidopsis thaliana
Welwitschia mirabilis
Ginkgo biloba
Physcomitrella patens
Mesostigma viride
Volvox carteri
Chlamydomonas reinhardtii
Dunaliella salina
Green algae
Acetabularia acetabulum
Micromonas pusilla
Ostreococcus tauri
Goniomonas
Guillardia theta
Leucocryptos marina
Cryptomonads
Gracilaria changii
Chondrus crispus
Porphyra yezoensis Red algae
Cyanidioschyzon merolae
Glaucocystis nostochinearum Glaucocystophytes
Cyanophora paradoxa
Trypanosoma brucei
Leishmania major
Bodo saltans
Diplonema papillatum
Euglena longa
Euglena gracilis
Entosiphon sulcatum
Jakoba libera
Reclinomonas americana
Seculamonas ecuadoriensis
Naegleria gruberi Excavata
Sawyeria marylandensis
Trichomonas vaginalis
Giardia duodenalis
Spironucleus barkhanus
Carpediemonas membranifera
Monocercomonoides sp.
Streblomastix strix
Trimastix pyriformis
Malawimonas californiana
Malawimonas jakobiformis
Acanthamoeba castellanii
Hartmannella vermiformis
Arcella hemisphaerica
Rhizamoeba sp.
Entamoeba histolytica Amoebozoa
Mastigamoeba balamuthi
Dictyostelium discoideum
Physarum polycephalum
Capitella capitata
Aplysia californica
Schistosoma mansoni
Apis mellifera
Drosophila melanogaster
Caenorhabditis elegans
Gallus gallus
Homo sapiens
EVOLUTION
Branchiostoma floridae
Mnemiopsis leidyi
Oscarella carmela
Aphrocallistes vastus Opisthokonta
Nematostella vectensis
Monosiga brevicollis
Amoebidium parasiticum
Sphaeroforma arctica
Capsaspora owczarzaki
Candida albicans
Saccharomyces cerevisiae
Schizosaccharomyces pombe
Phanerochaete chrysosporium
Ustilago maydis
Glomus intraradices
Allomyces macrogynus
Spizellomyces punctatus
Fig. 2. Time-calibrated tree of extant eukaryotes using All calibration points, 109 taxa, and root constrained to Opisthokonta. Nodes are at mean divergence
times and gray bars represent 95% HPD of node age. (Upper) Geological time scale; (Lower) Absolute time scale in Ma. Thick vertical bars demarcate eras and
•
thin vertical lines denote periods, with dates derived from the 2009 International Stratigraphic Chart. Node calibrated with Phanerozoic fossils ( ); node
calibrated with Proterozoic fossils (◯). Estimated ages of calibrated nodes differ from calibration constraints (Table 1) because they have been modified by
relaxed clock analysis of sequence data.
sible only when sulfidic subsurface waters began to wane about photosynthetic bacteria are capable of nitrogen fixation, ame-
800 Ma (45). Alternatively, early eukaryotic evolution may have liorating the impact of nitrate and ammonia limitation on pri-
occurred in coastal environments sheltered from the impact of mary production. Eukaryotes, however, have no such capacity;
sulfidic waters or in freshwater systems, which are both poorly thus, it may not be a coincidence that biomarkers indicating an
sampled by the geologic record and not impacted by sulfidic expanding importance of algae in marine primary production
oceanic water masses (46). Consistent with this view, moderately occur in conjunction with geochemical data recording the spread
diverse assemblages of fossil eukaryotes occur in well-ventilated
lake deposits of the 1200 to 900 Ma Torridonian succession, of oxygen through later Neoproterozoic oceans (51). In our
Scotland (47, 48), and in coastal marine deposits of the ∼1500 to analyses, the clade that contains extant photosynthetic taxa, in-
1400-Ma Roper Group, Australia (49). cluding green algae plus land plant and red algae, arose between
Within Proterozoic oceans, low concentrations of biologically 1670 and 1428 Ma, but diversification within these lineages oc-
available nitrogen may also have inhibited the diversification of curred later in the Neoproterozoic and may correspond to
photosynthetic eukaryotes (50). Many cyanobacteria and other a changing redox profile in the oceans (Fig. 2).
Parfrey et al. PNAS | August 16, 2011 | vol. 108 | no. 33 | 13627
Table 2. Comparison of major node ages to fossil dates Encephalitozoon cuniculi) and orphans (e.g., Breviata anathema) were re-
moved to minimize rate heterogeneity for the clock analysis. The resulting
Major clade Estimated age, Ma Oldest fossil, Ma Ref.
109-taxon data matrix includes 5,696 characters, with each taxon having
Eukaryotes * 1800 (2) between three and 15 of the target genes (36% missing character data;
Extant eukaryotes 1679–1866 1200 (11) Table S2; analyses a–c and e–p). A 91-taxon alignment was created by re-
Amoebozoa 1384–1624 800 (12) moving additional taxa with either long branches or high levels of missing
Excavata 1510–1699 450 (64) data to ensure that our results were not driven by these potential sources of
Opisthokonta 1240–1481 632 (71) artifact (analysis d).
Rhizaria 1017–1256 550 (65)
SAR 1365–1577 736 (74) Molecular Dating Analyses. Dating analyses were predominantly performed
in BEAST v1.5.4 (52), and we also assessed results obtained in PhyloBayes
Estimated age is range of mean dates from All analyses. 3.2f (53) (see SI Text for analysis details). BEAST offers a number of desirable
*The age of the root of all eukaryotes is not estimated because molecular features, including flexible specification of prior distributions that enable
clock studies can only inform the timing of extant clades. the uncertainty of the fossil record to be realistically modeled, as well as the
ability to coestimate divergence times with topology (15). We compared
divergence dates for eukaryotes obtained from different models to assess
Discrepancy Between These and Previous Molecular Clock Studies. whether our conclusions were driven by the choice of a particular model (SI
Previous molecular clock studies yielded vastly different dates for Text, Fig. 1 and Table S1).
the root of extant eukaryotes, ranging from 3970 to 1100 Ma (1). In
a recent analysis of small subunit ribosomal DNA (SSU-rDNA) Calibration Constraints. Calibration constraints were specified with prior dis-
from 83 broadly sampled eukaryotes, Berney and Pawlowski (4) tributions to incorporate errors arising from age dating, stratigraphy, and
placed the origin of eukaryotes at 1100 Ma, a conclusion that was clade assignment (Table 1). The impact of Proterozoic fossils was assessed by
robust to changing the position of the root. They had numerous analyzing the data with only the 16 Phanerozoic calibration constraints
Phanerozoic calibration constraints specified as either minimum or (Phan analyses b, f, h, j, l, n, and p) or with Phanerozoic and Proterozoic
maximum divergence dates (4), but they found that including calibration constraints (All analyses a, c–e, g, i, k, m, and o). Calibration
Proterozoic calibration points, such as Bangiomorpha at 1200 Ma, constraints were specified with prior distributions in BEAST using BEAUTi
shifted their estimates of the origin and diversification of eukar- v1.5.4 (52) and were derived from a conservative reading of the fossil record
yotes by 1000–2500 Ma. The age discrepancy observed by Berney (i.e., we err toward younger rather than older ages; SI Text). Distributions
and Pawlowski (4), when Proterozoic calibration constraints are were specified with long tails unless the fossil record provided minimum-
included, contrasts sharply with the relative stability of dates seen in divergence information. Calibration constraints used for PhyloBayes had to
our analyses (Fig. 1A). We hypothesize that the increased gene and be specified as a uniform distribution (Table S3).
taxon sampling, as well as the use of flexible prior distributions of
calibration points as implemented in BEAST, are major factors Assessing Impact of the Root on the Inferred Age of Eukaryotes. Molecular
contributing to the stability of molecular clock estimation in clock analyses require a rooted tree. However, the position of the eukaryotic
our analyses. root remains an open question; therefore, we compared age estimates from
molecular clock analyses with multiple positions for the root of extant
Conclusion eukaryotes. First, the root was constrained to the branch leading to the
Opisthokonta or to Opisthokonta + Amoebozoa (“Unikonta”) in accordance
Our molecular clock analyses yield a timeline of eukaryotic
with current hypotheses (see SI Text for discussion of the position of the
evolution that is congruent with the paleontological record and
eukaryotic root). In BEAST, the root was specified by constraining a mono-
robust to varying analytical conditions. According to our analy-
phyletic ingroup. PhyloBayes requires the tree topology to be fixed, and we
ses, crown (extant) groups of eukaryotes arose in the Paleo-
used the tree in Fig. 2 rooted on either Opisthokonta or “Unikonta”. Finally,
proterozoic era (2500–1600 Ma) and began to diversify soon
for the third condition, the root was estimated by the molecular clock cri-
thereafter, suggesting that early eukaryotic evolution was influ-
terion, as implemented in BEAST (SI Text), which yielded variable estimates
enced by anoxic and sulfidic water masses in contemporaneous of the location of the root.
oceans. The stability in our analysis across a range of variables is
a welcome departure from the large age discrepancies reported ACKNOWLEDGMENTS. We thank Ben Normark, Rob Dorit, and Sam Bowser
in earlier molecular analyses, reflecting improved paleontologi- for useful discussions, and Jeff Thorne and Bengt Sennblad for helpful
cal interpretation, advancements in molecular methods, and the discussions about molecular clock models. This manuscript has been improved
rapidly growing body of molecular data from diverse eukaryotes. following the comments of Emmanuelle Javaux, Andrew Roger, and Heroen
Verbruggen. We thank Jessica Grant and Tony Caldanaro for technical help.
Materials and Methods This research was supported by the National Aeronautics and Space Admin-
istration Astrobiology Institute (A.H.K.) and by National Science Foundation
Alignments. Alignments are derived from the 15 protein-coding genes ana- Assembling the Tree of Life Grant 043115 and National Science Foundation
lyzed in Parfrey et al. (dataset 15:10 of ref. 25). Using this 88-taxon dataset Systematics Grant 0919152 (to L.A.K). D.J.G.L. is supported by Conselho Nacional
as a starting point, taxa were added to capture additional lineages, partic- de Desenvolvimento Científico e Tecnológico-Brazil Doutorado no Exterior Fel-
ularly those with fossil data available (Table S2). Rapidly evolving taxa (e.g., lowship 200853/2007-4.
1. Roger AJ, Hug LA (2006) The origin and diversification of eukaryotes: Problems with 9. Javaux EJ, Knoll AH, Walter M (2003) Recognizing and interpreting the fossils of early
molecular phylogenetics and molecular clock estimation. Philos Trans R Soc Lond B eukaryotes. Orig Life Evol Biosph 33:75–94.
Biol Sci 361:1039–1054. 10. Javaux EJ, Knoll AH, Walter MR (2004) TEM evidence for eukaryotic diversity in mid-
2. Knoll AH, Javaux EJ, Hewitt D, Cohen P (2006) Eukaryotic organisms in Proterozoic Proterozoic oceans. Geobiology 2:121–132.
oceans. Philos Trans R Soc Lond B Biol Sci 361:1023–1038. 11. Butterfield NJ (2000) Bangiomorpha pubescens n. gen., n. sp.: Implications for the
3. Brocks JJ, Logan GA, Buick R, Summons RE (1999) Archean molecular fossils and the evolution of sex, multicellularity, and the Mesoproterozoic/Neoproterozoic radiation
early rise of eukaryotes. Science 285:1033–1036.
of eukaryotes. Paleobiol 26:386–404.
4. Berney C, Pawlowski J (2006) A molecular time-scale for eukaryote evolution recali-
12. Porter SM, Meisterfeld R, Knoll AH (2003) Vase-shaped microfossils from the Neo-
brated with the continuous microfossil record. Proc Roy Soc Lond B 273:18671872.
proterozoic Chuar Group, Grand Canyon: A classification guided by modern testate
5. Douzery EJP, Snell EA, Bapteste E, Delsuc F, Philippe H (2004) The timing of eukaryotic
amoebae. J Paleontol 77:409–429.
evolution: Does a relaxed molecular clock reconcile proteins and fossils? Proc Natl
13. Javaux EJ (2007) The early eukaryotic fossil record. Adv Exp Med Biol 607:1–19.
Acad Sci USA 101:15386–15391.
14. Welch JJ, Bromham L (2005) Molecular dating when rates vary. Trends Ecol Evol 20:
6. Cavalier-Smith T (2002) The phagotrophic origin of eukaryotes and phylogenetic
classification of Protozoa. Int J Syst Evol Microbiol 52:297–354. 320–327.
7. Cavalier-Smith T (2010) Deep phylogeny, ancestral groups and the four ages of life. 15. Drummond AJ, Ho SYW, Phillips MJ, Rambaut A (2006) Relaxed phylogenetics and
Philos Trans R Soc Lond B Biol Sci 365:111–132. dating with confidence. PLoS Biol 4:e88.
8. Porter SM (2004) The fossil record of early eukaryotic diversification. Paleontol Soc 16. Ho SYW, Phillips MJ (2009) Accounting for calibration uncertainty in phylogenetic
Papers 10:35–50. estimation of evolutionary divergence times. Syst Biol 58:367–380.
EVOLUTION
41. Canfield DE (1998) A new model for Proterozoic ocean chemistry. Nature 396: 71. Cohen PA, Knoll AH, Kodner RB (2009) Large spinose microfossils in Ediacaran rocks as
450–453. resting stages of early animals. Proc Natl Acad Sci USA 106:6519–6524.
42. Johnston DT, Wolfe-Simon F, Pearson A, Knoll AH (2009) Anoxygenic photosynthesis 72. Martin MW, et al. (2000) Age of Neoproterozoic bilatarian body and trace fossils,
modulated Proterozoic oxygen and sustained Earth’s middle age. Proc Natl Acad Sci White Sea, Russia: Implications for metazoan evolution. Science 288:841–845.
USA 106:16925–16929. 73. Butterfield NJ, Knoll AH, Swett K (1994) Paleobiology of the Neoproterozoic Svan-
43. Embley TM, Martin W (2006) Eukaryotic evolution, changes and challenges. Nature bergfjellet Formation, Spitsbergen. Fossils Strata 34:1–84.
440:623–630. 74. Summons RE, Walter MR (1990) Molecular fossils and microfossils of prokaryotes and
44. Mentel M, Martin W (2008) Energy metabolism among eukaryotic anaerobes in light protists from Proterozoic sediments. Am J Sci 290-A:212–244.
of Proterozoic ocean chemistry. Philos Trans R Soc Lond B Biol Sci 363:2717–2729. 75. Xiao SH, Knoll AH, Yuan XL, Pueschel CM (2004) Phosphatized multicellular algae in
45. Johnston DT, et al. (2010) An emerging picture of Neoproterozoic ocean chemistry: the Neoproterozoic Doushantuo Formation, China, and the early evolution of flo-
Insights from the Chuar Group, Grand Canyon, USA. Earth Planet Sci Lett 290:64–73. rideophyte red algae. Am J Bot 91:214–227.
Parfrey et al. PNAS | August 16, 2011 | vol. 108 | no. 33 | 13629
Supporting Information
Parfrey et al. 10.1073/pnas.1110633108
SI Text algae rather than on a particular node within the clade to be
Calibration Constraints. Calibration constraints (CCs) were as- conservative.
signed from the fossil record and were set to take into account the The single CC in the Excavata is placed within the euglenids.
multiple sources of uncertainty that arise from using paleonto- Although the Excavata generally have a poor fossil record, the
logical information to calibrate molecular clocks (1–3). The level euglenid Moyeria is widely distributed in the Ordovician and
of informativeness of the priors varied among calibration con- Silurian, with an earliest occurrence in the Caradocian (7), dated
straints to reflect to the level of confidence in the timing of the at 450 Ma. Moyeria is thought to have been photosynthetic based
split. For many lineages (including all Proterozoic CCs) the fossil on the patterning of its pellicle, indicating an early acquisition
record provides only a minimum divergence time, which is re- of the secondary green alga endosymbiont (8), thus the CC is
flected as a very long tail in the prior probability that extends placed at the split between photosynthetic (Euglena) and het-
back to ∼3500 Ma. In most cases, the CC was placed at the node erotrophic (Entosiphon) euglenids in the tree (Fig. 1).
where the clade with an available fossil split from its sister group; The calibration constraint for diatoms is based on the earliest
for example the first recorded angiosperm pollen (4) is used to diatom fossils from the Valanginian to Hauterivian Myogok
constrain the split of angiosperms from their gymnosperm an- Formation in Korea (9); a date of 133.9 Ma is used to represent
cestors (Table 1 and Fig. 2). In cases where the fossil falls within the upper Valanginian boundary. The CC of this node is younger
the crown clade, the CC was placed at the base of the clade, as in than in other clock analyses (10) because we do not rely on
the Endopterygota where the first Mecoptera fossils constrain Pyxidicula, a putative Toarcian diatom (11) for which the ma-
the split between Apis and Drosophila (Table 1). Minimum dates terial has been lost.
(offsets in BEAST) were assigned conservatively. We used ra- We include a CC for ciliates in the All 720, and Phan analyses
diometric dates when available, and set the minimum constraint that is based on the presence of gammacerane in Neoproterozoic
to the youngest edge of the reported confidence interval. Thus, sedimentary rocks (12). Tetrahymenol, the precursor of gam-
the minimum age of the CC for Arcellinida is 736 Ma, because macerane, is commonly found in some ciliates, although it has
arcellinid fossils are found in rocks older than 742 ± 6 Ma (Table also been found in bacteria (13). Tetrahymenol production is
1) (5). For fossils assigned to geological stages, we used the upper documented from the Oligohymenophorea and the Plagiopylea
boundary of the stage according to the 2009 International Stra- (Trimyena), which are not included in this analysis, so the CC was
tigraphy Chart published by the International Commission on placed at the stem of the Oligohymenophorea (Tetrahymena and
Stratigraphy (http://www.stratigraphy.org/). For example, angio- Paramecium). This CC was included despite the possibility of
sperm pollen is first found in Valanginian rocks (4) and so was bacterial origin, because the 736 Ma constraint is much younger
constrained to a minimum date of 133.9 Ma. than the date estimated for ciliates (∼1150 Ma) without this
Prior distributions were set in one of two ways depending on constraint in the Phan analyses.
the level of uncertainty. For clades with robust fossil records
where the maximum age of the clade is unlikely to be substantially Root of the Eukaryotic Tree of Life. Although our goal was to
earlier than its first occurrence (e.g., angiosperms), the prior elucidate timing of major events in eukaryotic evolution, we also
distribution was set to include 95% of the probable age of the explored the impact of changing the position of the root, because
clade. In contrast, Proterozoic records and fossils of groups with rooted phylogenies are crucial for interpreting the evolutionary
a poor fossilization potential provide only minimum dates for events in the history of a lineage. A root must be either provided
lineage origin and, commonly, no information on maximum clade or estimated for molecular clock analyses (14, 15). However, the
age (e.g., Arcellinida). In these cases the prior distribution was root of the eukaryotic tree of life is difficult to determine because
specified with a very long tail, as assessed in BEAUTi, that ex- the common methods for rooting phylogenies are vulnerable to
tended back to ∼3500 Ma. artifacts caused by rate heterogeneity among lineages of eukar-
Selected CCs are discussed here (see Table 1 for details of the yotes and the vast distance between eukaryotes and archaea or
remaining CCs). Fossils of the earliest red alga, Bangiomorpha, bacteria (16–18). Although numerous hypotheses have been
occur in the lower section of the Hunting Formation, Canada, proposed (19–23), the position of the root remains an open
which is bracketed by U-Pb radiometric dates on volcanic rocks debate (16, 17, 24, 25). The most popular hypothesis of recent
of 1267 ± 2 Ma and 723 ± 3 Ma. Direct Pb-Pb dates on carbo- years places the root of eukaryotes between the Opisthokonta +
nates correlative with those containing the fossils yield a much Amoebozoa (unikonts) and the remaining eukaryotes (bikonts)
narrow constraint of 1198 ± 24 Ma (6), but this date remains (19, 26), and previous molecular clock analyses of eukaryotes
unpublished, and radiometric dating of carbonates can be rooted trees in this manner (10, 27, 28). However, several lines of
problematic. The true age of the Hunting Bangiomorpha fossils evidence contradict the unikont/bikont split (23, 24), and alter-
may therefore lie closer to the lower U-Pb age constraint than native roots have been suggested, including at the base of
the upper, because of the sequence stratigraphic position of Opisthokonta (23, 29), within Archaeplastida (21, 22), or along
fossiliferous strata relative to constraining volcanic rocks and the lineage leading to Euglenozoa (20). Rooting the tree of ex-
chemo- and biostratigraphic data consistent with a later Meso- tant eukaryotes along the branch leading to Opisthokonta is
proterozoic (>1250 Ma) age (6). In most All 720, and Phan supported by ongoing gene-tree species-tree reconciliation work
analyses, the minimum date for the Bangiomorpha constraint was by Gordon Burleigh (University of Florida).
set at 1174 Ma. Given the importance of the Bangiomorpha Here, we assess the impact different positions of the root have
calibration as potentially the oldest phylogenetically constraining on estimates of the age of eukaryotes. The root is (i) estimated in
fossil by roughly 450 myr, we also ran the All 720, and Phan BEAST using the molecular clock criterion (30); (ii) placed be-
analysis with the constraint for Bangiomorpha set at 720 Ma (the tween Opisthokonta and the rest of eukaryotes (23, 29); or (iii)
minimum age for the Hunting Formation) for comparison placed between “Unikonta” and the rest of eukaryotes (19).
(analysis d). Because of the controversy surrounding Bangio- PhyloBayes requires a fixed topology for molecular dating anal-
morpha, we have placed the calibration on the base of the red yses, hence those analyses were run rooted either on Opistho-
1. Donoghue PCJ, Benton MJ (2007) Rocks and clocks: Calibrating the Tree of Life using 22. Rogozin IB, Basu MK, Csürös M, Koonin EV (2009) Analysis of rare genomic changes
fossils and molecules. Trends Ecol Evol 22:424–431. does not support the unikont-bikont phylogeny and suggests cyanobacterial
2. Ho SYW, Phillips MJ (2009) Accounting for calibration uncertainty in phylogenetic symbiosis as the point of primary radiation of eukaryotes. Genome Biol Evol 1:99–113.
estimation of evolutionary divergence times. Syst Biol 58:367–380. 23. Arisue N, Hasegawa M, Hashimoto T (2005) Root of the Eukaryota tree as inferred
3. Rutschmann F, Eriksson T, Salim KA, Conti E (2007) Assessing calibration uncertainty in from combined maximum likelihood analyses of multiple molecular sequence data.
molecular dating: The assignment of fossils to alternative calibration points. Syst Biol Mol Biol Evol 22:409–420.
56:591–608. 24. Roger AJ, Simpson AGB (2009) Evolution: Revisiting the root of the eukaryote tree.
4. Crane PR, Friis EM, Pedersen KR (1995) The origin and early diversification of Curr Biol 19:R165–R167.
angiosperms. Nature 374:27–33. 25. Koonin EV (2010) The origin and early evolution of eukaryotes in the light of
5. Porter SM, Meisterfeld R, Knoll AH (2003) Vase-shaped microfossils from the phylogenomics. Genome Biol 11:209.
Neoproterozoic Chuar Group, Grand Canyon: A classification guided by modern 26. Keeling PJ, et al. (2005) The tree of eukaryotes. Trends Ecol Evol 20:670–676.
testate amoebae. J Paleontol 77:409–429. 27. Douzery EJP, Snell EA, Bapteste E, Delsuc F, Philippe H (2004) The timing of eukaryotic
6. Butterfield NJ (2000) Bangiomorpha pubescens n. gen., n. sp.: Implications for the evolution: Does a relaxed molecular clock reconcile proteins and fossils? Proc Natl
evolution of sex, multicellularity, and the Mesoproterozoic/Neoproterozoic radiation Acad Sci USA 101:15386–15391.
of eukaryotes. Paleobiology 26:386–404. 28. Hug LA, Roger AJ (2007) The impact of fossils and taxon sampling on ancient
7. Gray J, Boucot AJ (1989) Is Moyeria a euglenoid? Lethaia 22:447–456. molecular dating analyses. Mol Biol Evol 24:1889–1897.
8. Leander BS, Witek RP, Farmer MA (2001) Trends in the evolution of the euglenid 29. Stechmann A, Cavalier-Smith T (2002) Rooting the eukaryote tree by using a derived
pellicle. Evolution 55:2215–2235. gene fusion. Science 297:89–91.
9. Harwood DM, Nikolaev VA, Winter DM (2007) Cretaceous records of diatom 30. Drummond AJ, Ho SYW, Phillips MJ, Rambaut A (2006) Relaxed phylogenetics and
evolution, radiation, and expansion. Paleontol Soc Papers 13:33–59. dating with confidence. PLoS Biol 4:e88.
10. Berney C, Pawlowski J (2006) A molecular time-scale for eukaryote evolution 31. Huelsenbeck JP, Bollback JP, Levine AM (2002) Inferring the root of a phylogenetic
recalibrated with the continuous microfossil record. Proc Biol Sci 273:1867–1872. tree. Syst Biol 51:32–43.
11. Rothpletz A (1896) On the flysch fucoids and a few other fossil algae, as well as 32. Hampl V, et al. (2009) Phylogenomic analyses support the monophyly of Excavata and
diatoms from Liassic sponge reefs (Translated from German). Z Dtsch Geol Ges 52: resolve relationships among eukaryotic “supergroups”. Proc Natl Acad Sci USA 106:
154–160. 3859–3864.
12. Summons RE, Walter MR (1990) Molecular fossils and microfossils of prokaryotes and 33. Parfrey LW, et al. (2010) Broadly sampled multigene analyses yield a well-resolved
protists from Proterozoic sediments. Am J Sci 290-A:212–244. eukaryotic tree of life. Syst Biol 59:518–533.
13. Kleemann G, et al. (1990) Tetrahymenanol from the phototrophic bacterium 34. Stamatakis A, Hoover P, Rougemont J (2008) A rapid bootstrap algorithm for the
Rhodopseudomonas palustris: First report of a gammacerane triterpene from a RAxML Web servers. Syst Biol 57:758–771.
prokaryote. J Gen Microbiol 136:2551–2553. 35. Abascal F, Zardoya R, Posada D (2005) ProtTest: Selection of best-fit models of protein
14. Renner SS, Grimm GW, Schneeweiss GM, Stuessy TF, Ricklefs RE (2008) Rooting and evolution. Bioinformatics 21:2104–2105.
dating maples (Acer) with an uncorrelated-rates molecular clock: Implications for 36. Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by sampling
north American/Asian disjunctions. Syst Biol 57:795–808. trees. BMC Evol Biol 7:214.
15. Sanderson MJ, Doyle JA (2001) Sources of error and confidence intervals in estimating 37. Suchard MA, Weiss RE, Sinsheimer JS (2001) Bayesian selection of continuous-time
the age of angiosperms from rbcL and 18S rDNA data. Am J Bot 88:1499–1516. Markov chain evolutionary models. Mol Biol Evol 18:1001–1013.
16. Roger AJ, Hug LA (2006) The origin and diversification of eukaryotes: Problems with 38. Xie W, Lewis PO, Fan Y, Kuo L, Chen MH (2011) Improving marginal likelihood
molecular phylogenetics and molecular clock estimation. Philos Trans R Soc Lond B estimation for Bayesian phylogenetic model selection. Syst Biol 60:150–160.
Biol Sci 361:1039–1054. 39. Burki F, et al. (2009) Large-scale phylogenomic analyses reveal that two enigmatic
17. Embley TM, Martin W (2006) Eukaryotic evolution, changes and challenges. Nature protist lineages, telonemia and centroheliozoa, are related to photosynthetic
440:623–630. chromalveolates. Genome Biol Evol 1:231–238.
18. Tekle YI, Parfrey LW, Katz LA (2009) Molecular data are transforming hypotheses on 40. Lartillot N, Lepage T, Blanquart S (2009) PhyloBayes 3: A Bayesian software package
the origin and diversification of eukaryotes. Bioscience 59:471–481. for phylogenetic reconstruction and molecular dating. Bioinformatics 25:2286–2288.
19. Stechmann A, Cavalier-Smith T (2003) The root of the eukaryote tree pinpointed. Curr 41. Lepage T, Bryant D, Philippe H, Lartillot N (2007) A general comparison of relaxed
Biol 13:R665–R666. molecular clock models. Mol Biol Evol 24:2669–2680.
20. Cavalier-Smith T (2010) Kingdoms Protozoa and Chromista and the eozoan root of 42. Linder M, Britton T, Sennblad B (2011) Evaluation of Bayesian models of substitution
the eukaryotic tree. Biol Lett 6:342–345. rate evolution—parental guidance versus mutual independence. Syst Biol 60:329–342.
21. Nozaki H (2005) A new scenario of plastid evolution: Plastid primary endosymbiosis 43. Ho SYW (2009) An examination of phylogenetic models of substitution rate variation
before the divergence of the “Plantae,” emended. J Plant Res 118:247–255. among lineages. Biol Lett 5:421–424.
Heterocapsa_rotundata
Alexandrium_tamarense
Crypthecodinium_cohnii
Karenia_brevis SAR
Oxyrrhis_marina
Perkinsus_marinus
Theileria_parva Alveolates
Plasmodium_berghei
Toxoplasma_gondii
Eimeria_tenella
Stylonychia_lemnae
Sterkiella_histriomuscorum
Nyctotherus_ovalis
Paramecium_tetraurelia
Tetrahymena_thermophila
Chilodonella_uncinata
Reticulomyxa_filosa
Ovammina_opaca
Plasmodiophora_brassicae
Rhizaria
Bigelowiella_natans
Gromia
Corallomyxa_tenera
Heteromita_globosa
Thalassiosira_pseudonana
Phaeodactylum_tricornutum Stramenopiles
Aureococcus_anophagefferens
Heterosigma_akashiwo
Ectocarpus_siliculosus
Apodachlya_brachynema
Phytophthora_infestans
Isochrysis_galbana
Emiliania_huxleyi Haptophytes
Prymnesium_parvum
Pavlova_lutheri
Oryza_sativa
Arabidopsis_thaliana
Welwitschia_mirabilis
Ginkgo_biloba
Physcomitrella_patens
Mesostigma_viride
Volvox_carteri
Chlamydomonas_reinhardtii Green algae
Dunaliella_salina
Acetabularia_acetabulum
Micromonas_pusilla
Ostreococcus_tauri
Gracilaria_changii
Chondrus_crispus
Porphyra_yezoensis Red algae
Cyanidioschyzon_merolae
Goniomonas
Guillardia_theta Cryptomonads
Leucocryptos_marina
Glaucocystis_nostochinearum Glaucocystophytes
Cyanophora_paradoxa
Trypanosoma_brucei
Leishmania_major
Bodo_saltans
Diplonema_papillatum
Euglena_longa
Euglena_gracilis
Entosiphon_sulcatum
Naegleria_gruberi
Sawyeria_marylandensis
Trichomonas_vaginalis
Jakoba_libera Excavata
Reclinomonas_americana
Seculamonas_ecuadoriensis
Giardia_duodenalis
Spironucleus_barkhanus
Carpediemonas_membranifera
Monocercomonoides_sp
Streblomastix_strix
Trimastix_pyriformis
Malawimonas_californiana
Malawimonas_jakobiformis
Arcella_hemisphaerica
Rhizamoeba_sp
Hartmannella_vermiformis
Acanthamoeba_castellanii
Entamoeba_histolytica Amoebozoa
Mastigamoeba_balamuthi
Dictyostelium_discoideum
Physarum_polycephalum
Capitella_capitata
Aplysia_californica
Schistosoma_mansoni
Apis_mellifera
Drosophila_melanogaster
Caenorhabditis_elegans
Gallus_gallus
Homo_sapiens
Branchiostoma_floridae
Oscarella_carmela
Aphrocallistes_vastus
Mnemiopsis_leidyi Opisthokonta
Nematostella_vectensis
Monosiga_brevicollis
Amoebidium_parasiticum
Sphaeroforma_arctica
Capsaspora_owczarzaki
Candida_albicans
Saccharomyces_cerevisiae
Schizosaccharomyces_pombe
Phanerochaete_chrysosporium
Ustilago_maydis
Glomus_intraradices
Allomyces_macrogynus
Spizellomyces_punctatus
Fig. S1. Time-calibrated tree of eukaryotes using Phanerozoic calibration points, 109 taxa, rooted on Opisthokonta, and constructed in BEAST (analysis b).
Nodes are at mean divergence times, and gray bars represent 95% HPD of node age. (Upper) Geological time scale. (Lower) Absolute time scale (in Ma). Thick
vertical bars demarcate eras, and thin vertical lines denote periods, with dates derived from the 2009 International Stratigraphic Chart.
Heterocapsa_rotundata
Alexandrium_tamarense
Crypthecodinium_cohnii
Karenia_brevis SAR
Oxyrrhis_marina
Perkinsus_marinus
Theileria_parva Alveolates
Plasmodium_berghei
Toxoplasma_gondii
Eimeria_tenella
Stylonychia_lemnae
Sterkiella_histriomuscorum
Nyctotherus_ovalis
Paramecium_tetraurelia
Tetrahymena_thermophila
Chilodonella_uncinata
Reticulomyxa_filosa
Ovammina_opaca Rhizaria
Plasmodiophora_brassicae
Bigelowiella_natans
Gromia
Corallomyxa_tenera
Heteromita_globosa
Thalassiosira_pseudonana
Phaeodactylum_tricornutum Stramenopiles
Aureococcus_anophagefferens
Heterosigma_akashiwo
Ectocarpus_siliculosus
Apodachlya_brachynema
Phytophthora_infestans
Isochrysis_galbana
Emiliania_huxleyi
Prymnesium_parvum
Haptophytes
Pavlova_lutheri
Oryza_sativa
Arabidopsis_thaliana
Welwitschia_mirabilis
Ginkgo_biloba
Physcomitrella_patens
Mesostigma_viride
Volvox_carteri Green algae
Chlamydomonas_reinhardtii
Dunaliella_salina
Acetabularia_acetabulum
Micromonas_pusilla
Ostreococcus_tauri
Gracilaria_changii
Chondrus_crispus
Porphyra_yezoensis Red algae
Cyanidioschyzon_merolae
Goniomonas
Guillardia_theta
Leucocryptos_marina Cryptomonads
Glaucocystis_nostochinearum Glaucocystophytes
Cyanophora_paradoxa
Trypanosoma_brucei
Leishmania_major
Bodo_saltans
Diplonema_papillatum
Euglena_longa
Euglena_gracilis
Entosiphon_sulcatum
Naegleria_gruberi
Sawyeria_marylandensis
Trichomonas_vaginalis
Jakoba_libera Excavata
Reclinomonas_americana
Seculamonas_ecuadoriensis
Giardia_duodenalis
Spironucleus_barkhanus
Carpediemonas_membranifera
Monocercomonoides_sp
Streblomastix_strix
Trimastix_pyriformis
Malawimonas_californiana
Malawimonas_jakobiformis
Arcella_hemisphaerica
Rhizamoeba_sp
Acanthamoeba_castellanii
Hartmannella_vermiformis
Entamoeba_histolytica
Amoebozoa
Mastigamoeba_balamuthi
Dictyostelium_discoideum
Physarum_polycephalum
Capitella_capitata
Aplysia_californica
Schistosoma_mansoni
Apis_mellifera
Drosophila_melanogaster
Caenorhabditis_elegans
Gallus_gallus
Homo_sapiens
Branchiostoma_floridae
Oscarella_carmela
Aphrocallistes_vastus
Mnemiopsis_leidyi Opisthokonta
Nematostella_vectensis
Monosiga_brevicollis
Amoebidium_parasiticum
Sphaeroforma_arctica
Capsaspora_owczarzaki
Candida_albicans
Saccharomyces_cerevisiae
Schizosaccharomyces_pombe
Phanerochaete_chrysosporium
Ustilago_maydis
Glomus_intraradices
Allomyces_macrogynus
Spizellomyces_punctatus
Fig. S2. Time-calibrated tree of eukaryotes using All (Proterozoic and Phanerozoic) calibration points with the Bangiomorpha CC set at 720 Ma, 109 taxa,
rooted on Opisthokonta, and constructed in BEAST (analysis c). Other notes as in Fig. S1.
Crypthecodinium_cohnii
Alexandrium_tamarense
Heterocapsa_rotundata SAR
Karenia_brevis
Perkinsus_marinus Alveolates
Theileria_parva
Plasmodium_berghei
Eimeria_tenella
Toxoplasma_gondii
Sterkiella_histriomuscorum
Stylonychia_lemnae
Tetrahymena_thermophila
Paramecium_tetraurelia
Heteromita_globosa Rhizaria
Corallomyxa_tenera
Bigelowiella_natans
Ovammina_opaca
Reticulomyxa_filosa
Thalassiosira_pseudonana
Phaeodactylum_tricornutum Stramenopiles
Aureococcus_anophagefferens
Ectocarpus_siliculosus
Phytophthora_infestans
Emiliania_huxleyi
Isochrysis_galbana Haptophytes
Prymnesium_parvum
Pavlova_lutheri
Volvox_carteri
Chlamydomonas_reinhardtii
Dunaliella_salina
Acetabularia_acetabulum
Micromonas_pusilla
Ostreococcus_tauri Green algae
Arabidopsis_thaliana
Oryza_sativa
Welwitschia_mirabilis
Ginkgo_biloba
Physcomitrella_patens
Mesostigma_viride
Gracilaria_changii
Chondrus_crispus Red algae
Porphyra_yezoensis
Glaucocystis_nostochinearum Glaucocystophytes
Cyanophora_paradoxa
Guillardia_theta
Goniomonas Cryptomonads
Jakoba_libera
Reclinomonas_americana
Seculamonas_ecuadoriensis
Sawyeria_marylandensis
Naegleria_gruberi
Euglena_longa
Euglena_gracilis
Entosiphon_sulcatum
Diplonema_papillatum Excavata
Bodo_saltans
Streblomastix_strix
Monocercomonoides_sp
Trimastix_pyriformis
Malawimonas_jakobiformis
Malawimonas_californiana
Arcella_hemisphaerica
Hartmannella_vermiformis
Acanthamoeba_castellanii
Dictyostelium_discoideum Amoebozoa
Physarum_polycephalum
Mastigamoeba_balamuthi
Entamoeba_histolytica
Drosophila_melanogaster
Apis_mellifera
Caenorhabditis_elegans
Capitella_capitata
Aplysia_californica
Homo_sapiens
Gallus_gallus
Branchiostoma_floridae
Mnemiopsis_leidyi
Oscarella_carmela
Nematostella_vectensis
Schistosoma_mansoni Opisthokonta
Monosiga_brevicollis
Capsaspora_owczarzaki
Sphaeroforma_arctica
Saccharomyces_cerevisiae
Candida_albicans
Schizosaccharomyces_pombe
Ustilago_maydis
Phanerochaete_chrysosporium
Glomus_intraradices
Allomyces_macrogynus
Spizellomyces_punctatus
Fig. S3. Time-calibrated tree of eukaryotes using All calibration points, 91 taxa, rooted on Opisthokonta, and constructed in BEAST (analysis d). Other notes as
in Fig. S1.
Alexandrium_tamarense
Heterocapsa_rotundata
Crypthecodinium_cohnii
Karenia_brevis
Oxyrrhis_marina
SAR
Perkinsus_marinus
Theileria_parva
Plasmodium_berghei
Alveolates
Toxoplasma_gondii
Eimeria_tenella
Stylonychia_lemnae
Sterkiella_histriomuscorum
Nyctotherus_ovalis
Paramecium_tetraurelia
Tetrahymena_thermophila
Chilodonella_uncinata
Reticulomyxa_filosa
Ovammina_opaca Rhizaria
Plasmodiophora_brassicae
Bigelowiella_natans
Gromia
Corallomyxa_tenera
Heteromita_globosa
Thalassiosira_pseudonana
Phaeodactylum_tricornutum
Aureococcus_anophagefferens
Stramenopiles
Heterosigma_akashiwo
Ectocarpus_siliculosus
Apodachlya_brachynema
Phytophthora_infestans
Isochrysis_galbana
Emiliania_huxleyi
Prymnesium_parvum Haptophytes
Pavlova_lutheri
Oryza_sativa
Arabidopsis_thaliana
Welwitschia_mirabilis
Ginkgo_biloba
Physcomitrella_patens
Mesostigma_viride
Volvox_carteri
Chlamydomonas_reinhardtii Green algae
Dunaliella_salina
Acetabularia_acetabulum
Micromonas_pusilla
Ostreococcus_tauri
Goniomonas
Guillardia_theta
Leucocryptos_marina Cryptomonads
Gracilaria_changii
Chondrus_crispus
Porphyra_yezoensis Red algae
Cyanidioschyzon_merolae
Glaucocystis_nostochinearum
Cyanophora_paradoxa Glaucocystophytes
Capitella_capitata
Aplysia_californica
Schistosoma_mansoni
Apis_mellifera
Drosophila_melanogaster
Caenorhabditis_elegans
Gallus_gallus
Homo_sapiens
Branchiostoma_floridae
Oscarella_carmela
Aphrocallistes_vastus
Mnemiopsis_leidyi Opisthokonta
Nematostella_vectensis
Monosiga_brevicollis
Amoebidium_parasiticum
Sphaeroforma_arctica
Capsaspora_owczarzaki
Candida_albicans
Saccharomyces_cerevisiae
Schizosaccharomyces_pombe
Phanerochaete_chrysosporium
Ustilago_maydis
Glomus_intraradices
Allomyces_macrogynus
Spizellomyces_punctatus
Acanthamoeba_castellanii
Hartmannella_vermiformis
Dictyostelium_discoideum
Physarum_polycephalum
Arcella_hemisphaerica Amoebozoa
Rhizamoeba_sp
Entamoeba_histolytica
Mastigamoeba_balamuthi
Trypanosoma_brucei
Leishmania_major
Bodo_saltans
Diplonema_papillatum
Euglena_longa
Euglena_gracilis
Entosiphon_sulcatum
Jakoba_libera
Reclinomonas_americana
Seculamonas_ecuadoriensis
Naegleria_gruberi
Sawyeria_marylandensis
Excavata
Trichomonas_vaginalis
Giardia_duodenalis
Spironucleus_barkhanus
Carpediemonas_membranifera
Monocercomonoides_sp
Streblomastix_strix
Trimastix_pyriformis
Malawimonas_californiana
Malawimonas_jakobiformis
Fig. S4. Time-calibrated tree of eukaryotes using All calibration points, 109 taxa, root estimated by BEAST, and constructed in BEAST (analysis e). Other notes
as in Fig. S1.
Alexandrium_tamarense
Heterocapsa_rotundata
Crypthecodinium_cohnii
Karenia_brevis
Oxyrrhis_marina
SAR
Perkinsus_marinus
Theileria_parva
Plasmodium_berghei
Alveolates
Toxoplasma_gondii
Eimeria_tenella
Stylonychia_lemnae
Sterkiella_histriomuscorum
Nyctotherus_ovalis
Paramecium_tetraurelia
Tetrahymena_thermophila
Chilodonella_uncinata
Reticulomyxa_filosa
Ovammina_opaca Rhizaria
Plasmodiophora_brassicae
Bigelowiella_natans
Gromia
Corallomyxa_tenera
Heteromita_globosa
Thalassiosira_pseudonana
Phaeodactylum_tricornutum
Aureococcus_anophagefferens
Stramenopiles
Heterosigma_akashiwo
Ectocarpus_siliculosus
Apodachlya_brachynema
Phytophthora_infestans
Isochrysis_galbana
Emiliania_huxleyi Haptophytes
Prymnesium_parvum
Pavlova_lutheri
Oryza_sativa
Arabidopsis_thaliana
Welwitschia_mirabilis
Ginkgo_biloba
Physcomitrella_patens
Mesostigma_viride
Volvox_carteri
Chlamydomonas_reinhardtii Green algae
Dunaliella_salina
Acetabularia_acetabulum
Micromonas_pusilla
Ostreococcus_tauri
Gracilaria_changii
Chondrus_crispus
Porphyra_yezoensis Red algae
Cyanidioschyzon_merolae
Goniomonas
Guillardia_theta
Leucocryptos_marina Cryptomonads
Glaucocystis_nostochinearum
Cyanophora_paradoxa
Glaucocystophytes
Capitella_capitata
Aplysia_californica
Schistosoma_mansoni
Apis_mellifera
Drosophila_melanogaster
Caenorhabditis_elegans
Gallus_gallus
Homo_sapiens
Branchiostoma_floridae
Oscarella_carmela
Aphrocallistes_vastus Opisthokonta
Mnemiopsis_leidyi
Nematostella_vectensis
Monosiga_brevicollis
Sphaeroforma_arctica
Amoebidium_parasiticum
Capsaspora_owczarzaki
Candida_albicans
Saccharomyces_cerevisiae
Schizosaccharomyces_pombe
Phanerochaete_chrysosporium
Ustilago_maydis
Glomus_intraradices
Allomyces_macrogynus
Spizellomyces_punctatus
Acanthamoeba_castellanii
Hartmannella_vermiformis
Dictyostelium_discoideum
Physarum_polycephalum
Arcella_hemisphaerica Amoebozoa
Rhizamoeba_sp
Entamoeba_histolytica
Mastigamoeba_balamuthi
Trypanosoma_brucei
Leishmania_major
Bodo_saltans
Diplonema_papillatum
Euglena_longa
Euglena_gracilis
Entosiphon_sulcatum
Jakoba_libera
Reclinomonas_americana
Seculamonas_ecuadoriensis
Naegleria_gruberi
Sawyeria_marylandensis Excavata
Trichomonas_vaginalis
Giardia_duodenalis
Spironucleus_barkhanus
Carpediemonas_membranifera
Monocercomonoides_sp
Streblomastix_strix
Trimastix_pyriformis
Malawimonas_californiana
Malawimonas_jakobiformis
Fig. S5. Time-calibrated tree of eukaryotes using Phanerozoic calibration points, 109 taxa, root estimated by BEAST, and constructed in BEAST (analysis f).
Other notes as in Fig. S1.
Alexandrium_tamarense
Heterocapsa_rotundata
Crypthecodinium_cohnii
Karenia_brevis
Oxyrrhis_marina
SAR
Perkinsus_marinus
Theileria_parva
Plasmodium_berghei
Alveolates
Toxoplasma_gondii
Eimeria_tenella
Stylonychia_lemnae
Sterkiella_histriomuscorum
Nyctotherus_ovalis
Paramecium_tetraurelia
Tetrahymena_thermophila
Chilodonella_uncinata
Thalassiosira_pseudonana
Phaeodactylum_tricornutum
Aureococcus_anophagefferens
Heterosigma_akashiwo Stramenopiles
Ectocarpus_siliculosus
Apodachlya_brachynema
Phytophthora_infestans
Reticulomyxa_filosa
Ovammina_opaca
Plasmodiophora_brassicae Rhizaria
Bigelowiella_natans
Gromia
Corallomyxa_tenera
Heteromita_globosa
Isochrysis_galbana
Emiliania_huxleyi
Prymnesium_parvum Haptophytes
Pavlova_lutheri
Oryza_sativa
Arabidopsis_thaliana
Welwitschia_mirabilis
Ginkgo_biloba
Physcomitrella_patens
Mesostigma_viride
Volvox_carteri
Chlamydomonas_reinhardtii Green algae
Dunaliella_salina
Acetabularia_acetabulum
Micromonas_pusilla
Ostreococcus_tauri
Goniomonas
Guillardia_theta Cryptomonads
Leucocryptos_marina
Glaucocystis_nostochinearum
Cyanophora_paradoxa Glaucocystophytes
Gracilaria_changii
Chondrus_crispus
Porphyra_yezoensis Red algae
Cyanidioschyzon_merolae
Trypanosoma_brucei
Leishmania_major
Bodo_saltans
Diplonema_papillatum
Euglena_longa
Euglena_gracilis
Entosiphon_sulcatum
Jakoba_libera
Reclinomonas_americana
Seculamonas_ecuadoriensis
Naegleria_gruberi Excavata
Sawyeria_marylandensis
Trichomonas_vaginalis
Giardia_duodenalis
Spironucleus_barkhanus
Carpediemonas_membranifera
Monocercomonoides_sp
Streblomastix_strix
Trimastix_pyriformis
Malawimonas_californiana
Malawimonas_jakobiformis
Capitella_capitata
Aplysia_californica
Schistosoma_mansoni
Apis_mellifera
Drosophila_melanogaster
Caenorhabditis_elegans
Gallus_gallus
Homo_sapiens
Branchiostoma_floridae
Aphrocallistes_vastus
Oscarella_carmela
Mnemiopsis_leidyi Opisthokonta
Nematostella_vectensis
Monosiga_brevicollis
Amoebidium_parasiticum
Sphaeroforma_arctica
Capsaspora_owczarzaki
Candida_albicans
Saccharomyces_cerevisiae
Schizosaccharomyces_pombe
Phanerochaete_chrysosporium
Ustilago_maydis
Glomus_intraradices
Allomyces_macrogynus
Spizellomyces_punctatus
Acanthamoeba_castellanii
Hartmannella_vermiformis
Dictyostelium_discoideum
Physarum_polycephalum
Arcella_hemisphaerica Amoebozoa
Rhizamoeba_sp
Entamoeba_histolytica
Mastigamoeba_balamuthi
Fig. S6. Time-calibrated tree of eukaryotes using All calibration points, 109 taxa, rooted on “Unikonta” and constructed in BEAST (analysis g). Other notes as
in Fig. S1.
Heterocapsa_rotundata
Alexandrium_tamarense
Crypthecodinium_cohnii
Karenia_brevis SAR
Oxyrrhis_marina
Perkinsus_marinus
Theileria_parva Alveolates
Plasmodium_berghei
Toxoplasma_gondii
Eimeria_tenella
Stylonychia_lemnae
Sterkiella_histriomuscorum
Nyctotherus_ovalis
Paramecium_tetraurelia
Tetrahymena_thermophila
Chilodonella_uncinata
Reticulomyxa_filosa
Ovammina_opaca
Plasmodiophora_brassicae
Bigelowiella_natans
Gromia Rhizaria
Corallomyxa_tenera
Heteromita_globosa
Thalassiosira_pseudonana
Phaeodactylum_tricornutum
Aureococcus_anophagefferens
Heterosigma_akashiwo
Ectocarpus_siliculosus
Stramenopiles
Apodachlya_brachynema
Phytophthora_infestans
Isochrysis_galbana
Emiliania_huxleyi Haptophytes
Prymnesium_parvum
Pavlova_lutheri
Oryza_sativa
Arabidopsis_thaliana
Welwitschia_mirabilis
Ginkgo_biloba
Physcomitrella_patens
Mesostigma_viride
Volvox_carteri
Chlamydomonas_reinhardtii Green algae
Dunaliella_salina
Acetabularia_acetabulum
Micromonas_pusilla
Ostreococcus_tauri
Gracilaria_changii
Chondrus_crispus
Porphyra_yezoensis Red algae
Cyanidioschyzon_merolae
Goniomonas
Guillardia_theta Cryptomonads
Leucocryptos_marina
Glaucocystis_nostochinearum Glaucocystophytes
Cyanophora_paradoxa
Trypanosoma_brucei
Leishmania_major
Bodo_saltans
Diplonema_papillatum
Euglena_longa
Euglena_gracilis
Entosiphon_sulcatum
Jakoba_libera
Reclinomonas_americana
Seculamonas_ecuadoriensis
Naegleria_gruberi
Sawyeria_marylandensis
Excavata
Trichomonas_vaginalis
Giardia_duodenalis
Spironucleus_barkhanus
Carpediemonas_membranifera
Monocercomonoides_sp
Streblomastix_strix
Trimastix_pyriformis
Malawimonas_californiana
Malawimonas_jakobiformis
Capitella_capitata
Aplysia_californica
Schistosoma_mansoni
Apis_mellifera
Drosophila_melanogaster
Caenorhabditis_elegans
Gallus_gallus
Homo_sapiens
Branchiostoma_floridae
Oscarella_carmela
Aphrocallistes_vastus
Mnemiopsis_leidyi Opisthokonta
Nematostella_vectensis
Monosiga_brevicollis
Amoebidium_parasiticum
Sphaeroforma_arctica
Capsaspora_owczarzaki
Candida_albicans
Saccharomyces_cerevisiae
Schizosaccharomyces_pombe
Phanerochaete_chrysosporium
Ustilago_maydis
Glomus_intraradices
Allomyces_macrogynus
Spizellomyces_punctatus
Acanthamoeba_castellanii
Hartmannella_vermiformis
Dictyostelium_discoideum
Physarum_polycephalum
Arcella_hemisphaerica Amoebozoa
Rhizamoeba_sp
Entamoeba_histolytica
Mastigamoeba_balamuthi
Fig. S7. Time-calibrated tree of eukaryotes using Phanerozoic calibration points, 109 taxa, rooted on “Unikonta” and constructed in BEAST (analysis h). Other
notes as in Fig. S1.
Root age range is the 95% HPD for BEAST analyses and minimum and maximum ages of 95% confidence interval for PhyloBayes. See Table S2 for details of
taxon sampling, and Table 1 for calibration constraints. All trees are available in Dataset S1. All, 22 calibration points of Phanerozoic and Proterozoic age
included; All 720, Bangiomorpha CC set to 720 Ma; CCs, calibration constraints; CIR, autocorrelated CIR model; Estim, root estimated by BEAST; model,
molecular clock model; Opis, root constrained to Opisthokonta; Phan, calibration points of Phanerozoic age included; root, position of the root; UCL, un-
correlated log normal; UGAM, uncorrelated gamma model; Uni, root constrained to “Unikonta”.
12 of 15
Table S2. Cont.
Lineage Taxon* 14–3-3 40S Actin αtub βtub Ef1α Ef2 Enolase Grc5 Hsp70cyt Hsp90 MetK Rps22a Rps23a Tsec61 Sum
13 of 15
Table S2. Cont.
Lineage Taxon* 14–3-3 40S Actin αtub βtub Ef1α Ef2 Enolase Grc5 Hsp70cyt Hsp90 MetK Rps22a Rps23a Tsec61 Sum
*Taxa in bold are included in both the 91-taxon and 109-taxon analyses.
†
Composite of Goniomonas truncata and Goniomonas cf. pacifica.
‡
Composite of G. oviformis and Gromia sp. Antarctica.
§
Composite of. H. globosa and Heteromita sp. ATCC PRA-74.
14 of 15
Table S3. PhyloBayes calibrations
Node specification Calibration†
Dataset S1 (XLS)