Вы находитесь на странице: 1из 46

bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998.

The copyright holder for this preprint (which was not


peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

1 The genome of the contractile demosponge Tethya wilhelma and


2 the evolution of metazoan neural signalling pathways
3 Warren R. Francis1 , Michael Eitel1 , Sergio Vargas1 , Marcin Adamski2 ,
Steven H.D. Haddock 3 , Stefan Krebs4 , Helmut Blum4 , Gert Wörheide1,5,6
1
Department of Earth and Environmental Sciences, Paleontology and Geobiology,
Ludwig-Maximilians-Universität München
Richard-Wagner Straße 10, 80333 Munich, Germany
2
Research School of Biology, College of Medicine, Biology & Environment,
Australian National University, Canberra ACT 0200 Australia
3
Monterey Bay Aquarium Research Institute, Moss Landing, CA 95039, USA
4
Laboratory for Functional Genome Analysis (LAFUGA), Gene Center,
Ludwig-Maximilians-University Munich, Munich, Germany
5
GeoBio-Center, Ludwig-Maximilians-Universität München, Munich, Germany
6
Bavarian State Collection for Paleontology and Geology, Munich, Germany

4 Abstract
5 Porifera are a diverse animal phylum with species performing important ecological roles in aquatic ecosys-
6 tems, and have become models for multicellularity and early-animal evolution. Demosponges are the largest
7 class in sponges, but previous studies have relied on the only draft demosponge genome of Amphimedon
8 queenslandica. Here we present the 125-megabase draft genome of the contractile laboratory demosponge
9 Tethya wilhelma, sequenced to almost 150x coverage. We explore the genetic repertoire of transporters, re-
10 ceptors, and neurotransmitter metabolism across early-branching metazoans in the context of the evolution
11 of these gene families. Presence of many genes is highly variable across animal groups, with many gene
12 family expansions and losses. Three sponge classes show lineage-specific expansions of GABA-B receptors,
13 far exceeding the gene number in vertebrates, while ctenophores appear to have secondarily lost most genes
14 in the GABA pathway. Both GABA and glutamate receptors show lineage-specific domain rearrangements,
15 making it difficult to trace the evolution of these gene families. Gene sets in the examined taxa suggest that
16 nervous systems evolved independently at least twice and either changed function or were lost in sponges.
17 Changes in gene content are consistent with the view that ctenophores and sponges are the earliest-branching
18 metazoan lineages and provide additional support for the proposed ParaHoxozoa clade.

19 Introduction
20 The presence of neurons is a defining character of animals, and is symbolic of their alleged superiority over all
21 other life on earth. Nonetheless, the four non-bilaterian phyla, Porifera, Placozoa, Ctenophora and Cnidaria,
22 are most different from other animals in their sensory systems and are often considered “lower” animals in
23 common parlance. Indeed, animals such as corals and sponges appear immobile or often unresponsive, chal-
24 lenging early theorists in their ideas of what is and is not an animal. Yet we now know that representatives
25 from all four non-bilaterian phyla demonstrate dynamic responses to outside stimuli.
26

27 Neural evolution has been discussed previously in the context of paleontology (reviewed in [Wray et al.,
28 2015]) and metazoan phylogeny (reviewed in [Jékely et al., 2015]). Indeed, it has been suggested that many
29 features of bilaterian neurons and nervous systems represent separate, parallel evolutionary events from a

1
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

30 “simple” nervous system. A simple nervous system then must arise from proto-neurons [Schierwater et al.,
31 2009], however it is unclear what that might look like.
32

33 Several qualities can be used to define neurons or proto-neurons [Leys, 2015,Nickel, 2010] such as synapses,
34 electrical excitability, membrane potential, or secretory functions, though no single quality (and ultimately
35 gene set) solely defines such cells as neurons. Two non-bilaterian groups, ctenophores and cnidarians, are
36 thought to have true neurons. When considering the remaining two non-bilaterian phyla, sponges and
37 placozoans, many components of neural cells are found without any neuron-like cells having been identi-
38 fied [Srivastava et al., 2010, Riesgo et al., 2014a, Leys, 2015], although synapse-like structures have been
39 identified in placozoan fiber cells that show vesicles close to an osmophile contact [Grell and Benwitz, 1974].
40

41 Comparative analyses revealed a gradient of neural-like qualities indicating that “neuron-or-not” classi-
42 fications are not straightforward. While ctenophores, cnidarians, and bilaterians have true neurons, struc-
43 tural and biochemical differences, [Moroz et al., 2014, Moroz, 2015] led to the proposition that neurons in
44 ctenophores and cnidarians may not be homologous, but rather separate evolutionary outcomes from neural-
45 like precursor cells. Potentially, in the case of independent evolutions, neurons are “easy” to evolve, since it
46 involves co-expression of various pan-metazoan genetic modules in the same cell type. Alternatively, early
47 rudimentary signaling systems may have been energetically costly and not especially useful in pre-Cambrian
48 oceans, and in such cases, it may have been comparatively easy to lose such genes and with them neuronal-
49 type cells.
50

51 Interpretation of neural evolution requires an accurate metazoan phylogeny, and the phylogenetic relation-
52 ships of early-branching metazoans have been a topic of continued controversy. Some analyses support the
53 traditional phylogenetic position of sponges as sister group to all other metazoans (“Porifera-sister”) [Philippe
54 et al., 2009,Pick et al., 2010,Nosenko et al., 2013,Pisani et al., 2015,Simion et al., 2017] while others suggest
55 that Ctenophora are the sister group to all other animals (“Ctenophora-sister”) [Dunn et al., 2008, Ryan
56 et al., 2013, Whelan et al., 2015], and some analyses also recover the classical view, a Coelenterata clade
57 uniting Cnidaria and Ctenophora [Philippe et al., 2009]. Importantly, phylogenomic analyses can be prone
58 to systematic artifacts under some circumstances, depending on taxon sampling [Pick et al., 2010, Philippe
59 et al., 2011], gene set [Nosenko et al., 2013], phylogenetic model [Pisani et al., 2015], or use of nucleotides
60 instead of proteins [Jarvis et al., 2014]. Other methods based on presence or absence of the genes themselves
61 have been proposed to provide a sequence-independent inference of phylogeny [Ryan et al., 2010,Ryan et al.,
62 2013, Pisani et al., 2015], relying on the assumption that gene loss is a rare event. However, non-bilaterians
63 have the additional problem that basic knowledge of many aspects of their biology is absent [Dunn et al.,
64 2015], and so the biological context that may separate or unite groups is limited.
65

66 In the context of phylogeny, the branching order critically affects whether neurons evolved multiple times
67 or were lost (see schematic in Figure 1). Given the gradient of neural-like qualities, the actual evolutionary
68 scenario may be somewhere in between a simple gain-loss of neurons. While some previous studies have
69 focused on neural evolution in ctenophores [Ryan et al., 2013, Alberstein et al., 2015, Li et al., 2015] or
70 analysing the genomic data from A. queenslandica [Krishnan et al., 2014], these alone do not provide a
71 comprehensive picture of all animals.
72

73 Here we have sequenced the genome of the contractile laboratory demosponge Tethya wilhelma [Sara
74 et al., 2001] and examined the protein repertoire in the context of genes mediating the contraction, and
75 other neural-like functions. Many metabolic genes show unique expansions in different sponge clades, as well
76 as other phyla, making it challenging to clearly assign functions based on similarity to human proteins. We
77 consider these expansions in the context of phylogenetics, showing that even though sponges lack neurons,
78 signaling pathways have still expanded. This gives support to the hypothesis that early neural-like cells have
79 become neurons multiple times in the history of animals.

2
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

A B
HAS
NEURONS

GAIN
LOSS
C D
Bilateria

Cnidaria

Placozoa

Porifera

Ctenophora

Figure 1: Schematic of neural evolution depending on metazoan phylogeny The presence of neu-
rons or neural-like cells in ctenophores, cnidarians, and bilaterians can be viewed differently depending on
the phylogeny. Two different metazoan phylogenies based on recent multi-gene phylogenetic analyses are
the source of the Porifera-sister (A,B) [Philippe et al., 2009, Pisani et al., 2015, Simion et al., 2017] and
Ctenophora-sister (C,D) [Ryan et al., 2013, Whelan et al., 2015] scenarios. Neurons can either have evolved
once requiring a secondary loss in sponges, placozoans, or both (A,C), or evolved twice, in ctenophores and
in cnidarians/bilaterians (B,D).

80 Results
81 Genome assembly and annotation
82 We generated a total of 61 gigabases of paired-end reads from a whole specimen of T. wilhelma (Figure 2)
83 and all associated bacteria. Because of a close association with microbes, some contigs were expected to
84 have derived from bacteria, as many reads have unexpectedly high GC content (Supplemental Figures 1-4).
85 After assembly and filtration of bacterial contigs, the final assembly was 125Mb, similar to A. queenslandica,
86 with a N50 value of 70kb (Supplemental Table 1). Gene annotation was done with a combination of a
87 deeply-sequenced RNAseq library from an adult sponge and ab initio gene predictions. Because of high
88 density of genes, extensive manual curation was often necessary to correct genes of the same strand that
89 were erroneously merged. After correction and filtering of the ab initio predictions, we counted 37,416
90 predicted genes, comparable with the counts in A. queenslandica (40,122) [Fernandez-Valverde et al., 2015]
91 and S. ciliatum (40,504) [Fortunato et al., 2014].
92 General trends in splice variation were similar between T. wilhelma and A. queenslandica (Supplemental
93 Tables 2 and 3), suggesting similar underlying biology or genome structure. One-to-one orthologs from T.
94 wilhelma and A. queenslandica had relatively low identity (Supplemental Figure 5), with the average identity
95 of 57.8%, showing a high genetic diversity within Porifera. The average identity is lower when compared to

3
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

Figure 2: Contraction of a normal specimen (A) Maximally expanded state of Tethya wilhelma. (B)
The same specimen approximately an hour later in the most contracted state. (C) Cartoon view of most
contracted versus expanded state. Scale bar applies to all images. Photos courtesy of Dan B. Mills.

96 S. ciliatum (49.7%), N. vectensis (53.5%) and human (52.0%), which is not surprising given that A. queens-
97 landica and T. wilhelma are both demosponges. Although both genomes are too fragmented to find syntenic
98 chromosomal regions, ordered blocks of genes are still identifiable between T. wilhelma and A. queenslandica
99 (Supplemental Figure 6), though not with S. ciliatum.
100

101 Neurotransmitter metabolism across early-branching metazoans


102 Compared to most other metazoans, sponges have a limited set of behaviors (contraction, closure of osculum
103 or choanocyte chambers to control flow), yet respond to many signaling molecules present in bilaterians [Ell-
104 wanger and Nickel, 2006, Ellwanger et al., 2007]. Some genes involved in vertebrate-like neurotransmitter
105 metabolism have been found in sponges [Riesgo et al., 2014a, Krishnan and Schiöth, 2015]. although many
106 display a sister-group relationship to homologs found in other animals and appear to have a complex evo-
107 lutionary history with duplications in sponges and other non-bilaterian animals (Figure 3, Supplemental
108 Figures 7-9), making the prediction of their functions difficult. Thus, declaring presence or absence of any
109 individual gene or genetic module is not correct in the strictest sense, since these proteins are often many-to-
110 many orthologs to human proteins with known functions, and it is not possible to computationally predict
111 which, if any, of the sponge orthologs shares its function with the human protein.
112

113 For instance, biosynthesis of monoamine neurotransmitters (dopamine, serotonin, etc.) requires two
114 enzymes, tryptophan hydroxylase and tyrosine hydroxylase. These two enzymes appear to have arisen in
115 bilaterians from duplications of an ancestral phenylalanine hydroxylase [Cao et al., 2010], though evidence
116 is lacking as to whether this ancestral protein had multiple functions that specialized after duplication (sub-
117 functionalization) or developed new functions (neofunctionalization) post-duplication. The absence of these
118 proteins in non-bilaterians seems to be ancestral; in other words, they had not evolved yet when these groups
119 split and diversified.
120

121 Among other non-bilaterians, some monoamine neurotransmitters are found in cnidarians [Carlberg and
122 Rosengren, 1985], but are mostly absent in ctenophores (or at detection limit) [Moroz et al., 2014]. Indeed,
123 previous studies were unable to find homologs of DOPA decarboxylase (AADC, Supplemental Figure 8),
124 dopamine β-hydroxylase (DBH, Supplemental Figure 7), monoamine oxidase (MAO, Supplemental Fig-
125 ure 9), or tyrosine hydroxylase (TH) in the genome of the ctenophore M. leidyi or any available ctenophore
126 transcriptome, and it was suggested that some of these proteins were absent in sponges as well (see Supple-
127 mentary Tables 17 and 19 in [Ryan et al., 2013]). However, we found orthologs of MAO and homologs of
128 AADC and DBH in several sponges, though it is unclear if they perform the same function as the human
129 proteins. Additionally, homologs of four enzymes, AADC, MAO, DBH, and ABAT, are present in single-

4
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

A Glutamine Glutamic acid


(TAT)
Alpha-keto glutarate Citric acid
C
O O O O O O cycle Metazoa
GLUD
H 2N OH
GS HO OH HO OH OGDC-E1 *
NH2 NH2 O
2OG-enzymes SUCLG1
GAD VitC
ABAT
SSDH

Filasterea
O O O

Placozoa

Calcarea
Cnidaria
Bilateria

Choano
OH

Hexact
HO HO HO

Cteno

Demo

Plant
Hsm
NH2 O O

B Phenylpyruvate
Gamma-amino
butyric acid
GATP Succinate
semialdehyde
Succinic acid

ABAT
COO-
4-hydroxyphenylpyruvate
GAD
O COO- HPD GS M 2
Phenylalanine Acetoacetate
HO
O
+ Fumarate GATP
KYAT catabolites
TAT SSDH
Phenylalanine Tyrosine Tyramine 4-hydroxyphenylacetate GLUD 2 M 2
COO- PAH COO- COO- HGD
NH3 +
HO
NH3+ HO
NH3+
HO KYAT 2 3 2
TH TAT
Standard
HPD
amino acids DOPA
AADC
Dopamine Dihydroxyphenylacetate Homovanillate
HO COO- HO HO
COO-
MeO
COO- HGD
NH3+ NH3+
HO HO HO HO

PAH 2
DBH VitC
TH
Noradrenaline DOMA VMA AADC 2
Catecholamine OH OH COMT OH

neurotransmitters HO MAO HO
COO -
MeO
COO - DBH M M M M M
HO
NH3+
HO HO PNMT
PNMT MAO 2
Catecholamine
Adrenaline degradation COMT
Enzyme requires O2
OH
products
NADH is a product HO
Present Homolog Loss
NH2Me+
VitC is a cofactor HO
Absent Absent in transcriptome

Figure 3: Neurotransmitter overview across metazoans Summary schematic of neurotransmitter


biosynthesis and degradation pathways across early-branching metazoans. Each of the four non-bilaterian
groups is presumed to be monophyletic, although some individual trees of genes or gene families may display
alternate topologies. Bold letters refer to the enzymes. Individual gene trees that display the orthology of the
clades are found in the supplemental information. Arrows are shown in one direction though many reactions
can be reversible. (A) Glutamate and GABA metabolism from the citric acid cycle through the “GABA
shunt” pathway. Because ABAT is absent in ctenophores, GLUD and TAT are potentially alternatives to
convert α-ketoglutarate to glutamate. GATP can convert GABA to succinate semialdehyde, but this enzyme
was only found in some demosponges and plants. (B) Monoamine metabolism, excluding tryptophan. (C)
Table of presence-absence for genes in parts A and B. Presence (green) refers to a 1-to-1 ortholog where
orthology is clear from the tree position. Homolog (blue) refers to a sister group position in trees before
duplications with different or unknown functions. Secondary loss (red) refers to the gene missing in the
clade, but homologs are found in non-metazoan phyla. Numbers inside the boxes indicate copy number
specific to that group, M refers to multiple duplications within the group where the copy number is variable
among species. Abbreviations for clades are as follows: Cteno, ctenophores; Hsm, homoscleromorphs; Demo,
demosponges; Hexact, hexactinellids; Choano, choanoflagellates.

130 celled eukaryotes but not ctenophores, implying a secondary loss of these protein families in this phylum.
131

5
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

132 GABA receptors


133 The neurotransmitter gamma-amino butyric acid (GABA) has been shown to affect contraction in T. wil-
134 helma [Ellwanger et al., 2007] and the freshwater sponge E. muelleri [Elliott and Leys, 2010]. The genome of
135 T. wilhelma contains metabotropic receptors (GABA-B, mGABARs), but not ionotropic GABA receptors
136 (GABA-A, iGABARs). While humans have two mGABARs and the ctenophore M. leidyi has only one, the
137 T. wilhelma genome has nine. Sponges appear to have undergone a large expansion of this protein family
138 (Figure 4), similar to the expansion of glutamate-binding GPCRs previously observed in sponges [Krishnan
139 et al., 2014]. Based on the structure of the binding pocket of human GABAR-B1 [Geng et al., 2013], many
140 differences are observed across the mGABAR protein family, even showing that many residues involved in
141 coordination of GABA are not conserved between the two human proteins or all other animals (Supple-
142 mental Figure 11). Contrary to previous reports [Ramoino et al., 2010], we were unable to find normal
143 mGABARs in the two calcareous sponges S. ciliatum and L. complicata. Instead, in these two species, the
144 best BLAST hits from human GABA-B receptors (the putative mGABARs) had the best reciprocal hits to
145 Insulin-like growth-factor receptors. Structurally, this was due to the normal seven-transmembrane domain
146 being swapped with a C-terminal protein kinase domain (Figure 5), meaning these are not true metabotropic
147 GABA receptors. Similarly, in the filasterean Capsaspora owczarzaki, the N-terminal ligand binding domain
148 is also exchanged with other domains, suggesting as well that these are not true metabotropic GABA recep-
149 tors.
150

151 Glutamate receptors


152 Glutamate is of particular interest as it is a key metabolic intermediate and the main excitatory neurotrans-
153 mitter in animal nervous systems, acting on two types of receptors: the metabotropic glutamate receptors
154 (mGluRs) and the ionotropic ones (iGluRs). Some sponge species possess iGluRs, though these receptors
155 were absent in the transcriptomes of several demosponges [Riesgo et al., 2014a]. We were unable to find
156 iGluRs in the genome of T. wilhelma, in the genome and transcriptomes of any other demosponge, or in the
157 genomes of two choanoflagellates (M. brevicollis and S. rosetta). The top BLAST hits in demosponges have
158 a GPCR domain instead of the ion channel domain, indicating that these are not true iGluRs (Supplemental
159 Figure 13). Because the domain structure in plants is the same as most animal iGluRs, the ligand binding
160 domain was swapped out in demosponges.
161

162 The homoscleromorpha/calcarea clade appears to have an independent expansion of iGluRs (Supplemen-
163 tal Figure 12), though the normal ion transporter domain is switched with a SBP-bac-3 domain (PFAM
164 domain PF00497) compared to all other iGluRs (Supplemental Figure 13). Additionally, ctenophores and
165 placozoans appeared to have dramatic expansions of this protein family as well [Ryan et al., 2013, Moroz
166 et al., 2014,Alberstein et al., 2015], suggesting that a small set of iGluRs was present in the common ancestor
167 of eukaryotes and have diversified multiple times in both plants and animals, while other clades appear to
168 have modified or lost these proteins.
169

170 Vesicular transporters


171 Secretory systems are a common feature of all eukaryotes, as most cells have endoplasmic reticulum to secrete
172 proteins or make membrane proteins. Neurons secrete peptides (conceptually identical to any other protein)
173 or small-molecule neurotransmitters in a paracrine fashion, specifically to other neural cells. Compared to
174 peptides, small-molecule neurotransmitters need to be loaded into vesicles by dedicated transport proteins.
175 Vesicular glutamate transporters (VGluTs, SLC17A6-8) are part of a superfamily of transporters [Sreedha-
176 ran et al., 2010] that carry glutamate, aspartate, and nucleotides. The position of sponge proteins in the tree
177 is inconsistent with a clear role in glutamate transport (Supplemental Figure 14), as several sponge clades
178 and ctenophores occur as sister group to multiple duplications. Transporters in sponges, ctenophores, and
179 choanoflagellates may well act upon glutamate or other amino acids, but this needs to be experimentally
180 investigated.
181

6
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

Protostome GABAR-B2
GABAR-B3 (7) Placozoans (2)

Cnidarians (9) Vertebrate


GABAR-B1 (4)
Filasterea (1)

Ctenophores (2)

Homoscleromorphs (1)
*
**
Hexactinellids (4)

Demosponges (4)

++ BS>80
BS>90

Figure 4: Metabotropic GABA receptors (GABA-B type) across metazoans Protein tree generated
with RAxML. Numbers in parentheses indicate the number of species from that group, so the 46 demosponge
mGABARs come from 4 species. Key bootstrap values are summarized as yellow or gray dots, for values of
90 or more, or 80 or more, respectively. Single star indicates sequences annotated as mGABARs in [Krishnan
et al., 2014], double plus indicates the clade annotated as “sponge specific expansion” in [Krishnan et al.,
2014]. For complete version with protein names and all bootstrap values, see Supplemental Figure 10

182 Similar to glutamate, GABA is loaded into vesicles with the vesicular inhibitory amino acid transporter
183 (VIAAT). Ctenophores, sponges, and placozoans lack one-to-one homologs of VIAAT (Supplemental Fig-
184 ure 15). Several other transporters are thought to transport GABA (ANTL or SLC6 class) and many other
185 amino acids. SLC6-class transporters, which transport diverse amino acids, are found in all non-bilaterian
186 groups, so the function of VIAAT may be redundant.

187 Glycine receptors


188 Glycine is known to affect the contraction of T. wilhelma [Ellwanger and Nickel, 2006]. Some ctenophore
189 iGluRs have been shown to bind glycine [Alberstein et al., 2015] due to the substitution of serine for arginine
190 (S687 in human GluN1), though this appears to be specific only to ctenophores, as essentially all other
191 iGluRs have the conserved serine/threonine at this position. Because no ionotropic glycine receptors could
192 be identified in the T. wilhelma genome (or any other sponges, ctenophores or placozoans), other proteins
193 may be responsible for mediating this effect.
194

7
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

GABR1_HUMAN
Vertebrate mGABARs Signal peptide
ANF_receptor Sushi Recep_L_domain
7-transmembrane TIG Furin−like
GABR1_MOUSE
Tyrosine kinase Laminin G3 fibronectin3

GABR2_HUMAN Human Receptor tyrosine kinases


INSRR_HUMAN

GABR2_MOUSE
IGF1R_HUMAN

Triad1_g72.t1__scaffold_1 Placozoans
Calcisponge mGABAR-like
scict1.027609.1_0
Sycon ciliatum
ML02335a−AUGUSTUS Ctenophores
scict1.029082.1_0

Twilhelma_twi_ss.20824.4_2 Demosponges

lctid9879_0
Leucosolenia complicata
Aqu2.1.28011_001

lctid37767_0
aphrocallistes_vastus_comp16141_c0_seq1_1 Hexactinellids

rosella_fibulata_TR14675_c0_g1_i1_0

Capsaspora owczarzaki Filasterean


CAOG_05982T0
oscarella_t_h60_42123_comp11388_c0_seq18 Homoscleromorph

0 200 400 600 800 1000 0 200 400 600 800 1000 1200 1400

Figure 5: Domain organization of GABA-B type receptors across metazoans


Scale bar displays number of amino acids. Top reciprocal BLAST hits to human for putative mGABARs
in calcareous sponges are INSRR and IGF1R, due to high-scoring hits to the tyrosine kinase domain. All
mGABARs from demosponges, glass sponges, and the homoscleromorph Oscarella carmela share the 7-
transmembrane domain (green) with mGABARs from other animals, while calcisponge proteins have the
same ligand-binding domain (red) but instead have a protein tyrosine kinase domain (purple) at the C-
terminus, similar to growth-factor receptors. The filasterean Capsaspora owczarzaki has alternate domains
at the N-terminus.

195 Mechanical receptors


196 Some sponges can contract in response to mechanical agitation, as reported for the demosponges E. muel-
197 leri [Elliott and Leys, 2007] and T. wilhelma [Nickel, 2010]. Several diverse protein families appear to
198 be responsible for the sense of touch [Árnadóttir and Chalfie, 2010]. A subgroup of the TRP (transient
199 receptor-potential) channels, TRP-N, thought to mediate mechanosensation was determined to be absent
200 in sponges [Schuler et al., 2015], and we were unable to identify any in either T. wilhelma or S. ciliatum,
201 although other TRP-class channels were found [Ludeman et al., 2014, Schuler et al., 2015]. Because the
202 mechanosensory function of TRP channels may be redundant, we analysed for the presence of PIEZO, a
203 280kDa trimeric protein [Ge et al., 2015] involved in touch sensation in mammals [Coste et al., 2012]. Al-
204 though two homologs were found in vertebrates, we found one copy in all other animals (Supplemental
205 Figure 16) as well as fungi, plants and most other eukaryotic groups, suggesting an ancient and conserved
206 function of this protein.
207

8
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

208 Voltage-gated channels


209 Voltage-gated ion channels are necessary for the propagation of electrical signals down axons and dendrites
210 [Zakon, 2012,Moran et al., 2015], and have specificities for sodium, potassium, or calcium. Previous analyses
211 were unable to clearly identify potassium or sodium channels in sponges [Liebeskind et al., 2011]; only one
212 partial potassium channel was found in the transcriptome of the homoscleromorph Corticium candelabrum
213 [Riesgo et al., 2014a,Li et al., 2015]. We were unable to find any voltage-gated sodium or potassium channels
214 in the genome or transcriptome of any sponge. We then examine voltage-gated hydrogen channels (hvcn1), as
215 these proteins have been found in a number of single-celled eukaryotes [Smith et al., 2011], and are extremely
216 conserved. These channels were found in all sponge groups, although the high protein identity resulted in a
217 poorly-resolved tree (Supplemental Figure 19).
218 Reports of action potentials in hexactinellids [Leys et al., 1999,Leys et al., 2007,Nickel, 2010] showed that
219 sponge action potentials were inhibited by divalent cations [Leys et al., 1999], suggesting a role of calcium
220 channels instead. Because voltage-gated sodium and calcium channels arose from a duplication event [Gur
221 Barzilai et al., 2012], the ion selectivity may be variable within this protein family. Most sponges have only a
222 single CaV-channel (Supplemental Figure 18) and several Hv-channels, and no voltage-gated channel of any
223 kind was found in any glass sponge. However, all glass sponge sequences are from transcriptomes, therefore
224 either the expression level of the true channels is low in glass sponges, or they have independently evolved
225 another mechanism to propagate action potentials.
226

9
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

227 Discussion
228 Gene content variation of metazoa
229 Among the thousands of genes in the genome, we focused on genes that may be mediating contractile be-
230 havior in T. wilhelma, and the interactions of those genes within broader metabolic pathways. Many of the
231 “housekeeping” genes in our study have lineage-specific duplications in at least one animal phylum. Consid-
232 ering the importance of “single-copy” proteins in phylogenetic analyses, as taxon sampling improves, it may
233 be found that very few or no genes are single copy across most or all animal phyla. Many other genes that
234 are critical for neural functioning in bilaterians have independent losses in other animal lineages (Figure 6).
235

Gene family expansions


Demo/Hexact Calcarea/Hsm Ctenophores Placozoans Cnidarians
mGluRs iGluRs iGluRs iGluRs VGluT
mGABARs DBH-like * Kv-channels mGABARs VIAAT
MAO/PAOX-like Hv-channels * DBH-like MAO/PAOX-like
DBH-like NaV-channels DBH-like

Gene family losses


iGluRs mGABARs *# ABAT MAO -
VIAAT DBH-like * MAO
AADC-like NaV-channels DBH-like
NaV-channels Kv-channels AADC-like
Kv-channels

Figure 6: Summary of gene expansions and losses


Demo/Hexact refers to the clade of demosponges and glass sponges. Calcarea/Hsm refers to the unnamed
clade of calcareous sponges and homoscleromorphs. The star indicates that expansion or loss was found in
one class but not the other. Number sign indicates the domain rearrangement in calcisponges.

236 Glutamate and GABA receptor evolution


237 There is stark contrast in the relative abundance of mGABARs and iGluRs in sponges and ctenophores.
238 The relative dearth of mGABARs in ctenophores may reflect the apparently absence of amino-butyrate
239 amino-transferase (ABAT) in ctenophores, suggesting that ctenophores use an alternate pathway to produce
240 glutamate or metabolize GABA (Figure 3), rarely use GABA as a neurotransmitter, or simply are missing
241 this pathway. Other aminotransferases such as GLUD or TAT may perform some of the exchange between
242 α-ketoglutarate and glutamate, particularly as ctenophores have two copies of GLUD while most animals
243 have only one. Ctenophores also have multiple (variable) copies of glutamate synthase and three copies of
244 KYAT one of which may serve to balance glutamate metabolism in these animals.
245

246 There are two explanations for the diversity of mGABARs in sponges. Given the high variability of
247 amino acids in the mGABAR binding pocket (Supplemental Figure 11), it is plausible that many of these
248 receptors do not bind GABA at all, and have diversified for other ligands. There is precedent for this as it
249 was shown that the independent expansion of ctenophore iGluRs also included several key mutations to the
250 binding pocket which changed the ligand specificity of these proteins [Alberstein et al., 2015]. For the other

10
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

251 hypothesis, all of the receptors could bind GABA, essentially mediating the same contraction signal, but
252 their kinetics could differ and be influenced by factors such as, for instance, temperature. Because sponges
253 are mostly immobile, they often can be subject to environment variation in terms of light, oxygen, and
254 temperature. The possession of a set of proteins capable of triggering the same response (e.g. contraction)
255 with varying daily or seasonal environmental conditions (e.g. temperature) would be beneficial and may
256 explain the diverse set of receptors observed in sponges. Experimental characterization of these binding
257 domains is necessary and may even show that a combination of these hypotheses explains the diversification
258 of mGABARs in Porifera.
259

260 The apparent absence of true mGABARs in calcareous sponges (the genome of S. ciliatum and transcrip-
261 tome of L. complicata) conflicts with a previous study that identified key proteins in the GABA pathway
262 by immunostaining [Ramoino et al., 2010]. The best mGABARs BLAST hits found in the two calcisponges
263 display a conserved ligand binding domain but the seven-transmembrane domain has been swapped with
264 a tyrosine kinase domain (Figure 5). Structural similarity of the conserved N-terminal domain may result
265 in a false-positive signal in studies using immunostaining with standard antibodies [Ramoino et al., 2010].
266 On the other hand, compared to ctenophores, which apparently lack ABAT, this enzyme was found in both
267 of the calcareous sponges analyzed. Thus it would be surprising if these sponges had no capacity to create
268 or respond to GABA. Since true vertebrate-like mGABARs are found in all other sponge classes, and our
269 study could only examine two calcareous sponges, it could be that mGABAR presence is variable in this
270 class. The genome of S. ciliatum contains 40 proteins annotated as mGluRs [Fortunato et al., 2014], so a
271 third possibility is that even in the absence of true mGABARs, some of these proteins may have evolved
272 affinity for GABA and mediate its signaling in calcareous sponges.
273

274 Although a putative iGluR was identified in the transcriptome of the demosponge Ircinia fasciculata,
275 this sequence was only a fragment, so the glutamate affinity and domain structure could not be determined.
276 As with the mGABARs, the domain structure is different between the sponge classes. Otherwise, it appears
277 that only calcareous sponges and homoscleromorphs have NMDA/AMPA-like iGluRs. The presence of these
278 proteins in plants and other single-celled eukaryotes suggests that at least iGluRs were present in the com-
279 mon ancestor of all eukaryotes, and their absence in demosponges is likely the product of secondary losses.
280 In the context of contractions of T. wilhelma, the abundance of mGluRs and mGABARs could plausibly
281 work in antagonistic ways via the action of different G-proteins making ionotropic channels not necessary
282 for the modulation of this behavior.
283

284 Variation in neurotransmitter metabolism


285 Many of the oxidative enzymes in the monoamine pathway require molecular oxygen, suggesting an im-
286 portant role of this molecule both the synthesis of the neurotransmitters (with PAH, TH, and DBH) and
287 their inactivation (with MAO). Two catabolic pathways arise from tyrosine (Figure 3) and require oxygen
288 at nearly all steps. It is unclear why intermediate products of one of these two, the catecholamine pathway,
289 became neurotransmitters and the other did not, particularly as hydroxyphenylpyruvate pathway is univer-
290 sally found in animals and catabolic intermediates are likely to be universal.
291

292 MAO was found in most animal groups, but we were unable to find any in placozoans or ctenophores.
293 The topology of the MAO phylogenetic tree suggests a secondary loss of this protein in these phyla (Sup-
294 plemental Figure 9). Related genes (PAOX, polyamine oxidase) were found in placozoans with several
295 placozoan-specific duplications, and again, potentially one of these may catalyze the oxidation of aromatic
296 amines. The analysis of these proteins also uncovered a clade including sponges, cnidarians, and lancelets,
297 though the function of these proteins cannot be predicted based on homology searches. In vitro charac-
298 terization of these enzymes may reveal the function to provide evidence as to how these could have been
299 important for metabolism in early animals, and was subsequently replaced or lost in most other metazoan
300 lineages.
301

302 Remarkably, the DBH group has independent expansions in three sponge classes as well as placozoans

11
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

303 and cnidarians (Supplemental Figure 7). No DBH homologs were identified in calcareous sponges or in
304 ctenophores. A putative homolog of this group was found in the choanoflagellate M. brevicollis but not in
305 any other non-metazoan. The alignment and the phylogenetic position of the M. brevicollis protein suggest
306 that it may be a member of the copper-binding oxygenase superfamily, rather than a true homolog of DBH
307 (see Supplemental Alignment).
308

309 The presence of DBH-like and AADC-like enzymes in most animal groups suggests the possibility to
310 make phenylethanolamines (like octopamine or noradrenaline) from tyrosine, and then subsequently inac-
311 tivate them with MAO. All demosponges appear to lack AADC, and ctenophores appear to lack both of
312 these enzymes calling into question a previous report of the detectability of monoamine neurotransmitters
313 in ctenophores [Carlberg and Rosengren, 1985].
314

315 Conserved properties of neurons


316 Neurons are defined by the presence of five key aspects: membrane potential, voltage-gated ion channels,
317 secretory pathways, ligand-gated ion channels, and cell-cell junctions to form synapses. Voltage-gated chan-
318 nels, secretory systems, and ligand-gated ion channels are thoroughly discussed above. Membrane potential
319 is maintained in animal cells by sodium-potassium pumps (ATP-ases), which are a class of cation pumps
320 exclusively found in animals [Stein, 1995, Sáez et al., 2009]. It is thought that such pumps are necessary
321 because animals are the only multicellular group that lacks any kind of cell wall, thus careful control of
322 ionic balance is necessary to resist osmotic stress [Stein, 1995]. For non-bilaterian animals, cell layers were
323 in direct contact with water, so potentially all cells needed this protein to function normally. Therefore
324 having neuron-like functionality is unlikely to rest upon the gain or loss of this gene. The last feature is
325 the presence of cell-cell connections. Many proteins involved in synapse structure or neurotransmission are
326 found in sponges, [Srivastava et al., 2010,Riesgo et al., 2014a,Moran et al., 2015,Leys, 2015] though it is not
327 clear which genes are necessary for neural functioning, or may have evolved independently.

328 Neural evolution and losses


329 Based on recent phylogenies, both Porifera-sister and Ctenophora-sister evolutionary scenarios require either
330 at least one loss of neurons or two independent gains (Figure 1) of this cell-type. The only scenario that
331 allows for a single evolution of neurons and no losses is the “Coelenterata” hypothesis (reviewed in [Jékely
332 et al., 2015]), which joins cnidarians and ctenophores in a clade. However, many molecular datasets [Dunn
333 et al., 2008, Ryan et al., 2013, Whelan et al., 2015, Pisani et al., 2015] and morphological evidence [Harbison,
334 1985] argue against this scenario (but also see [Philippe et al., 2009] and [Simion et al., 2017]). One other
335 alternative is that placozoans have an unidentified neuron-like cell in a Porifera-sister context, which would
336 therefore allow for a single origin of neural systems in animals and no losses.
337

338 What do the two different scenarios mean for evolution of neuronal cells? Considering the basic properties
339 of neurons related to electrical signaling or secretory pathways, it had been shown before that many of the
340 genes involved are universally found in animals. A single origin and multiple losses implies that the genetic
341 toolkit necessary for all of these functions was present in the same single-celled organism or the same cell
342 type (an hypothetical proto-neuron) of the last common ancestor of crown-group Metazoa, and either that
343 cell type was lost or its functions were split up.
344

345 Sponges and ctenophores both appear to have lost several gene families (Figure 6), though ctenophores
346 nonetheless have neural cells. Thus, the losses of the GABA or monoamine pathways are not critical for the
347 functioning of neural cell types overall. However, voltage-gated potassium and sodium channels are thought
348 to be essential for the propagation of electrical signals down axons and dendrites and have been found in
349 all animal groups except sponges [Moran et al., 2015]. The NaV-channel tree shows a single origin of this
350 protein family (Supplemental Figure 17), and presence of these channels in choanoflagellates suggests they
351 were present in the common ancestor of all animals; the apparent absence in sponges therefore is probably a
352 secondary loss. By comparison, ctenophores have a mostly-unique expansion of Kv-channels relative to the
353 rest of metazoans [Li et al., 2015] and a duplication in NaV channels. Together with the loss of this protein

12
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

354 family in sponges, the gene content argues for a combination of both multiple, independent gains and a loss
355 of neural-type cells and their associated functions across animals.
356

357 Properties of the earliest metazoans are unknown, including life cycle or number of cell types, but it
358 seems parsimonious to conceive that the first obligate multicellular animals did not have anything close to a
359 sophisticated nervous system [Wray et al., 2015]. Yet, the genomic evidence shows that these animals could
360 respond to environmental or paracrine signals, regulate the cell internal ion concentrations and respond to
361 changes in their concentrations, and secrete small molecules that could serve as effectors in unconnected (but
362 proximal) cells. Thus, the earliest animals likely had the capacity to develop nerve cells using the genetic
363 toolkit they possessed, though the number of times this occurred is unclear. This capacity appears to have
364 been lost in sponges with the loss of voltage-gated channels. As we were unable to find putative genes
365 to mediate action potentials in glass sponges, either all of the four transcriptomes were incomplete or the
366 unique action potentials of glass sponges may represent a third case of the evolution of neural-like functions
367 in Metazoa.
368

369 Methods
370 Sequence data
371 Project overview can be found at spongebase.net. Reference data from the demosponge Tethya wilhelma
372 are available at: https://bitbucket.org/molpalmuc/tethya_wilhelma-genome
373

374 Raw genomic reads for T. wilhelma are available on NCBI SRA under accession numbers SRR2163223
375 (genomic reads), SRR2296844 (mate pairs), SRR5369934 (DNA Moleculo), and SRR4255675 (RNAseq).
376

377 Genome assembly


378 Processing and assembly
379 We generated 25Gb of 100bp paired-end Illumina reads of genomic DNA and 35Gb of 125bp Illumina gel-free
380 mate-pair reads. Contigs were assembled with SOAPdenovo2 [Luo et al., 2012] using a kmer of 83bp. We
381 also generated 436Mb of Moleculo synthetic long reads. Because both haplotypes are represented in the
382 Moleculo reads, we merged the Moleculo reads using HaploMerger [Huang et al., 2012]. Contigs and merged
383 Moleculo reads were then scaffolded using the gel-free mate-pairs with SSPACE [Boetzer et al., 2011] and
384 BESST [Sahlin et al., 2014]. The first draft assembly had 7,947 contigs, totaling 145 megabases.
385

386 Removal of low-coverage contigs


387 To examine the completeness of the genome, we generated a plot of kmer coverage against GC percentage for
388 the contigs (Supplemental Figure 1) using custom Python scripts (available at http://github.org/wrf/lavaLampPlot).
389 This revealed 1,040 contigs with a coverage of zero that were carried over from the Moleculo reads and were
390 not assembled (Supplemental Figure 2), accounting for 6 megabases. As these reads likely derived from
391 bacterial contamination in the aquarium water, these contigs were removed, leaving 6,907 contigs totalling
392 138 megabases.
393

394 Separation of bacterial contigs


395 Additionally, the plot revealed many contigs with lower coverage (20x-90x) and high GC content (50-75%)
396 suggesting the presence of bacteria (Supplemental Figure 3). Because many of these contigs were shorter
397 than 10kb, separation of the bacterial contigs was done through several steps. We found 4,858 contigs with
398 mapped RNAseq reads and GC content under 50%, as expected of metazoans. These contigs accounted for

13
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

399 88% of the sponge assembly, or 121 megabases. For the 2,014 contigs with no mapped RNAseq, we used
400 blastn to search the contigs against the A. queenslandica scaffolds and all complete bacterial genomes from
401 Genbank (5,242 sequences). Based on subtraction of bitscores, 62 contigs were identified as sponge and 565
402 were identified as bacterial. For the remaining 1,387 contigs, most of which were under 10kb, we repeated
403 the search with tblastx against A. queenslandica scaffolds and the genomes of Sinorhizobium medicae and
404 Roseobacter litoralis, which were the most similar complete genomes to the two bacterial 16S rRNAs identi-
405 fied in the contigs. After all sorting, 798 putative bacterial contigs accounted for 12.7 megabases and were
406 separated to bring the total to 6,109 sponge contigs. Contigs for the two bacteria were binned by tetranu-
407 cleotide frequency using MetaWatt [Strous et al., 2012] (Supplemental Figure 4).
408

409 Genome coverage and completeness


410 Coverage was estimated two ways: kmer frequency and read mapping. Kmers of 31bp were counted using
411 the Jellyfish kmer counter [Marçais and Kingsford, 2011] and analyzed using custom Python and R scripts
412 (“fastqdumps2histo.py” and “jellyfish gc coverage blob plot v2.R”, available at http://github.org/wrf/
413 lavaLampPlot). As expected, the kmer distribution showed two peaks, one for kmers at heterozygous
414 positions and one for homozygous positions, whereupon the coverage peak was at 131-fold coverage for ho-
415 mozygous positions. Because of sequencing errors, this method often underestimates coverage, and so to
416 confirm this estimate we then mapped all reads to the genome using Bowtie2 [Langmead and Salzberg, 2012].
417 The sum of mapped reads divided by the total length provided an estimated coverage of 159-fold physical
418 coverage.
419

420 Of the original reads, 185 million (86.5%) mapped back to the assembled sponge contigs. Completeness
421 for gene content was assessed with BUSCO [Simão et al., 2015], whereupon we found 728 (86%) complete
422 genes and 42 (4.9%) predicted-incomplete genes. Overall, these data suggest that the genome assembly is
423 adequate for downstream analyses.
424

425 Genome annotation


426 Transcriptome versions
427 The transcriptome for T. wilhelma was assembled de novo using Trinity (release r20140717) [Grabherr et al.,
428 2011, Haas et al., 2013]. Default parameters were used, except for strand specific assembly, in silico read
429 normalization, and trimming (–SS lib type RF –normalize reads –trimmomatic). This produced 127,012
430 transcripts with an average length of 913bp. Assembled transcripts were mapped to the genomic assembly
431 using GMAP [Wu and Watanabe, 2005] to produce a GFF file of the transcript mapping. Of these, 114,744
432 transcripts were mapped 166,847 times, allowing for multiple mappings.
433

434 For the genome-guided transcriptome, strand-specific RNAseq reads were mapped against the genome
435 build using Tophat2 v2.0.13 [Kim et al., 2013] using strand-specific mapping (option –library-type fr-
436 firststrand) and otherwise default parameters. Mapped reads were then joined into transcripts using StringTie
437 v1.0.2 [Pertea et al., 2015] with default parameters.
438

439 Additionally, ab initio gene models were predicted using AUGUSTUS [Stanke et al., 2008]. AUGUSTUS
440 was trained on the webAugustus server [Hoff and Stanke, 2013] using the highest expression transcripts for
441 each Trinity component and the assembled contigs. This identified 27,551 putative genes. The majority of
442 these overlapped partially or completely with a predicted gene based on the Trinity mapping or Stringtie
443 genes. However, 3,866 genes (4,321 transcripts) had no overlap with any predicted exon from either the
444 Trinity or StringTie set, and were kept for the final set. Considering the possibility that some of these may
445 be pseudogenes, we aligned these proteins to the SwissProt database with BLASTP [Camacho et al., 2009].
446 Of these, only 759 had reliable hits (E-value < 10−5 ) to 688 proteins. The annotated functions were diverse,
447 including proteins similar to many receptors and large structural proteins such as fibrillin (potentially any
448 protein with EGF repeats), dynein heavy chain, and titin; because very large proteins may be split across

14
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

449 multiple contigs, the predicted genes may be only fragments of the full gene. Only 42 of the hits were against
450 transposable elements.

451 Filtering of the final gene set


452 Because assembly of transcripts for both StringTie and Trinity relies on overlaps in the genome or RNAseq
453 reads, genes that overlap in the untranslated regions (UTRs) can sometimes be erroneously fused. For
454 StringTie, we developed a custom Python script to separate non-overlapping transcripts belonging to the
455 same “gene” (stringtiesplitgenes.py, available at https://bitbucket.org/wrf/sequences/). Tandem du-
456 plications can lead to RNAseq reads bridging the two tandem copies and result in both copies being called
457 the same gene. The original StringTie set contained 46,572 transcripts for 32,112 genes, while the corrected
458 set contained 33,200 genes and identified 1,088 new non overlapping genes.
459

460 Positional errors in the genome or allelic variations may result in some RNAseq reads not mapping
461 to the genome, so some genes are fragmented in the genome-guided transcriptome but not the de novo
462 assembly. Making use of the protein predictions from TransDecoder, we compared the predicted pro-
463 teins between the two transcriptomes using a custom Python script (transdecodersplitgenes.py, available
464 at https://bitbucket.org/wrf/sequences/). This identified 406 StringTie transcripts that were better
465 modeled by Trinity transcripts.
466

467 Functional gene annotation


468 Many genes of functional importance were examined manually, and the best transcript from StringTie, Trin-
469 ity, or AUGUSTUS was retained for the final gene set. In the GFF and fasta versions of the transcriptomes,
470 names of protein functions were assigned several ways. Target genes that were manually curated and edited,
471 such as those used in all trees, are named by the generic function or the annotated function of the closest
472 human protein. For instance, the dopamine beta-hydroxylase (DBH) homolog in T. wilhelma was manu-
473 ally corrected, and the position in a phylogenetic tree demonstrated that demosponges diverged before the
474 duplication which created DBH and the two DBH-like proteins in humans, thus the T. wilhelma protein
475 is annotated as DBH-like. Secondly, automated ortholog finding pipelines (HaMStR [Ebersberger et al.,
476 2009]) used for phylogeny [Cannon et al., 2016] have identified homologs in T. wilhelma, which have been
477 manually checked based on positions in the phylogenetic trees. Thirdly, single-direction BLAST results were
478 kept as annotations provided that the BLAST hit had a bitscore over 1000, or a bitscore over 300 and the
479 T. wilhelma protein covered at least 75% of the best hit against the human protein dataset from SwissProt.
480 The bitscore and length cutoffs were applied to reduce the number of annotations based on a single domain.
481

482 Analysis of splice variation


483 Using the transcriptome from StringTie, splice variation was assessed using a custom Python script (splice-
484 variantstats.py, available at https://bitbucket.org/wrf/sequences/). In this script, several ambiguous
485 definitions were clarified to define the different splice types. Firstly, single exon genes with no variants are
486 distinguished from single exon genes with variations, that is, a gene with two exons can have a variant with
487 one exon. For loci with only two transcripts, the canonical or main transcript is defined as the one with
488 the higher expression level, as measured by the higher FPKM value reported from StringTie. For loci with
489 three or more transcripts, main or canonical exons are those included in at least two transcripts. A cassette
490 exon must occur in less than 50% of the transcripts for a locus, otherwise such case is defined as a skipped
491 exon. A retained intron is any portion that exactly spans two other exons; for highly expressed transcripts
492 this may include erroneously retained introns due to intermediates in splicing. A summary of the splicing
493 types is displayed in Supplemental Table 2.
494

495 Intron retention was recently reported to be a common mode of alternative splicing in A. queens-
496 landica [Fernandez-Valverde et al., 2015]. We found 3,295 transcripts with 3,565 retained intron events
497 (Supplemental Table 2). We then analyzed the length of the retained introns and found the phase of the

15
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

498 retained piece to be randomly distributed (unlike cassette exons, Supplemental Table 3), suggesting that
499 many of the retained introns result from incomplete splicing rather than functional retention.
500

501 Microsynteny across sponges


502 Putative synteny blocks were identified using a custom Python script (microsynteny.py, available at https:
503 //bitbucket.org/wrf/sequences/). Briefly, the script combines the gene positions on scaffolds for both
504 the query and the reference with BLASTX hits for the query against the reference. If a minimum of three
505 genes in a row on a query scaffold match to different genes on the same reference scaffold, the group is
506 kept. By default, this mandated a gap of no more than five genes before discarding the block, and that the
507 next gene must occur within 30kb. This method was designed to work for highly fragmented genomes with
508 thousands of scaffolds, so the order and direction of the corresponding genes on the reference scaffold do not
509 need to match those of the query scaffold.
510

511 StringTie transcripts for T. wilhelma were aligned against the A. queenslandica v2.0 protein set with
512 BLASTX [Camacho et al., 2009], and positions were taken from the accompanying A. queenslandica v2.0
513 GTF. The same procedure was attempted against the S. ciliatum gene models, though essentially no syn-
514 tenic blocks were detected, indicating either substantial differences in gene content or gene order between
515 demosponges and calcareous sponges.
516

517 Collection of reference data


518 Proteins for Oikopleura dioica [Denoeud et al., 2010] were downloaded from Genoscope. Gene models
519 for Ciona intestinalis [Dehal et al., 2002], Branchiostoma floridae [Putnam et al., 2008], Trichoplax ad-
520 herens [Srivastava et al., 2008], Capitella teleta, Lottia gigantea, Helobdella robusta [Simakov et al., 2013],
521 Saccoglossus kowalevskii [Simakov et al., 2015], and Monosiga brevicollis [King et al., 2008] were downloaded
522 from the JGI genome portal. Gene models for Sphaeroforma arctica, Capsaspora owczarzaki [Suga et al.,
523 2013] and Salpingoeca rosetta [Fairclough et al., 2013] were downloaded from the Broad Institute.
524

525 We used genomic data of the cnidarians Nematostella vectensis [Moran et al., 2014], Exaiptasia pall-
526 ida [Baumgarten et al., 2015], and Hydra magnipapillata as well as transcriptomes from 33 other cnidari-
527 ans [Bhattacharya et al., 2016, Zapata et al., 2015, Pratlong et al., 2015, Brinkman et al., 2015, Ponce et al.,
528 2016], mostly corals.
529

530 For demosponges, we used the genome of Amphimedon queenslandica [Srivastava et al., 2010, Fernandez-
531 Valverde et al., 2015] and transcriptomic data from: Mycale phyllophila [Qiu et al., 2015], Petrosia fici-
532 formis [Riesgo et al., 2014a], Crambe crambe [Versluis et al., 2015], Cliona varians [Riesgo et al., 2014b], Hal-
533 isarca dujardini [Borisenko et al., 2016], Crella elegans [Pérez-Porro et al., 2013], Stylissa carteri, Xestospon-
534 gia testutinaria [Ryu et al., 2016], Scopalina sp., and Tedania anhelens. We used data from the genome of the
535 calcareous sponge Sycon ciliatum [Fortunato et al., 2014] and the transcriptome of Leucosolenia complicata.
536 For hexactinellids (glass sponges), we used transcriptome data from Aphrocallistes vastus [Ludeman et al.,
537 2014], Hyalonema populiferum, Rosella fibulata, and Sympagella nux [Whelan et al., 2015]. For homosclero-
538 morphs, we used two transcriptomes from Oscarella carmela and Corticium candelabrum [Ludeman et al.,
539 2014].
540

541 We used data from the two published draft genomes of ctenophores [Ryan et al., 2013,Moroz et al., 2014],
542 as well as transcriptome data from 11 additional ctenophores: Bathocyroe fosteri, Bathyctena chuni, Beroe
543 abyssicola, Bolinopsis infundibulum, Charistephane fugiens, Dryodora glandiformis, Euplokamis dunlapae,
544 Hormiphora californensis, Lampea lactea, Thalassocalyce inconstans, and Velamen parallelum.
545

546 We used data from the unpublished draft genome of a novel placozoan species, designated H13.
547

16
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

548 Gene trees


549 For protein trees, candidate proteins were identified by reciprocal BLAST alignment using blastp or tblastn.
550 All BLAST searches were done using the NCBI BLAST 2.2.29+ package [Camacho et al., 2009]. Because
551 most functions were described for human, mouse, or fruit fly proteins, these served as the queries for all
552 datasets. Candidate homologs were kept for analysis if they reciprocally aligned by blastp to a query pro-
553 tein, usually human. Alignments for protein sequences were created using MAFFT v7.029b, with L-INS-i
554 parameters for accurate alignments [Katoh and Standley, 2013]. Phylogenetic trees were generated using ei-
555 ther FastTree [Price et al., 2010] with default parameters or RAxML-HPC-PTHREADS v8.1.3 [Stamatakis,
556 2014], using the PROTGAMMALG model for proteins and 100 bootstrap replicates with the “rapid boot-
557 strap” (-f a) algorithm and a random seed of 1234.
558

559 Domain annotation


560 Domains for individual protein trees were annotated with “hmmscan” v3.1b1 from the HHMER pack-
561 age [Eddy, 2011] using the PFAM-A database v27.0 [Finn et al., 2016] as queries. Signal peptides were pre-
562 dicted using the stand-alone version of SignalP v4.1 [Petersen et al., 2011]. Domain structures were visualized
563 using a custom Python script, “pfampipeline.py”, available at https://github.com/wrf/genomeGTFtools .
564

565 Acknowledgments
566 W.R.F would like to thank K. Achim, M. Nickel and J. Musser for helpful discussions. This work was
567 supported by a LMUexcellent grant (Project MODELSPONGE) to G.W. as part of the German Excellence
568 Initiative, and NIH grant NIGMS-5-R01-GM087198 to S.H.D.H. The authors declare no competing interests.

569 References

570 References
571 [Alberstein et al., 2015] Alberstein, R., Grey, R., Zimmet, A., Simmons, D. K., and Mayer, M. L.
572 (2015). Glycine activated ion channel subunits encoded by ctenophore glutamate receptor genes.
573 Proceedings of the National Academy of Sciences, 112(44):E6048–E6057.
574 [Árnadóttir and Chalfie, 2010] Árnadóttir, J. and Chalfie, M. (2010). Eukaryotic Mechanosensitive
575 Channels. Annual Review of Biophysics, 39(1):111–137.
576 [Baumgarten et al., 2015] Baumgarten, S., Simakov, O., Esherick, L. Y., Liew, Y. J., Lehnert, E. M.,
577 Michell, C. T., Li, Y., Hambleton, E. a., Guse, A., Oates, M. E., Gough, J., Weis, V. M., Aranda,
578 M., Pringle, J. R., and Voolstra, C. R. (2015). The genome of Aiptasia , a sea anemone model for
579 coral symbiosis. Proceedings of the National Academy of Sciences, page 201513318.
580 [Bhattacharya et al., 2016] Bhattacharya, D., Agrawal, S., Aranda, M., Baumgarten, S., Belcaid, M.,
581 Drake, J. L., Erwin, D., Foret, S., Gates, R. D., Gruber, D. F., Kamel, B., Lesser, M. P., Levy,
582 O., Liew, Y. J., MacManes, M., Mass, T., Medina, M., Mehr, S., Meyer, E., Price, D. C., Putnam,
583 H. M., Qiu, H., Shinzato, C., Shoguchi, E., Stokes, A. J., Tambutté, S., Tchernov, D., Voolstra,
584 C. R., Wagner, N., Walker, C. W., Weber, A. P., Weis, V., Zelzion, E., Zoccola, D., and Falkowski,
585 P. G. (2016). Comparative genomics explains the evolutionary success of reef-forming corals. eLife,
586 5:1–26.
587 [Boetzer et al., 2011] Boetzer, M., Henkel, C. V., Jansen, H. J., Butler, D., and Pirovano, W. (2011).
588 Scaffolding pre-assembled contigs using SSPACE. Bioinformatics, 27(4):578–579.
589 [Borisenko et al., 2016] Borisenko, I., Adamski, M., Ereskovsky, A., and Adamska, M. (2016). Sur-
590 prisingly rich repertoire of Wnt genes in the demosponge Halisarca dujardini. BMC evolutionary
591 biology, 16(1):123.

17
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

592 [Brinkman et al., 2015] Brinkman, D. L., Jia, X., Potriquet, J., Kumar, D., Dash, D., Kvaskoff, D.,
593 and Mulvenna, J. (2015). Transcriptome and venom proteome of the box jellyfish Chironex fleckeri.
594 BMC Genomics, 16(1):407.
595 [Camacho et al., 2009] Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer,
596 K., and Madden, T. L. (2009). BLAST+: architecture and applications. BMC bioinformatics,
597 10:421.
598 [Cannon et al., 2016] Cannon, J. T., Vellutini, B. C., Smith, J., Ronquist, F., Jondelius, U., and
599 Hejnol, A. (2016). Xenacoelomorpha is the sister group to Nephrozoa. Nature, 530(7588):89–93.
600 [Cao et al., 2010] Cao, J., Shi, F., Liu, X., Huang, G., and Zhou, M. (2010). Phylogenetic analysis
601 and evolution of aromatic amino acid hydroxylase. FEBS Letters, 584(23):4775–4782.
602 [Carlberg and Rosengren, 1985] Carlberg, M. and Rosengren, E. (1985). Biochemical basis for adren-
603 ergic neurotransmission in coelenterates. Journal of Comparative Physiology B, 155(2):251–255.
604 [Coste et al., 2012] Coste, B., Xiao, B., Santos, J. S., Syeda, R., Grandl, J., Spencer, K. S., Kim, S. E.,
605 Schmidt, M., Mathur, J., Dubin, A. E., Montal, M., and Patapoutian, A. (2012). Piezo proteins are
606 pore-forming subunits of mechanically activated channels. Nature, 483(7388):176–181.
607 [Dehal et al., 2002] Dehal, P., Satou, Y., Campbell, R. K., Chapman, J., Degnan, B., De Tomaso, A.,
608 Davidson, B., Di Gregorio, A., Gelpke, M., Goodstein, D. M., Harafuji, N., Hastings, K. E. M., Ho,
609 I., Hotta, K., Huang, W., Kawashima, T., Lemaire, P., Martinez, D., Meinertzhagen, I. a., Necula,
610 S., Nonaka, M., Putnam, N., Rash, S., Saiga, H., Satake, M., Terry, A., Yamada, L., Wang, H.-G.,
611 Awazu, S., Azumi, K., Boore, J., Branno, M., Chin-Bow, S., DeSantis, R., Doyle, S., Francino, P.,
612 Keys, D. N., Haga, S., Hayashi, H., Hino, K., Imai, K. S., Inaba, K., Kano, S., Kobayashi, K.,
613 Kobayashi, M., Lee, B.-I., Makabe, K. W., Manohar, C., Matassi, G., Medina, M., Mochizuki, Y.,
614 Mount, S., Morishita, T., Miura, S., Nakayama, A., Nishizaka, S., Nomoto, H., Ohta, F., Oishi, K.,
615 Rigoutsos, I., Sano, M., Sasaki, A., Sasakura, Y., Shoguchi, E., Shin-i, T., Spagnuolo, A., Stainier,
616 D., Suzuki, M. M., Tassy, O., Takatori, N., Tokuoka, M., Yagi, K., Yoshizaki, F., Wada, S., Zhang,
617 C., Hyatt, P. D., Larimer, F., Detter, C., Doggett, N., Glavina, T., Hawkins, T., Richardson,
618 P., Lucas, S., Kohara, Y., Levine, M., Satoh, N., and Rokhsar, D. S. (2002). The draft genome
619 of Ciona intestinalis: insights into chordate and vertebrate origins. Science (New York, N.Y.),
620 298(5601):2157–2167.
621 [Denoeud et al., 2010] Denoeud, F., Henriet, S., Mungpakdee, S., Aury, J.-M., Da Silva, C.,
622 Brinkmann, H., Mikhaleva, J., Olsen, L. C., Jubin, C., Canestro, C., Bouquet, J.-M., Danks, G.,
623 Poulain, J., Campsteijn, C., Adamski, M., Cross, I., Yadetie, F., Muffato, M., Louis, A., Butcher,
624 S., Tsagkogeorga, G., Konrad, A., Singh, S., Jensen, M. F., Cong, E. H., Eikeseth-Otteraa, H.,
625 Noel, B., Anthouard, V., Porcel, B. M., Kachouri-Lafond, R., Nishino, A., Ugolini, M., Chourrout,
626 P., Nishida, H., Aasland, R., Huzurbazar, S., Westhof, E., Delsuc, F., Lehrach, H., Reinhardt, R.,
627 Weissenbach, J., Roy, S. W., Artiguenave, F., Postlethwait, J. H., Manak, J. R., Thompson, E. M.,
628 Jaillon, O., Du Pasquier, L., Boudinot, P., Liberles, D. a., Volff, J.-N., Philippe, H., Lenhard, B.,
629 Crollius, H. R., Wincker, P., and Chourrout, D. (2010). Plasticity of Animal Genome Architecture
630 Unmasked by Rapid Evolution of a Pelagic Tunicate. Science, 1381(2010).
631 [Dunn et al., 2008] Dunn, C. W., Hejnol, A., Matus, D. Q., Pang, K., Browne, W. E., Smith, S. a.,
632 Seaver, E., Rouse, G. W., Obst, M., Edgecombe, G. D., Sørensen, M. V., Haddock, S. H. D.,
633 Schmidt-Rhaesa, A., Okusu, A., Kristensen, R. M., Wheeler, W. C., Martindale, M. Q., and Giribet,
634 G. (2008). Broad phylogenomic sampling improves resolution of the animal tree of life. Nature,
635 452(7188):745–9.
636 [Dunn et al., 2015] Dunn, C. W., Leys, S. P., and Haddock, S. H. D. (2015). The hidden biology of
637 sponges and ctenophores. Trends in Ecology & Evolution, pages 1–10.
638 [Ebersberger et al., 2009] Ebersberger, I., Strauss, S., and von Haeseler, A. (2009). HaMStR: profile
639 hidden markov model based search for orthologs in ESTs. BMC evolutionary biology, 9:157.
640 [Eddy, 2011] Eddy, S. R. (2011). Accelerated profile HMM searches. PLoS Computational Biology,
641 7(10).

18
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

642 [Elliott and Leys, 2007] Elliott, G. R. D. and Leys, S. P. (2007). Coordinated contractions effectively
643 expel water from the aquiferous system of a freshwater sponge. Journal of Experimental Biology,
644 210(21):3736–3748.
645 [Elliott and Leys, 2010] Elliott, G. R. D. and Leys, S. P. (2010). Evidence for glutamate, GABA and
646 NO in coordinating behaviour in the sponge, Ephydatia muelleri (Demospongiae, Spongillidae). The
647 Journal of experimental biology, 213:2310–2321.
648 [Ellwanger et al., 2007] Ellwanger, K., Eich, A., and Nickel, M. (2007). GABA and glutamate specif-
649 ically induce contractions in the sponge Tethya wilhelma. Journal of Comparative Physiology A:
650 Neuroethology, Sensory, Neural, and Behavioral Physiology, 193(1):1–11.
651 [Ellwanger and Nickel, 2006] Ellwanger, K. and Nickel, M. (2006). Neuroactive substances specifically
652 modulate rhythmic body contractions in the nerveless metazoon Tethya wilhelma (Demospongiae,
653 Porifera). Frontiers in zoology, 3:7.
654 [Fairclough et al., 2013] Fairclough, S. R., Chen, Z., Kramer, E., Zeng, Q., Young, S., Robertson,
655 H. M., Begovic, E., Richter, D. J., Russ, C., Westbrook, M. J., Manning, G., Lang, B. F., Haas,
656 B., Nusbaum, C., and King, N. (2013). Premetazoan genome evolution and the regulation of cell
657 differentiation in the choanoflagellate Salpingoeca rosetta. Genome biology, 14(2):R15.
658 [Fernandez-Valverde et al., 2015] Fernandez-Valverde, S. L., Calcino, A. D., and Degnan, B. M. (2015).
659 Deep developmental transcriptome sequencing uncovers numerous new genes and enhances gene
660 annotation in the sponge Amphimedon queenslandica. BMC Genomics, 16(1):1–11.
661 [Finn et al., 2016] Finn, R. D., Coggill, P., Eberhardt, R. Y., Eddy, S. R., Mistry, J., Mitchell, A. L.,
662 Potter, S. C., Punta, M., Qureshi, M., Sangrador-Vegas, A., Salazar, G. A., Tate, J., and Bateman,
663 A. (2016). The Pfam protein families database: towards a more sustainable future. Nucleic Acids
664 Research, 44(D1):D279–D285.
665 [Fortunato et al., 2014] Fortunato, S. a. V., Adamski, M., Ramos, O. M., Leininger, S., Liu, J., Ferrier,
666 D. E. K., and Adamska, M. (2014). Calcisponges have a ParaHox gene and dynamic expression of
667 dispersed NK homeobox genes. Nature, 514(7524):620–623.
668 [Ge et al., 2015] Ge, J., Li, W., Zhao, Q., Li, N., Chen, M., Zhi, P., Li, R., Gao, N., Xiao, B., and Yang,
669 M. (2015). Architecture of the mammalian mechanosensitive Piezo1 channel. Nature, 527(5):64–69.
670 [Geng et al., 2013] Geng, Y., Bush, M., Mosyak, L., Wang, F., and Fan, Q. R. (2013). Structural
671 mechanism of ligand activation in human GABA(B) receptor. Nature, 504(7479):254–9.
672 [Grabherr et al., 2011] Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. a.,
673 Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q., Chen, Z., Mauceli, E., Hacohen, N.,
674 Gnirke, A., Rhind, N., di Palma, F., Birren, B. W., Nusbaum, C., Lindblad-Toh, K., Friedman, N.,
675 and Regev, A. (2011). Full-length transcriptome assembly from RNA-Seq data without a reference
676 genome. Nature biotechnology, 29(7):644–52.
677 [Grell and Benwitz, 1974] Grell, K. G. and Benwitz, G. (1974). Spezifische Verbindungsstrukuren der
678 Faserzellen von Trichoplax adhaerens F.E. Schulze. Z. Naturforsch., 29c:790.
679 [Gur Barzilai et al., 2012] Gur Barzilai, M., Reitzel, A. M., Kraus, J. E. M., Gordon, D., Technau, U.,
680 Gurevitz, M., and Moran, Y. (2012). Convergent Evolution of Sodium Ion Selectivity in Metazoan
681 Neuronal Signaling. Cell Reports, 2(2):242–248.
682 [Haas et al., 2013] Haas, B. J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., Bowden,
683 J., Couger, M. B., Eccles, D., Li, B., Lieber, M., Macmanes, M. D., Ott, M., Orvis, J., Pochet,
684 N., Strozzi, F., Weeks, N., Westerman, R., William, T., Dewey, C. N., Henschel, R., Leduc, R. D.,
685 Friedman, N., and Regev, A. (2013). De novo transcript sequence reconstruction from RNA-seq
686 using the Trinity platform for reference generation and analysis. Nature protocols, 8(8):1494–512.
687 [Harbison, 1985] Harbison, G. R. (1985). On the classification and evolution of the Ctenophora. In
688 Conway Morris, S. C., George, J. D., Gibson, R., and Platt, H. M., editors, The Origin and Rela-
689 tionships of Lower Invertebrates, pages 78–100. Clarendon Press, Oxford.
690 [Hoff and Stanke, 2013] Hoff, K. J. and Stanke, M. (2013). WebAUGUSTUS–a web service for training
691 AUGUSTUS and predicting genes in eukaryotes. Nucleic Acids Research, 41(W1):W123–W128.

19
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

692 [Huang et al., 2012] Huang, S., Chen, Z., Huang, G., Yu, T., Yang, P., Li, J., Fu, Y., Yuan, S., Chen,
693 S., and Xu, A. (2012). HaploMerger: Reconstructing allelic relationships for polymorphic diploid
694 genome assemblies. Genome Research, 22(8):1581–1588.
695 [Jarvis et al., 2014] Jarvis, E. D., Mirarab, S., Aberer, A. J., Li, B., Houde, P., Li, C., Ho, S. Y. W.,
696 Faircloth, B. C., Nabholz, B., Howard, J. T., Suh, A., Weber, C. C., da Fonseca, R. R., Li, J., Zhang,
697 F., Li, H., Zhou, L., Narula, N., Liu, L., Ganapathy, G., Boussau, B., Bayzid, M. S., Zavidovych, V.,
698 Subramanian, S., Gabaldon, T., Capella-Gutierrez, S., Huerta-Cepas, J., Rekepalli, B., Munch, K.,
699 Schierup, M., Lindow, B., Warren, W. C., Ray, D., Green, R. E., Bruford, M. W., Zhan, X., Dixon,
700 A., Li, S., Li, N., Huang, Y., Derryberry, E. P., Bertelsen, M. F., Sheldon, F. H., Brumfield, R. T.,
701 Mello, C. V., Lovell, P. V., Wirthlin, M., Schneider, M. P. C., Prosdocimi, F., Samaniego, J. A.,
702 Velazquez, A. M. V., Alfaro-Nunez, A., Campos, P. F., Petersen, B., Sicheritz-Ponten, T., Pas, A.,
703 Bailey, T., Scofield, P., Bunce, M., Lambert, D. M., Zhou, Q., Perelman, P., Driskell, A. C., Shapiro,
704 B., Xiong, Z., Zeng, Y., Liu, S., Li, Z., Liu, B., Wu, K., Xiao, J., Yinqi, X., Zheng, Q., Zhang, Y.,
705 Yang, H., Wang, J., Smeds, L., Rheindt, F. E., Braun, M., Fjeldsa, J., Orlando, L., Barker, F. K.,
706 Jonsson, K. A., Johnson, W., Koepfli, K.-P., O’Brien, S., Haussler, D., Ryder, O. A., Rahbek, C.,
707 Willerslev, E., Graves, G. R., Glenn, T. C., McCormack, J., Burt, D., Ellegren, H., Alstrom, P.,
708 Edwards, S. V., Stamatakis, A., Mindell, D. P., Cracraft, J., Braun, E. L., Warnow, T., Jun, W.,
709 Gilbert, M. T. P., and Zhang, G. (2014). Whole-genome analyses resolve early branches in the tree
710 of life of modern birds. Science, 346(6215):1320–1331.
711 [Jékely et al., 2015] Jékely, G., Paps, J., and Nielsen, C. (2015). The phylogenetic position of
712 ctenophores and the origin(s) of nervous systems. EvoDevo, 6(1):1.
713 [Katoh and Standley, 2013] Katoh, K. and Standley, D. M. (2013). MAFFT multiple sequence align-
714 ment software version 7: improvements in performance and usability. Molecular biology and evolu-
715 tion, 30(4):772–80.
716 [Kim et al., 2013] Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R., and Salzberg, S. L.
717 (2013). TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and
718 gene fusions. Genome biology, 14(4):R36.
719 [King et al., 2008] King, N., Westbrook, M. J., Young, S. L., Kuo, A., Abedin, M., Chapman, J.,
720 Fairclough, S., Hellsten, U., Isogai, Y., Letunic, I., Marr, M., Pincus, D., Putnam, N., Rokas, A.,
721 Wright, K. J., Zuzow, R., Dirks, W., Good, M., Goodstein, D., Lemons, D., Li, W., Lyons, J. B.,
722 Morris, A., Nichols, S., Richter, D. J., Salamov, A., Sequencing, J. G. I., Bork, P., Lim, W. a.,
723 Manning, G., Miller, W. T., McGinnis, W., Shapiro, H., Tjian, R., Grigoriev, I. V., and Rokhsar,
724 D. (2008). The genome of the choanoflagellate Monosiga brevicollis and the origin of metazoans.
725 Nature, 451(7180):783–8.
726 [Krishnan et al., 2014] Krishnan, A., Dnyansagar, R., Almén, M. S., Williams, M. J., Fredriksson,
727 R., Manoj, N., and Schiöth, H. B. (2014). The GPCR repertoire in the demosponge Amphimedon
728 queenslandica: insights into the GPCR system at the early divergence of animals. BMC Evolutionary
729 Biology, 14(1):270.
730 [Krishnan and Schiöth, 2015] Krishnan, A. and Schiöth, H. B. (2015). The role of G protein-coupled
731 receptors in the early evolution of neurotransmission and the nervous system. The Journal of
732 experimental biology, 218(Pt 4):562–571.
733 [Langmead and Salzberg, 2012] Langmead, B. and Salzberg, S. L. (2012). Fast gapped-read alignment
734 with Bowtie 2.
735 [Leys, 2015] Leys, S. P. (2015). Elements of a ’nervous system’ in sponges. The Journal of experimental
736 biology, 218(Pt 4):581–91.
737 [Leys et al., 1999] Leys, S. P., Mackie, G. O., and Meech, R. W. (1999). Impulse conduction in a
738 sponge. The Journal of experimental biology, 202 (Pt 9)(June 1997):1139–1150.
739 [Leys et al., 2007] Leys, S. P., Mackie, G. O., and Reiswig, H. M. (2007). The Biology of Glass Sponges.
740 Advances in Marine Biology, 52(06):1–145.
741 [Li et al., 2015] Li, X., Liu, H., Chu Luo, J., Rhodes, S. a., Trigg, L. M., van Rossum, D. B., Anishkin,
742 A., Diatta, F. H., Sassic, J. K., Simmons, D. K., Kamel, B., Medina, M., Martindale, M. Q., and

20
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

743 Jegla, T. (2015). Major diversification of voltage-gated K\n +\n channels occurred in ancestral
744 parahoxozoans. Proceedings of the National Academy of Sciences, page 201422941.
745 [Liebeskind et al., 2011] Liebeskind, B. J., Hillis, D. M., and Zakon, H. H. (2011). Evolution of sodium
746 channels predates the origin of nervous systems in animals. Proceedings of the National Academy of
747 Sciences of the United States of America, 108(22):9154–9159.
748 [Ludeman et al., 2014] Ludeman, D. A., Farrar, N., Riesgo, A., Paps, J., and Leys, S. P. (2014).
749 Evolutionary origins of sensation in metazoans: functional evidence for a new sensory organ in
750 sponges. BMC Evolutionary Biology, 14(3):1–11.
751 [Luo et al., 2012] Luo, R., Liu, B., Xie, Y., Li, Z., Huang, W., Yuan, J., He, G., Chen, Y., Pan, Q.,
752 Liu, Y., Tang, J., Wu, G., Zhang, H., Shi, Y., Liu, Y., Yu, C., Wang, B., Lu, Y., Han, C., Cheung,
753 D. W., Yiu, S.-M., Peng, S., Xiaoqian, Z., Liu, G., Liao, X., Li, Y., Yang, H., Wang, J., Lam, T.-W.,
754 and Wang, J. (2012). SOAPdenovo2: an empirically improved memory-efficient short-read de novo
755 assembler. GigaScience, 1(1):18.
756 [Marçais and Kingsford, 2011] Marçais, G. and Kingsford, C. (2011). A fast, lock-free approach for
757 efficient parallel counting of occurrences of k-mers. Bioinformatics, 27(6):764–770.
758 [Moran et al., 2015] Moran, Y., Barzilai, M. G., Liebeskind, B. J., and Zakon, H. H. (2015). Evolu-
759 tion of voltage-gated ion channels at the emergence of Metazoa. Journal of Experimental Biology,
760 218:515–525.
761 [Moran et al., 2014] Moran, Y., Fredman, D., Praher, D., Li, X. Z., Wee, L. M., Rentzsch, F., Zamore,
762 P. D., Technau, U., and Seitz, H. (2014). Cnidarian microRNAs frequently regulate targets by
763 cleavage. Genome Research, 24(4):651–663.
764 [Moroz, 2015] Moroz, L. L. (2015). Convergent evolution of neural systems in ctenophores. Journal
765 of Experimental Biology, 218:598–611.
766 [Moroz et al., 2014] Moroz, L. L., Kocot, K. M., Citarella, M. R., Dosung, S., Norekian, T. P., Po-
767 volotskaya, I. S., Grigorenko, A. P., Dailey, C., Berezikov, E., Buckley, K. M., Ptitsyn, A., Reshetov,
768 D., Mukherjee, K., Moroz, T. P., Bobkova, Y., Yu, F., Kapitonov, V. V., Jurka, J., Bobkov, Y. V.,
769 Swore, J. J., Girardo, D. O., Fodor, A., Gusev, F., Sanford, R., Bruders, R., Kittler, E., Mills, C. E.,
770 Rast, J. P., Derelle, R., Solovyev, V. V., Kondrashov, F. a., Swalla, B. J., Sweedler, J. V., Rogaev,
771 E. I., Halanych, K. M., and Kohn, A. B. (2014). The ctenophore genome and the evolutionary
772 origins of neural systems. Nature, 17:1–123.
773 [Nickel, 2010] Nickel, M. (2010). Evolutionary emergence of synaptic nervous systems: what can we
774 learn from the non-synaptic, nerveless Porifera? Invertebrate Biology, 129(1):1–16.
775 [Nosenko et al., 2013] Nosenko, T., Schreiber, F., Adamska, M., Adamski, M., Eitel, M., Hammel,
776 J., Maldonado, M., Müller, W. E. G., Nickel, M., Schierwater, B., Vacelet, J., Wiens, M., and
777 Wörheide, G. (2013). Deep metazoan phylogeny: when different genes tell different stories. Molecular
778 phylogenetics and evolution, 67(1):223–33.
779 [Pérez-Porro et al., 2013] Pérez-Porro, a. R., Navarro-Gómez, D., Uriz, M. J., and Giribet, G. (2013).
780 A NGS approach to the encrusting Mediterranean sponge Crella elegans (Porifera, Demospongiae,
781 Poecilosclerida): transcriptome sequencing, characterization and overview of the gene expression
782 along three life cycle stages. Molecular ecology resources, 454:494–509.
783 [Pertea et al., 2015] Pertea, M., Pertea, G. M., Antonescu, C. M., Chang, T.-C., Mendell, J. T., and
784 Salzberg, S. L. (2015). StringTie enables improved reconstruction of a transcriptome from RNA-seq
785 reads. Nature Biotechnology, 33(3).
786 [Petersen et al., 2011] Petersen, T. N., Brunak, S., von Heijne, G., and Nielsen, H. (2011). SignalP
787 4.0: discriminating signal peptides from transmembrane regions. Nature methods, 8(10):785–6.
788 [Philippe et al., 2011] Philippe, H., Brinkmann, H., Lavrov, D. V., Littlewood, D. T. J., Manuel,
789 M., Wörheide, G., and Baurain, D. (2011). Resolving difficult phylogenetic questions: why more
790 sequences are not enough. PLoS biology, 9(3):e1000602.
791 [Philippe et al., 2009] Philippe, H., Derelle, R., Lopez, P., Pick, K., Borchiellini, C., Boury-Esnault,
792 N., Vacelet, J., Renard, E., Houliston, E., Quéinnec, E., Da Silva, C., Wincker, P., Le Guyader, H.,

21
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

793 Leys, S., Jackson, D. J., Schreiber, F., Erpenbeck, D., Morgenstern, B., Wörheide, G., and Manuel,
794 M. (2009). Phylogenomics revives traditional views on deep animal relationships. Current biology :
795 CB, 19(8):706–12.
796 [Pick et al., 2010] Pick, K. S., Philippe, H., Schreiber, F., Erpenbeck, D., Jackson, D. J., Wrede, P.,
797 Wiens, M., Alié, A., Morgenstern, B., Manuel, M., and Wörheide, G. (2010). Improved phylogenomic
798 taxon sampling noticeably affects nonbilaterian relationships. Molecular Biology and Evolution,
799 27(9):1983–1987.
800 [Pisani et al., 2015] Pisani, D., Pett, W., Dohrmann, M., Feuda, R., Rota-Stabelli, O., Philippe, H.,
801 Lartillot, N., and Wörheide, G. (2015). Genomic data do not support comb jellies as the sister group
802 to all other animals. Proceedings of the National Academy of Sciences, 112(50):201518127.
803 [Ponce et al., 2016] Ponce, D., Brinkman, D. L., Potriquet, J., and Mulvenna, J. (2016). Tentacle tran-
804 scriptome and venom proteome of the pacific sea nettle, Chrysaora fuscescens (Cnidaria: Scyphozoa).
805 Toxins, 8(4).
806 [Pratlong et al., 2015] Pratlong, M., Haguenauer, A., Chabrol, O., Klopp, C., Pontarotti, P., and
807 Aurelle, D. (2015). The red coral ( Corallium rubrum ) transcriptome: a new resource for population
808 genetics and local adaptation studies. Molecular Ecology Resources, 15(5):1205–1215.
809 [Price et al., 2010] Price, M. N., Dehal, P. S., and Arkin, A. P. (2010). FastTree 2 - Approximately
810 maximum-likelihood trees for large alignments. PLoS ONE, 5(3).
811 [Putnam et al., 2008] Putnam, N. H., Butts, T., Ferrier, D. E. K., Furlong, R. F., Hellsten, U.,
812 Kawashima, T., Robinson-Rechavi, M., Shoguchi, E., Terry, A., Yu, J.-K., Benito-Gutiérrez, E. L.,
813 Dubchak, I., Garcia-Fernàndez, J., Gibson-Brown, J. J., Grigoriev, I. V., Horton, A. C., de Jong,
814 P. J., Jurka, J., Kapitonov, V. V., Kohara, Y., Kuroki, Y., Lindquist, E., Lucas, S., Osoegawa,
815 K., Pennacchio, L. a., Salamov, A. a., Satou, Y., Sauka-Spengler, T., Schmutz, J., Shin-I, T., Toy-
816 oda, A., Bronner-Fraser, M., Fujiyama, A., Holland, L. Z., Holland, P. W. H., Satoh, N., and
817 Rokhsar, D. S. (2008). The amphioxus genome and the evolution of the chordate karyotype. Nature,
818 453(7198):1064–71.
819 [Qiu et al., 2015] Qiu, F., Ding, S., Ou, H., Wang, D., Chen, J., and Miyamoto, M. M. (2015). Tran-
820 scriptome changes during the life cycle of the red sponge, Mycale phyllophila (Porifera, Demospon-
821 giae, Poecilosclerida). Genes, 6(4):1023–1052.
822 [Ramoino et al., 2010] Ramoino, P., Ledda, F. D., Ferrando, S., Gallus, L., Bianchini, P., Diaspro, A.,
823 Fato, M., Tagliafierro, G., and Manconi, R. (2010). Metabotropic ??-aminobutyric acid (GABAB)
824 receptors modulate feeding behavior in the calcisponge Leucandra aspera. Journal of Experimental
825 Zoology Part A: Ecological Genetics and Physiology, 313A(3):132–140.
826 [Riesgo et al., 2014a] Riesgo, A., Farrar, N., Windsor, P. J., Giribet, G., and Leys, S. P. (2014a). The
827 Analysis of Eight Transcriptomes from All Poriferan Classes Reveals Surprising Genetic Complexity
828 in Sponges. Molecular biology and evolution.
829 [Riesgo et al., 2014b] Riesgo, A., Peterson, K., Richardson, C., Heist, T., Strehlow, B., McCauley,
830 M., Cotman, C., Hill, M., and Hill, A. (2014b). Transcriptomic analysis of differential host gene
831 expression upon uptake of symbionts: a case study with Symbiodinium and the major bioeroding
832 sponge Cliona varians. BMC genomics, 15(1):376.
833 [Ryan et al., 2010] Ryan, J. F., Pang, K., Mullikin, J. C., Martindale, M. Q., and Baxevanis,
834 A. D. (2010). The homeodomain complement of the ctenophore Mnemiopsis leidyi suggests that
835 Ctenophora and Porifera diverged prior to the ParaHoxozoa. EvoDevo, 1(1):9.
836 [Ryan et al., 2013] Ryan, J. F., Pang, K., Schnitzler, C. E., a. D. Nguyen, A.-d., Moreland, R. T.,
837 Simmons, D. K., Koch, B. J., Francis, W. R., Havlak, P., Smith, S. a., Putnam, N. H., Haddock,
838 S. H. D., Dunn, C. W., Wolfsberg, T. G., Mullikin, J. C., Martindale, M. Q., Baxevanis, A. D.,
839 Comparative, N., and Program, S. (2013). The Genome of the Ctenophore Mnemiopsis leidyi and
840 Its Implications for Cell Type Evolution. Science, 342(6164):1242592–1242592.
841 [Ryu et al., 2016] Ryu, T., Seridi, L., Moitinho-Silva, L., Oates, M., Liew, Y. J., Mavromatis, C.,
842 Wang, X., Haywood, A., Lafi, F. F., Kupresanin, M., Sougrat, R., Alzahrani, M. A., Giles, E.,

22
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

843 Ghosheh, Y., Schunter, C., Baumgarten, S., Berumen, M. L., Gao, X., Aranda, M., Foret, S.,
844 Gough, J., Voolstra, C. R., Hentschel, U., and Ravasi, T. (2016). Hologenome analysis of two
845 marine sponges with different microbiomes. BMC genomics, 17(1):158.
846 [Sáez et al., 2009] Sáez, A. G., Lozano, E., and Zaldı́var-Riverón, A. (2009). Evolutionary history of
847 Na,K-ATPases and their osmoregulatory role. Genetica, 136(3):479–490.
848 [Sahlin et al., 2014] Sahlin, K., Vezzi, F., Nystedt, B., Lundeberg, J., and Arvestad, L. (2014). BESST
849 - Efficient scaffolding of large fragmented assemblies. BMC Bioinformatics, 15(1):281.
850 [Sara et al., 2001] Sara, M., Sara, A., Nickel, M., and Brümmer, F. (2001). Three New Species
851 of Tethya (Porifera: Demospongia) from German Aquaria. Stuttgarter Beiträge zur Naturkunde,
852 631(15S):1–16.
853 [Schierwater et al., 2009] Schierwater, B., Kolokotronis, S., Eitel, M., and DeSalle, R. (2009). The
854 Diploblast-Bilateria Sister hypothesis. Communicative & Integrative Biology, 2(5):1–3.
855 [Schuler et al., 2015] Schuler, a., Schmitz, G., Reft, A., Ozbek, S., Thurm, U., and Bornberg-Bauer,
856 E. (2015). The rise and fall of TRP-N, an ancient family of mechanogated ion channels, in Metazoa.
857 Genome Biology and Evolution, 7(6):1–27.
858 [Simakov et al., 2015] Simakov, O., Kawashima, T., Marlétaz, F., Jenkins, J., Koyanagi, R., Mitros,
859 T., Hisata, K., Bredeson, J., Shoguchi, E., Gyoja, F., Yue, J.-X., Chen, Y.-C., Freeman, R. M.,
860 Sasaki, A., Hikosaka-Katayama, T., Sato, A., Fujie, M., Baughman, K. W., Levine, J., Gonzalez,
861 P., Cameron, C., Fritzenwanker, J. H., Pani, A. M., Goto, H., Kanda, M., Arakaki, N., Yamasaki,
862 S., Qu, J., Cree, A., Ding, Y., Dinh, H. H., Dugan, S., Holder, M., Jhangiani, S. N., Kovar, C. L.,
863 Lee, S. L., Lewis, L. R., Morton, D., Nazareth, L. V., Okwuonu, G., Santibanez, J., Chen, R.,
864 Richards, S., Muzny, D. M., Gillis, A., Peshkin, L., Wu, M., Humphreys, T., Su, Y.-H., Putnam,
865 N. H., Schmutz, J., Fujiyama, A., Yu, J.-K., Tagawa, K., Worley, K. C., Gibbs, R. A., Kirschner,
866 M. W., Lowe, C. J., Satoh, N., Rokhsar, D. S., and Gerhart, J. (2015). Hemichordate genomes and
867 deuterostome origins. Nature, pages 1–19.
868 [Simakov et al., 2013] Simakov, O., Marletaz, F., Cho, S.-J., Edsinger-Gonzales, E., Havlak, P., Hell-
869 sten, U., Kuo, D.-H., Larsson, T., Lv, J., Arendt, D., Savage, R., Osoegawa, K., de Jong, P.,
870 Grimwood, J., Chapman, J. a., Shapiro, H., Aerts, A., Otillar, R. P., Terry, A. Y., Boore, J. L.,
871 Grigoriev, I. V., Lindberg, D. R., Seaver, E. C., Weisblat, D. a., Putnam, N. H., and Rokhsar, D. S.
872 (2013). Insights into bilaterian evolution from three spiralian genomes. Nature, 493(7433):526–31.
873 [Simão et al., 2015] Simão, F. A., Waterhouse, R. M., Ioannidis, P., and Kriventseva, E. V. (2015).
874 BUSCO : assessing genome assembly and annotation completeness with single-copy orthologs.
875 Genome analysis, 31(June):9–10.
876 [Simion et al., 2017] Simion, P., Philippe, H., Baurain, D., Jager, M., Richter, D. J., Di Franco, A.,
877 Roure, B., Satoh, N., Quéinnec, É., Ereskovsky, A., Lapébie, P., Corre, E., Delsuc, F., King, N.,
878 Wörheide, G., and Manuel, M. (2017). A Large and Consistent Phylogenomic Dataset Supports
879 Sponges as the Sister Group to All Other Animals. Current Biology, pages 1–10.
880 [Smith et al., 2011] Smith, S. M. E., Morgan, D., Musset, B., Cherny, V. V., Place, A. R., Hastings,
881 J. W., and DeCoursey, T. E. (2011). Voltage-gated proton channel in a dinoflagellate. Proceedings
882 of the National Academy of Sciences, 108(44):18162–18167.
883 [Sreedharan et al., 2010] Sreedharan, S., Shaik, J. H. A., Olszewski, P. K., Levine, A. S., Schiöth, H. B.,
884 and Fredriksson, R. (2010). Glutamate, aspartate and nucleotide transporters in the SLC17 family
885 form four main phylogenetic clusters: evolution and tissue expression. BMC genomics, 11(iii):17.
886 [Srivastava et al., 2008] Srivastava, M., Begovic, E., Chapman, J., Putnam, N. H., Hellsten, U.,
887 Kawashima, T., Kuo, A., Mitros, T., Salamov, A., Carpenter, M. L., Signorovitch, A. Y., Moreno,
888 M. a., Kamm, K., Grimwood, J., Schmutz, J., Shapiro, H., Grigoriev, I. V., Buss, L. W., Schier-
889 water, B., Dellaporta, S. L., and Rokhsar, D. S. (2008). The Trichoplax genome and the nature of
890 placozoans. Nature, 454(7207):955–60.
891 [Srivastava et al., 2010] Srivastava, M., Simakov, O., Chapman, J., Fahey, B., Gauthier, M. E. a.,
892 Mitros, T., Richards, G. S., Conaco, C., Dacre, M., Hellsten, U., Larroux, C., Putnam, N. H.,

23
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

893 Stanke, M., Adamska, M., Darling, A., Degnan, S. M., Oakley, T. H., Plachetzki, D. C., Zhai, Y.,
894 Adamski, M., Calcino, A., Cummins, S. F., Goodstein, D. M., Harris, C., Jackson, D. J., Leys, S. P.,
895 Shu, S., Woodcroft, B. J., Vervoort, M., Kosik, K. S., Manning, G., Degnan, B. M., and Rokhsar,
896 D. S. (2010). The Amphimedon queenslandica genome and the evolution of animal complexity.
897 Nature, 466(7307):720–6.
898 [Stamatakis, 2014] Stamatakis, A. (2014). RAxML version 8: a tool for phylogenetic analysis and
899 post-analysis of large phylogenies. Bioinformatics, 30(9):1312–1313.
900 [Stanke et al., 2008] Stanke, M., Diekhans, M., Baertsch, R., and Haussler, D. (2008). Using native and
901 syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics, 24(5):637–
902 644.
903 [Stein, 1995] Stein, W. D. (1995). The sodium pump in the evolution of animal cells. Philosophical
904 transactions of the Royal Society of London. Series B, Biological sciences, 349(1329):263–9.
905 [Strous et al., 2012] Strous, M., Kraft, B., Bisdorf, R., and Tegetmeyer, H. E. (2012). The bin-
906 ning of metagenomic contigs for microbial physiology of mixed cultures. Frontiers in Microbiology,
907 3(DEC):1–11.
908 [Suga et al., 2013] Suga, H., Chen, Z., de Mendoza, A., Sebé-Pedrós, A., Brown, M. W., Kramer, E.,
909 Carr, M., Kerner, P., Vervoort, M., Sánchez-Pons, N., Torruella, G., Derelle, R., Manning, G., Lang,
910 B. F., Russ, C., Haas, B. J., Roger, A. J., Nusbaum, C., and Ruiz-Trillo, I. (2013). The Capsaspora
911 genome reveals a complex unicellular prehistory of animals. Nature communications, 4:2325.
912 [Versluis et al., 2015] Versluis, D., D’Andrea, M. M., Ramiro Garcia, J., Leimena, M. M., Hugenholtz,
913 F., Zhang, J., Öztürk, B., Nylund, L., Sipkema, D., van Schaik, W., de Vos, W. M., Kleerebezem,
914 M., Smidt, H., and van Passel, M. W. J. (2015). Mining microbial metatranscriptomes for expression
915 of antibiotic resistance genes under natural conditions. Scientific reports, 5(January):11981.
916 [Whelan et al., 2015] Whelan, N. V., Kocot, K. M., Moroz, L. L., and Halanych, K. M. (2015). Error
917 , signal , and the placement of Ctenophora sister to all other animals. Proceedings of the National
918 Academy of Sciences, 112(18):1–6.
919 [Wray et al., 2015] Wray, G. A., Smith, A., Peterson, K., Donoghue, P., Benton, M., Thackray, J.,
920 Budd, G., Marshall, C., Shu, D., Isozaki, Y., Zhang, X., Han, J., Maruyama, S., Walcott, C., Knoll,
921 A., Walter, M., Narbonne, G., Christie-Blick, N., Raymond, P., Cloud, P., Budd, G., Jensen, S.,
922 Briggs, D., Fortey, R., Morris, S. C., Seilacher, A., Bose, P., Pfluger, F., Rasmussen, B., Bengtson,
923 S., Fletcher, I., McNaughton, N., Pecoits, E., Konhauser, K., Aubet, N., Heaman, L., Veroslavsky,
924 G., Stern, R., Gingras, M., Huldtgren, T., Cunningham, J., Yin, C., Stampanoni, M., Marone, F.,
925 Donoghue, P., Bengtson, S., Gaucher, C., Poire, D., Bossi, J., Bettucci, L., Beri, A., Fedonkin,
926 M., Simonetta, A., Ivantsov, A., Sprigg, R., Glaessner, M., Gehling, J., Seilacher, A., Retallack,
927 G., Narbonne, G., Seilacher, A., Grazhdankin, D., Legouta, A., Li, C., Chen, J., Hua, T., Maloof,
928 A., Tang, F., Bengtson, S., Wang, Y., Wang, X., Yin, C., Yin, Z., Grant, S., Grotzinger, J.,
929 Watters, W., Knoll, A., Fedonkin, M., Vickers-Rich, P., Swalla, B., Trusler, P., Hall, M., Xiao, S.,
930 Zhang, Y., Knoll, A., Chen, J.-Y., Oliveri, P., Li, C., Zhou, G., Gao, F., Hagadorn, J., Peterson,
931 K., Davidson, E., Hagadorn, J., Chen, L., Xiao, S., Pang, K., Zhou, C., Yuan, X., Bengtson, S.,
932 Budd, G., Cunningham, J., Margoliash, E., Brown, R., Richardson, M., Boulter, D., Ramshaw,
933 J., Jefferies, R., Runnegar, B., Knoll, A., Carroll, S., Wray, G., Levinton, J., Shapiro, L., Aris-
934 Brosou, S., Yang, Z., Bromham, L., Rambaut, A., Fortey, R., Cooper, A., Penny, D., Welch, J.,
935 Fontanillas, E., Bromham, L., Wheat, C., Wahlberg, N., Schulte, J., Erwin, D., Laflamme, M.,
936 Tweedt, S., Sperling, E., Pisani, D., Peterson, K., Filipski, A., Murillo, O., Freydenzon, A., Tamura,
937 K., Kumar, S., Tamura, K., Battistuzzi, F., Billing-Ross, P., Murillo, O., Filipski, A., Kumar,
938 S., Hug, L., Roger, A., Ho, S., Phillips, M., Sanderson, M., Thorne, J., Kishino, H., Painter, I.,
939 Sanderson, M., Britton, T., Anderson, C., Jacquet, D., Lundqvist, S., Bremer, K., Shaul, S., Graur,
940 D., Battistuzzi, F., Billing-Ross, P., Murillo, O., Filipski, A., Kumar, S., Mello, B., Schrago, C.,
941 Rambaut, A., Bromham, L., Kishino, H., Thorne, J., Bruno, W., Douzery, E., Snell, E., Bapteste,
942 E., Delsuc, F., Philippe, H., Battistuzzi, F., Filipski, A., Hedges, S., Kumar, S., Schwartz, R.,
943 Mueller, R., Ho, S., Phillips, M., Drummond, A., Cooper, A., Blair, J., Hedges, S., Warnock, R.,
944 Yang, Z., Donoghue, P., Battistuzzi, F., Billing-Ross, P., Murillo, O., Filipski, A., Kumar, S., Ayala,

24
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

945 F., Rzhetsky, A., Ayala, F., Peterson, K., Lyons, J., Nowak, K., Takacs, C., Wargo, M., McPeek,
946 M., Cutler, D., Chernikova, D., Motamedi, S., Csuros, M., Koonin, E., Rogozin, I., Richter, D.,
947 King, N., Dunn, C., Giribet, G., Edgecombe, G., Hejnol, A., Emes, R., Grant, S., Burkhardt, P.,
948 Hejnol, A., Martindale, M., Balavoine, G., Adoutte, A., Hirth, F., Miller, D., Ball, E., Northcutt,
949 R., Strausfeld, N., Hirth, F., Tosches, M., Arendt, D., Lyons, T., Reinhard, C., Planavsky, N.,
950 Planavsky, N., Sperling, E., Frieder, C., Raman, A., Girguis, P., Levin, L., Knoll, A., Stanley, S.,
951 Peterson, K., Cotton, J., Gehling, J., Pisani, D., Butterfield, N., Benito-Gutierrez, E., Arendt, D.,
952 Holland, L., Carvalho, J., Escriva, H., Laudet, V., Schubert, M., Shimeld, S., Yu, J.-K., Turner,
953 S., Young, J., Pisani, D., Poling, L., Lyons-Weiler, M., Hedges, S., Otsuka, J., Sugaya, N., Hedges,
954 S., Blair, J., Venturi, M., Shoe, J., and Gu, X. (2015). Molecular clocks and the early evolution
955 of metazoan nervous systems. Philosophical transactions of the Royal Society of London. Series B,
956 Biological sciences, 370(1684):424–431.
957 [Wu and Watanabe, 2005] Wu, T. D. and Watanabe, C. K. (2005). GMAP: a genomic mapping and
958 alignment program for mRNA and EST sequences. Bioinformatics, 21(9):1859–1875.
959 [Zakon, 2012] Zakon, H. H. (2012). Adaptive evolution of voltage-gated sodium channels: the first 800
960 million years. Proceedings of the National Academy of Sciences of the United States of America, 109
961 Suppl(Supplement 1):10619–25.
962 [Zapata et al., 2015] Zapata, F., Goetz, F. E., Smith, S. A., Howison, M., Siebert, S., Church, S. H.,
963 Sanders, S. M., Ames, C. L., McFadden, C. S., France, S. C., Daly, M., Collins, A. G., Haddock,
964 S. H. D., Dunn, C. W., and Cartwright, P. (2015). Phylogenomic Analyses Support Traditional
965 Relationships within Cnidaria. Plos One, 10(10):e0139068.

25
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

966 1 Supplemental Figures

Raw reads coverage vs GC 1e+06


100

1e+05
80

Bacteria
10000

Bacteria
60

Haploid Diploid Repeats Repeats 1000


GC%

40

100
20

10
0

1
100 200 300 400 500

Coverage Counts

Supplemental Figure 1: Coverage vs. GC content for reads


Lava lamp plot of the unfiltered paired-end reads. Coverage was calculated as the median 31-mer coverage
for each read. High-GC reads indicate the presence of bacteria in the raw reads.

26
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

T.wilhelma Moleculo contigs coverage vs GC


100
80

100
60
GC%

40

10
20
0

1
100 200 300 400

Coverage Counts

Supplemental Figure 2: Coverage vs. GC content for genomic scaffolds


Heat map of percent GC versus coverage of reads for the all Moleculo reads.

27
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

100
80
60

10
%GC
40
20
0

1
50 100 150 200 250 300 Counts

Coverage

Supplemental Figure 3: Coverage vs. GC content for genomic scaffolds


Scatterplot of percent GC versus coverage of reads for the all scaffolds. The 1,040 contigs with zero coverage
are carried over from low-coverage Moleculo reads, and likely derive from amplified contaminating DNA.

28
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

80

Non−Rhizobium 1
Non−Rhizobium 2
Non−Rhodobacter 1
Non−Rhodobacter 2
70

Non−Rhodobacter 3
Non−Rhodobacter 4
Possible Rhodospirales
Sponge contigs
60
50
40
30
20

0 50 100 150 200 250 300

Supplemental Figure 4: Separation of contigs derived from bacterial symbionts


Sponge scaffolds are in green, while scaffolds assigned to bacteria are identified by blue (Rhizobiales) and
pink (Rhodobacter ). Low-coverage contigs were removed. Seven bins were identified with MetaWatt to
separate the bacterial contigs, though several bins appeared to correspond to the same bacteria.

29
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

100

Twi to A.queenslandica
Twi to S.ciliatum
Twi to N.vectensis
Twi to H.sapiens
80
Percent Identity

60
40
20

0 100 200 300 400 500

Genes ordered by identity with Aqu

Supplemental Figure 5: Percent identity between sponges


Calculated protein percent identity between 570 one-to-one orthologs between T. wilhelma and A. queens-
landica (blue), S. ciliatum (green), the anemone N. vectensis (purple) and human (red). Average identity
between T. wilhelma and A. queenslandica is shown as the dotted black line at 57%.

30
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

513
500
400
Number of blocks
300

750 total blocks


3400 Tethya genes in blocks
200
100

84
53
34
17 9 14 8 7
2 3 1 0 1 1 1 0 1 0 0 0
0

3 6 9 12 15 18 21

Number of genes in block


Supplemental Figure 6: Length of microsyntenic blocks
Histogram of number of genes in detected microsyntenic blocks between T. wilhelma and A. queenslandica
v2.0 gene models.

31
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

Placozoans

_4
ffold 2.t1

old

_4
_ .t1
brbra _sc245 103 _scaff
_sca_g1033
778

old
_4
Acro

g42 a_bra 10333.t1

3.t2 056

r1 1_ ld_ _s 1
Po

4 caff
.t
Cnidarians

sima_20
rit

ak ke aff .t1 28
Porite Pori
pora

Acropora_mil

r4ia_bra _g424_g10

1. .t1
es

Por
Po rali

Acropora_dig

r1

_
Porites_lobata_6537
_

t1
33 2 9
au
rit P en

Tr H oilung 44.t1_ke
_dig riteali ut

Fungia_scutaria_26369
it

Ho o _g gia d1raker1

a_a_ 1_ g4 _g

10 03
s_aust tes_

_g
st
es_ o sis P F

o
e

Fungia_

legantis
s_a ungia

_g g1
ra r1
P
itife s_lo

r
aurite _1 ori un

ngng 6.d1 ke
o
F
Homoscleromorphs

d1_ungike
_4

ustr _sc

Anthop NVE14232
str s _ 3 4 t e g

iadoilu Tria_b

r
_
ralielo
ld

ra_3 baista_ _5

ia
fo 4

alilo 4 s_ ia

lepitifera_

e
t
af _

TriaHoiail_b
scutaria_

leura_e
An sc old

enba 4 lo _s

ens aria

nsbisat_5
751 1_9

ora_12988 2
th __ ff

iluilu 4T 2
sitsa_ ba cu

n
t. 1 _sca d_4 d_4

ng
op

i
1

_14 ta ta

i
03 le
51 1_ fol fol seq

_4
a_10

Hoilu

H
_a ur

1
1132 _ r
42 2.t caf caf 3_

ld
ur

14732
8691677
a_

2125
_c

2750
ato el

407 11 ia_

fo
e _g25 __s __s 15
03 ia_rc AI leg

864 1
llav

af
197
anh d1 4 t1 t1 16

sc
_au _f PG an A
in ia1_g 48. 49. p1 q1 eq1
6

__
oef ENtis IPG e
fen relia alA r d 2 o m s s

17 6
Tia 42g4 _ 0_

.t1
E2sim E _c c0
i_t_ _rc SM
Tr 1_dg1_ 98 8_ 620_c

53
h01 _fin _1 7 9 a _ NE 76 44

97
iadria 12 p10 3

62

42
_03 alA 221 0529 2 _4 i4_
NV 27967 Tr T 60 comp _com g 4_

60

_g
Dopamine
83
529 SM 7_ h 0 _
_ 1 _

d1
_co 18 _p E8 _t 6_ 95 7|c

98
lla 267 294 853

9408
2
mp 59 ar

ia
08
232 _5 tia are 0_6 h60_

8
TR5

Tr
l 9 c ial
8 _c s 6 _ _ a rt
0 _s o t_h a_t rum 1_p

99
_ ll
lla are lab _i1_

β-hydroxylase
eq4 nde
85 are osc _g1
Cca |c0
85 os c 586

Hexactinellids
R48
92 ru m_T
lab
nde

97
DOP Cca
Dreri D O_MDOOP
o_AA
I63055OPO seq1 1_i1
_HUUSOE_ _c0_ 3_c0_g

99
Cmilii_ .1__DBHMABNOVIN 2191 R1342
mp2 um_T
98
XP_007 65 99
Csavigny 898364 us_co pulifer
i_SNAP0 .1__DB vast onema_po

77
00000936 H tes_
69 73 callis hyal _seq1
54 54 aphro 648_c0
comp6
Sakowv30025742m 92 vastus_ _i3
070_c0_g1 0_g1_i2
allistes_ lata_TR4
88 aphroc rosella_fibuagel la_nux_TR9313_c

90
53 35 symp
sympagella_ nux_TR1348 6_c0_g1_i1
Amel_4.5_GB53665-PA 98 81
NV15929-PA 29 86 99
38 59 87
52
Cgigas_EKC28453
0180
Lotgi1|12 6.2
31
09.p
as eq.14m
irn 3126 53 1249
99

ob
v220
Ocbim o 1 |64a21|14
Helr Capc
21
21

39 32
22 97
19

Clionavarians_TR46177|c0_g3_i8
20 23 Clionavarians_TR4
Clion avarians_
TR53827267
37

|c5_g6_i
2|c0_g1_i1 1
30 72
twi_ss.1 _1
50 twi_ss 4527.1
93

.22187 _1
99
.1_1
Clio Clio
96 89
58
79 nav nav
aria arian
ns_T s_T
97

tw
twi_ R 2 9 R 4 79 8
i_sss.25 792 5
s.1 29 |c0 |c1_g1
tw380 9b.2 _g 2 _i2

99
i_s 7.1 _1 _i8
s.1 _0
88
Xt 9 6
26.1 es .1_
026 1
54

Sla Xt A t ut
eq.1
eSplahc

es qu ina A
nas ecpus Slaep lleri_

Aq
80

tu 2 ria qu
obir hytri tin .1. _T 2.1
yudsatr

u2
das_ usytdriat 3971i2_5 ar 27 R1 .37

.1
ia 56 29 88
tiais_c1

tica10

.2
.p 4
.1_
E

30m
_T 6 37 3_

75
m 44 c1m0uel omp
NR

m u 03

69 R 1 _0 0 |c 001
ch

2
XD L1 6

47
029
ue 67
97 0_
MODBH 9

A
DA

.18 76 1
e 0 01

v22 qu

_0
g1
lle _g
Sakowv300

a 7| 2. _i1
-sr

01
ri_ 1_
bim
1_

s_ia
.1_ ANK

Oc evo D1 c0 1.
96 SEM IC

2 _1
08 i1_
.1_

r-d X _g 75
LoLtg

E
74 UU H

2 g 1_

ULS2 E O
57 3
a
89 M1O_H1_C

1_ 48
NR
Sakowv
_

45 ler

pm
otg
705

MBOH M _0
.1_0 .50 _0

5_
__ i6
DA
i1|1
1.9_0

2__D
_D D

22 i_1 8610

_0 01
.460 _1

DBH-like 2
c
CA
114nase0q74.1

i1|1352
X

XD.1_ 2_
o
_c

_g 75
_0 XDO1XO

m
aseq 386.2

M2O12 XD NO
M MM

39026m

p5
3_ 67

O
35247
300350

0 A
L2
91

0_
seq bflorq.24

Csavi nyi_SNAP

i1_ _c

A3 M
Csavig
O
07

DA DBH
6

UE
57
bflorn q.28

3 omp
06

s_
_c
e

__ 9G
993-PA94-PA

ru
s

gnyi_S

0_
r na

.1
XP

au H
rnase

86m

Bt 96
.34

_c0

se
_

01
bflo
ilii

q1
66

1
_GB406

_se

28
Cm

Demosponges
bflo

NAP

37

01
q9
rna

P_
0_

_X
0000

c0
bflo

NV.515

lis
000000

_se
NV10994-PA

ca
Amel_4.5|GB41735-PA-trimmed

pi
Amel_4

0111

q2

ro
Xt
OME

88972
MOX11_DROME

011
MOX12_DR

DBH-like 1
0.3

Supplemental Figure 7: Dopamine β -hydroxylase homologs across metazoans


Tree of DBH and DBH-like proteins across all metazoan groups, generated with RAxML using the
PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown.

32
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

TYDC1_ORYSJ
83 TYDC2_ARATH
50 TYDC1_ARATH

Plants
TYDC1_PAPSO
98 99 TYDC5_PAPSO
TYDC3_PAPSO
TYDC2_PAPSO
TYDC2_PETCR
TYDC1_PETCR
74
TYDC4_PETCR
85
TYDC3_PETCR
SARC_01107T0
50 Odioica_GSOIDT00000153001

Sponges
40 Platygyra_carnosus_4652
lctid68041_0
99 Syconcoactum_contig_34351_2_partial
75
98 scict1.001978.2_0_partial
Syconcoactum_contig_14168_4-partial
Capsaspora_KJE95580.1
Ccandelabrum_TR92788|c0_g5_i7_4-c0_g8_i14_2-partial
92 oscarellacarmela_comp38470_c0_seq2_2

Placozoans
42
Hoilungia_stringtie|12206.1|m.15829
Triad1_g1959.t1__scaffold_2
39 Nve_XP_001635455.1

Cnidarians
Alatina_alata_c38444_g1_i1_0
35
16 Chironex_fleckeri_TR31060_c0_g1_i1_0
50 NVE18494
AIPGENE1003
AIPGENE10179
Cioin2|262216_partial
Cioin2|230165
Csavignyi_FGENESH00000078696
12 Csavignyi_GENEFINDER00000066949
Helro1|186120
81 Capca1|158583
Lotgi1|181541
Lotgi1|201667__AADC-like
15 82 Cgigas_EKC41301-trimmed
60 Ocbimv22022526m.p
43 obirnaseq.78969.4_1
21 Ocbimv22022529m.p
Ocbimv22022527m.p
Lingula_comp152792_c2_seq3__DDC-like
Sakowv30017927m
25
7 Bf_V2_253_g40219.t2
21 95 Bf_V2_287_g43385.t2

34
86 Bf_V2_250_g40085.t26
Cmilii_XP_007895676.1__AADC Aromatic amino acid
Drerio_NP_998507.1__AADC
98
94
Xtropicalis_XP_012820058.1__AADC
DDC_MOUSE
Decarboxylase
DDC_HUMAN
88 DDC_BOVIN
Cgigas_EKC25403-trimmed
Ectopleura_larynx_g20429.2_i2_joined-partial
Ectopleura_larynx_g36591.2_i1_partial
91
HAEP_T-CDS_v02_46779
98 86 HAEP_T-CDS_v02_10472
6
hydra_sra.4784.1_2
66

82
HAEP_T-CDS_v02_45038_partial
99 hydra_sra.19042.1_0 Hydrozoan AADC
HAEP_T-CDS_v02_1229_partial
33 hydra_sra.8276.1_2
Dmel_FBpp0080698
Dmel_FBpp0080697
32 97 TC013401
53 Bmori_XP_004931016.1

54
33
23
TC013402
Amel_4.5|GB45938-PA
Invertebrate
71
E9RJV1_GRYBI
NV11111-PA

Amel_4.5|GB45973-PA__AADC-like
AADC-like
NV11109-PA
41 TC013480
35 DDC_DROME
82 DDC_MANSE
Bmori_NP_001037174.1__AADC
Spurpuratus_XP_011664746.1__AADC
93 Sakowv30016396m
Helro1|101612
86 Helro1|84539
Helro1|84403

Invertebrate
Capca1|119245
75 Lotgi1|139922__HDC-like
96 Ocbimv22019847m.p
95
85
Amel_4.5|GB55830-PA
TC012567 HDC-like
Dmel_FBpp0085475
TC030580
59 Dmel_FBpp0085476
44 NV18137-PA
52 Amel_4.5|GB55831-PA
Spurpuratus_XP_789367.3__HDC
98 PFL3_pfl_40v0_9_20150316_1g34573.t1
Skowalevskii_NP_001161568.1_HDC
93 pmar16.36248-30514.1_0
Oreochromis_niloticus_XP_005463234.1__HDC
96 A7KBS5_DANRE__HDC
Xtropicalis_XP_002939672.3__HDC
Homoscleromorphs 83
CHICK_BAP16218.1__HDC
DCHS_BOVIN
Histidine
Calcarea 82
DCHS_MOUSE
DCHS_HUMAN Decarboxylase
Aplysia_XP_012940696.1
Aplysia_NP_001191536.1__HDC
Placozoans 93
73
Capca1|180248
Ocbimv22006132m.p
Cnidarians 93
59
28
Cgigas_EKC37654__HDC
LOTGIDRAFT_119964__HDC
Protostomes Dpulex_EFX79676.1
Apisum_XP_008179690.1__HDC

Echinoderm/hemichordate 99 50
Bmori_XP_012551886.1__HDC
TC010062
70
Chordates 89
NV12919-PA
DCHS_DROME

99
Vertebrates
Amel_4.5|GB47379-PA__HDC-like
Bterrestris_XP_003393425.1__HDC

0.5
Plants

Supplemental Figure 8: Aromatic amino acid decarboxylase homologs across metazoans


Tree of AADC and histidine decarboxylase (HDC) proteins across all metazoan groups, generated with
RAxML using the PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown.

33
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

Demosponges

Tedaniaan calephyllophila
Demosponge

Crellaelegans_T
Homoscleromorphs

My
helens_TR
Calcarea
MAO-like MAO
Cnidarian

22979_
R35249_c
Ctenophores

_TR827

Aqu2.1.28355_001
c0_g1_

Aqu2.1.28354_001
Ocbim
Placozoans

1_g2i1_
MAO

76_c1_g1

83
_i2_2

57
v220
Cnidarians

4 _part

E1
lctid61scict1.0

NV
g1_i2_3
2955

_i1_1

l
7 _4438147 partia
Protostomes

4m.p Helro 9147m

ial
359_0 1872
O cbim

79n3sisra_1 604_
Stylissacarteri_TR109385_c0_
Chordates

5
Styorite Accrro IPG12

v22

sa 38
Mo horaustsrtreoraa__m E149

e po _29
43
P
lop s_ A oppo

erno287
00
nta Fu_pisali p di il

Dappu1
1|1

raa_scu 1888
6.1_0
Vertebrates

strang tilenoragilte 988

_cataria_
E15952

p_a2lile
a

855.p

a_uss_trmvil
ea_ia_slatsis__s ifpeo
A

E1
|30097

73 80
02

GEN
cavcuta

2_196_g33248.t2

peosp_reoaara
or

46 86241145
85 M AIPNV
A VE

ern_6r4ia
N

246
a1|136AFL
Plants and other outgroups

a
Fu

ngi

74 ta 16ra_
2
6.t

oitro

_35 lla a_po 9


gu

Aercst
8
os5a9_23413701761155

ia 1stiifeilrle872
EN95
_X Dr 67

86 AosnPFu
trota
_BR
19

9p_2rraa__

_
M P_ eri

_9262 6 18
g

_
on .1-

ut_4a_ iag_ sis


01 o_
S. arctica
5

CapcEL9

ar34pi itm
br 58

12342
16 XP

scspoar_odrlien
1_
g1 02 _6 6 q.4
38 8471 ase

C3Z

aa_p_ horrotpra
99 81 86 86 2778
6028 orn

Bf_V
8. 0. 31 20|1 bfl

iopc s
2N4E

u
t2 1_ 0
1c|a1

c o_a
__ _M.4 VEE

A
caap

96

gyvirl0
sc A_O_M

FurS1Aites
6

84 9
NPG p 8

Fnat0
af C
fo AI Ca 7 68

49
-liAO 8|510 ed

9
ld

98 P o
_4 ke -
3a21 FL oin
lik
|1pc RA .p-j
e aC1 a _B 3m
88 pc C1 FL
ZD RA 200
855
Ca CA30_B imv424

90
O AO A

28
ZE O|c1b902 .1__0M.1A__M .1__MAO
C3 i1 00136162 8354 __MAOA
Lotg AH_0 7 39 _00108
P 15155953
.1
P is0_N_0

22 2
Cnidarian and
o__AX
FuriguX keevn_XPIN

24 2 19
Dre la
Chic _B OV
UM
FB_HB_MOUS AN E
47

60
AOFB

68
77 48 AO AOF UMAN
9037 AOFA_H

85
30 84 61AOFA_ AOFA_BOVIN
MOUS
ep

M E
h

yc
yd

79 0 62

sponge group
a

76
44 5
at

lep

Monoamine
ia

h
m

yll
ue

op
h H
lle

ila al
ri_

52
Te _T isa

84
da
25

n R8 tw rc
71

Sty Cli iaan 20 i_ ad

Oxidase A/B
4 ss uj
1_

o
lis n he 0 _c .16 ar 60
co

My sac av le 2 _g693 din


m

cale a
art ri s_ n
p7

ph eri_ans TR 1_ .1 i_
i2 _1 H
80

yll TR_TR 28 _2 AD
op
73

hil 1 0064 670 A0


a_T
_c

1633 _c 10
0_

eph R8 4_0_ 0_ 43
08 c0c0 g1
se

yda 72 __ _ 50
_c0 g1g_1i1
q1

tiam _ i11 0.
eph u _g2 i_1_ _
yda elleri_ _ i1_
53 4 1_
5
tiam 1
uell 8278_ 1 SARC_08858T0
eri_ c
248 omp6
26 _ 6 6
com 77_ twi_
p69 c0_
421 seq ss.16
_c0 1
_se
q1
970
.1_
2
Capsaspora and
Bf_V
Bf_V AIPGE
2 _1 NE 94 cnidarian group
2_26 84_g30 14762
Bf_V CAOG_00719T0
3_g4
2_98 7
1383 03.t1 100
100

_g16 122..tt1 7 Acropora_millepora_19274


1 61 4 AIPGENE7844
AIPGENE NVE215 44 13 7 79
22275
AIPGENE2226
36 0
AIPGENE3902 7 62 5
AIPGENE17020 3991 94
99 5
AIPGENE9768 59 96
75
SARC_03516T0
1_0
hydra_sra.16499. .1_2 69
.12474
hydra_sra S.rosetta_PTSG_06229T0
25

47
97
Plant Lysine
85

Demethylase
Primary amine 100 96 LDL1LDL1_OR
LDL2_ _ARA
ARATH TH
YSJ
0

LDL2_O
10

RYSJ

Oxidase (PAOX) 84 8

KDM1A
26

4 FBpp03
35

7 958 04939
Am 33
189 19 el_ 9BKf_ AIP
10

4.5 DVM2_ GEN


6

3 |G 1A225_ E198 THrioialu


0

6-P-PA
B45215 A 89 68 M g37 hy5d2_p d1ng
B4 _H _gia1 ra
Amel_4.5|G
NV1830 83 UOM 2 ra art 1_b
71

86 UASN 565ke
30

E32.t2 _sraia
.3l0 .t1r1___g10
-PA scaff903.t1
66

586 old_4
40

.1_ 7
58 _KD
1 M1
|31173 39-PA
-lik
Dappu1 B459 res e_p
artia
_4.5|G 27058 om
99

EN l
Amel u1|3 OUUMS-lAike -lik
e Tia2_t
Dapp OXX_M
_HO X OXEN M_x
PAPAO__PA _SOMMUSA 2 NV 9S0_
MI_MU 5.t 7_05
ICK
54 56

82 E1 HY94
_CH ALOOXX__H g8 71 DV61_
Bf F6 E83

1 _CSM _
88

5 Lo 56_
_V V 1BFK6

JV B S M 9 U com
KL _3
78

F1N
2_ LRH3RDG

V9 _V
2 tg _K p7
24 2_AB3D1_2M

Cg i1 D
97 7 36

Bf ig |14 M1
43
43

6_ XE83C_HHBU_ON

as 5 -li 53
_c
g3 NT_LAIC M

_E 16 ke
XM

0_
K 0

95 R TK ANUS
hor M
5

KC 5 _p seq
25

a
B

24 rti 1
99 69

mip
bboba

.t1
_
1

08 al
NVE13555

M
ealiuth
L0nth

-b

9
hpolo

flo
C
MODO

KDM1B
9oo5yc

H
rak_ameis

rn

1
cp3ysiro

0
as

39
1te0na_t
99

t_h _t_
Monbr

eq
s_at_ _h30

|5 0
E

u1 97
.3
_t_
20_ h200_298

15

pp 01
h20 _08640

Da1|3
19
h2_104
095 _09738_cmp96
95

1_

.2

u
pp
-g
g8203.

Da
73_ 797_omp 45_c0

39
S.rose __scaffold_2

52
35.t1

com com 1676 _seq


78

7.
Hoilungia_braker1_g04833.t1 ld_1

2_co omp12342

t2
t1

p87 p82 6_c0 1


35.t11

tta_PT

87
raker1 r1_g0_9138
4.t1

_1

_c
ffold6.t
Triad1_g432.t1__scaffo
A

23_ 88_c _seq


00 91
A

g04883
ke old
6-Pl

Corticiumcandelabrum_TR56531_c1
77 ia
-P

_g0483

Ctenophore
17 art
76

.t1__sca04

Aqu2.1.40694

SG_0

c
a_bra aff

0
52
NV _p

ike 074962

Aqu2.1.

_se
er 1_g
ungi __sc

PA
B5

-PA

Triad1_g431 1_
p0 5

9-
q1 q1
6283T0
_SM 8 Bp 08
|G

akker

95
62

7
Hoil9931.t1

_c0_se

_
86 _ 9 5 F 0 3
-li _4.5

KDM1
10A
A

se
01

bra
PA-P

33483_00

N8V-P
p

ia_br
B5

p
2- 61

el

ngia_b

1
FB
Am

Hoilungia_
ke

5
__PAO
9609

|G

09
q1
OX-l

_001
g
10V1

1
d1_
4.5

Hoilung
AO

NV
NV N

el_

Hoilu
M

Tria
762 074

X-like
1
__

Am
66

p00 pp0
91

B
00

Lysine-specific
TC

Placozoan
FBp

_g1_i1_1

PAOX-like Histone Demethylase


(KDM1)
0.8

Supplemental Figure 9: Monoamine oxidase homologs across metazoans


Tree of monoamine oxidase (MAO) and related proteins across all metazoan groups, generated with RAxML
using the PROTGAMMALG model. Searches for lysine demethylase (KDM1) and primary amine oxidase
(PAOX) were not exhaustive, and were added to display the sole positions of ctenophores (only have KDM)
and placozoans (only have PAOX) in this protein family. Bootstrap values are 100 unless otherwise shown.

34
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

Filasterea
CAOG_08244T0
41 CAOG_00213T0
CAOG_09932T0
CAOG_05982T0
Aqu2.1.43530_001
21 twi_ss.25346.1_2
41 ephydatiamuelleri_20738_comp67567_c0_seq1
ephydatiamuelleri_25644_comp76203_c0_seq1
ephydatiamuelleri_25735_comp78991_c0_seq1
97 ephydatiamuelleri_22338_comp68117_c0_seq2
52 ephydatiamuelleri_06289_comp54437_c0_seq1
54 ephydatiamuelleri_05142_comp51299_c0_seq1
ephydatiamuelleri_20901_comp67615_c1_seq2
Aqu2.1.23001_001
Aqu2.1.22999_001

Hexactinellids
twi_ss.11316.1_0
69 ephydatiamuelleri_09349_comp60264_c0_seq6
ephydatiamuelleri_10299_comp61291_c0_seq1
Slacustris_c104630_g1_i1
rosella_fibulata_TR14631_c0_g1_i1_1
aphrocallistes_vastus_comp22152_c0_seq3_3
aphrocallistes_vastus_comp16141_c0_seq1_1
41 rosella_fibulata_TR14675_c0_g1_i1_0
sympagella_nux_TR10361_c0_g1_i1_0
rosella_fibulata_TR4584_c0_g1_i5_2
95 sympagella_nux_TR20329_c0_g1_i1_2_partial
aphrocallistes_vastus_comp22385_c1_seq6_1
aphrocallistes_vastus_comp22385_c1_seq2_1
aphrocallistes_vastus_comp22385_c1_seq1_1
aphrocallistes_vastus_comp12269_c0_seq1_2
aphrocallistes_vastus_comp17700_c0_seq1_4
97 rosella_fibulata_TR4083_c0_g1_i1_1
rosella_fibulata_TR4083_c0_g1_i2_1
hyalonema_populiferum_TR19_c0_g1_i1_4
rosella_fibulata_TR7683_c0_g1_i1_0
97 rosella_fibulata_TR7683_c0_g1_i3_0
62 aphrocallistes_vastus_comp11820_c0_seq1_4
rosella_fibulata_TR8723_c0_g2_i1_2
15 88 75 93
sympagella_nux_TR21451_c0_g1_i1_0
rosella_fibulata_TR3980_c0_g1_i1_1
rosella_fibulata_TR3353_c0_g1_i1_0
aphrocallistes_vastus_comp20208_c0_seq3_0
97 hyalonema_populiferum_TR13658_c0_g1_i1_1_partial
rosella_fibulata_TR7995_c0_g1_i1_2
sympagella_nux_TR21457_c0_g1_i3_1
86 sympagella_nux_TR21457_c0_g1_i2_0
sympagella_nux_TR21457_c0_g1_i1_0
twi_ss.15124.1_2
ephydatiamuelleri_17836_comp66476_c0_seq2
ephydatiamuelleri_08108_comp58383_c0_seq9
twi_c31705_g2_i8_0

Demosponges
97 twi_c31705_g2_i4_1
twi_ss.30746.2_1
85 twi_ss.29636.1_1
Aqu2.1.28011_001
Aqu2.1.38315_001
52 ephydatiamuelleri_22930_comp68288_c0_seq43
Slacustris_c104641_g1_i1
Slacustris_c98123_g1_i3_partial
25 ephydatiamuelleri_16611_comp65944_c0_seq8
Aqu2.1.39153_001
95 ephydatiamuelleri_16976_comp66116_c0_seq1
20 38
ephydatiamuelleri_16977_comp66116_c0_seq2
Slacustris_c103807_g1_i3
16 Aqu2.1.25564_001
Aqu2.1.38717_001
Aqu2.1.27571_001
54 Aqu2.1.27573_001-trimmed
Aqu2.1.39154_001
69 45 Aqu2.1.38823_001
97 Aqu2.1.41785_001
Aqu2.1.26762_001
70 ephydatiamuelleri_23301_comp68424_c0_seq32
36 twi_ss.13620.1_2
44 twi_ss.20824.4_2
ephydatiamuelleri_11626_comp62613_c0_seq9
98 ephydatiamuelleri_23659_comp68518_c0_seq1
ephydatiamuelleri_23660_comp68518_c0_seq5

37
ML02335a-AUGUSTUS
hormiphora_t_x0_09819_comp8395_c0_seq2
oscarella_t_h60_65871_comp34165_c0_seq1_partial
Ctenophores
oscarella_t_h60_42123_comp11388_c0_seq18

Homoscleromorph
oscarella_t_h60_52462_comp11760_c0_seq13
93 oscarella_t_h60_64267_comp17249_c0_seq1
Porites_australiensis_13238
Alatina_alata_c61081_g1_i1_5__lCt
86

Cnidarians
32 Acropora_digitifera_408__lCt
Acropora_millepora_2820__lCt
Capca1|22448
Capca1|42446
47 Triad1_g522.t1__scaffold_1
Hoilungia_stringtie|10218.1|m.13450
Triad1_g72.t1__scaffold_1
20 Acropora_digitifera_6641
Acropora_millepora_2277
48 hydra_sra.36426.1_0
76 adi_v1.15790
Porites_australiensis_19582
1 C3Y433_BRAFL-trimmed
25 Capca1|22458
10 Lanatina_comp153988_c0_seq2_1
4 Stylophora_pistillata_2729
Porites_australiensis_23261
80 Stylophora_pistillata_11795
64 Stylophora_pistillata_9171
92 AIPGENE7219__lCt
Acropora_millepora_3159
NVE1157
NVE5469__lCt
56 Porites_australiensis_9038__lCt
5 40 98
94
Lanatina_comp149962_c0_seq3_2
obirnaseq.37923.1_2
Capca1|204049
Protostome
GABAR-B3
TC007169
Amel_4.5|GB53009
71 Dmel_NP_001285554.1__GABA-B3G
Porites_australiensis_8820
95 Anthopleura_elegantissima_62189
G5ECB2_CAEEL__gbb-2
47 C3YEC0_BRAFL
98 84 Q1LUN9_DANRE
GABR2_HUMAN

GABAR-B2
GABR2_MOUSE
2 65
Capca1|222380
1 97 Lanatina_comp141557_c0_seq2_3
74 Amel_4.5|GB49239
TC014995
95 Dmel_NP_001287456.1__GABA-B2C
Hoilungia_stringtie|8751.1|m.11539
AIPGENE12245
5
Hoilungia_stringtie|15673.1|m.20278
36
Demosponges
Triad1_g1739.t1__scaffold_2
97 Hoilungia_stringtie|6504.1|m.8438
Triad1_g6059.t1__scaffold_6

Oscarella carmela
Hoilungia_stringtie|6503.1|m.8445
Triad1_g6057.t1__scaffold_6
Dmel_NP_001246033.1__GABA-B1C
2 62
Hexactinellids
Amel_4.5|GB49131
81 TC016191-016192-partial
B3VBI8_CAEEL__gbb-1
43 64
obirnaseq.69248.1_2
Lanatina_comp155098_c1_seq4_1
93
Ctenophores 58 Capca1|107055

GABR1_MOUSE
C3Z4V0_BRAFL
GABAR-B1
Placozoans GABR1_HUMAN
F1QAJ3_DANRE
F1RDY7_DANRE

Cnidarians 5
70
Hoilungia_stringtie|3833.2|m.4840
Triad1_g7917.t1__scaffold_9
Hoilungia_stringtie|3832.1|m.4618

Protostomes Triad1_g7915.t1__scaffold_9

Placozoans
55 Hoilungia_stringtie|3838.1|m.4714
Hoilungia_stringtie|11079.1|m.14461
61
Chordates Triad1_g7871.t1__scaffold_9
Triad1_g3741.t1__scaffold_3
Hoilungia_stringtie|10558.1|m.13818

Vertebrates 11 Triad1_g5475.t1__scaffold_5
Stylophora_pistillata_9477
Porites_australiensis_37273
Stylophora_pistillata_9644

Capsaspora (Filasterean outgroup) 0.4

Supplemental Figure 10: mGABA receptors across metazoans


Complete version of Figure 4 metabotropic GABA receptor (GABA-B type) protein tree generated with
RAxML. Bootstrap values are 100 unless otherwise shown. The majority of deeper nodes were poorly re-
solved; branch transparency corresponds to bootstrap support for values under 50, meaning half-transparent.

35
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

246 268 287 367 395 446


GABR1_HUMAN - P GC S S V - - YGS S S P AL S F R TH P S A GL F Y E TE A L I G - WY A GG F Q E A P L
GABR1_MOUSE - P GC S S V - - YGS S S P AL S F R TH P S A GL F Y E TE A L I G - WY A GG F Q E A P L
F1QAJ3_DANRE - P GC S S V - - YGS S S P AL S F R TH P S A GL F Y E TE A L I G - WY A GG F Q E A P L
Dmel_NP_001246033.1__GABA-B1C - AGC S T V - - YGAS S P AL S F R TH P S A GL F Y VVAA F I G - WY E EG YQE A P L
Amel_4.5|GB49131 - AGC S T V - - YGAS S P AL S F R TH P S A GL F Y VVAA F I G - WY E EG YQE A P L
C3Z4V0_BRAFL - TG C S S V - - YGS S S P AL S F R TH P S A G V F Y EDMA I I G - WY P PGY P E AP L
B3VBI8_CAEEL__gbb-1 - TG C S P V - - Y GG S S P A L S F R TH P S A GL F Y V TE A F I G - WY A GG F P E A P L
GABR2_HUMAN GG VC P S V - - F AA T TP VL A F R TV P SD GQ F DQN MA I P G - WY E G P S K F HG Y
GABR2_MOUSE GG VC P S V - - F AA T TP VL A F R TV P SD GQ F DQN MA I P G - WY E G P S K F HG Y
Q1LUN9_DANRE GG VC P S V - - F AA T TP VL A F R TV P SD GQ F D E N L A I P G - WY Q E TS K F HG F
C3YEC0_BRAFL G G TC S S V - - YDV TS P E F S F R TV P SD G L F D E RM A L NG - P Y S R TN P F H G Y
Dmel_NP_001287456.1__GABA-B2C G A A C TH V - - Y A D TH P M F T F RVVP SE GN F N E H F A I MA - T Y S E Y S R F HG Y
Amel_4.5|GB49239 G A A C TH V - - Y A D TH P M F T F RVVP SE G N F N E TWA I MG - T Y T E Y S R F HG Y
Hoilungia_stringtie|15673.1|m.20278 G SG Y S S V - - YGA TS P S L S L R T I P SD A S F Y E K EG L L G - WL S D TYG Y SN F
Triad1_g1739.t1__scaffold_2 G SG Y S S V - - YGA TS P S L S L R T I P SD A S F Y E K EG F L G - WL S E S YG Y SN F
Hoilungia_stringtie|6503.1|m.8445 G AG Y S S V - - Y S A TS P VL S FRT I ESE A S F Y E D TG L I G - WL E QQD I Y A A F
Triad1_g6057.t1__scaffold_6 G AG Y S S V - - Y SASSP VL S FRT I ESE A S F Y E D TG L I G - WL E K RD L Y A A F
Hoilungia_stringtie|6504.1|m.8438 G AD F S T V - - FGS TS P VL S Y R TV S SD AN F Y E D L G L L G - WL Q QP LGY Y P F
G5ECB2_CAEEL__gbb-2 GGQC T E V - - Y A E TH A K F A F RVVPGS VD VD E E MA L PG - YH S P N N TWR G Y
AIPGENE12245 G AG Y S KC - - Y S S TS PQL S Y R T I P DD GN F F GQG A L LG - Y I E R RDDH S S Y
Hoilungia_stringtie|10558.1|m.13818 G AQN S RC - - HG S TS T I L S F R TA L SD A I GQ A P V I L V G - WY P T TN T L R A Y
Triad1_g5475.t1__scaffold_5 G AQN S R S - - HG S TS P I L S F R TA L SD A I GQ A P V I L V G - WY P T TN T L R A Y
Hoilungia_stringtie|11079.1|m.14461 GP VSS I C - - N AA TS A TL S Y R TA L SD V Y A Y P DM T L L G - WY F Y L A S V H TW
Hoilungia_stringtie|3832.1|m.4618 GP VL SAV - - P S S A SMK L MR T A I SD L F AY SVKA L PG - F F F P L L GS Y TF
Hoilungia_stringtie|3833.2|m.4840 GP VF SY I - - H T A T SME L T F R TA I SD L F AY SP I A MP G - F Y Y P L YGS Y S Y
Triad1_g7917.t1__scaffold_9 GP VL SSV - - H T A T S MQ L V F R TS I SD L F AY P SKA L PG - F Y L P L SG S Y S Y
Hoilungia_stringtie|3838.1|m.4714 G P I L SQ T - - P S A TS L KL A F R TS I SD L N A Y P GQ I L P G - WL Y H L SG TY TY
Triad1_g3741.t1__scaffold_3 G P L T T VG - - SG V S S A A L F T T E T TD L F AD L I P T I TF - GY I P F K Y F Q TN
Capca1|42446 A A AN S E L - - Y ASVSP I L S F R TS S PD T I C NMP Q A M P G - WF D H A TN L A TQ
Capca1|22448 GGGC S I V - - Y GC MS P S L S F R T VQ P E TA S Y ED EM F A N - WF D TG K D F G P A
Hoilungia_stringtie|10218.1|m.13450 G A AC S I V - - Y ASASP AL S F R TY P S E AG F Y S P QG I P G - WL G GA I DY ASY
Triad1_g72.t1__scaffold_1 G A AC S V V - - Y ASASP AL S FR I YPSE GG F Y S S QG I TG - WL E GA I DF ASF
Triad1_g522.t1__scaffold_1 G S AC S T V - - Y ASVSP AL S F R TRS S E TG F F P S P G G P G - WF T GG K K F AG F
Capca1|22458 GGGC S E A - - F S AG S P D L S F R T I RSD YN S Y AK E A I P G - WWA GYN P YH TF
Lanatina_comp153988_c0_seq2_1 GGGC S V S - - YGAARL K L S WR N I P TM F DG S EQ T F Y Y G - WY N G F ND YH P F
AIPGENE7219 G P GC S E S - - MAN AN P A L G VR TVP P E GMF R E T T A L MSG S Y R E PNP Y AS F
C3Y433_BRAFL-trimmed GPG - GAV - - F S TY S P AL S F K T I Q TD A L MA P HQC F NG - A L R KS TVY VP L
Hoilungia_stringtie|8751.1|m.11539 G P GC S P V - - Y AG S S P A L S Y RL Y P S E L S S Y ED I A F MG - WY S G G WK T A G F
Dmel_NP_001285554.1__GABA-B3G G S AC S E V - - FGS TS P AL S Y R TVAPD G S F SQ E L A L H E - S MG A I SQ Y A PQ
Amel_4.5|GB53009 G T AC S E V - - FGS TAP AL S L R TVAPD GS F S PQL A LPS-DTI T Y S K F AGQ
Lanatina_comp149962_c0_seq3_2 G S SC SD V - - HASSAP AL S Y R L A A AD G S F S ED I A L VG - D F T T R S AH A P Q
NVE5469/23-200 G P AC S A V - - Y A S TA SD L S Y R T VQ P D GMF Y E D A A L L D - WA D T AN L Y V P Y
twi_ss.15124.1_2/34-214 G ADC S I A - - PL SSSPSL T L RV V S SD L SVY P AY A T F A - WY P A L F GQ AQ F
ephydatiamuelleri_17836_comp66476_c0_seq2 G ADC S V S - - P L S TS PQL E L R TV A SD V N T Y P TH A Y F A - WY P S L QDQ V P Y
oscarella_t_h60_42123_comp11388_c0_seq18 A E GC T I V - - F L SSSPSLD YH L L P AE G L F Y E E VG F TG - WY D D F QQ Y T A Y
oscarella_t_h60_52462_comp11760_c0_seq13 A AGC S T V - - F L SSSP ALQ HQ I N P D E AL L Y E P TA F P G - WY H QF TY Y AAY
oscarella_t_h60_64267_comp17249_c0_seq1 A E GC S V V - - F L SSSP AL E HQ L L P S E G L F Y E E TG F V G - WF Q D F DH Y T A Y
Aqu2.1.25564_001 GGGC S L V - - Y Y S TS P S L N F TL RP SD L NC D P K I S T L G - WY P DDH S I E S Y
Aqu2.1.26762_001 G S GC S V A - - Y S S S S VH L R FQ I Y VSE L N MWE P K A L Y G - WY S Q P T Y V AHM
Aqu2.1.41785_001 G S GC S V A - - F S S S S VH L R FQ I Y ASE L N MWE P K A L Y G - WY N L S S Y V AHM
Aqu2.1.39154_001 G AGC S A A - - Y S S A A P ML S F R TY P SD L NMY SG K A I P G - WY Q Y P I PDAL T
twi_ss.13620.1_2 GC GC S V A - - YGS AA I TL S F R TH P S I I N TD P D L A T Y G - WY Q SQ TY TA A L
twi_ss.20824.4_2 GC GC S V A - - YG T TS T TL S F R AN P P S LNS Y PNP - A Y G - WY A I L D T T A AQ
ephydatiamuelleri_23659_comp68518_c0_seq1 GC GC S V A - - Y F S AS TAL S F R TN P S F L N S Y SG P A M RD - S F V E A S D AGG R
ephydatiamuelleri_23660_comp68518_c0_seq5 GC GC S V A - - Y F SAS I EL S F R TN P S L VN S Y P E P A M RD - S F V E A S D AGG R
ephydatiamuelleri_11626_comp62613_c0_seq9 GC GC S V A - - Y VD V S S A L D F R TS P SD L N S Y P GQ A TN D - WY N E K S E A AG R
ephydatiamuelleri_23301_comp68424_c0_seq32 G S GC S V A - - P E P L E AQ P S F QL F P SG L NMY SN I A TN G - D Y T QY TY VS AL
Aqu2.1.27571_001 GC GC S T A - - A I AG A F V L N F R TL VS F I N T Y QD I A I P M - WY N TE L G L SG V
Aqu2.1.27573_001-trimmed GC GC S I S - - Y A TG A D V L T F R TL VS F I N TY PD I A I P M - WY N TE L G L SG V
Aqu2.1.38717_001 GC GC S T A - - F A A S TH I L N Y RA L I SN I NAY P SF S L P M - WY A T E L G I AG L
Aqu2.1.28011_001 G S GC S L A - - C ASSSSEL A F QML P T E L AMY A P MA T Y G - WY T S Y L I N S EH
Aqu2.1.38315_001 G AGC S V A - - C A S S SHN L A F QML P S A L AMY P GH A I Y G - WY S R Y S TH N I Q
twi_ss.29636.1_1 G AGC S V A - - C V S S SN E L N FQL L P TE L AMF S KQ A TYG - TY E TH Y D Y A P H
ephydatiamuelleri_22930_comp68288_c0_seq43 GGGC E N A - - CDS S TP E AE F Q TL PND L AMY VN D T T L G - WY T S S S QD E E T
ephydatiamuelleri_16611_comp65944_c0_seq8 G S GC E A A - - SD S S A P E L E F QML P N E L AVYQPDA L R D - WY G D P N RDD P S
twi_ss.30746.2_1 G S GC S V A - -H I SSSPEL S F Q L L G SG L AMF L P R A VY P - F Y S T S N I A S QQ
Aqu2.1.38823_001 GGGC P S T - - Y N P G L MR Y S VQ V F P S R L N T Y QN T A I Y D - WY N S E MD G Y A F
Aqu2.1.39153_001 G AGC S L A - - Y A SQ S S V L N F R TVP S S L NMY P P H A V Y G - WY P SESY AAP F
ephydatiamuelleri_16977_comp66116_c0_seq2 GC DC P E L - - Y D R G TM S L N F R TY P SD L S MN P L N A T L G - WY P FQ TS VAP F
ephydatiamuelleri_16976_comp66116_c0_seq1 GGGC P A V - - Y D TG A I S L S I R TY P SD L NMY P L I A T L G - WY P FQ TS VAP F
twi_c31705_g2_i4_1 G AGC S I A - - HSSSSAVLD L RA I L SD I N AD P V T T L L E - WH V L VD E S V Y F
twi_c31705_g2_i8_0 G AGC P L A - - Y S S S SG V L D F R T I P SD L N VC P E Y A M Y G - WY D A AD E I A S F
ephydatiamuelleri_08108_comp58383_c0_seq9 A P GC T E E - - YGF S - - I L G F SML P S E I N AN A S T T V H A - WN T S I TED TVP
Aqu2.1.23001_001/19-196 GGGC S L A - - Y F A TS P AL S F RVSP SE G F F Y SN K A L PG - F YG DQ F L S P T L
twi_ss.11316.1_0/38-229 DGGC S P A - - L SASSP AL S FRT I PSE G Y F Y ED K A L P A - WY T E PHAY I TF
ephydatiamuelleri_10299_comp61291_c0_seq1 GGGC S S S - - FGS S S P TL T FRT I PSE I WM Y E D K A L P G - WY S A V S K Y MR F
ephydatiamuelleri_09349_comp60264_c0_seq6 DGGC N T V - - FGSSSP SL S YRT I PSE A WM Y E D Q A L P G - WY T N MH S Y V P F
hydra_sra.36426.1_0/39-213 G P P Y TEQ - - Y T E AN E P F E FQ TP P S I A L F S V SG A LFE- I LP KC D EN L A Y
hormiphora_t_x0_09819_comp8395_c0_seq2 G P V F TD V - - P TAS S S E F S L RLHP VE A N F D TD D A L PS - SL P K E D K AG S L
ML02335a-AUGUSTUS G P V F TD V - - P TAS S S S L S L RLHP VE AN F D E DD A L PS - SYK I E E K AG S L
Aqu2.1.43530_001 G P PC ED S - - Y AS T TPQL N F R TVP S F V Y SQ P E Y L F VG - N T E ND L S V A A S
twi_ss.25346.1_2 G P GC S T S E - YGYN S P I FH YQ TARS I AF L SE RL A F I G TS Y S G P D P R AG T
ephydatiamuelleri_25735_comp78991_c0_seq1 G P GC S E A - VS VA T TVE L T L R TVS TS G F L GG E L T F HD - R S V I DL L YGP T
ephydatiamuelleri_25644_comp76203_c0_seq1 G L AC S D S - VS I A TS P S L N P RP I SS S AF L - GP VA F HD - R S L SD L A YG P V
ephydatiamuelleri_06289_comp54437_c0_seq1 GP LC SEA - VH L A S S A A L A G I RG S I D - F S QH L AQ NA- D I PL VD S D T F N F
ephydatiamuelleri_05142_comp51299_c0_seq1 G P RC Y N S - V H MC G S P N L G TS T TPC I VN I DG I L T F L R - TKL S I N K L GN A
ephydatiamuelleri_20901_comp67615_c1_seq2 G L SC YD L - AH I S S S F I L G VG I T P S M AF I GSSL A F L Y - RKL T T S L RGN L
ephydatiamuelleri_22338_comp68117_c0_seq2 G P SC S E S - L HMA A S P A L K F GMAG S A L L AN S D L A QVE - I TL A S P P WA N L
ephydatiamuelleri_20738_comp67567_c0_seq1 GL PC S TP - I SG S S S P A L S Y RV A S SG G - - T E HQC L P D - RD V N DN I Y VN S
CAOG_00213T0 Q A L I Q SC V - I TVARP S A T F Y N GQN Q Y P K TQ A Q V TPH - QY T S C R F G L AD
CAOG_05982T0 Q A L VD S C V - - I E TG K A V A F G AG S N Q Y P M TQ A Q V TPN - TY T YC RFGS A T
CAOG_09932T0 Q A L VD S C V - - I E TG K A V A F G AG S N Q Y P M TQ A Q V TPN - TY T YC RFGS A T

Supplemental Figure 11: Binding pocket alignment of mGABARs across metazoans


Select residues involved in the binding of GABA are highlighted, and numbers correspond to the position
in the human protein GABR1/GABAR-B1, based on the structure of GABR1 [Geng et al., 2013]. Highly
conserved residues not thought to be involved in binding are highlighted in blue. The two human proteins
are highlighted in pink. Placozoan proteins are highlighted in gray. Sponge proteins are highlighted in blue.
The two ctenophore proteins are highlighted in violet.

36
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

A.thaliana_NP_565744.1__GluR5
86 G5EKN9_SOLLC__SlGLR3.1
V.vinifera_F6HIJ5_VITVI
V.vinifera_F6HIJ6_VITVI
A.thaliana_NP_565743.1__GluR3.5
52 A.thaliana_NP_001030971.1__GluR3.4
G5EKP2_SOLLC__SlGLR3.4
G5EKP0_SOLLC__SlGLR3.2
99 V.vinifera_A5AMA8_VITVI
99 A.thaliana_NP_974686.1__GluR3.2
A.thaliana_NP_028351.2__GluR

Plants
A.thaliana_NP_190716.3__GluR3.6
G5EKP3_SOLLC__SlGLR3.5
54 52 V.vinifera_D7SWB7_VITVI
A.thaliana_NP_174978.1__GluR3.3
9951 V.vinifera_D7UDC6_VITVI
G5EKP1_SOLLC__SlGLR3.3
G5EKN1_SOLLC__SlGLR1.1
G5EKN2_SOLLC__SlGLR1.2
V.vinifera_A5AGU5_VITVI
V.vinifera_A5BMN8_VITVI
V.vinifera_F6HM67_VITVI
V.vinifera_F6GXG0_VITVI
V.vinifera_A5AUG7_VITVI
62 A.thaliana_NP_187061.1__GluR1.1

Sponges
A.thaliana_NP_187408.2__GluR1.4
77 A.thaliana_NP_199652.1__GluR1.3
A.thaliana_NP_199651.1__GluR1.2
G5EKN7_SOLLC__SlGLR2.5
98 G5EKN4_SOLLC__SlGLR2.2
G5EKN6_SOLLC__SlGLR2.4
91 G5EKN5_SOLLC__SlGLR2.3
78 G5EKN3_SOLLC__SlGLR2.1
A.thaliana_NP_180476.3__GluR2.7
A.thaliana_NP_180474.1__GluR2.9
93
95
98
A.thaliana_NP_180475.2__GluR2.8
A.thaliana_NP_196682.1__GluR2.5
A.thaliana_NP_196679.1__GluR2.6
A.thaliana_NP_180047.1__GluR2.3
O.carmela
L.complicata
A.thaliana_NP_180048.1__GluR2.2
88 A.thaliana_NP_194899.1__GluR2.4
A.thaliana_NP_198062.2__GluR2.1
G5EKN8_SOLLC__SlGLR2.6
80V.vinifera_A5AIS1_VITVI
84 V.vinifera_A5AD54_VITVI
V.vinifera_A5BDG6_VITVI
V.vinifera_F6H9F4_VITVI
V.vinifera_F6H9D0_VITVI
S.ciliatum
V.vinifera_F6H9G4_VITVI
V.vinifera_A5AU42_VITVI
V.vinifera_A5AQR7_VITVI
V.vinifera_A5AVQ8_VITVI
85 V.vinifera_F6H9G5_VITVI
V.vinifera_F6H9H0_VITVI
oscarella_t_h60_63112_comp13096_c0_seq1
90 oscarella_t_h60_08441_comp7318_c0_seq1
91 L.complicata_lctid7089
S.ciliatum_scpid22929_GLUR3.5
scict1.030436.1_0
99 oscarella_t_h60_33991_comp10944_c0_seq1
oscarella_t_h60_03154_comp3263_c0_seq1
91 oscarella_t_h60_50237_comp11700_c0_seq3
oscarella_t_h60_50246_comp11700_c0_seq12
81 S.ciliatum_scpid26529_GLUR3.4
L.complicata_lctid31759
oscarella_t_h60_65012_comp23829_c0_seq1
42 oscarella_t_h60_06601_comp6249_c0_seq1
oscarella_t_h60_65051_comp24134_c0_seq1
78 S.ciliatum_scpid25297_GLUR3.1
L.complicata_lctid34850
81 S.ciliatum_scpid27594_GLUR3.7
L.complicata_lctid7088
69 S.ciliatum_scpid21909_GLUR2.9
S.ciliatum_scpid15641_GLUR2.9
euplokamis_t_h20_19863_comp13082_c0_seq1
Hcal_TR15585_c2_g1_i1_m.37437
Hcal_TR15585_c3_g1_i1_m.37441-TR15585_c0_g1_i1
67 Pleurobrachia_bachei_AEX15543.1
MLRB004413
euplokamis_t_h30_23254_comp14560_c1_seq1-h10_06856
55 95 MLRB00443-edited

Ctenophores
Pleurobrachia_bachei_AEX15551.1_trimmed
Hcal_TR7392_c0_g1_i1_m.11709
Pleurobrachia_bachei_AEX15547.1
53 99 Hcal_TR7373_c0_g1_i1_m.11556
euplokamis_t_h20_19416_comp12942_c0_seq1
Pleurobrachia_bachei_AEX15548.1-trimmed
euplokamis_t_h30_09966_comp8826_c0_seq1
61 Pleurobrachia_bachei_AEX15546.1
MLRB306921
ML30697a-trimmed
Pleurobrachia_bachei_AEX15539.1
Hcal_TR18969_c1_g2_i2_m.48365_partial
10 94 Pleurobrachia_bachei_AEX15541.1
91 Pleurobrachia_bachei_AEX15549.1
Pleurobrachia_bachei_AEX15542.1
89 ML00626a
ML032222a
ML032221a
ML05909a
Pleurobrachia_bachei_AEX15550.1
Pleurobrachia_bachei_AEX15544.1
ML111714a
85 ML15636a
63 78 65 ML085016a
36 Pleurobrachia_bachei_AEX15540.1
ML0850-17a-18a-fusion
Pleurobrachia_bachei_AEX15545.1
MLRB150054-fixed
ML03683a
95 ML027316a-fixed
58 ML07344a-recut
ML150012a-MLRB150049
88 99 ML150010a-trimmed-MLRB150043
MLRB064020
96 C3ZFG6_BRAFL
C3YQ18_BRAFL
C3YZA0_BRAFL-trimmed
78

Cnidarians
C3ZMS5_BRAFL-cutshort
64 C3ZKA8_BRAFL-trimmed
adi_v1.00972
85 A7RPU4_NEMVE
83 adi_v1.17049-trimmed
54 adi_v1.11422
90 AIPGENE1622
A7T1G4_NEMVE
H13_TR19588_c2_g2_i3_m.37163
g1037.t1__scaffold_1__Triad1-18943
H13_TR30549_c0_g1_i1_m.66150
57 g1036.t1__scaffold_1__Triad1-18262
H13_TR15638_c1_g1_i1_m.29638
g1035.t2__scaffold_1__Triad1-18823

Placozoans
g9262.t1__scaffold_14__Triad1-30612
g9261.t1__scaffold_14__Triad1-30609
94 Triad1_55165
H13_TR9210_c0_g1_i1_m.20124
g10441.t1__scaffold_23__Triad1-32461
99 H13_TR13015_c0_g2_i3_m.25087-partial
g10442.t1__scaffold_23__Triad1-61396
94 g10443.t2__scaffold_23__Triad1-3218
H13_TR30216_c5_g1_i3_m.64945
A7SFF8_NEMVE_NMDA-like
AIPGENE3629-joined_AIPGENE1829-allele
98 A7SGV8_NEMVE_NMDA-like
68 AIPGENE14101
18 A7SL56_NEMVE
A7RLA0_NEMVE_NMDA-like

NMDAR
adi_v1.13302
Ocbimv22034182m.p
99 99 D.melanogaster_NP_730940.1__NMDAR-Z-like
81 Amel_4.5_GB46886
X.laevis_NP_001081616.1_NMDA1.2
Q6ZM67_DANRE_grin1b
E7F101_DANRE_grin1a
46NMDZ1_MOUSE
71 NMDZ1_HUMAN
71 C3YII6_BRAFL
obirnaseq.137713.1_1
Mouse_NP_001263284.1_NMDA3A
NMD3A_HUMAN
NMD3B_MOUSE
99 Amel_4.5_GB48097
NMD3B_HUMAN
D.melanogaster_NP_001014714.1__NMDAR-like
C3Y993_BRAFL
22 96 C3Y747_BRAFL
NMDE2_MOUSE
86 NMDE2_HUMAN
NMDE1_HUMAN
NMDE1_MOUSE
NMDE3_MOUSE
NMDE3_HUMAN
E9QBK8_DANRE_grin2db_NMDE4-like

Glycine
NMDE4_MOUSE
NMDE4_HUMAN
oscarella_t_h60_52136_comp11753_c0_seq16
67 oscarella_t_h60_15764_comp9193_c0_seq1
oscarella_t_h60_42587_comp11410_c0_seq2
H13_braker1_g01988.t1
T.adherens_g388.t5
0 obirnaseq.63134.1_0
84 13 99 D.melanogaster_NP_001260049.1_25a

Acidic ligand
D.melanogaster_NP_727328.1_8a
scict1.009114.1_0
13 95 obirnaseq.13493.1_0
obirnaseq.109126.1_2
15 PH13_braker1_g02946.t1
71 g5117.t1__scaffold_5__Triad1-14565
92 g5089.t1__scaffold_5__Triad1-25027
PH13_braker1_g02971.t1
PH13_braker1_g02972.t1

Unknown ligand g5085.t1__scaffold_5__Triad1-25025


C3ZRD9_BRAFL
C3XPD8_BRAFL

DeltaR
86 60 C3ZQV0_BRAFL
0 98 X.tropicalis_NP_001096470.1_DELTA1
GRID1_HUMAN
GRID1_MOUSE
GRID2_DANRE
GRID2_MOUSE
GRID2_HUMAN
Ocbimv22018191m.p
36 89 C3ZMC8_BRAFL
3 obirnaseq.94822.1_1
C3ZZY6_BRAFL
60 Amel_4.5_GB53122
80
Clumsy
Amel_4.5_GB49273
98 Amel_4.5_GB49268
Q9VIE2_DROME__Clumsy
69 Amel_4.5_GB49275
D.melanogaster_CAB64942.1
Danio_rerio_XP_009290381.1_K5
GRIK5_HUMAN
8 GRIK5_MOUSE
Danio_rerio_XP_017206665.1_K4
GRIK4_MOUSE
54 GRIK4_HUMAN

Ctenophores 81
C3ZWB5_BRAFL
Danio_rerio_XP_009303592.1_K1
GRIK1_MOUSE

KainateR
GRIK1_HUMAN
Danio_rerio_XP_017206939.1_K3
Placozoans 99
GRIK3_MOUSE
87 Danio_rerio_XP_017206873.1_K2
GRIK3_HUMAN
B9V8S1_XENLA_GRIK2
98
Cnidarians
GRIK2_MOUSE
99
GRIK2_HUMAN
85 obirnaseq.32300.1_1
obirnaseq.114940.1_2
79 Amel_4.5_GB40973

Protostomes D.melanogaster_NP_476855.1
D.melanogaster_NP_001261621.2
Q71E63_DANRE_gria2a
Q71E62_DANRE_gria2b
B9V8R8_XENLA_GRIA2
Chordates 55
99 GRIA2_HUMAN
99
GRIA2_MOUSE
B3DGS8_DANRE_gria1a
Q71E64_DANRE_gria2a

Vertebrates
X.laevis_NP_001153151.1_AMPA1
78GRIA1_MOUSE

AMPAR
GRIA1_HUMAN
Q71E61_DANRE_gria3a
99Q71E60_DANRE_gria3b
X.laevis_NP_001153154.1_AMPA3
99GRIA3_MOUSE
Plants 51 GRIA3_HUMAN
X.laevis_NP_001153157.1_AMPA4
GRIA4_MOUSE
52GRIA4_HUMAN
0.5 Q71E59_DANRE_gria4a
Q71E58_DANRE_gria4b

Supplemental Figure 12: Phylogenetic tree of ionotropic glutamate receptors across metazoans
Protein tree generated with RAxML using the PROTCATWAG model. Bootstrap values are 100 unless
otherwise shown. These receptors are not found in the genomes or transcriptomes of demosponges or
hexactinellids, so the Sponge clade refers to calcareous sponges and homoscleromorphs. For the three sponges,
the blue star indicates sequences derived from a transcriptome. Based on [Alberstein et al., 2015], some
receptors are predicted to bind ligands other than glutamate, shown with the green star, red diamond, and
37
black star, for glycine, acidic ligands, and unknown, respectively. Four placozoan proteins have substitutions
at the conserved acidic residue (D723 in human GluN1), as either GY in ctenophores, or GG/WY in
placozoans; the carboxyl of the glutamic/aspartic acid is needed to coordinate the amino group of glutamate,
suggesting that these proteins do not bind an α -amino acid.
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

NMD3A_HUMAN
Human NMDA-class iGluRs

NMDE1_HUMAN

GRIA2_HUMAN Human AMPA-class iGluRs

euplokamis_t_h30_09966_comp8826_c0_seq1
Ctenophore iGluRs

ML032222a

Pleurobrachia_bachei_AEX15542.1

oscarella_t_h60_42587_comp11410_c0_seq2 Calcisponge/Hsm AMPA-like

scict1.009114.1_0

A.thaliana_NP_187061.1__GluR1.1 Plant iGluRs

S.ciliatum_scpid25297_GLUR3.1 Calcisponge/Hsm iGluR-like

oscarella_t_h60_06601_comp6249_c0_seq1

Signal peptide
L.complicata_lctid34850 ANF_receptor
7-transmembrane
bac SBP type 3
Aqu2.1.29334_001 Demosponge mGluR-like
Ligand channel
NCD3G
E.muelleri−Emu1128−0.9−mRNA−1 NMDAR2_C

twi_ss.7125c.5|m.10167

0 250 500 750 1000 1250 1500

Supplemental Figure 13: Domain organization of ionotropic glutamate receptors across meta-
zoans
Scale bar displays number of amino acids. Top BLAST hits for human iGluRs in demosponges appear to
be metabotropic, due to the presence of a 7-transmembrane domain instead of the ion channel, while the
ligand-binding domain is conserved. Ctenophore iGluRs and some calcisponge/homoscleromorph (Hsm) pro-
teins have the vertebrate-type domain organization, though the other calcisponge/homoscleromorph proteins
(main sponge group in Supplemental Figure 12) have an SBP domain.

38
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

CAOG_01813T0__S17A9-like
98 Alatina_alata_c59016_g1_i1__S17A9
hydra_sra.33023.2__s17a9
100 37 Scopalina_sp_TR21479_c0_g1_i1__s17a9
X.testutinaria_TR78129_c0_g1_i1__s17a9
46 Aqu2.1.12587_001_partial__s17a9
91 Stylissa_carteri_TR73134_c0_g1_i2__s17a9
81 twi_ss.30106.1__s17a9
NVE8304
75 NVE22746

50
73 87
99 Porites_australiensis_3031__S17A9
Acropora_millepora_10895__S17A9
Danio_rerio_NP_001002635.1__SLC17A9
Xtropicalis_XP_002944466.2__SLC17A9
Gallus_gallus_NP_001006292.1__SLC17A9
Nucleotide transporter
(SLC17A9)
S17A9_HUMAN
75 S17A9_MOUSE
Branchiostoma_belcheri_XP_019645208.1__SLC17A9
FBpp0077146__S17A9
82 Amel_4.5|GB46143-PA__S17A9
40 HELRODRAFT_76708__SLC17A9
49 66 Capca1|155164__S17A9
62 Capca1|155158
Lingula_comp143954_c0_seq4__S17A9
79 9883 obirnaseq.110733.2__S17A9
Cgigas_EKC18089__S17A9
LOTGIDRAFT_112031__SLC17A9
CAOG_02811T0
obirnaseq.94311.1
6 Hoilungia_braker1_g07337.t1
Triad1_g69.t1__scaffold_1
Hoilungia_braker1_g07338.t2
scict1.031446.1
lctid25328
lctid59401
scict1.029742.13
scict1.015158.1
100 36 bathyctena_t_h10_20781_comp13657_c0_seq4
beroeabyssicola_t_h20_12199_comp12599_c0_seq1
65 99 ML011726a
91 bolinopsis_t_h20_00340_comp262_c0_seq1
euplokamis_t_h20_11645_comp9428_c0_seq1
beroeabyssicola_t_h20_15083_comp14311_c0_seq2
78 thalassocalyceinconstans_t_h10_05529_comp6090_c0_seq1
45 31 bathocyroe_t_h20_21180_comp15532_c0_seq1
Hcal_TR27180_c1_g1_i1
72 41 dryodora_t_h20_15255_comp12806_c0_seq1
38 ML21903a
bolinopsis_t_h20_19456_comp13734_c0_seq1
42 velamen_t_h20_17746_comp13342_c0_seq1
99 Monbr1_g5507.t1__scaffold_14
Srosetta_PTSG_01814T0
Halisarca_dujardini_HADA01062594.1
19 Halisarca_dujardini_HADA01062593.1
sympagella_nux_TR6482_c0_g1_i1_partial
94 aphrocallistes_vastus_comp19335_c0_seq1
68 Scopalina_sp_TR32724_c0_g1_i1
95 72 Scopalina_sp_TR32724_c0_g2_i2
76 twi_ss.14577.1
Tedania_anhelens_TR19189_c0_g1_i10
88 Petrosia_ficiformis_TR2695_c2_g1_i1
Aqu2.1.28662_001
69 97 X.testutinaria_TR52323_c0_g1_i1
X.testutinaria_TR24582_c0_g1_i4
13 Aqu2.1.29226_001
Aqu2.1.28663_001
scict1.023385.2
lctid55483
75 65 Corticium_candelabrum_TR84939_c2_g3_i1_partial
Oscarella_carmela_comp9729_c0_seq1
84 Oscarella_carmela_comp31204_c0_seq3
78 91 Oscarella_carmela_comp37189_c0_seq1
Oscarella_carmela_comp40500_c0_seq9
Hydractinia_symbiolongicarpus_c17024_g1_i4
96 Alatina_alata_g24140.1_i2
AIPGENE1813
NVE15609__VGLUT-like
Hydractinia_symbiolongicarpus_g7761.1_i1
92 Alatina_alata_g35508.1_i1

Cnidarian VGLUT-like
Acropora_millepora_17115
AIPGENE1891
97 NVE15610__VGLUT-like
14 95 NVE21028__VGLUT-like
NVE16311__VGLUT-like
35 Alatina_alata_c52723_g1_i1
Hydractinia_symbiolongicarpus_c16442_g1_i1
95 78 hydra_sra.17150.1
Alatina_alata_g57195.1_i1
hydra_sra.36087.1
Hydractinia_symbiolongicarpus_c19851_g1_i1
Acropora_millepora_10896
Porites_australiensis_15186
19 NVE20928__VGLUT-like
88 AIPGENE24249
Triad1_g5316.t1__scaffold_5
Hoilungia_braker1_g11731.t1
98
40 Hoilungia_braker1_g10108.t1
Alatina_alata_c54481_g1_i1
97 96 Hydractinia_symbiolongicarpus_c23495_g1_i1
21 22 hydra_sra.10353.1
77 Alatina_alata_g56947.1_i1
34 Alatina_alata_g51821.1_i3
hydra_sra.10649.1
Hydractinia_symbiolongicarpus_c14056_g1_i1
22 NVE17787__VGLUT
52 Acropora_millepora_12796
82 Porites_australiensis_46055
50 NVE8595__VGLUT
33 AIPGENE9842
Helro1|65417
FBpp0083297
94 58 TC010895
NV17997-PA
50 96 Amel_4.5|GB53933-PA
52 61 64
Lingula_comp156026_c0_seq2
Cgigas_EKC30442
Lotgi1|126078
65 76 Lotgi1|102784
Lotgi1|161078
bflornaseq.42024.2
bflornaseq.42037.1
bflornaseq.29980.1-29981.1-joined
20 EKC38935
Cgigas_EKC18280
Lingula_comp149033_c2_seq2
89 Lingula_comp152169_c0_seq4
Capca1|208926
Lingula_comp149080_c0_seq7
59 Cgigas_EKC34298
97 Lotgi1|233350
23 Bf_V2_327_g44834.t1__bflornaseq.13338.1
10 Spurpuratus_XP_786480.3__VGLUT
Lotgi1|128007__VGLUT
94 86 obirnaseq.87250.1
82 Cgigas_EKC24439__VGLUT
75 35
Lingula_comp143616_c3_seq1__VGLUT
Capca1|177109__VGLUT
55 TC008459__VGLUT
Q9VQC0_DROME__VGLUT
53 81 Amel_4.5|GB54867-PA__VGLUT
NV15959-PA__VGLUT
VGLU3_DANRE
99Gallus_gallus_NP_001305958.1__VGLUT3
VGLU3_MOUSE
VGLU3_HUMAN
Glutamate transporter
VGLU1_XENTR

VGLUT
Danio_rerio_NP_001092225.1__VGLUT1
78 VGLU1_HUMAN
83 VGL2A_DANRE
VGLU1_MOUSE
81 VGL2B_DANRE
2991
(SLC17A6,7,8)
Gallus_gallus_NP_001161855.1__VGLUT2
VGLU2_MOUSE
98
VGLU2_HUMAN
Spurpuratus_XP_780445.3
72 Spurpuratus_XP_795625.3
94 Spurpuratus_XP_783585.2
82 Spurpuratus_XP_782868.2
Spurpuratus_XP_785484.3
Danio_rerio_NP_001070195.1__sialin
47 Salmo_salar_NP_001167306.1__sialin
99S17A5_HUMAN
98
S17A5_MOUSE
99 Alligator_mississippiensis_XP_006269813.1
99 Gallus_NP_001026257.1
Gallus_gallus_XP_015140231.1__sialin
Callorhinchus_milii_XP_007906946.1
Sialin (SLC17A5)
91 Danio_rerio_XP_005159826.1
Alligator_mississippiensis_XP_006268054.2
68 Pelodiscus_sinensis_XP_014424898.1
E1BDD2_BOVIN__SLC17A4
99
SLC17A4
S17A4_MOUSE
95S17A4_HUMAN
NPT3_BOVIN
NPT3_MOUSE
56 NPT3_HUMAN
89 NPT4_HUMAN
94 NPT1_MOUSE
NPT1_HUMAN
15 40 Capca1|183805
Helro1|186317
27 Lingula_comp149120_c0_seq6
Lotgi1|197570
27 80 Lotgi1|133858
35 obirnaseq.111983.1
7 32 Cgigas_EKC24609
Alatina_alata_c52956_g1_i1
99 46
Demosponges
Alatina_alata_c52359_g1_i1
50 Alatina_alata_g38981.1_i1
53 Hydractinia_symbiolongicarpus_c22518_g1_i1
NVE10158__sialin-like
AIPGENE16892
Homoscleromorphs 74
95 72
Porites_australiensis_22512
Acropora_millepora_10266
NVE12518__sialin-like
AIPGENE10305
71 NVE9481__sialin-like
Calcarea 7
80 AIPGENE16935
TC005982
TC006631
NV11798-PA

Hexactinellids 64 83
TC006632
Amel_4.5|GB41670-PA
NV10357-PA
NV15849-PA
97 NV17612-PA
61 NV17613-PA
Ctenophores
TC006625
NV11203-PA
Amel_4.5|GB42969-PA
57 Amel_4.5|GB51651-PA
87 NV18394-PA
Placozoans 2 91 FBpp0086261
FBpp0309074
Amel_4.5|GB43225-PA
Amel_4.5|GB51650-PA

NV15041-PA
Cnidarians Hoilungia_braker1_g05533.t1
Capca1|113983
Lotgi1|107883
EKC40068
87
Protostomes 10
27
91 99 67
Capca1|93612
Lotgi1|155133
Lotgi1|161420
Lotgi1|107712
Helro1|108787
Echinoderm/hemichordate 31 93
81 obirnaseq.69929.1
Cgigas_EKC35063
Helro1|71683
Cgigas_EKC35062
Chordates 25
97 90 63
Capca1|209782
Cgigas_EKC25794
bflornaseq.39778.2
Lotgi1|113326
Lotgi1|167964
Vertebrates 93
53
81
obirnaseq.30932.1
obirnaseq.22900.1
obirnaseq.22904.3
obirnaseq.22889.1
obirnaseq.30926.2
5330 obirnaseq.30928.1
Filastereans 24
obirnaseq.22894.5
obirnaseq.30927.1
3765 obirnaseq.123251.2
obirnaseq.123253.1
Choanoflagellates
36 obirnaseq.30934.1
47 obirnaseq.30937.1 0.6

Supplemental Figure 14: Vesicular glutamate transporter homologs across metazoans


Tree of VGLUT (SLC17A6-8) proteins across all metazoan groups, generated with RAxML using the
PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown.

39
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

SC6A1_HUMAN
SC6A1_MOUSE
32 Srosetta_PTSG_08575T0
Hoilungia_braker1_g07132.t1
91 94 Triad1_g3650.t1__scaffold_3
Srosetta_PTSG_00672T0
Spurpuratus_XP_788574.3__S38AB
80 29 S38AB_DANRE

44
S38AB_HUMAN
S38AB_MOUSE
TC015432__S38
NV10096-PA__S38
Na-coupled neutral AA transporter
(SLC38)
31 Lingula_comp153547_c1_seq2_1__S38
65 EKC22480__S38
99 Lotgi1|174956__S38
CAOG_10085T0
lampea_t_h30_15888_comp12890_c0_seq2__cg4
Hcal_TR16507_c0_g1_i4__cg4
97 80 bolinopsis_t_h20_07298_comp7564_c0_seq1__cg4
45 thalassocalyceinconstans_t_h10_14237_comp12108_c0_seq1__cg4
99 57 beroeabyssicola_t_h01_04633_comp4052_c0_seq1__cg4
76 dryodora_t_h20_15361_comp12868_c0_seq1__cg4
Hoilungia_braker1_g00642.t1
Triad1_g3043.t2__scaffold_3
82
60
Cgigas_EKC37816
AIPGENE11622
Porites_australiensis_5139
Montastraea_cavernosa_84647
Unknown VIAAT-like
37Madracis_auretenra_42167
88 scict1.009980.1
lctid63806
Alatina_alata_c58688_g1_i1
hydra_sra.11928.1
93 resomia_t_x0_062726_comp75224_c0_seq1
19 Alatina_alata_c52262_g1_i1
hydra_sra.11232.1
34 hydra_sra.11218.1
Alatina_alata_c49879_g1_i1
Alatina_alata_c53411_g1_i1
98 hydra_sra.20297.1
56 hydra_sra.20302.1
94 AIPGENE25242
Acropora_millepora_11455

Cnidarian-specific VIAAT-like
Porites_australiensis_9197
AIPGENE7072
26 NVE17671
98 Porites_australiensis_38448
Acropora_millepora_17942
94 98 resomia_t_x0_049257_comp70251_c0_seq1
resomia_t_x0_027597_comp54692_c0_seq1
98 Alatina_alata_g42698.1_i1
51 Alatina_alata_c51053_g1_i1
56 94 Acropora_millepora_8356
Porites_australiensis_37320
99 Acropora_millepora_12688
Porites_australiensis_49386
59 NVE17670
43 AIPGENE27144
NVE5099
NVE22971
79 NVE15339
85 AIPGENE28594
83 Acropora_millepora_16417
74 Porites_australiensis_36679
Alatina_alata_c44674_g1_i1
99 NVE24524
81 hydra_sra.24414.1
32 99 AIPGENE4721
NVE8024
Alatina_alata_g47107.1_i1
99 hydra_sra.1815.1
29 Alatina_alata_g24024.1_i1
AIPGENE26117
NVE8637
76 AIPGENE29146
NVE25975
24 hydra_sra.28166.1
Alatina_alata_g44676.1_i1
PFL3_pfl_40v0_9_20150316_1g2763.t1
Spurpuratus_XP_795408.3
obirnaseq.30198.1
95 Helro1|183498
68 81 Capca1|225531
39 TC009185
FBpp0086703
95 Amel_4.5|GB40541-PA

GABA transporter VIAAT


NV15949-PA
Spurpuratus_XP_791315.3__VIAAT
31 bflornaseq.38036.1
85 Gallus_gallus_XP_417347.3__VIAAT

(SLC32A1)
VIAAT_XENTR
54 Danio_rerio_NP_001074170.1__VIAAT
49VIAAT_MOUSE
VIAAT_HUMAN
oscarella_t_h60_07389_comp6812_c0_seq2
beroeabyssicola_t_h20_14227_comp13818_c0_seq1
bathocyroe_t_h20_04478_comp6860_c0_seq1
Hcal_TR19444_c0_g1_i2
52 32thalassocalyceinconstans_t_h20_14855_comp14523_c0_seq1
24 velamen_t_h20_22074_comp14838_c0_seq2
62bolinopsis_t_h20_25026_comp15044_c0_seq3
49ML073035a
Monbr1_g6159.t1
Srosetta_PTSG_07915T0

Similar to
41 Lotgi1|157294__ANTL1
28 99 Lingula_comp152251_c0_seq2__ANTL1
bflornaseq.11495.1
74 oscarella_t_h60_33621_comp10920_c0_seq1
25 58 98
Aromatic and neutral
twi_ss.6165.1
Petrosia_ficiformis_TR11201_c0_g1_i1
99 X.testutinaria_TR68852_c1_g1_i1
38 93 Aqu2.1.23868_001
Triad1_g5051.t1__scaffold_5

62 63
80
Hoilungia_braker1_g03002.t1
Porites_australiensis_6407__ANTL1
Acropora_millepora_13042__ANTL1
NVE21968__ANTL1
AA transporter
(ANTL)
98 Exaiptasia_pallida_KXJ22393.1__ANTL1
euplokamis_t_h30_07022_comp6902_c0_seq1
90 charistephanefugiens_t_h10_09118_comp9861_c0_seq1
Hcal_TR17487_c0_g1_i1
88 dryodora_t_h20_09505_comp9226_c0_seq1
bathocyroe_t_h20_17157_comp14441_c0_seq1
5842 thalassocalyceinconstans_t_h20_24470_comp18878_c0_seq3
97 25ML104321a
81 velamen_t_h20_12670_comp11072_c0_seq1
83 bolinopsis_t_h20_08753_comp8580_c0_seq1
CAOG_00255T0
Monbr1_g2633.t1
34 scict1.017867.1
lctid70109_0
99 lctid78596_0
scict1.018556.1
96 scict1.015001.1
90 scict1.015001.2
lctid12311_0
lctid80334_0
aphrocallistes_vastus_comp20147_c1_seq1
rosella_fibulata_TR11879_c0_g1_i1
sympagella_nux_TR7911_c0_g2_i1
87 aphrocallistes_vastus_comp6524_c0_seq1
sympagella_nux_TR20508_c0_g1_i4
78 rosella_fibulata_TR13163_c0_g1_i1
Scopalina_sp_TR28065_c0_g3_i1
44 twi_ss.8957.1_1
56
Demosponges
Tedania_anhelens_TR6077_c3_g2_i2
Aqu2.1.31258_001
63 Aqu2.1.31260_001
Petrosia_ficiformis_TR11578_c0_g1_i1
33
Homoscleromorphs 25
82 X.testutinaria_TR48957_c0_g1_i8
X.testutinaria_TR65433_c1_g3_i10
oscarella_t_h60_10767_comp8125_c0_seq6
Triad1_g5318.t2__scaffold_5
Calcarea 39
Triad1_g5319.t1__scaffold_5
euplokamis_t_h20_13527_comp10429_c0_seq1__S36A
bathyctena_t_h30_00646_comp699_c0_seq1__S36A

Hexactinellids 41 95
68
76
dryodora_t_h20_27957_comp16799_c0_seq3__S36A
bathocyroe_t_h20_09052_comp10596_c0_seq3__S36A
thalassocalyceinconstans_t_h10_13472_comp11685_c1_seq1__S36A
velamen_t_h20_29161_comp16666_c0_seq1__S36A
96 bolinopsis_t_h20_13495_comp11280_c1_seq1__S36A
79 ML085720a__S36A
Ctenophores 22 89
Acropora_millepora_4745__S36A
Porites_australiensis_21159__S36A
NVE5526__S36A
Placozoans 68
93 NVE5525__S36A
AIPGENE23310__S36A
Alatina_alata_c55134_g1_i1__S36A

Cnidarians
hydra_sra.16612.1__S36A
45 hydra_sra.16610.1__S36A
Alatina_alata_c43290_g1_i1__S36A
98 hydra_sra.11681.1__S36A
Protostomes 21
40
Alatina_alata_c41537_g1_i1__S36A
bflornaseq.11538.1-g25112.t1
Spurpuratus_XP_003723741.1__S36A
Lingula_comp153549_c1_seq2_1__S36A
Echinoderm/hemichordate 15 89
63
97
Lotgi1|218879__S36A
NV16821-PA__S36A
FBpp0292523__S36A
Chordates 40
71
71
NV18592-PA__S36A
Amel_4.5|GB51487-PA__S36A
NV11499-PA__S36A
S36A4_MOUSE
Vertebrates S36A4_HUMAN
S36A3_HUMAN
S36A3_MOUSE H-coupled AA transporter
S36A1_HUMAN
97 S36A1_MOUSE
Filastereans (SLC36)
46 S36A2_MOUSE
S36A2_HUMAN

Choanoflagellates
0.6

Supplemental Figure 15: Vesicular inhibitory amino acid transporter homologs across meta-
zoans
Tree of VIAAT (SLC32A1) proteins and related transpoters across all metazoan groups, generated with
RAxML using the PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown.

40
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

Spongospora_subterranea_A0A0H5R9G2

20 Ostreococcus_lucimarinus_A4RY52_OSTLU

70 M2XB31_GALSU Red algae


38 Klebsormidium_flaccidum_A0A0U9I7C4_KLEFL

Oryza_brachyantha_J3L063_ORYBR

Beta_vulgaris_A0A0J8B722_BETVU

K4CUY8_SOLLC

36 F6HQ45_VITVI

49 Citrus_sinensis_A0A067H648_CITSI Plants
99
Eutrema_salsugineum_V4KNG8_EUTSA

PIEZO_ARATH

D.discoideum_Y2801_DICDI
92
Rozella_allomycis_EPZ31590.1

Monbr1_g1806.t2

99
Salpingoeca_rosetta_PTSG_01300T0 Choanoflagellates
20 Monbr1_g2005.t1

euplokamis_t_h30_28970_comp15884_c0_seq4

bathyctena_t_x0_34756_comp23030_c0_seq1_0
Ctenophores
98 beroeabyssicola_t_h20_27141_comp17803_c0_seq1
55
Pbachei_3461157_partial
68
hormiphora_t_h10_11700-TR2190_c1_g3_i2
46
bolinopsis_t_h20_24908_comp15026_c0_seq11

ML018021-AUGUSTUS

Aqu2.1.38329_001

ephydatiamuelleri_12638-joined

twi_ss.19116.1-19102.1

oscarella_t_h60_60372_comp11964_c0_seq14 Sponges
scict1.026728.1_scict1.029409.2_0

L.complicata_lctid5362

Placozoans
T.adherens_g4404.t1

Hoilungia_stringtie_3695.2_m.4488
99
resomia_t_x0_097703_comp79502_c0_seq1

hydra_sra.24978.1_2

NVE3870

AIPGENE4679-13595-4700_partial

adi_v1.23438-05905-05906-05907_partial
Cnidarians
Acropora_millepora_552

Porites_australiensis_8982
99
Montastraea_cavernosa_16922

Brafl_PIEZO_partial

O.dioica_2913001-5897001-5899001
57
94 F1NVW5_CHICK
39
PIEZ2_MOUSE

PIEZ2_HUMAN

48
E1BX07_CHICK Vertebrates
PIEZ1_HUMAN

PIEZ1_MOUSE

Spur_390358264-joined

obirnaseq.123544.1_0
42
97 98
C.gigas_EKC42879-EKC42880

Transcriptome 81
Capca1_219762
Protostomes
L.anatina_comp156528_c0_seq1

D0VWN8_CAEEL

Partial protein 94 PIEZO_DROME

A0A087ZN43_APIME

Joined protein
99
TC013391 0.4

Supplemental Figure 16: Phylogenetic tree of Piezo homologs across metazoans


Protein tree generated with RAxML using the PROTCATWAG model and 100 bootstraps. Yellow circle
indicates the sequence was compete when joined with other genes or partial sequences, red partial circle
indicates the sequence is incomplete in the genome or transcriptome. A blue star indicates that the sequence
derived from a transcriptome, so copy number cannot be determined with certainty. Bootstrap values are
100 unless otherwise shown.

41
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

Monbr1_g1391.t1__scaffold_3
Srosetta_PTSG_00336T0 Choanoflagellates
Hcal_TR17996_c0_g1_i1
beroeabyssicola_t_h20_12271_comp12645_c0_seq2
bolinopsis_t_h20_09329_comp8981_c0_seq1
velamen_t_h20_28621_comp16542_c0_seq1
77
ML17059a__SCN-like

98
88 74
euplokamis_t_h20_12092_comp9658_c0_seq3
Hcal_TR16070_c2_g2_i1
bathocyroe_t_h20_22759_comp15829_c0_seq2
Ctenophores
thalassocalyceinconstans_t_h10_09040_comp8843_c0_seq1
98 bolinopsis_t_h20_18283_comp13341_c0_seq1
velamen_t_h20_34229_comp17681_c0_seq1
64
ML358826a__SCN-like
Hoilungia_braker1_g08315.t1

Placozoans
Triad1|54699_g3484.t1
85
Hoilungia_braker1_g10012.t1
Triad1_g3491.t2_Triad1-23340
Hoilungia_braker1_g08313.t1
Triad1_g3486.t1
Hoilungia_braker1_g10010.t1
Triad1_g3490.t1
100 Hoilungia_braker1_g05943.t1
Hoilungia_braker1_g08314.t1
99
Triad1_g3487.t1
PFL3_pfl_40v0_9_20150316_1g3063.t1
91
Spurpuratus_XP_793384.3

Protostome
FBpp0300666
81 Amel_4.5|GB47190-PA
92
TC008776
99 Lingula_comp154129_c1_seq10

NaV2
92
Capca1|134859
Lotgi1|118343
Cgigas_EKC21550
99 Halocynthia_roretzi_BAA95896.1
49
Brafl1|75071
Brafl1|249620_bflornaseq.37403.1
bflornaseq.37394.1-37393.1-Brafl1-143759
bflornaseq.37392.1-37391.1
84 Halocynthia_roretzi_BAA04133.1
Pmarinus_GENSCAN00000002143
50
Pmarinus_GENSCAN00000032390
F6XEC0_XENTR__scn4a
43 SCN4A_MOUSE
SCN4A_HUMAN
86 60 K9J7R3_XENTR__scn5a
SCN5A_HUMAN
99
SCN5A_MOUSE
99 SCNBA_HUMAN
98 SCNBA_MOUSE
99
SCNAA_HUMAN
SCNAA_MOUSE

Vertebrate
F6XFJ2_XENTR__scn8a
SCN8A_RAT
SCN8A_HUMAN
87

NaV1
K9J7Z6_XENTR__scn2a
81 F6UXH2_XENTR__scn1a
8179
F6TZ24_XENTR__scn3a
SCN3A_RAT
25
SCN3A_HUMAN
73 SCN7A_HUMAN
44 SCN9A_HUMAN
SCN9A_MOUSE
34 SCN2A_HUMAN
SCN2A_RAT
92
SCN1A_HUMAN
SCN1A_MOUSE

Protostome
SCNA_DROME__para
TC004749__SCN
48 Amel_4.5|GB42728-PA
93

NaV1
NV14617-PA__SCN
Lingula_comp153551_c1_seq30
81 Cgigas_EKC22630__SCN
Lotgi1|107523
Capca1|210954
Helro1|89291
93 Helro1|119038
99 Helro1|109965
Helro1|64539

Cnidarian
Rhopilema_esculentum_TR59849_c0_g1_i2
hydra_sra.25523.1
99 Craspedacusta_sowerbyi_TR39962_c1_g1_i1

NaV Group 2
Corallium_rubrum_TR22337_c0_g1_i3
94 AIPGENE4043
Stylophora_pistillata_5656
Porites_australiensis_12936
81 45
Acropora_millepora_9921
Nemve1|171660_NVE7195
AIPGENE22631

22
Nemve1|88319_NVE6936
Porites_australiensis_8690
Anthozoan
Ctenophores Stylophora_pistillata_17883
Corallium_rubrum_TR32421_c1_g1_i1
NaV Group
67
Placozoans 98 99
AIPGENE9486
NVE7348__SCN-like

Cnidarians 98
Stylophora_pistillata_13842
Acropora_millepora_904 Cnidarian
Protostomes Porites_australiensis_6111

NaV Group 1
Alatina_alata_c37058_g1_i1
70
Echinoderm/hemichordate Rhopilema_esculentum_TR83407_c2_g1_i2
Cyanea_capillata_AAA75572.1

Chordates 96
Craspedacusta_sowerbyi_TR78689_c5_g1_i1
resomia_t_x0_058327_comp73969_c0_seq1

Vertebrates 94
77
Hydractinia_symbiolongicarpus_c23789_g1_i2
Polyorchis_penicillatus_AAC38974.1
93 hydra_sra.26839.1
Choanoflagellates hydra_sra.39059.1
0.6

Supplemental Figure 17: Phylogenetic tree of voltage-gated sodium channel alpha subunits
across metazoans
Protein tree generated with RAxML using the PROTGAMMALG model and 100 bootstraps. Bootstrap
values are 100 unless otherwise shown.

42
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

scict1.007366.1
lctid45949
scict1.016080.1
lctid3586 Sponges
Aqu2.1.42586_001
Twilhelma_c34057_g1_i2_twi_ss.12332.4
59 Tedania_anhelens_TR33204_c2_g1_i1
64
Scopalina_sp_TR28310_c4_g1_i2
100
Srosetta_PTSG_09464T0
Monbr1_g2466.t1__scaffold_5
euplokamis_t_h30_14855_comp11501_c0_seq1
Hcal_TR25961_c0_g1_i1__CaC-like
bathocyroe_t_h20_30616_comp16858_c2_seq1

100 ML190424a__CaC-like
99velamen_t_h20_35925_comp17943_c0_seq2 Ctenophores
55
bolinopsis_t_h20_17550_comp13069_c0_seq1
Triad1_g1763.t1__scaffold_2
Hoilungia_braker1_g05522.t1
92
CAC1E_HUMAN

95
CAC1B_HUMAN
CAC1A_MOUSE
CAC1A_HUMAN
N/P/Q-type CaV
99
bflornaseq.27973.1
Amel_GB18730-PA
99
NV11619-PA
32 TC011227
TC011226
85

Cacophony
CAC1A_DROME__cacophony
99
Helro1|73530
Helro1|128993
69
Helro1|119050
Lingula_comp155476_c1_seq27
71 82
Cgigas_EKC27184
hydra_sra.3016.1
Acropora_millepora_1337
56
Porites_australiensis_47371
NVE1263__CaA-like
98
hydra_sra.19987.4
Rhopilema_esculentum_TR102482_c2_g1_i1
90 AIPGENE18550
NVE18768__CaA-like
Porites_australiensis_44917
Acropora_millepora_1120
Alatina_alata_c56718_g1_i3
hydra_sra.37904.1
NVE4667__CaA-like
AIPGENE13431
Acropora_millepora_894
Oscarella_carmela_comp39348_c0_seq21
Hoilungia_braker1_g06121.t1
Triad1_g1163.t2__scaffold_1
Lingula_comp156742_c0_seq2
56
Cgigas_EKC35362
98 Lotgi1|51275
TC004715
CAC1D_DROME__Ca-alpha1D
bflornaseq.3496.1_bflornaseq.3497.1
94 CAC1S_HUMAN

L-type CaV
CAC1F_HUMAN
CAC1D_HUMAN
96
CAC1C_HUMAN
99
CAC1C_MOUSE
hydra_sra.20483.3
Alatina_alata_c60357_g1_i1
Chironex_fleckeri_TR37575_c1_g1_i8__CAC1C
75
Chrysaora_fuscescens_TR20395_c0_g1_i2__CAC1C
Rhopilema_esculentum_TR83197_c1_g2_i1__CAC1C
Corallium_rubrum_TR89372_c0_g1_i1__CAC1C
95 NVE6254__CaC-like
Stylophora_pistillata_AAD11470.1
Porites_australiensis_45764
60
Acropora_millepora_1095
Srosetta_PTSG_03773T0
Hoilungia_braker1_g01453.t1
Triad1_g2446.t1__scaffold_2

100
CAC1I_HUMAN
CAC1H_MOUSE
CAC1H_HUMAN
T-type CaV
58
CAC1G_HUMAN
bflornaseq.16516.1
Demosponges 100 NV13530-PA__CAC1G-like

96 FBpp0088415__CAC1G-like
Homoscleromorphs 82
TC005355__CAC1G-like

Calcarea
Capca1|89566__CAC1G-like
98
54 Helro1|66349__CAC1G-like
58 Helro1|170765__CAC1G-like
39
Ctenophores 81
44 Helro1|148954__CAC1G-like
Lingula_comp156482_c1_seq2__CAC1G-like

Placozoans 98 EKC37236__CAC1G-like
Lymnaea_stagnalis_AAO83843.2
Cnidarians 71
Lotgi1|220094__CAC1G-like
Chironex_fleckeri_TR32013_c0_g1_i1__CAC1G
Protostomes Corallium_rubrum_TR78644_c0_g1_i1__CAC1G
99
Chordates
Porites_australiensis_46716__CAC1G
AIPGENE17378__CAC1G-like
48 99
Vertebrates NVE5017__CaG-like
Rhopilema_esculentum_TR92782_c0_g1_i2__CAC1G
Acropora_millepora_3959_partial 0.7
Choanoflagellates NVE7616__CaG-like

Supplemental Figure 18: Phylogenetic tree of voltage-gated calcium channels across metazoans
Protein tree generated with RAxML using the PROTGAMMALG model and 100 bootstraps. Bootstrap
values are 100 unless otherwise shown.

43
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

Anthozoan Group 1

Sp
ur
pu
Protostomes
al

obirnaseq.13988.1
rti

ra_20758
rat

_10161

1
1
pa

1
or enr 97 215
us pur

9
_

Hoilungia Triad1_g

_1
i1

_ X pu

Sct iscu_taarvern 3166


S

til 26
2_

phret 682 _10

a
P_ ra

lat
g

pis 91
4_

itifera
00 tu

yloau ia_ osa


HVC
_c

digllepo

a_ a _
dra_asea_c liensis_
tr ustra 1046
37 s_
92

Li
88

27 N

_braker1_3120
N1_C
m

_mi
R7

47P_

_2
ul

Lin

2
_T

us

pora_
obir as_EK 143
Cgig com

_i
8.100

ora_sp
Acropora

_c 1
_p

g1
gu
um

IOIN

90 58
r

ol

0_
nase C4 29
la_
ab

11

36 _29
yp
l

tes_
Acro

nagsia
Ca de

g02370
19
he
l

MPooreriop

R2 ina
llo rtia
an

FnuMta
q.60214 6_c
77

NVE9761
m
rh _c pa

_T a l
us

p
Da inch

9.1
0_

m nt
.t1.t1
m

Ast
_X L
08 5

9282 0_s
ciu

ru _ve
Xe nio us 863909

P_ otg
no _r _m rti id 24.1
scplctid

01 i1

ub ia
Vertebrate
pu
s_t erio_ ilii_

.1
Co 3.1 916 018 0

_rgon
Bf
t1.0 8201

37 | 2
rop N X _V 3
ica P_0 P_0 54 d7 scic lctid

80 14

umor
lis_ 2_ 15lcti

20 10
0 0

lli G
Calcisponges
NP 100 790 32 .0

eq1
3. 5
_g 1
HV _0 23 0 ict

ra
HV Channel
Gallu C N 0 4 4
1 0 6. 4 8 66 sc

Co
s_gal HVC1N _M 11 1 .1
26
46
.t1
lus_N 1 OU
P_00 _HUSM E 2.1
1545
1.2
1025 AN ict1.0 331
834.1 sc
lctid79

0.109
56.2
scict1.0042

0.918

4
Hoilu lctid66549

0.89
ngia_b

0.
raker1

0.581
2
_g0237
1. 1 8
0.98
Triad1_ g3t1

7
Placozoans
122.t2

80.8
49

86
0. 0.9

51
73

0.8
12
Hoilungia_brake 7 0.9
r1_g02369.t2 0.862
Triad1_g3119.t1

Hydrozoans R36069_c0_g1_i1
Chironex_fleckeri_T _g2_i1
1317_c0
0.805 0.96
0.96 Plants
tum_TR7 490.1
_esculen sra.26 Chlorella_variabilis_EFN53563.1
Rhopilema hydra_ 0.
67 0.925

8
9 Polysphondyli

0.75
Protists
um _pallidum_EF
03m A75681.1
4
0029
wv3
Sako
Karlodi
nium_v
en eficum
_AEQ59

0.999
Sak Cocco 28 6.1
ow lithu

1
v30 Pha s_braa
03 eod rudii_
60 a Tha ADM
85 ctylu lass 25 825.1
m m_ iosira
tric _pse
orn udo
utu nan
m_ a _X
XP_ P_0
1 002 022
9 33
180 60.1
795
.1

bathyc
_i1
1

_g2_i2i1 1
3_i1 1_i

1_i1

_c1_g1g2_ 9. i4

tena_t
839_c0 0_ 83 _i2 _
c2 g1 0_g

1_i5 1_i2 i3

R40 90 _c 15 2 g1
1057378.1
0 _g
_c2_g1_c0__gg1_

AIP
s_TR2630663 i_ss. c1_g c0_
c

i2

euplo
75 _c 78_

a n AnGE
leg _T
2_c
_g _

2 tw 2_ 8_
TR i2

th NE

Anthozoan Group 2
_h30
a_e elens e_TR
8 c0
1
ri_ 1_

ll 3 9 op 73
77

5 27

thalasso ML024915a
8_ 0_

Cre anh mb M
96

leu 91
_g

09

seq1
_TR3 TR183316_

kam
_ ra 3 on ra
c1

R1 TR
97

ania e_c

_458
ta P _e
69 72
0_

Ted amb i_T sp_


i_HADA0

st o leg
TR

301_comp13867_c1_

is_t_
ns_ 35

ter
64

ra ri
rte

TR 18

Cr a_ an

M
ar ea tes
_
12

Faud
hele ns_TR

tis

26_com
2
ca

lin
calycein
ila

_c _c _a
0718
ria R

sim
TR

nrgaSc ty
a_

h10
a a a v us
na s_T

ph

ss op a_
p_

iais_lop
iss

yli er tra
_

Sc 65
yllo

_s au h
uti mi
_s

St no lie
yl

_088
a

_dujardin

82

cu reora
a

ania _eleg
St

sa ns
est or
in

p149 4
_ph

ta te _p
constan
_1 is_
al

X.t cif

606.1

vela

ria nr ist
Hc
op

14 15
_an

56_co
_fi

rambe
cale

_9 a_1 illa
8 6 48
Sc

al_
ll

487_
sia

me

05 68 ta_
3
Cre

TR
My

A01036
tro

Halisarca

s_t_h10

n_t boli
mp7

20
be_c
Pe

c0_seq
Ted

03 29
_h2 nop
beroeabyssicola_t_h20_14
bathocyroe_t_h20_10680_comp11581_c0_seq1

58
263

4_
Cram

0_1 sis_t_
i_HAD

_22914

c0
_c0_
1

19
252 h20

_g
Demosponges
1_
seq1

3_c
jardin

i1
_comp1

om 31421

Ctenophores
p11
rca_du

005 omp23
4651_c
Halisa

_c0
_c

_se
9_seq1

q1
663
_c0

0.3
_se
q1

Supplemental Figure 19: Phylogenetic tree of voltage-gated proton channels across metazoans
Protein tree generated with FastTree.

44
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

967 2 Supplemental Tables

Supplemental Table 1: Summary statistics of the Tethya wilhelma genome. Holobiont genomic
includes the sponge and all associated bacteria.
Feature Type Count
Assembly size (Mb) Sponge only 126.0
Total gaps (Mb) Sponge only 1.348
Estimated kmer coverage Sponge only 131x
Estimated mapping coverage Sponge only 159.3x
Number of contigs All 6,907
Number of contigs Sponge only 6,109
Number of contigs Bacterial 789
GC % Sponge only 39.98%
Contig N50 (kb) All 70.7
Contig N50 (kb) Sponge only 73.4
Contig N50 (kb) Bacterial 48.2
Genome size (bp) Mitochondrion 19,754
Estimated mapping coverage Mitochondrion 669.6x
GC % Mitochondrion 34.43%
Paired-end reads Holobiont genomic 100bp 259,518,468
Total paired-end bases (Gb) Holobiont genomic 25.951
Reads aligning back to genome All contigs 214,103,768
Mate-pair reads Holobiont genomic 125bp 280,837,536
Total mate-pair bases (Gb) Holobiont genomic 35.104
Moleculo (TruSeq) long reads Holobiont genomic 125,150
Total Moleculo bases (Mb) Holobiont genomic 436.7
Paired-end RNA-seq reads dUTP Stranded 201,451,574
Paired-end RNA-seq bases (Gb) dUTP Stranded 25.181
Trinity De novo transcripts 127,012
RNA-seq mapping fraction All contigs 68.3%
StringTie Genome guided transcripts 46,572

45
bioRxiv preprint first posted online Mar. 27, 2017; doi: http://dx.doi.org/10.1101/120998. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.

Supplemental Table 2: Summary of splice variation for the genome-guided transcriptome for
T. wilhelma (Twi) and transcript set v2.0 for A. queenslandica (Aqu).
Splice Type Twi transcripts Twi Aqu transcripts Aqu
events/exons events/exons
Cassette exons 4779 9089 3591 5535
Canonical splicing 5049 - 2602 -
Skipped exons 3868 8329 1022 721
Alternative splice acceptor - 3747 - 638
Intron retention 3295 3565 3437 3400
Alternative splice donor - 3264 - 521
Alternative N-terminus 1964 - 1965 -
Alternative C-terminus 1788 - 1968 -
Intronic start 471 - 571 -
Intronic end 285 - 592 -
Non-canonical 73 - 135 -
Single exon with variants 246 - 13 -
No variants 12088 - 24027 -
Single exon and no variants 15421 - 12400 -

Supplemental Table 3: Skipped exon and retained intron frame, for T. wilhelma (Twi) and A.
queenslandica (Aqu). Skipped exons tend to have lengths as multiples of three.
Feature Position 1 Position 2 Position 3
Twi Skipped exons 6904 5179 5331
Twi Retained introns 1202 1204 1159
Aqu Skipped exons 2671 1914 1972
Aqu Retained introns 1244 1092 1101

46

Вам также может понравиться