Вы находитесь на странице: 1из 20

1Human salivary proteome - a resource of potential biomarkers for oral cancer

2
3Priya Sivadasan1,2, Manoj Kumar Gupta2 , Gajanan J. Sathe2, Lavanya Balakrishnan2, Priyanka
4Palit1, Harsha Gowda2, Amritha Suresh1,3, Moni Abraham Kuriakose1,3* and Ravi
5Sirdeshmukh2,3*
6
71. Head and Neck Oncology, Mazumdar Shaw Medical Center, Narayana Health, Bangalore-
8560099, India
92. Institute of Bioinformatics, International Tech Park, Bangalore- 560066, India
103. Mazumdar Shaw Center for Translational Research, Mazumdar Shaw Medical Foundation,
11Narayana Health, Bangalore- 560099, India
12
13*Corresponding Authors
14
15
16Ravi Sirdeshmukh
17Institute of Bioinformatics
18Bangalore-560066, India
19Tel.: +91 9885090963
20E-mail address: ravisirdeshmukh@gmail.com, ravi@ibioinformatics.org
21
22Moni Abraham Kuriakose
23Mazumdar Shaw Medical Center, Narayana Health,
24Bangalore-560099, India
25Tel: +91 9902776000
26E-mail address: makuriakose@gmail.com, moni.abraham@ms-mf.org
27
28
29
30
31

1
32Abstract

33Proteins present in human saliva offer an immense potential for clinical applications. However,
34exploring salivary proteome is technically challenged due to the presence of amylase and
35albumin in high abundance. In this study, we used four workflows to analyze human saliva from
36healthy individuals which involved depletion of abundant proteins using affinity-based
37separation methods followed by protein or peptide fractionation and high resolution mass
38spectrometry analysis. We identified a total of 1,256 human salivary proteins, 292 of them being
39reported for the first time. All identifications were verified for any shared proteins/peptides from
40the salivary microbiome that may conflict with the human protein identifications. On integration
41of our results with the analyses reported earlier, we arrived at an updated human salivary
42proteome containing 3,449 proteins, 808 of them have been reported as differentially expressed
43proteins in oral cancer tissues. The secretory nature of 598 of the 808 proteins has also been
44supported on the basis of the presence of signal sequence, transmembrane domain or association
45with exosomes. From this subset, we provide a priority list of 139 proteins along with their
46proteotypic peptides, which may serve as a reference for targeted investigations as secretory
47markers for clinical applications in oral malignancies.
48

49Key words: Saliva; Salivary Proteome; Proteomics; Mass Spectrometry; Oral cancer

50

51

52

53

54

55

56

57

2
581. Introduction

59Human saliva is a complex biological fluid that bathes the oral cavity and is critical to the
60preservation and maintenance of oral health [1, 2]. Composed of more than 99% water[3], it
61contains secretions from the salivary glands (parotid, submandibular, sublingual glands, and
62minor salivary glands) and non-salivary components including the gingival crevicular fluid, nasal
63and bronchial secretions, blood derivatives, desquamated epithelial linings, food components and
64micro-organisms [2]. The chemical composition of saliva, which primarily includes proteins,
65peptides, nucleic acids and enzymes, suggests it to be an informative biological fluid useful for
66diagnosis, prognosis and post-treatment surveillance of patients with oral cancers as well as other
67diseases [4, 5]. Amylase and albumin account for approximately 60% of the salivary proteome
68[6-9]. In addition, the abundant protein portfolio includes proline-rich proteins, mucins, cystatins
69and statherins along with other plasmatic proteins [10-12]. A comprehensive cataloging of the
70lesser abundant salivary proteome would hence be important and help in identification of
71disease-specific biomarkers.
72
73Mass spectrometry (MS)-based proteomics has been earlier employed to explore the salivary
74proteome under normal and pathological conditions. After the initial studies using 2D-MS
75approach that normally accessed highly abundant proteins, the first high throughput proteomic
76analysis of saliva using LC–MS/MS approach was published by Xie et al [9] revealing 437
77proteins. This was followed by several other reports. Denny et al, using combination of multiple
78depletion and fractionation strategies reported a total of 1,166 salivary proteins, with a high
79proportion of these proteins being also present in blood plasma [13]; this result was also
80supported by another study wherein many salivary proteins were found to originate from plasma
81[4]. A study using capillary isotachophoresis-based multi-dimensional separation platform
82coupled with tandem mass spectrometry identified a total of 1,479 salivary proteins [14]. The use
83of hexapeptide libraries for dynamic range compression coupled with three dimensional peptide
84fractionation using preparative isoelectric focusing, SCX and capillary-reversed-phase HPLC,
85followed by LC-MS/MS analysis resulted in the identification of 2,340 human salivary proteins
86[15], the largest number identified in any one study. These earlier studies differed with respect to
87saliva sampling – glandular or whole saliva, sample processing and analytical platforms.
88

3
89In the present study, we carried out proteomic analysis of saliva from healthy individuals by
90using variations of depletion and fractionation strategies followed by high resolution mass
91spectrometry. Our analysis resulted in the identification of 1,256 human proteins that were
92exclusive of any protein/peptides of microbial origin present in the saliva. The identified human
93proteins include 292 novel identifications. By integrating our results with earlier reports, we
94present an updated salivary proteome as a useful reference for developing clinical applications
95for oral malignancies.
96

972. Materials and methods

982.1. Sample collection and processing

99The study was approved by the Institutional Ethics Committee. The procedure for collection and
100processing of saliva was adapted from earlier reports [16, 17]. Briefly, unstimulated saliva
101samples (5 ml) were collected from healthy subjects of either sex in the age group between 20
102-50 yrs, with written informed consent. The individuals selected were without any risk habits like
103tobacco chewing, smoking or alcohol abuse. Samples were collected in the morning after rinsing
104the mouth with water and with subjects refraining from food/drink for at least 1 hour prior to the
105collection. All the samples were centrifuged at 2,000 rpm, at 4 0C for 10 minutes to remove the
106cells. The supernatant was then collected and centrifuged at 14,000 rpm to remove any debris.
107Protein estimation was carried out using RC-DC protein assay (Bio-Rad, USA) as per the
108manufacturer’s guidelines and the samples were stored at -800 C until further use.
109
1102.2. Depletion and fractionation methods

111Equal volumes of saliva were pooled based on the age groups and pooled saliva samples were
112processed further. One pool included samples from individuals of 30-50 years of age, (Pool A)
113and the other pool included samples from individuals of 20-30 years of age (Pool B). We adopted
114two strategies to deplete abundant proteins. Depletion of amylase alone was carried out by using
115starch affinity-based amylase capture and depletion of amylase and plasmatic proteins by
116amylase capture followed by antibody-based depletion of plasma proteins such as albumin,
117immunoglobulins and others. The depleted protein fraction was then subjected to fractionation on

4
118SDS-PAGE and in-gel tryptic digestion or in solution digested with trypsin and tryptic peptides
119were fractionated by SCX chromatography (Figure 1; workflow 1-3 respectively). In another
120strategy, compression of the protein dynamic range of total salivary proteins was carried out
121using hexapeptide library enrichment kit (ProteoMiner, Bio-Rad, CA, USA). The tryptic digest
122of the enriched protein fraction was then subjected to fractionation by SCX chromatography
123(Figure 1; workflow 4).

124For amylase depletion, 5 ml of pooled saliva (approximately 5 mg of protein) was mixed with
1251.5g of potato starch (Sigma Aldrich, MO, USA) [previously washed 3 times with water (3,000
126rpm, 5 minutes)] and incubated for one hour in a rotating shaker, at room temperature. The
127mixture was then centrifuged at 3,000 rpm for 5 minutes and the supernatant was collected. The
128pellet was washed again to recover trapped saliva. Protein estimation was then carried out as
129mentioned above. Depletion of albumin, immunoglobulins and any other abundant plasma
130proteins (transferrin, fibrinogen, immunoglobulin A, haptoglobin, alpha antitrypsin, alpha 2
131macroglobulin, immunoglobulin M, apolipoprotein A1, alpha1 acid glycoprotein, Complement
132C3, apolipoprotein A11 and transthyretin) was carried out using Human MARS-14 spin cartridge
133(Agilent Technologies, CA, USA) as per manufacturers’ instructions. The protein sample after
134amylase depletion was passed through the MARS-14 cartridge and the unbound protein was
135collected. The procedure was repeated multiple times to collect approximately 500μg of depleted
136protein fraction for further experiments. Flow through fractions were collected, concentrated and
137desalted using a 5 kDa MW cut off ultracentrifugal filter device (Amicon, Millipore, Billerica,
138MA). The protein concentration of the sample was determined as mentioned above.

139Two hundred micrograms of above mentioned depleted saliva protein was resolved on a 10%
140SDS-PAGE (16X18cm) and gel was stained using colloidal coomassie blue. Twenty-five gel
141slices were excised and destained using 40 mM ammonium bicarbonate in 40% acetonitrile
142(ACN). The sample was then subjected to reduction using 5 mM DTT (60˚C for 45 minutes)
143followed by alkylation using 20 mM iodoacetamide (10 min. at room temperature). In-gel
144digestion with trypsin was carried out at 37˚C for 12-16 hours using modified sequencing grade
145Trypsin (Promega, WI, USA). Peptides were extracted from gel pieces sequentially using 5 %
146formic acid, 5% formic acid in 40% ACN and finally with 100% ACN. The extracted peptides
147were dried and stored at -80˚C until LC-MS/MS analysis.

5
148Alternatively, depleted protein fraction was subjected to direct in-solution digestion with trypsin
149and the resulting peptides fractionated by SCX chromatography. Briefly, 200µg of protein was
150reduced with 5 mM DTT and alkylated using 10 mM IAA as above. The proteins were then
151digested with trypsin as above and the digested peptide mix was reconstituted in solvent A (10
152mM potassium phosphate, 30% ACN, pH 2.7) and fractionation was carried out on a SCX
153column (Polysulfoethyl A column; 300 Å, 5 µm, 100 × 2.1 mm; PolyLC, MD, USA) using 1200
154HPLC system (Agilent Technologies, CA, USA) coupled with a binary pump, UV detector and a
155fraction collector. Peptides were eluted using a linear salt gradient (0 to 35%) of solvent B (10
156mM potassium phosphate buffer containing 30% ACN, 350 mM KCl, pH 2.7) at a flow rate of
157200µl/ min. The adjacent fractions were then pooled based on the chromatographic profile to
158make the total number to 25. The samples were dried, reconstituted in 0.1% TFA and desalted
159using C18 stage-tip. The desalted samples were dried and stored at -80˚C until further analysis.

160For enrichment using ProteoMiner, salivary proteins were subjected to the procedure according
161to the manufacturers’ instructions (ProteoMiner; Bio-Rad, CA, USA). Briefly, 10 mg of salivary
162protein was added to the ProteoMiner column, incubated in a rotational shaker for 2 hours at
163room temperature and centrifuged at 1,000 x g for 1 minute to discard the unbound fraction. The
164column was then washed thrice with 200 µl of wash buffer, by centrifugation at 1000g for 1 min.
165Two hundred microlitres of deionized water was added and centrifuged at 1000 g for 1 min. The
166enriched low abundant proteins bound to the column were eluted with 100 µl of rehydrated
167elution reagent, desalted using 5 kDa MW cut off ultracentrifugal filter device (Amicon,
168Millipore, Billerica, MA) and protein estimation was carried out. The enriched protein sample
169was digested in-solution with trypsin and the tryptic digest was subjected to SCX fractionation as
170described above.

171

1722.3. LC-MS/MS analysis

173Fourier-Transform LTQ-Orbitrap Velos mass spectrometer (Thermo Fischer Scientific, Bremen,


174Germany) equipped with Proxeon Easy nLC was used for LC-MS/MS analysis. In house
175chromatographic capillary columns made up of Magic C 18 AQ reversed phase material (Michrom
176Bioresources, 5 and 3 μm, 100 Å) were used for HPLC. Nanospray source with an emitter tip of

6
17710µm (New Objective, Woburn, MA) was used for ionisation with a voltage of 2 kV. Peptides
178were enriched on trap column (75 mm X 2 cm) at a flow rate of 3 µL/min using Solvent A (0.1%
179formic acid) followed by fractionation in an analytical column (75 mm X 10 cm) to resolve the
180peptides. A linear gradient of 7-30% solvent B (0.1% formic acid, 95% ACN) was used at a flow
181rate of 350 nL/min., for 80 min. The mass spectrometry parameters used are as follows:
182acquisition of the full scan data was implemented with a mass resolution of 60,000 at 400 m/z,
183top 20 intense peaks from each MS cycle were selected for MS/MS fragmentation with a mass
184resolution of 15,000 at 400 m/z. Only multiple charged peptides were selected and 39%
185normalized collision energy was used for fragmentation with 45 sec exclusion time. Automatic
186gain control and filling time were kept at 5x10 5 ions and 100 milliseconds for MS, and 1x10 5
187ions and 500 milliseconds for MS/MS, respectively. Polydimethylcyclosiloxane (m/z,
188445.1200025) ion was used for internal calibration [18].
189
1902.4. Protein identification and bioinformatics analysis

191Mass spectrometry data was analyzed using Proteome Discoverer v1.4software (Thermo
192Scientific, Bremen, Germany). Peak list file generation and database searches were carried out in
193SEQUEST mode. Precursor mass range of 350 to 8,000 Da and signal to noise ratio of 1.5 were
194used as the criteria for generation of peak list files. Database searches for protein identifications
195were carried out for human proteins using, NCBI Human RefSeq 60 protein database. As human
196saliva also contains microbial flora, a separate search was also carried out using combined
197database of NCBI Human RefSeq60 and oral microbial proteins from the Human Oral
198Microbiome Database (HOMD; www.homd.org). Human RefSeq 60 database consists of 30,082
199protein entries, HOMD consists of 61, 66,196 protein entries and can impact the results in the
200combined searches. Therefore, we used the searches against human protein database alone to
201identify all human proteins. The identifications were compared with those from the combined
202database search and any shared peptides of microbial protein origin identified were filtered out to
203ensure that human protein identifications were completely based on unique human peptides. The
204analysis revealed microbial proteins providing protein-based identification of microbes in saliva
205(see in Ref [19]).
206

7
207The parameters used for database searches included trypsin as a protease with one missed
208cleavage, carbamidomethyl cysteine as a fixed modification, and oxidation of methionine as a
209dynamic modification. Precursor ion and fragment ion mass error window used was 20 ppm
210and0.1 Da, respectively. The proteins and their corresponding peptide list were obtained using
211the criteria: peptide confidence – high; Peptide rank – 1; Xcorr filters at individual MS runs to
212allow 1% FDR at peptide level with searches using decoy database. Only unique peptides were
213considered for protein identifications. Further, all the single peptide identifications were
214manually screened for the quality of spectra, peptide length and uniqueness. The single
215peptide/protein hits were included only if the fragmentation was scored as good with respect to
21670-80% of ‘b’ ion or ‘y’ ion information with optimal intensities and the peptides were at least 6
217residues long. Peptides which have ambiguous spectra were not included for valid
218identifications.
219
220Gene Ontology (GO) classification was done using HPRD (http://www.hprd.org) to classify
221identified proteins for their subcellular localization, molecular function and biological processes.
222SignalP 4.1 (www.cbs.dtu.dk/services/SignalP) and TMHMM 2.0c (http://www.cbs.dtu.dk/
223services/TMHMM) were used to predict signal peptide or transmembrane domain presence in the
224proteins identified. The proteins were also compared with human exosomal protein database
225(Exocarta; http://exocarta.org) [20].
226

2273. Results and discussion

2283. 1. Proteomic analysis of normal human saliva

229Mass spectrometry-based proteomic studies have significantly increased the identification and
230coverage of human salivary proteins. Some of the key proteomic studies in saliva were reported
231during the period of 2005-2010 by several research groups. These groups used whole saliva or
232secretions from parotid and submandibular/sublingual glands for the proteomic analysis. Saliva
233sampling was done either from a single individual or pooled from many subjects with different
234age groups using variant methods of collection. Depletion methods were used for some of the
235high abundant proteins, however in some of the studies depletion was not carried out. Gel-based
236protein separation (SDS-PAGE) or peptide-based fractionations using strong cation exchange

8
237chromatography, isoelectric focusing and transient capillary isotachophoresis have been
238employed for protein/peptide fractionation to reduce sample complexity. Further, mass
239spectrometry platforms used for the LC-MS/MS analysis included LC-MALDI TOF/TOF, LTQ-
240linear ion trap, LTQ-Orbitrap XL and QSTAR Pulsar XL instruments, which varied in their
241analytical capabilities. Proteins identifications were based on single or multiple peptides with
242FDR cut-off between 1-3% at the peptide level [8, 9, 13-15]. The number of proteins identified
243ranged from 400 to more with the largest dataset being 2340 proteins in one single study by
244Bandhakavi et al [15].
245
246The etiology of oral cancer is primarily habit based; nevertheless, an increasing number of
247patients without history of risk habits are being diagnosed with the disease. A recent study
248reveals patients with oral cancer at young ages and their numbers are on the rise [21]. In our
249effort, we used multiple LC-MS/MS work flows to profile salivary proteins using pooled saliva
250samples from subjects covering a wide range of age group - one pool from the age group of 30-
25150 years and the other pool from the age group of 20-30 years and identified proteins using high
252resolution mass spectrometry. Saliva pooled from healthy individuals was subjected to depletion
253of amylase alone using starch or sequential depletion of amylase and other abundant plasmatic
254proteins including albumin and immunoglobulins by using starch followed by antibody-based
255affinity columns (Human MARS-14) or enrichment of low abundant proteins using hexapeptide
256library-based separation (ProteoMiner, Bio-Rad, CA, USA). Proteins were then fractionated by
257SDS-PAGE followed by tryptic digestion of the gel slices or directly digested with trypsin and
258peptides fractionated by SCX chromatography. LC-MS/MS analysis of the peptide fractions were
259carried out using HPLC coupled to a Fourier Transform LTQ-Orbitrap Velos high resolution
260mass spectrometer (Thermo Fischer, Bremen, Germany). The four analysis workflows used in
261the study are shown in Figure 1. We identified 631 proteins from workflow 1, 549 proteins from
262workflow 2, 534 proteins from workflow 3 and 825 proteins from workflow 4. Together, we
263identified a total of 15,880 peptides corresponding to 1,256 human proteins exclusive of any
264shared sequences from microbial proteins present in the saliva. More than 60% of the proteins
265identified are based on multiple peptides. Out of 1,256 human proteins, 292 are being reported
266for the first time in saliva, 103 out of 292 were multiple peptide hits while the remaining 189
267were based on single peptides. The identifications based on single peptides were screened for

9
268confidence by applying several criteria including minimum peptide length of 6 residues, as
269mentioned under Methods. Indeed, in our analysis we observed that most of the peptides ranged
270from 8-15 residues, some even longer. The analysis also provides multiple peptide support for
271single peptide protein identifications reported in earlier studies. Further, it is interesting to note
272that on comparion of this data with salivary protein profile of oral cancer patients (age group
273between 40-60 years, unpublished), more than 60% of the proteins were found to be common,
274thus strengthening the validity of our analysis against salivary proteins present in patient
275samples. In addition, comprehensive cataloguing of the salivary proteome was done by including
276the data from earlier literature and using bioinformatics considerations, we compiled a priority
277list of proteins which can be pursued in a targeted manner for clinical applications (see below).
278The details of non-redundant portfolio of human proteins identified in the four workflows along
279with their annotations are provided in Supplementary Table 1A. The detailed supporting
280information of human proteins identified from each of the four workflows used in this study
281along with the peptide information is provided in Supplementary Table 1B, 1C, 1D and 1E. We
282have identified most of the commonly observed proteins (amylase, cystatins, basic proline rich
283proteins, mucins, lactotransferrin, carbonic anhydrase, lyzozymes, peroxidases, albumin,
284thymosins and defensins) in human saliva [22]. The three protein families - statherins, histatins
285and acidic proline rich proteins were not identified in our study. A likely reason for this could be
286the use of whole saliva supernatant for the analysis, which is the optimal medium for use in oral
287pathologies. It is known that these proteins have a glandular origin, undergo cleavage to smaller
288peptides, bind to oral tissues and are also prone to proteolysis by tissue and bacterial proteases in
289the oral cavity and hence may not be present in detectable levels in whole saliva [10, 23, 24].
290
2913.2. Biological process and subcellular localization of human salivary proteins

292GO analysis was carried out to group the identified salivary proteins based on biological process,
293molecular function and subcellular localization. Subcellular localization grouping showed
294enrichment of cytoplasmic proteins (32%), followed by extracellular (18%) and plasma
295membrane proteins (9.5%) (Figure 2A). With respect to biological process, majority of the
296proteins were involved in metabolism and energy pathways (19%), followed by protein
297metabolism (18%), cell communication and signal transduction (17%) (Figure 2B).

10
298Enzymes represented a major category (22%) of salivary proteins identified. Hydrolases (ADP-
299sugar pyrophosphatase (NUDT5), tissue alpha-L-fucosidase precursor (FUCA1), fumaryl
300acetoacetase (FAH), dehydrogenases (glyceraldehyde-3-phosphate dehydrogenase (GAPDH),
301glucose-6-phosphate 1-dehydrogenase (G6PD), L-lactate dehydrogenase A (LDHA) and
302oxidoreductases (alcohol dehydrogenase class-3 (ADH5), catalase (CAT), glutaredoxin-1
303(GLRX) were associated with metabolism, energy pathways and detoxification [25-33]. These
304enzymes may also be involved in the initiation of digestion of food components in the oral
305cavity. The other major category of proteins identified were the transport/carrier proteins
306Apolipoprotein B (APOB), Albumin (ALB) and Apolipoprotein H (APOH) that might have been
307detected in saliva through leakage from blood [34, 35]. Calcium binding proteins such as
308calmodulin–like (CALML5), calcium binding protein 39 (CAB39), members of S100 family
309(S100A7, S100A8, S100A9, S100A11) and annexin family members (ANXA1, ANXA2,
310ANXA3) were also identified in saliva. These proteins might be associated with the re-
311mineralization of tooth enamel [36]. Vimentin (VIM), moesin (MSN) and keratin 17 (KRT17)
312are some of the cytoskeletal proteins identified in saliva. Growth factors such as epidermal
313growth factor (EGF), glia maturation factor gamma (GMFG), granulin (GRN), hepatocyte-
314derived growth factor (HDGF), platelet-derived growth factor C (PDGFC) and thymidine
315phosphorylase (TYMP) which may be associated with the inherent wound healing properties of
316saliva [37] were some of the other kinds of proteins identified. Evidence does suggest that
317growth factors play a role in the maintenance of oral and systemic health as well [38, 39]. Apart
318from these proteins involved in normal salivary functions discussed above, human saliva also
319contains many proteins implicated in oral diseases. In Supplementary Table 3A, we show those
320implicated in oral malignancies and detectable in the saliva, many of them present among the
321proteins identified in our analysis (under section “Salivary proteome and Oral Cancer”).
322
323
3243.3. Human salivary proteome
325
326Initial proteomic studies of saliva were carried out by two-dimensional gel electrophoresis
327coupled with MALDI-MS leading to the identifications of proteins relatively small in number
328(50-300) that are more abundant in content. With developments in the depletion strategies of the

11
329highly abundant proteins, fractionation methods and tandem mass spectrometry-LC-MS/MS
330strategies, the coverage of salivary proteome has improved significantly as discussed above [8, 9,
33113-15]. The use of coupled hexapeptide libraries (ProteoMiner, Bio-Rad, CA, USA) for
332compression of the protein dynamic range coupled with three dimensional fractionation of the
333tryptic peptides, using sequential preparative isoelectric focusing, strong cation-exchange (SCX)
334chromatography and capillary-reversed-phase HPLC resulted in the identification of 2,340
335human salivary components, which is the largest salivary proteome identified in a single study
336till date [15]. We integrated the results of our analyses with earlier five reports mentioned above
337[8, 9, 13-15] and catalogued a non-redundant list of proteins that may be referred to as the
338updated human salivary proteome, consisting of a total of 3,449 proteins (Figure 3), 1,671 of
339which have been reported in atleast two studies. The salivary proteome resource presented here
340provides information pertaining to the secretory features, molecular class, biological process,
341molecular function and subcellular localization of 3,449 proteins is provided in Supplementary
342Table 2.

3433.4. Salivary proteome and oral cancer


344

345Oral cancer is the most common malignant tumor in the head and neck region affecting over
346300,000 people worldwide per year. Delayed detection of oral cancer is likely to be the major
347reason for the poor prognosis associated with the disease. Identification of biomarkers that
348complement clinical/pathological assessments could help to screen patients at risk, predict
349disease outcome, and effectively help to plan treatment strategies. Secretory proteins are known
350to be released through classical and non-classical secretion processes and are of increasing
351interest as potential biomarkers for diseases [40]. Saliva being the proximal fluid for oral cancer,
352together with availability and minimal invasiveness for collection, salivary proteins are ideal
353candidates for oral cancer detection and monitoring of disease progression [41]. Since saliva
354from oral cancer patients may contain secretions from various glands mentioned above as well as
355proteins released from tumor cells, we therefore looked for the salivary proteins which has been
356already implicated in oral cancer on the basis of their differential regulation in oral cancer
357clinical tissues. On comparison of the human saliva proteome that we compiled, with the
358published literature data on oral cancer, a total of 808 salivary proteins were found to be

12
359differentially expressed in oral cancer tissues. The details are listed in (Supplementary Table
3603A) and the references used for this comparison are provided separately in Supplementary
361Document 1.
362
363Further, evaluating salivary proteins for their secretory features would provide additional support
364for their consistent occurrence and detection in saliva. The secretory features of the salivary
365proteins (n=3,449) were examined by mapping to Exocarta database for their exosomal
366association or to SignalP and TMHMM database for the presence of signal peptide sequence or
367transmembrane domain, respectively. Since, signal peptides control the proper targeting of
368virtually all proteins to the secretory pathway; its mapping also represents glycosylated proteins
369that are secreted [42, 43]. We observed that 1,920 (56%) salivary proteins have secretory features
370based on any one of the above three criteria thereby supporting their detection in saliva
371(Supplementary Table 2). 598 of these proteins are also included among the 808 salivary
372proteins associated with oral cancer described above and shown in Supplementary Table 3A.
373From these, 139 proteins were observed to match atleast two of the three secretory parameters
374i.e. exosomal, signal peptide and transmembrane domain association (Supplementary Table
3753B). This panel could be used as a priority data set for investigation for clinical applications. The
376representative members of this subset are shown in Table 1.
377
378The main biological processes associated with this subset of 139 secretory proteins were cell
379growth and maintenance, cell communication and signal transduction, immune response,
380metabolism and transport. CD44, tenascin C (TNC) and cysteine rich transmembrane BMP
381regulator 1 (CRIM1) are some of the important molecules representing cell communication and
382signal transduction process. CD44 has been reported to be involved with tumor growth and
383metastasis and has also been implicated as a cancer stem cell marker in head and neck squamous
384cell cancer (HNSCC) and is also associated with bad prognosis [44]. TNC is an extracellular
385matrix protein with growth, invasion and angiogenesis-promoting activities. It has been reported
386to be up-regulated in tumorigenesis and has been suggested to correlate with prognosis in various
387carcinomas [45]. It has also been shown to be significantly upregulated in early disease stages
388[46]. CRIM1, a potential risk factor for cancer, has been identified in saliva for the first time.
389Extracellular matrix (ECM) is a major component of the tumor microenvironment and exerts

13
390many roles during tumor progression: it supports proliferation and survival of tumor cells. Several
391members of the collagen family (COL1A1, COL1A2, COL4A2, COL6A1), fibronectin 1 (FN1)
392and matrix metalloproteinase 1 (MMP1) that are known to be associated with ECM and oral
393cancer, were also identified in saliva in this study.

394Hypothesis based targeted mass spectrometry-based methods are emerging as a strong option to
395analyze proteins accurately and quantitatively in multiple samples. Conventional multiple/selected
396reaction monitoring (MRM/SRM) assays or the newly developed SWATH assays provide
397important tools for this purpose. Using these assays, peptides from several proteins can be
398assayed in parallel with high precision, accuracy and reproducibility. We have provided the top 10
399proteotypic peptides/most observed peptides from the Global Proteome Machine Database
400(GPMdb) for all 139 proteins listed in Supplementary Table 3B and have indicated those (n=99)
401which have been detected in normal saliva. All these 99 proteins with proteotypic peptides are
402part of the integrated salivary proteome catalogued with a subset of them (n=76) being
403represented in our proteomic analysis. In addition, we have sorted out other salivary peptides that
404have been empirically observed in multiple MS analyses as discussed above. These proteins and
405their peptide information would be valuable for investigation by targeted mass spectrometry
406approaches and would constitute a prioritized reference platform for investigating their clinical
407applications in oral malignancies.

408

4094. Conclusions

410Identification of normal salivary proteins is invaluable in the context of emerging interest in


411salivary diagnostics. The proteomic analysis of human saliva from healthy individuals reported
412here revealed a total of 1,256 human proteins, 292 of which are being reported for first time in
413high throughput LC-MS/MS-based analyses. More than 60% of the proteins identified were
414based on multiple peptides, and rest being single peptide identification screened for high
415confidence. Integrating our results with earlier studies, an updated salivary proteome of 3,449
416proteins was compiled, from which a subset of 139 salivary proteins was derived with secretory
417potential and implications in oral cancer. This would serve as first level reference for any
418targeted investigation towards clinical applications of the salivary proteins.

14
419

420Acknowledgements

421We acknowledge Department of Biotechnology (DBT), Govt. of India for financial support. PS is
422a recipient of Senior Research fellowship from University Grants Commission, India. MKG is a
423recipient of Senior Research fellowship from Council of Scientific and Industrial Research
424(CSIR), India. HG is a WellcomeTrust/DBT India Alliance Early Career Fellow. AS is a recipient
425of Young Investigator award from DBT. Sneha M. Pinto (IOB) helped in the initial phase of
426experiments.

427

428Conflict of interest

429The authors declare no conflict of interest.

430

431References

432[1] Humphrey SP, Williamson RT. A review of saliva: normal composition, flow, and function. The Journal
433of prosthetic dentistry. 2001;85:162-9.
434[2] Kaufman E, Lamster IB. The diagnostic applications of saliva--a review. Critical reviews in oral biology
435and medicine : an official publication of the American Association of Oral Biologists. 2002;13:197-212.
436[3] Pedersen AM, Bardow A, Jensen SB, Nauntofte B. Saliva and gastrointestinal functions of taste,
437mastication, swallowing and digestion. Oral diseases. 2002;8:117-29.
438[4] Loo JA, Yan W, Ramachandran P, Wong DT. Comparative human salivary and plasma proteomes.
439Journal of dental research. 2010;89:1016-23.
440[5] Malamud D. Saliva as a diagnostic fluid. Dental clinics of North America. 2011;55:159-78.
441[6] Krief G, Deutsch O, Gariba S, Zaks B, Aframian DJ, Palmon A. Improved visualization of low abundance
442oral fluid proteins after triple depletion of alpha amylase, albumin and IgG. Oral diseases. 2011;17:45-52.
443[7] Deutsch O, Fleissig Y, Zaks B, Krief G, Aframian DJ, Palmon A. An approach to remove alpha amylase
444for proteomic analysis of low abundance biomarkers in human saliva. Electrophoresis. 2008;29:4150-7.
445[8] Yan W, Apweiler R, Balgley BM, Boontheung P, Bundy JL, Cargile BJ, et al. Systematic comparison of
446the human saliva and plasma proteomes. Proteomics Clinical applications. 2009;3:116-34.
447[9] Xie H, Rhodus NL, Griffin RJ, Carlis JV, Griffin TJ. A catalogue of human saliva proteins identified by
448free flow electrophoresis-based peptide separation and tandem mass spectrometry. Molecular & cellular
449proteomics : MCP. 2005;4:1826-30.
450[10] Carpenter GH. The secretion, components, and properties of saliva. Annual review of food science
451and technology. 2013;4:267-76.
452[11] Vitorino R, Lobo MJ, Ferrer-Correira AJ, Dubin JR, Tomer KB, Domingues PM, et al. Identification of
453human whole saliva protein components using proteomics. Proteomics. 2004;4:1109-15.

15
454[12] Hu S, Arellano M, Boontheung P, Wang J, Zhou H, Jiang J, et al. Salivary proteomics for oral cancer
455biomarker discovery. Clinical cancer research : an official journal of the American Association for Cancer
456Research. 2008;14:6246-52.
457[13] Denny P, Hagen FK, Hardt M, Liao L, Yan W, Arellanno M, et al. The proteomes of human parotid and
458submandibular/sublingual gland salivas collected as the ductal secretions. Journal of proteome research.
4592008;7:1994-2006.
460[14] Fang X, Yang L, Wang W, Song T, Lee CS, DeVoe DL, et al. Comparison of electrokinetics-based
461multidimensional separations coupled with electrospray ionization-tandem mass spectrometry for
462characterization of human salivary proteins. Analytical chemistry. 2007;79:5785-92.
463[15] Bandhakavi S, Stone MD, Onsongo G, Van Riper SK, Griffin TJ. A dynamic range compression and
464three-dimensional peptide fractionation analysis platform expands proteome coverage and the
465diagnostic potential of whole saliva. Journal of proteome research. 2009;8:5590-600.
466[16] Navazesh M. Methods for collecting saliva. Annals of the New York Academy of Sciences.
4671993;694:72-7.
468[17] Li Y, St John MA, Zhou X, Kim Y, Sinha U, Jordan RC, et al. Salivary transcriptome diagnostics for oral
469cancer detection. Clinical cancer research : an official journal of the American Association for Cancer
470Research. 2004;10:8442-50.
471[18] Olsen JV, de Godoy LM, Li G, Macek B, Mortensen P, Pesch R, et al. Parts per million mass accuracy
472on an Orbitrap mass spectrometer via lock mass injection into a C-trap. Molecular & cellular proteomics :
473MCP. 2005;4:2010-21.
474[19] Sivadasan P, Gupta MK, Balakrishnan L, Sathe GJ, Gowda H, Suresh A, et al. Data for Human Salivary
475Proteome - a resource of potential biomarkers for oral cancer. Data in brief. 2015.
476[20] Simpson RJ, Kalra H, Mathivanan S. ExoCarta as a resource for exosomal research. Journal of
477extracellular vesicles. 2012;1.
478[21] Elango JK, Gangadharan P, Sumithra S, Kuriakose MA. Trends of head and neck cancers in urban and
479rural India. Asian Pacific journal of cancer prevention : APJCP. 2006;7:108-12.
480[22] Castagnola M, Cabras T, Vitali A, Sanna MT, Messana I. Biotechnological implications of the salivary
481proteome. Trends Biotechnol. 2011;29:409-18.
482[23] Oppenheim FG, Salih E, Siqueira WL, Zhang W, Helmerhorst EJ. Salivary proteome and its genetic
483polymorphisms. Annals of the New York Academy of Sciences. 2007;1098:22-50.
484[24] Campese M, Sun X, Bosch JA, Oppenheim FG, Helmerhorst EJ. Concentration and fate of histatins
485and acidic proline-rich proteins in the oral environment. Archives of oral biology. 2009;54:345-53.
486[25] Ito R, Sekiguchi M, Setoyama D, Nakatsu Y, Yamagata Y, Hayakawa H. Cleavage of oxidized guanine
487nucleotide and ADP sugar by human NUDT5 protein. Journal of biochemistry. 2011;149:731-8.
488[26] Chojnowska S, Zalewska A, Knas M, Waszkiewicz N, Waszkiel D, Kossakowska A, et al. Determination
489of lysosomal exoglycosidases in human saliva. Acta biochimica Polonica. 2014;61:85-90.
490[27] Labelle Y, Phaneuf D, Leclerc B, Tanguay RM. Characterization of the human fumarylacetoacetate
491hydrolase gene and identification of a missense mutation abolishing enzymatic activity. Human
492molecular genetics. 1993;2:941-6.
493[28] Li T, Liu M, Feng X, Wang Z, Das I, Xu Y, et al. Glyceraldehyde-3-phosphate dehydrogenase is
494activated by lysine 254 acetylation in response to glucose signal. The Journal of biological chemistry.
4952014;289:3775-85.
496[29] Gomez-Manzo S, Terron-Hernandez J, de la Mora-de la Mora I, Garcia-Torres I, Lopez-Velazquez G,
497Reyes-Vivas H, et al. Cloning, expression, purification and characterization of his-tagged human glucose-
4986-phosphate dehydrogenase: a simplified method for protein yield. The protein journal. 2013;32:585-92.
499[30] Miyajima H, Takahashi Y, Suzuki M, Shimizu T, Kaneko E. Molecular characterization of gene
500expression in human lactate dehydrogenase-A deficiency. Neurology. 1993;43:1414-9.

16
501[31] Hoog JO, Stromberg P, Hedberg JJ, Griffiths WJ. The mammalian alcohol dehydrogenases interact in
502several metabolic pathways. Chemico-biological interactions. 2003;143-144:175-81.
503[32] Tipton DA, Braxton SD, Dabbous MK. Role of saliva and salivary components as modulators of
504bleaching agent toxicity to human gingival fibroblasts in vitro. Journal of periodontology. 1995;66:766-
50574.
506[33] Fernandes AP, Holmgren A. Glutaredoxins: glutathione-dependent redox enzymes with functions far
507beyond a simple thioredoxin backup system. Antioxidants & redox signaling. 2004;6:63-74.
508[34] Song F, Poljak A, Crawford J, Kochan NA, Wen W, Cameron B, et al. Plasma apolipoprotein levels are
509associated with cognitive status and decline in a community cohort of older individuals. PloS one.
5102012;7:e34078.
511[35] Cuevas-Cordoba B, Santiago-Garcia J. Saliva: a fluid of study for OMICS. Omics : a journal of
512integrative biology. 2014;18:87-97.
513[36] Lamkin MS, Oppenheim FG. Structural features of salivary function. Critical reviews in oral biology
514and medicine : an official publication of the American Association of Oral Biologists. 1993;4:251-9.
515[37] Brand HS, Ligtenberg AJ, Veerman EC. Saliva and wound healing. Monographs in oral science.
5162014;24:52-60.
517[38] Mandel ID. The role of saliva in maintaining oral homeostasis. Journal of the American Dental
518Association. 1989;119:298-304.
519[39] Cowman RA, Schaefer SJ, Fitzgerald RJ. Specificity of utilization of human salivary proteins for
520growth by oral streptococci. Caries research. 1979;13:181-9.
521[40] Stastna M, Van Eyk JE. Investigating the secretome: lessons about the cells that comprise the heart.
522Circulation Cardiovascular genetics. 2012;5:o8-o18.
523[41] Farnaud SJ, Kosti O, Getting SJ, Renshaw D. Saliva: physiology and diagnostic potential in health and
524disease. TheScientificWorldJournal. 2010;10:434-56.
525[42] Braakman I, Bulleid NJ. Protein folding and modification in the mammalian endoplasmic reticulum.
526Annual review of biochemistry. 2011;80:71-99.
527[43] Tusnady GE, Simon I. Topology prediction of helical transmembrane proteins: how far have we
528reached? Current protein & peptide science. 2010;11:550-61.
529[44] Chen J, Zhou J, Lu J, Xiong H, Shi X, Gong L. Significance of CD44 expression in head and neck cancer:
530a systemic review and meta-analysis. BMC cancer. 2014;14:15.
531[45] Atula T, Hedstrom J, Finne P, Leivo I, Markkanen-Leppanen M, Haglund C. Tenascin-C expression and
532its prognostic significance in oral and pharyngeal squamous cell carcinoma. Anticancer research.
5332003;23:3051-6.
534[46] Fialka F, Gruber RM, Hitt R, Opitz L, Brunner E, Schliephake H, et al. CPA6, FMO2, LGI1, SIAT1 and
535TNC are differentially expressed in early- and late-stage oral squamous cell carcinoma--a pilot study. Oral
536oncology. 2008;44:941-8.

537

538

539Table Legends

540Table 1: Representative list of human salivary proteins associated with oral squamous cell
541carcinoma with secretory potential.

542Figure Legends

17
543Figure 1: Workflows used for human salivary protein analysis. Saliva was collected from
544normal healthy individuals, processed and depleted of abundant proteins, fractionated and
545subjected to LC-MS/MS analysis using 4 varying workflows. The details of each workflow used
546are as follows. Workflow 1: Protein from pool A was depleted of amylase using starch affinity,
547fractionated by SDS-PAGE and proteins in the gel fractions were digested with trypsin. Saliva
548from pool B was used for the remaining three workflows. Workflow 2: Saliva from pool B was
549sequentially depleted of amylase and other high abundant plasmatic proteins using starch and
550antibody based affinity methods, fractionated on SDS-PAGE and protein was digested in-gel
551with trypsin. Workflow 3: Protein from pool B was sequentially depleted of amylase and other
552high abundant proteins as in workflow 2. Depleted protein was digested with trypsin in solution
553and fractionated using SCX chromatography. Workflow 4: Saliva from pool B was subjected to
554enrichment of low abundant proteins using ProteoMiner, enriched fraction digested with trypsin
555followed by fractionation using SCX chromatography. The tryptic digest of the gel fractions
556(workflow 1 and 2) or those from SCX fractions (workflow 3 and 4) were subjected to LC-
557MS/MS analysis by LTQ-Orbitrap Velos mass spectrometer. The raw data obtained from each
558run was then searched against Human RefSeq 60 database and the combined Human RefSeq 60
559database and Human Oral Microbiome Database (HOMD) to identify the human specific
560proteins.
561
562
563Figure 2: Gene Ontology-based classification of salivary proteins. Classification of the
564human salivary proteins identified in this study was carried out using Human Protein Reference
565Database. Bar graphs show, A. Subcellular localization and B. Biological Processes.
566
567
568Figure 3: Human salivary proteome. The datasets of the proteins identified in this study and
569reported earlier from LC-MS/MS analyses of saliva were integrated to generate a non-redundant
570list of proteins detected in human saliva, referred as the salivary proteome. Five studies
571published during the years 2005-2010 and the present analyses were used to integrate the data.
572Number of proteins identified in each of them which make the salivary proteome are shown in

18
573the figure. This integration yielded 3,449 proteins. Details of the proteins are given in
574Supplementary Table 2.
575

576Supplementary Table Legends

577
578Supplementary Table 1A: Table shows non-redundant list of human salivary proteins identified
579in the four workflows used in the study along with their Gene Ontology information. Proteins
580and peptides identified from each workflow, sequence coverage are given separately in
581Supplementary Table 1 (B-E).
582
583Supplementary Table 1B: List of proteins and peptides identified in human saliva from
584workflow 1 as indicated under Supplementary Table 1A.

585Supplementary Table 1C: List of proteins and peptides identified in human saliva from
586workflow 2 as indicated under Supplementary Table 1A.
587
588Supplementary Table 1D: List of proteins and peptides identified in human saliva from
589workflow 3 as indicated under Supplementary Table 1A.
590
591Supplementary Table 1E: List of proteins and peptides identified in human saliva from
592workflow 4 as indicated under Supplementary Table 1A.
593

594Supplementary Table 2: Human salivary proteome. Non-redundant catalogue of human


595salivary proteins from LC-MS/MS analyses reported (during the years, 2005-2010) and the
596present study were compiled to make an updated salivary proteome. Annotations of these
597proteins are provided in the table. This list consists of 3,449 proteins, 1,671 of them being
598reported in at least 2 studies. Protein identifications (n=292) marked with asterisk are being
599reported for the first time in human saliva. This list was subjected to bioinformatics analysis to
600obtain GO information on the proteins and their secretory potential indicated in the table.
601

19
602Supplementary Table 3A: List of proteins differentially expressed in oral cancer tissues
603and detectable in human saliva. On comparison of the human saliva proteome (Figure 3 and
604Supplementary Table 2), with the data on differentially expressed proteins in oral cancer, a total
605of 808 salivary proteins were found to overlap. These proteins are listed in the Table along with
606their differential status. Proteins whose differential status varies between different studies are
607marked with asterisk (*). Gene Ontology information and secretory features of the proteins
608according to the criteria mentioned under method section are also given.
609
610Supplementary Table 3B: Priority list of salivary proteins for investigations for clinical
611applications. Proteins listed in Supplementary Table 3A were screened for atleast two of three
612secretory parameters mentioned (Exocarta, SignalP and TMHMM). 149 proteins passed this
613screen, out of which only proteins whose differential expression was found to be consistent in
614multiple studies were selected (n=139) and are given in the Table. The table also includes citation
615for these proteins, their disease relevance and Gene Ontology information as well as secretory
616features. Peptides identified for these proteins from all saliva studies covered are given and those
617which are empirically observed in multiple analyses are shown in blue. In addition, the peptides
618of these proteins detected that match with the proteotypic peptides predicted as per GPMdb are
619shown in red.
620
621Supplementary Document 1: List of references on oral cancer tissues used to annotate salivary
622proteins given in Supplementary Table 3A.

623

624 --------

20

Вам также может понравиться