Академический Документы
Профессиональный Документы
Культура Документы
com
Community profiling
Early efforts to describe whos there have relied upon
cataloging species as designated by conserved changes in
their rDNA sequence, either via targeted sequencing of
amplicons, microarray technologies (Phylochip [1]), or
electrophoretic sizing techniques that are primarily
restricted to differentiating between communities (DGGE
and T-RFLP [24]). Each methodology has its own set of
limitations, and most share a reliance on the known set of
target 16S rDNA sequences, as well as the assumption
that 16S sequence can serve as a sufficient marker for
species level identification.
10 Analytical biotechnology
Table 1
List of NGS sequencing platforms and their expected throughputs, error types and error rates. Each platform has distinct advantages
owing to cost, error rate, read length, and so on
Platform
Roche
454 FLX+
454 FLX Titanium
454 GS
1820
10
10
Illumina
GAIIx
HiSeq 2000
HiSeq 2000 V3
MiSeq
14
8
10
1
Life technologies
SOLiD 4
SOLiD 5500xl
12
8
Ion torrent
PGM 314 Chip
PGM 316 Chip
PGM 318 Chip
3
3
3
Pacific biosciences
RS
Error type
900
500
50
Indel
Indel
Indel
1
1
1
2 150
2 100
2 150
2 150
96,000
400,000
<600,000
1000
Substitution
Substitution
Substitution
Substitution
>0.1
>0.1
>0.1
>0.1
50 35
75 35 PE
60 60 MP
71,000
155,000
A-T Bias
A-T Bias
>0.06
>0.01
100
100+
200
10
100
1000
Indel
Indel
Indel
1500
45/SC
Insertions
1
1
1
15
necessitate planning for at least several hundred gigabytes of data storage per sample.
The currently accepted methods most capable of assembling NGS data utilize Kmer DeBruijn graph traversalbased methods, including programs such as Velvet, SOAPdenovo, ALLPATHS, ABySS, the CLC Bio commercial
www.sciencedirect.com
Table 2
Currently available software tools for analysis and assembly of metagenomes
References
Software/algorithm
Annotation and analysis
MG-RAST
IMG-M
Eragatis
DIYA
CloVR
RATT
VMGAP
CAMERA
METAREP
[38]
[39]
[27]
[40]
http://clovr.org/
[41]
[31]
[42]
[43]
Assembly
RAY
Velvet
SOAPdenovo
Newbler
ABySS
ALLPATHs
Genovo
CLCbio
Meta-IDBA
MetaVelvet
[44]
[21]
[45]
[20]
[46]
[16]
[24]
http://clcbio.com
[47]
http://metavelvet.dna.bio.keio.ac.j
Mapping/alignment
BWA
Bowtie
Novoalign
SOAP
MrFAST
CloudBurst
BFAST
MUMer
MOSAIK
BLAST
MAQ
[48]
[49]
[50]
[51]
[52]
[53]
[54]
[26]
http://bioinformatics.bc.edu/marthlab/Mosaik
[26]
12 Analytical biotechnology
Figure 1
Character- or
HomologyBased
Approaches
Gene-Targeted
Metagenomics
OTU Based
&
HypothesisTesting
Approaches
Shotgun Metagenomics
Read
Binning
Gene
Calling
Taxonomy/Function-subfamily
Community Profiling
FunctionSubfamily
Profiling
Assembly
Reads
Contig
Binning
Contig
Annotation
Read
Annotation
Contig-based
Taxonomy
and
Functional
Profiling
Read-based
Taxonomy
and
Functional
Profiling
Final Analytical Data Set for Analysis and Community (Metagenome) Comparisons
Current Opinion in Biotechnology
Analytical stages and steps for analysis of metagenomic data from either amplicon sequencing or whole sample shotgun metagenome sequencing.
genomic and metagenomic data [33]. Customized features can be added by users, and analysis pipelines can be
built and shared easily among scientists within the
research community.
Acknowledgements
This study was supported partly by Laboratory-Directed Research and
Development of Los Alamos National Laboratory under grant number
20100034DR, by the U.S. Department of Energy Joint Genome Institute
through the Office of Science of the U.S. Department of Energy under
Contract No. DE-AC02-05CH11231, and by grants from the U.S. Defense
Threat Reduction Agency under contract numbers B104153I and B084531I.
2.
3.
4.
5.
Future directions
NGS has enabled us to peer at the genetic composition of
complex communities in a way not thought possible only
a few years ago. While novel tools have been developed
specifically for such massively parallel high-throughput
sequencers, the complexity of metagenomic samples has
presented difficult challenges and exposed a number of
analytical bottlenecks. Despite problems inherent with
assembly or read-based analysis, both approaches need to
be examined for a more complete understanding of any
metagenome project and begin to answer the basic questions in metagenomics (who, what, how). Figure 1 illustrates the types of analyses that are possible for
metagenomes with NGS technologies and the interrelatedness of read-based and assembly based analyses. There
www.sciencedirect.com
6.
Iwai S, Chai B, Sul WJ, Cole JR, Hashsham SA, Tiedje JM:
Gene-targeted-metagenomics reveals extensive diversity of
aromatic dioxygenase genes in the environment. ISME Journal
2009, 4:279-285.
This study is the first to target the dioxgenase gene family to understand
its diversity. This is one of few studies that target genes other than the
ribosomal RNA genes yet relies on many tools developed for 16S community profiling studies.
7.
8.
14 Analytical biotechnology
9.
www.sciencedirect.com