Wenz Et Al. - 2004 - A Novel High-Throughput SNP Genotyping System Utilizing Capillary Electrophoresis Detection Platforms

A Novel High-Throughput SNP
Genotyping System Utilizing Capillary

Electrophoresis Detection Platforms
H. Michael Wenz et al.*,
Applied Biosystems, 850 Lincoln Centre Dr., Foster City, CA 94404, USA
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
We have developed a high throughput SNP genotyping system that offers speed
and flexibility, produces reliable, high quality data and supports large-scale
genotyping projects for disease research, including association and linkage
analysis. This SNPlex Genotyping System is based on multiplex OLA/PCR and
capillary electrophoresis for high throughput genotyping. The assay uses an
optimized universal set of ZipChute reagents for accurate read-out on highthroughput capillary electrophoresis platforms. Genomic DNA is interrogated with
multiplexed sets of ligation probes targeting currently up to 48 specific SNP loci
in each reaction. Such a multiplex reaction utilizes less than 1 ng of gDNA per
SNP genotype. A pair of universal PCR primers amplifies all ligation products in
a multiplex simultaneously. Amplicons containing internal universal cZipCode
oligonucleotide sequences are hybridized to a corresponding mix of universal
fluorescent ZipChute reagents. ZipChute reagents contain sequences complementary to the cZipCode oligonucleotide sequences and exhibit unique preoptimized mobilities during electrophoresis. Genotypes are determined by identification of specifically hybridized ZipChute reagents that are eluted and identified
by capillary electrophoresis and subsequently associated with target SNPs using
the GeneMapper Analysis Software. Using a 48-plex format with the Applied
Biosystems 3730xl instrument and commercially available robotics systems one
can process approximately 1.5 million genotypes in 5 days. Statistics for 1,250
population validated SNPs are presented including the in silico assay design,
conversion rate, accuracy and call rate.
Genome-wide linkage and association studies involving thousands of DNA

samples in combination with multiple SNP loci require tens of thousands of
genotypes per project. Such large-scale studies necessitate a genotyping
solution that fulfills the following requirements: 1) low consumption of gDNA, 2)
high throughput, 3) flexible assay platform, 4) high call rate and accuracy, 5) ease
of use, and 6) low cost. We are developing the SNPlex System, a genotyping
technology with the potential to deliver all these requirements. The assay utilizes
multiplexed oligonucleotide ligation assay (OLA) on gDNA, followed by a
universal PCR reaction. The encoded genotype information is read utilizing a set
of common ZipChute probes. ZipChute probes are hybridized to complementary
sequences that are part of genotype specific amplicons. These ZipChute probes
are eluted and detected by electrophoretic separation on Applied Biosystems
3730xl DNA Analyzers. This approach is an attractive alternative to existing
genotyping methodologies since it requires only three unlabeled probes per SNP,
consumes a low amount of gDNA, can be highly multiplexed, and uses widely
available capillary electrophoresis instruments.
Fragmented gDNA is interrogated directly by a set of three unlabeled ligation

probes per SNP in a multiplex of 48 assays (Figure 1). Ligation probes are
designed utilizing proprietary design software (Figure 3). After phosphorylation of
OLA probes and universal linkers, genotype specific ligation and an enzymatic
clean up is performed. PCR amplification, using two universal PCR primers
follows. Biotinylated amplicons are bound to streptavidin-coated plates. Singlestranded PCR products are interrogated by a set of universal ZipChute probes.
These probes are fluorescently labeled, have a unique sequence that is complementary to a specific portion of the single-stranded PCR product, and contain
mobility modifiers (Figure 1). After elution ZipChute probes are electrophoretically separated on an Applied Biosystems 3730xl DNA Analyzer. ZipChute probe
pairs representing both alleles for a given SNP are arranged into markers. Two
bins for each marker represent both alleles of the respective SNP. The intensities
of specific signals in each bin of a marker are automatically converted into cluster
plots using GeneMapper Analysis Software version 3.5. Cluster plots together
with data to associate a universal ZipChute pair with the respective individual
SNP are used to automatically call genotypes (Figure 2).
Results
Figure 4. Cluster Plots for 8 Representative SNPs
A total of 1250 high-quality SNPs were selected from Applera Corp. resequencing and discovery project. The SNP targets were submitted to the
automated assay and pool design pipeline (Figure 3). Successfully designed and
synthesized OLA probe sets were tested against a panel of 92 DNA samples
from Coriell Cell Repositories. Genotype data were compared to TaqMan probebased assays (Ranade et al., 2001) and sequencing data.
Figure 5. Assay Reproducibility

A
Figure 1. Simplified Diagram of the SNPlex System Assay

Start
fw primer
site
ZipCode1
GER
Prep genomic DNA
A
ASO1
spacer
ASO-Linker 1
rev primer
GER
site
ZipCode2 GER
ASO2
LSO
spacer
ASO-Linker 2
Kinase probes and linkers
Activate probes
Perform OLA
Allelic
Discrimination
LSO-Linker
Step 1: Activation
A
P
P
Step 2: Ligation
G
Purify by enzymatic digestion
Purification
gDNA
Perform Universal PCR
Step 3: Purification
Ligation Product
Amplification
Capture on SA-plates
Step 4: Amplification
Purification
Biotin
Step 5: Capture
Hybridize ZipChute Set
Anneal Reporter
Probe
Wash, Elute & Load on 3730
Read Out on CE
Table 1. Summary of In Silico Probe Design Success for

1,250 SNP Loci, Selected from the Applera Genome
Initiative
Streptavidin
Fluorescent
Label
ZipChute
Probe
Mobility
Modifier
Step 6: Hybridization
Total
~16 hrs
Step 7: Elution
Primary Analysis & QC data
Allele Calling
1250
100
Successful designs
1166
93.3
Successful design for both strands1
762
61.0
Successful design for A only1
404
32.3
OLA probes were designed using proprietary probe design algorithm. Both DNA
strands were initially targeted for probe design.
Figure 2. The SNPlex System Detection

ZipChute
Probes
Percent
SNPs submitted
Marker(x48) = SNP Identification

Each ZipChute
probe pair is
representative
of one SNP.
For 84 SNPs both strands failed design due to common incompatible sequence
motifs, secondary structure, and repeat sequences within the genome.
Based on probe design rules one strand is favored over the other based on
sequence composition and secondary structure. Our design filters indicated that
second strand synthesis should not be attempted.
Step 1: Detection
Polar plot representation for three SNPs, CV1163126, CV11918682, and

CV2059319 from the study described in Table 4, is shown. The three plots in A
show for each SNP data points for 92 DNAs. In B four separate runs were
combined into one project, and data points for all four runs were overlaid.
Therefore each plot shows 368 data points. In C all 12 separate runs were
combined into one project, and data points for all 12 runs were overlaid.
Therefore each plot shows 1104 data points. Data points represented by a
indicate no template control (NTC). Data points indicated by an x indicate a no
call for this particular DNA.
Table 2. Automated Assay Pass Rate for Successfully

Designed and Synthesized Assays
Step 2: Analysis
Total
Percent
1928
100.0
1773
92.0
Tested Designs strand A
1166
100.0
Passed Designs strand A
1068
91.6
Tested Designs strand B
762
100.0
Passed Designs strand B
705
92.5
Tested Designs
Passed Designs
Bin1 + Bin2 = Allele Identification
Applied Biosystems 3730xl DNA Analyzer
Step 3:
Genotype Clustering
References
High-Throughput Genotyping with Single Nucleotide Polymorphisms (2001). Ranade, K., Chang, M.-S., Ting, C.T., Pei, D., Hsiao, C.-F., Olivier, M., Pesich, R., Hebert, J., Chen, Y.-D., Dzau, V. J., Curb, D., Olshen, R., Risch,
N., Cox, D. R. and Botstein, D. Genome Res., 11, 12621268.
Acknowledgements
*The author wishes to acknowledge the following groups at Applied Biosystems for their support of this project:
Genomic Applications R&D, Pilot Operations R&D, Bioinformatics, Global Oligo Operations, Product Test, Genotyping
Applications Marketing, Consumables Development and Manufacturing, and Analysis Software R&D.
Note
All designs that passed strand A and strand B were tested.
For Research Use Only. Not for use in diagnostic procedures.
Table 3. SNPlex System Concordance and Call Rate

Percent
The PCR process is covered by patents owned by Roche Molecular Systems, Inc. and F. Hoffmann-La Roche Ltd.
Applied Biosystems and GeneMapper are registered trademarks and AB (Design), Applera, SNPbrowser, SNPlex,
ZipChute, and ZipCode are trademarks of Applera Corporation or its subsidiaries in the US and/or certain other
countries.
TaqMan is a registered trademark of Roche Molecular Systems, Inc.
Genotype clustering with polar plotting and genotype calling, using GeneMapper Analysis Software version 3.5
Concordance between strands1
99.6%
Concordance between duplicate DNAs2
99.9%
Concordance with TaqMan assay3
99.2%
Concordance with sequencing4
98.5%
Mean call rate
96.4%
To determine concordance between strands, data for approximately 190,000

genotype calls were compared. 2Two identical DNA samples were placed on the
DNA plate. Concordance was determined based on 5574 genotype calls.
3
Concordance with the TaqMan probe-based assay is based on 89 DNA samples
in common between genotyping by SNPlex and TaqMan probe-based assay. The
non-concordance is due to errors in SNPlex and in the TaqMan probe-based
assay, which are approximately equal. Thus, the SNPlex error rate is about half
of the 0.8%, or 99.6% real accuracy. 4Concordance with sequencing is based on
14 DNA samples in common between genotyping by SNPlex and sequencing.
1
Figure 3. SNPlex System Assay Design Pipeline
Start Design
Batch
Multiplex
Pool Design
myScience
Assay Design
XML
Web Portal
SNP Specificity
SNP sequence
Input flat-file
Input validation
SNPlex Assay Design Pipeline
Web Portal
myScience
Probe Sets
Manufacturing
Review Design
Report
Table 4. SNPlex System Call Rate, Concordance, and

Precision
SNP sequence
Input XML file
Genome
Assembly
Assay Information
File (XML)
Probe Sets
(48-plex)
Experimental design. A set of SNPs were selected for which both sequencing
and TaqMan probe based assay data were available. This set is a subset of SNPs
with genotype information from both the Applera Genome Initiative and the
TaqMan Assays-on-Demand SNP Genotyping Products. Selected assays were
required to have between 9 and 14 high-quality sequencing reads for DNA
samples in common between the re-sequencing data and the TaqMan data. A
further criterion was to select only SNPs whose minor allele frequencies
determined by 5-nuclease were at least 0.10 with at least one of the major
populations. These assays were previously tested against a panel of 89 DNA
samples from the Coriell Institute Cell Repository to obtain allele frequencies.
Number of replicate runs
Call rate
Concordance
Precision
12
98.68%
99.79%
99.97%
One additional SNP set, consisting of 48 high quality SNPs with minor allele
frequencies of typically >0.1 with four major populations (Caucasian, African
American, Japanese, Chinese) was selected from Applera Corp. SNP discovery
project. This 48-plex SNP set was run through the whole SNPlex system 12
times. For the 46 SNPs that always passed the automated GeneMapper analysis
across all twelve runs, call rate, concordance, and precision were calculated.
Concordance was determined based on 34,891 genotypes that were in common
between SNPlex and the TaqMan Assays on Demand SNP genotyping products.
Of the 576 possible assays 560 (97.2%) passed the automatic genotyping
pipeline. Of the 16 failed assays one SNP failed in all twelve runs, a second SNP
failed in 4 out of 12 runs due to insufficient cluster separation.
2004 Applied Biosystems. All rights reserved.
Conclusions
We show initial results of SNP genotyping that were achieved
with the SNPlex Genotyping System, an assay for the high
throughput determination of SNP genotypes. Genotypes are
detected with low consumption of gDNA (~0.8 ng/genotype), high
reproducibility and accuracy, and high throughput on readily
available capillary electrophoresis based read-out systems. In
this study that used population validated SNPs with a minor allele
frequency of at least 5% in at least one major population, the
overall system conversion rate was ~85% (93% in silico design
conversion x 92% assay conversion). The SNPbrowser software
can be used to select these SNPs and to display their position
within LD and haplotype blocks. Where possible, those SNPs can
be used as backbone SNPs for the required application. At a 48plex level one can conceivably generate approximately 1.5 million
genotypes in 5 days on an Applied Biosystems 3730xl system
and commercially available robotics systems. Genotypes are
called automatically, using GeneMapper software, with the option
of manual editing. The system is currently configured at 48-plex
per reaction; however, the future system enhancements include
increased throughput through higher multiplexed reactions, such
as 96- and 192-plex per reaction.

Wenz Et Al. - 2004 - A Novel High-Throughput SNP Genotyping System Utilizing Capillary Electrophoresis Detection Platforms

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Wenz Et Al. - 2004 - A Novel High-Throughput SNP Genotyping System Utilizing Capillary Electrophoresis Detection Platforms

Загружено:

Авторское право:

Доступные форматы

A Novel High-Throughput SNP

Genotyping System Utilizing Capillary

MATERIALS AND METHODS

Genome-wide linkage and association studies involving thousands of DNA

Fragmented gDNA is interrogated directly by a set of three unlabeled ligation

Figure 4. Cluster Plots for 8 Representative SNPs

Figure 5. Assay Reproducibility

Figure 1. Simplified Diagram of the SNPlex System Assay

Prep genomic DNA

Kinase probes and linkers

Purify by enzymatic digestion

Perform Universal PCR

Hybridize ZipChute Set

Wash, Elute & Load on 3730

Table 1. Summary of In Silico Probe Design Success for

Primary Analysis & QC data

Successful design for both strands1

Successful design for A only1

Figure 2. The SNPlex System Detection

Marker(x48) = SNP Identification

Polar plot representation for three SNPs, CV1163126, CV11918682, and

Table 2. Automated Assay Pass Rate for Successfully

Tested Designs strand A

Passed Designs strand A

Tested Designs strand B

Passed Designs strand B

All designs that passed strand A and strand B were tested.

For Research Use Only. Not for use in diagnostic procedures.

Table 3. SNPlex System Concordance and Call Rate

Concordance between strands1

Concordance between duplicate DNAs2

Concordance with TaqMan assay3

Concordance with sequencing4

Mean call rate

To determine concordance between strands, data for approximately 190,000

Figure 3. SNPlex System Assay Design Pipeline

SNPlex Assay Design Pipeline

Table 4. SNPlex System Call Rate, Concordance, and

Number of replicate runs

2004 Applied Biosystems. All rights reserved.

Вам также может понравиться