Вы находитесь на странице: 1из 38

AFLP and

microsatellite analysis

Amplified Fragment Length Polymorphism


Pros: Large number of markers with relatively little lab effort No prior information about genome needed Genome wide overage Small amount of DNA needed

Cons: Markers are dominant (i.e. heterozygotes are scores as homozygotes) Can be tedious to score Size homoplasy Reproducibility?

STEP 1: Restriction-Ligation

STEP 2: Pre-selective PCR

EcoRI PRE-SELECTIVE PRIMER


GTAGACTGCGTACC AATT CA

CA

AT

GAGTCCTGAGTA

MseI PRE-SELECTIVE PRIMER

STEP 3: Selective PCR

FAM

EcoRI SELECTIVE PRIMER (labeled)


GTAGACTGCGTACC AATT CACT

GTAGACTGCGTACC

AATT

CA

SELECTIVE PRIMER
CA AT GAGTCCTGAGTA

GACA

AT

GAGTCCTGAGTA

MseI SELECTIVE PRIMER

EcoRI: 6bp cutter --> MseI: 4bp cutter -->

one cut every 4096 bp one cut every 256 bp

Selective PCR product contains many unlabeled fragments that will not be visible on ABI

MseI MseI MseI MseI MseI MseI MseI MseI MseI EcoRI MseI MseI MseI EcoRI MseI EcoRI MseI MseI MseI MseI EcoRI MseI EcoRI MseI MseI MseI MseI MseI MseI MseI

MseI
EcoRI MseI MseI MseI MseI MseI MseI MseI MseI MseI MseI MseI MseI

Number of bands in AFLP profile


is determined by

1 Genome size:

larger genome ---> more bands

2 Number of selective nucleotides in selective primers 3 Dilution of PCR product Low (noise) peaks get magnified

Why optimize number of bands?


1 Size homoplasy !!!!! 2 Difficult to score

Choosing selective primer combinations


Use many of these to get enough markers (cheap)

EcoR1-AGT EcoR1-AGC

MseI-CGT MseI-CGA MseI-CGC MseI-CGG etc. MseI-CGTG MseI-CG

And use these to optimize number of bands

Use few of these (expensive), but allows use of multiple colors (multiplex run on ABI)

An additional nucleotide reduces number of peaks 4-fold One less nucleotide increases number of peaks 4-fold

Reproducibility
High reproducibility has generally been reported

However, DNA quality is crucial component (use same DNA extraction protocol for all samples!)

Assess quality of data by repeating several samples from scratch


i.e. starting with DNA extraction

Ideal AFLP profile

1 Well separated peaks


2 Right number of peaks 2 Little noise 3 Peaks are distributed across size range 4 High level of Polymorphism

Note: Genome size is correlated with noise level Around 20% of primer combinations provide profiles that are suitable for high throughput genotyping.

A very fine example

Too many peaks

Optimizing AFLP reactions


1 DNA quality 2 DNA quality 3 DNA quality 4 Increase restriction time to 2 hours 5 Increase ligation time to 16 hours 6 Use fresh T4 ligase 7 Increase amount of DNA (rest-lig) added to pre-selective PCR (15 ul DNA in 50ul reaction) 8 Reduce amount of DNA in Selective PCR 9 Increase amount of cycles in Selective PCR 10 Increase amount of TAQ in Selective PCR 11 Several people have reported better results with TaqI vs MseI A successful AFLP analyses depends crucially on this

(but this requires different adaptors)

Scoring AFLP profiles


Normalize samples: Arbitrary cut-off peak height has to be used and this needs to be relative since different samples have different intensity. Set high cut-off for inclusion as marker (that is, at least one individual has to have this cut-off peak height), then reduce peak height for scoring the presence/absence for remainder of individuals. In Genemapper do not use auto-bin option. Make your own bins

Analyze all samples for the same primer set in the same project. This allows you to assess the reliability of the marker by scrolling across samples. Also prevents you from including non-polymorphic markers. Also, normalization performed on all samples at the same time. Do not include peaks that do not show clear presence or absence in most cases.
Score blindly to avoid bias.

Check for overflow from different dye

Normalization

Genemapper

Freeware for scoring AFLP from ABI runs: Genographer v 1.6 GenoProfiler 2.0

A few population genetic programs for AFLP analyses

RAPDFst: Fst (Lynch and Milligam, 1994)


MVSP, NTSYS: Jaccard coeficient, Nei and Li (1979) Arlequin, TFPGA: Amova Genalex: st, analog of Fst, Amova Structure, BAPS: inference of population structure. Hickory: Bayesian estimation of F statistics for dominant markers

A few population genetic programs for AFLP analyses

RAPDFst: Fst (Lynch and Milligam, 1994)

Assumes H-W equilibrium

MVSP, NTSYS: Jaccard coeficient, Nei and Li (1979) Arlequin, TFPGA: Amova Genalex: st, analog of Fst, Amova Structure, BAPS: inference of population structure. Hickory: Bayesian estimation of F statistics for dominant markers

A few population genetic programs for AFLP analyses

RAPDFst: Fst (Lynch and Milligam, 1994)

Assumes H-W equilibrium

MVSP, NTSYS: Jaccard coeficient, Nei and Li (1979) Arlequin, TFPGA: Amova Genalex: st, analog of Fst, Amova Structure, BAPS: inference of population structure. Hickory: Bayesian estimation of F statistics for dominant markers
Treats multilocus data as single haplotype

A few population genetic programs for AFLP analyses

RAPDFst: Fst (Lynch and Milligam, 1994)

Assumes H-W equilibrium

MVSP, NTSYS: Jaccard coeficient, Nei and Li (1979) Arlequin, TFPGA: Amova Genalex: st, analog of Fst, Amova Structure, BAPS: inference of population structure.
Low information content Treats multilocus data as single haplotype

Hickory: Bayesian estimation of F statistics for dominant markers

No assumption of H-W equilibrium

Microsatellites
* Di- or tri-nuleotide repeats * Ubiquitous * High mutation rate (102-106)

High level of variability

Mutational mechanism
Slippage during replication
(also happens during PCR)

ACCGAGTCGATCGTGTGTGTGTGTGTGTGTACGCTA TGGCTCAGCTAGCACACA CA CA CA C

ACCGAGTCGATCGTGTGTG TGTGTGTGTGTACGCTA TGGCTCAGCTAGCACACAC ACACACACACATGCGAT CA

Reduces or decreases number of repeats

Slippage increases with number of repeats

Obtaining Microsatellites

Screening sequenced genomes


Screening enriched genomic library

Glenn and Schable (2005) Methods in Enzymology 395: 202-222.


This paper is particularly useful. It comes from a Lab that has isolated microsatellites from 125+ species

SELECTING LOCI
Too few repeats Low variability

Too many repeats

Difficult to score, Homoplasy

Choosing loci: 8 - 20 repeats uninterrupted repeats

Screening of loci: Number of alleles Heterozygosity, allelic richness Cloning pool of PCR amplicons, followed by labeled PCR M13 labeled primers

M13 tailed primer


Boutin-Ganache et al (2001) Biotechniques 31, 26-28

Forward primer

Reverse primer
(Low concentration)

M13-tail

Forward primer Forward primer Reverse primer

M13 primer
FAM

Some scoring issues

Great looking heterozygote

Some scoring issues


Extra peak because of partial A overhang addition of Taq

Stutter bands of the two high peaks due to slippage

Some scoring issues

Heterozygote

Some scoring issues

A single large allele with many repeats Lots of slippage

Some scoring issues

35 repeats
Increase in slippage with increase in repeat number

Some scoring issues

How many alleles?

Some scoring issues

Find a heterozygote that clearly shows the shape of a single allele

Some scoring issues

The alleles

Some scoring issues

Electrophoresis artifacts
(Fernando et al (2001) Mol. Ecol. Notes 1, 325-328)

The figures shows the difference in peak shape of the same PCR products loaded at different concentration

Some scoring issues

Electrophoresis artifacts
(Fernando et al (2001) Mol. Ecol. Notes 1, 325-328)

Do not overload your gel !


Also keep in mind that in different PCRs the left peak or the right peak may be dominant

Optimizing PCR
Avoid Null Alleles (or try to) Minimize annealing temp MgCl2 concentration Different species lowest temp that produces clean bands increase reduces specificity design new primers (if possible) (In my limited experience with cross species
amplification null alleles can be big problem)

Reduce stutter: Reduce number of cycles Seems to be most successfull Reduce amount of MgCl2 Touchdown PCR 2/2/8 PCR (2 sec denat, 2 sec anneal, 8 sec extens.) BSA, DMSO Addition of A Increase final extension time Add Pigtail (GTTTCTT) on 5end of reverse primer to facilitate addition of A overhang

Analysis Issues
Microsats biggest problem
Population subdivision causes both. Null alleles only cause HW disequilibrium.

Null alleles

Are loci in HW equilibrium? Linkage disequilibrium?

Possible solutions: Remove loci from analysis (if enough loci are available)

Check if HW disequilibrium influences results by temporarily removing affected loci.


Adjust allele and genotype frequencies (Microchecker)

Some population genetics software


Microsatellite toolkit: Excel plug-in for creating Arlequin, FSTAT and Genepop files. Microchecker: Arlequin: Estimate null allele frequency. Adjust allele frequencies. HW equilibrium, Linkage Disequilibrium, Fst, exact test of differentiation, Amova, Mantel test Allelic richness, Fst per locus (to check contribution of each locus to observed pattern of differentiation) Population structuring, population assignment. Estimates of effective population size and migration rates

FSTAT:

Structure, BAPS: Migrate:

Bottleneck:

Check for very recent population bottlenecks

Вам также может понравиться