Академический Документы
Профессиональный Документы
Культура Документы
12
Multifactorial Inheritance
and Complex Diseases
Christine W Duarte, Laura K Vaughan, T Mark Beasley,
and Hemant K Tiwari
Department of Biostatistics, Section on Statistical Genetics, University of Alabama at Birmingham,
Birmingham, AL, USA
This article is a revision of the previous edition article by Hemant K Tiwari, T Mark Beasley, Varghese George and David B Allison,
volume 1, pp 299306, 2007, Elsevier Ltd.
12.1 INTRODUCTION
The polygenic model has its origins from Fishers seminal work (2), which showed that many small, equal and
additive loci would result in a Gaussian (or normal)
distribution for a phenotype. Similarly, the combined
additive effects of many genetic and environmental factors will also produce an approximately Gaussian phenotypic distribution. To illustrate, suppose (navely)
that a quantitative trait such as percent body fat is determined by a single gene with two codominant1 alleles, A
and a, which have equal frequency (p=0.50). Assume
individuals with an A allele tend to have a higher value
of the trait, and individuals with an a allele tend to have
a lower value of the trait. If A has an additive effect, then
there are three distinct phenotypic groups, namely, high
(2), intermediate (1), and low (0). If the allele frequencies of A and a are both 0.50, then 25% of individuals
would be expected to be aa and of low-percent fat, 50%
would be expected to be Aa and of moderate-percent
fat, and 25% would be expected to be AA and of highpercent fat. Figure 12-1 gives the distribution of the trait
in a population.
Now, suppose that the trait is determined by two loci.
The second locus also has two codominant alleles, B for
high and b for low expression of the trait, with B having
an allele frequency of 0.50 and the same effect magnitude as the A allele. There are now nine possible genotypes (see Table 12-1).
1An
AA
Aa
aa
4
3
2
3
2
1
2
1
0
TA B L E 1 2 - 1 Frequency Distribution of
Genotypic Values for Two Loci
with No Linkage Disequilibrium
BB
Bb
Bb
AA
Aa
aa
0.0625
0.1250
0.0625
0.1250
0.2500
0.1250
0.1250
0.1250
0.0625
An individual can possess 0, 1, 2, 3, or 4 hightrait alleles. Assuming that the combined effects of the
two loci are also additive,2 there are five distinct phenotypes with respect to the number of high-trait alleles (see
Table12-2).
The trait distribution with respect to genotypic value
distribution is shown in Figure 12-2. As can be seen in
Figure 12-2, even with two loci, the distribution of the
phenotype starts to look Gaussian. An example of a
three-locus system with equal allele frequencies, no linkage disequilibrium,3 and equal additive effects, is shown
in Figure 12-3. It can be shown that six diallelic loci
are enough to produce population frequencies virtually
indistinguishable from a normal curve.
Many traits (or diseases) are treated as dichotomous
variables because they appear to be either present or absent
(e.g. cancer). By definition, dichotomous variables do not
approximate a Gaussian distribution. However, these
diseases may still be polygenic or multifactorial because
2That
21 =
Concordant
Pair
Discordant
Pair
Total Pairs
n11
n21
nC
n12
n22
nD
nMZ
nDZ
n
VDZ VMZ
VMZ
Var(G)
Var(T)
Var(A)
Var(T)
(a 1)ns(a)
Ks = s = 1 a = 1
s=1
a = 1 (s 1)ns(a)
these relative pairs. Methods based on the variance components framework have become a popular choice for
linkage analysis because of ease of modeling covariates
and genegene or geneenvironment interactions, and
can utilize large extended families in the model-free environment (4246). There are several programs available
to perform linkage analyses based on variance components methodology such as ACT (43), SOLAR (42), and
Mx (http://www.vcu.edu/mx/). Mx uses a structural
equation modeling approach, which is equivalent to the
other types of variance components approaches under
most circumstances. There are many more statistical
genetics packages freely available for public use, and a
complete list along with download information for each
package can be found at http://linkage.rockefeller.edu/
soft/ and http://www.soph.uab.edu/ssg/linkage/lddac.
due to relatedness or population substructure. Casecontrol and population data have been commonly used
for GWAS because of availability and convenience of
ascertainment. There are some issues associated with the
case-control design. If the disease is heterogenous, extra
attention should be paid to minimize heterogeneity in
case selection, e.g. selecting the most extreme cases or
selecting individuals from a familial disease cohort. There
has been a controversy in how to select optimal controls.
Usually, controls from the same population and residing
in same geographic area are preferred, but these can be
difficult to ascertain. The Wellcome Trust Case Control
Consortium used 3000 UK controls and 2000 cases from
each of seven different diseases and showed that using
common controls was effective, had minimal impact on
genotypic distributions, and did not lead to excess false
positives (58,59). Misclassification error in control selection could affect the power of the association analysis.
Specifically, this is true for late-onset diseases because
controls have not yet reached the age to develop the disease; this issue can be resolved by increasing the sample
size (58). Population stratification and cryptic relatedness (i.e. relatedness among the individuals in the study
that is not known to the investigator) can also increase
the false-positive findings as previously discussed. The
family-based association studies are robust to population
stratification, but it is difficult to ascertain all pedigree
members, which leads to missing data within families
and loss of power compared to case-control designs (60).
There are other issues with study design selection; an
excellent review is provided by McCarthy etal. (58).
10
comprehensive discussion of the advantages and disadvantages of these methods pertaining to GWAS.
12.9 CONCLUSIONS
Genetic modeling is a challenging art and science. Advances in molecular technology, statistical
methodology,6 and increasing availability of large
6A
11
REFERENCES
1. Badano, J. L.; Katsanis, N. Beyond Mendel: An Evolving
View of Human Genetic Disease Transmission. Nat. Rev.
Genet. 2002 Oct, 3 (10), 779789.
2. Fisher, R. A. The Correlation between Relatives on the Supposition of Mendelian Inheritance. Trans. R. Soc. Edinb.
1918, 52, 399433.
3. Falconer, D. S. The Inheritance of Liability to Certain Diseases, Estimated from the Incidence among Relatives. Ann.
Hum. Genet. 1965 Aug, 29, 5176.
4. Reich, T.; James, J. W.; Morris, C. A. The Use of Multiple
Thresholds in Determining the Mode of Transmission of
Semi-Continuous Traits. Ann. Hum. Genet. 1972 Nov, 36
(2), 163184.
5. Chakraborty, R. The Inheritance of Pyloric Stenosis Explained
by a Multifactorial Threshold Model with Sex Dimorphism
for Liability. Genet. Epidemiol. 1986, 3 (1), 115.
6. Dronamraju, K. R.; Bixler, D.; Majumder, P. P. Fetal Mortality Associated with Cleft Lip and Cleft Palate. Johns Hopkins Med. J. 1982 Dec, 151 (6), 287289.
7. Dronamraju, K. R.; Bixler, D. Fetal Mortality in Oral Cleft
Families(IV): The Doubling Effect. Clin. Genet. 1983 Jul,
24 (1), 2225.
8. Elston, R. C.; Boklage, C. E. An Examination of Fundamental Assumptions of the Twin Method. Prog. Clin. Biol. Res.
1978, 24A, 189199.
9. Hopper, J. L. Twin Concordance. In Encyclopedia of Biostatistics; John Wiley: New York, 1998; Vol. 6, pp 46264629.
10. Karlin, S.; Cameron, E. C.; Williams, P. T. Sibling and
ParentOffspring Correlation Estimation with Variable
Family Size. Proc. Natl. Acad. Sci. U.S.A. 1981 May, 78 (5),
26642668.
11. Neale, M. C. Adoption Studies. In Encyclopedia of Biostatistics; John Wiley: New York, 1998; Vol. 1, pp 7781.
12. Neale, M. C.; Cardon, L. R. Methodology for Genetic Studies of Twins and Families; Kluwer: London, 1992.
13. Carey, G. Sibling Imitation and Contrast Effects. Behav.
Genet. 1986 May, 16 (3), 319341.
14. Lykken, D. T.; McGue, M.; Tellegen, A.; Bouchard, T. J., Jr.
Emergenesis. Genetic Traits That May Not Run in Families.
Am. Psychol. 1992 Dec, 47 (12), 15651577.
15. Risch, N. Linkage Strategies for Genetically Complex Traits.
I. Multilocus Models. Am. J. Hum. Genet. 1990 Feb, 46 (2),
222228.
16. Olson, J. M.; Cordell, H. J. Ascertainment Bias in the Estimation of Sibling Genetic Risk Parameters. Genet. Epidemiol. 2000 Mar, 18 (3), 217235.
17. Visscher, P. M.; Medland, S. E.; Ferreira, M. A.; Morley,
K. I.; Zhu, G.; Cornes, B. K.; Montgomery, G. W.; Martin, N. G. Assumption-Free Estimation of Heritability from
Genome-Wide Identity-by-Descent Sharing between Full Siblings. PLoS Genet. 2006 Mar, 2 (3), e41, Epub 2006 Mar 24.
18. Morton, N. E. Sequential Tests for the Detection of Linkage.
Am. J. Hum. Genet. 1955 Sep, 7 (3), 277318.
19. Elston, R. C.; Stewart, J. A General Model for the Genetic
Analysis of Pedigree Data. Hum. Hered. 1971, 21 (6),
523542.
20. Elston, R. C.; Rao, D. C. Statistical Modeling and Analysis
in Human Genetics. Annu. Rev. Biophys. Bioeng. 1978, 7,
253286, Review.
12
13
14
99. Ng, S. B.; Buckingham, K. J.; Lee, C.; Bigham, A. W.; Tabor,
H. K.; Dent, K. M.; Huff, C. D.; Shannon, P. T.; Jabs, E.
W.; Nickerson, D. A., etal. Exome Sequencing Identifies the
Cause of a Mendelian Disorder. Nat. Genet. 2010 Jan, 42
(1), 3035.
100. Erlich, Y.; Edvardson, S.; Hodges, E.; Zenvirt, S.; Thekkat,
P.; Shaag, A.; Dor, T.; Hannon, G. J.; Elpeleg, O. Exome
Sequencing and Disease-Network Analysis of a Single Family
Implicate a Mutation in KIF1A in Hereditary Spastic Paraparesis. Genome. Res. 2011 May, 21 (5), 658664.
101. Rdelsperger, C.; Krawitz, P.; Bauer, S.; Hecht, J.; Bigham,
A. W.; Bamshad, M.; de Condor, B. J.; Schweiger, M. R.;
Robinson, P. N. Identity-by-Descent Filtering of Exome
Sequence Data for Disease-Gene Identification in Autosomal
Recessive Disorders. Bioinformatics 2011 Mar 15, 27 (6),
829836.
102. Ng, S. B.; Turner, E. H.; Robertson, P. D.; Flygare, S. D.;
Bigham, A. W.; Lee, C.; Shaffer, T.; Wong, M.; Bhattacharjee, A.; Eichler, E. E., etal. Targeted Capture and Massively
Parallel Sequencing of 12 Human Exomes. Nature 2009 Sep
10, 461 (7261), 272276.
103. Sunyaev, S.; Ramensky, V.; Koch, I.; Lathe, W., III; Kondrashov, A. S.; Bork, P. Prediction of Deleterious Human
Alleles. Hum. Mol. Genet. 2001 Mar 15, 10 (6), 591597.
104. Morgenthaler, S.; Thilly, W. G. A Strategy to Discover Genes
That Carry Multi-Sllelic or Mono-Allelic Risk for Common
Diseases: A Cohort Allelic Sums test (CAST). Mutat. Res.
2007 Feb 3, 615 (12), 2856.
105. Li, B.; Leal, S. M. Methods for Detecting Associations with
Rare Variants for Common Diseases: Application to Analysis of Sequence Data. Am. J. Hum. Genet. 2008 Sep, 83 (3),
311321.
106. Madsen, B. E.; Browning, S. R. A Groupwise Association
Test for Rare Mutations Using a Weighted Sum Statistic.
PLoS Genet. 2009 Feb, 5 (2), e1000384.
107. Price, A. L.; Kryukov, G. V.; de Bakker, P. I.; Purcell, S. M.;
Staples, J.; Wei, L. J.; Sunyaev, S. R. Pooled Association
Tests for Rare Variants in Exon-Resequencing Studies. Am.
J. Hum. Genet. 2010 Jun 11, 86 (6), 832838.
108. Yi, N.; Zhi, D. Bayesian Analysis of Rare Variants in Genetic
Association Studies. Genet. Epidemiol. 2011 Jan, 35 (1),
5769.
109. Bansal, V.; Libiger, O.; Torkamani, A.; Schork, N. J. Statistical Analysis Strategies for Association Studies Involving Rare
Variants. Nat. Rev. Genet. 2010 Nov, 11 (11), 773785.
110. Ansorge, W. J. Next-Generation DNA Sequencing Techniques. Nat. Biotechnol. 2009 Apr, 25 (4), 195203.
111. Hirst, M.; Marra, M. A. Next Generation Sequencing Based
Approaches to Epigenomics. Brief Funct. Genomics 2010
Dec, 9 (56), 455465, Review.
112. Meyerson, M.; Gabriel, S.; Getz, G. Advances in Understanding Cancer Genomes through Second-Generation Sequencing. Nat. Rev. Genet. 2010 Oct, 11 (10), 685696, Review.
113. Timmermann, B.; Kerick, M.; Roehr, C.; Fischer, A.; Isau,
M.; Boerno, S. T.; Wunderlich, A.; Barmeyer, C.; Seemann,
P.; Koenig, J., etal. Somatic Mutation Profiles of MSI and
MSS Colorectal Cancer Identified by Whole Exome Next
Generation Sequencing and Bioinformatics Analysis. PLoS
One 2010 Dec 22, 5 (12), e15661.
114. Wei, X.; Walia, V.; Lin, J. C.; Teer, J. K.; Prickett, T. D.;
Gartner, J.; Davis, S.; NISC Comparative Sequencing
Program; Stemke-Hale, K.; Davies, M. A., et al. Exome
Sequencing Dentifies GRIN2A As Frequently Mutated in
Melanoma. Nat. Genet. 2011 May, 43 (5), 442446, Epub
2011 Apr 15.
115. Schadt, E. E. Molecular Metworks As Sensors and Drivers
of Common Human Diseases. Nature 2009 Sep 10, 461
(7261), 218223, Review.
15
Biographies
hristine Duarte, PhD. Dr Duarte is an assistant professor in the Department of Biostatistics at
C
University of Alabama at Birmingham. Her current research focuses on developing statistical
methods for applying systems biology approaches to high-dimensional genomic data analysis.
Her research interests falls into three main areas: (1) prediction of complex traits and disease
from high dimensional genomic data; (2) developing network models for decomposing genetic
effects on complex traits into intermediate phenotypes; and (3) integrating cross-platform
genomic data using network modeling to explain biological processes of interest such as disease
etiology and treatment response.
aura Kelly Vaughan, PhD. Dr Vaughan is an assistant professor in the Section on Statistical
L
Genetics in the Department of Biostatistics at the University of Alabama at Birmingham. Her
research interests include integrating biological knowledge into the statistical analysis of complex genomic data, pathway analysis, and reproducible research practices.
Mark Beasley, PhD. Dr Beasley is an associate professor and has been a faculty member of
T
the UAB Department of Biostatics and the Section on Statistical Genetics (SSG) since 2001. Dr
Beasley has published over 30 articles in the area of statistical methodology, focused in five
major areas: (1) methodological problems in statistical genetics; (2) nonparametric statistics;
(3) simulation studies; (4) the use of linear models; (5) longitudinal analysis, and (6) mediation
analysis. He has also coauthored over 30 applied research articles in a variety of disciplines
including education, psychology, medicine, genetics, and gerontology, pharmacology.
emant K Tiwari, PhD. Dr Tiwari is William Student Sealy Gosset Professor in Biostatistics
H
and Interim Head of the Section on Statistical Genetics in the School of Public Health at the
University of Alabama at Birmingham. His research interests include genetic linkage analysis,
disequilibrium mapping, genome-wide association studies (GWAS), population genetics, structural variations in the genome, pharmacogenetics/pharmacogenomics, and bioinformatics. At
present, he is involved in gene mapping studies of rheumatoid arthritis, SLE, cardiovascular
diseases, and obesity-related traits.