Вы находитесь на странице: 1из 16

ENVIRONMENTAL

AND ECOLOGICAL
STATISTICS WITH R
Second Edition
ENVIRONMENTAL
AND ECOLOGICAL
STATISTICS WITH R
Second Edition

Song S. Qian
The University of Toledo
Ohio, USA

Boca Raton London New York

CRC Press is an imprint of the


Taylor & Francis Group, an informa business
A CHAPMAN & HALL BOOK
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2017 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works

Printed on acid-free paper


Version Date: 20160825

International Standard Book Number-13: 978-1-4987-2872-0 (Hardback)

This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information stor-
age or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, please access www.copy-
right.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222
Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that pro-
vides licenses and registration for a variety of users. For organizations that have been granted a photo-
copy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
In memory of my grandmother 张一贯,mother 仲泽庆, and father 钱拙.
Contents

Preface xiii

List of Figures xvii

List of Tables xxiii

I Basic Concepts 1
1 Introduction 3

1.1 Tool for Inductive Reasoning . . . . . . . . . . . . . . . . . . 3


1.2 The Everglades Example . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Statistical Issues . . . . . . . . . . . . . . . . . . . . . 10
1.3 Effects of Urbanization on Stream Ecosystems . . . . . . . . 14
1.3.1 Statistical Issues . . . . . . . . . . . . . . . . . . . . . 15
1.4 PCB in Fish from Lake Michigan . . . . . . . . . . . . . . . 16
1.4.1 Statistical Issues . . . . . . . . . . . . . . . . . . . . . 16
1.5 Measuring Harmful Algal Bloom Toxin . . . . . . . . . . . . 17
1.6 Bibliography Notes . . . . . . . . . . . . . . . . . . . . . . . 18
1.7 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2 A Crash Course on R 19

2.1 What is R? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Getting Started with R . . . . . . . . . . . . . . . . . . . . . 20
2.2.1 R Commands and Scripts . . . . . . . . . . . . . . . . 21
2.2.2 R Packages . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.3 R Working Directory . . . . . . . . . . . . . . . . . . . 22
2.2.4 Data Types . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2.5 R Functions . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3 Getting Data into R . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.1 Functions for Creating Data . . . . . . . . . . . . . . . 29
2.3.2 A Simulation Example . . . . . . . . . . . . . . . . . . 31
2.4 Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . 34
2.4.1 Data Cleaning . . . . . . . . . . . . . . . . . . . . . . 35
2.4.1.1 Missing Values . . . . . . . . . . . . . . . . . 36

vii
viii Contents

2.4.2 Subsetting and Combining Data . . . . . . . . . . . . 36


2.4.3 Data Transformation . . . . . . . . . . . . . . . . . . . 38
2.4.4 Data Aggregation and Reshaping . . . . . . . . . . . . 38
2.4.5 Dates . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3 Statistical Assumptions 47

3.1 The Normality Assumption . . . . . . . . . . . . . . . . . . . 48


3.2 The Independence Assumption . . . . . . . . . . . . . . . . . 54
3.3 The Constant Variance Assumption . . . . . . . . . . . . . . 55
3.4 Exploratory Data Analysis . . . . . . . . . . . . . . . . . . . 56
3.4.1 Graphs for Displaying Distributions . . . . . . . . . . 57
3.4.2 Graphs for Comparing Distributions . . . . . . . . . . 59
3.4.3 Graphs for Exploring Dependency among Variables . . 61
3.5 From Graphs to Statistical Thinking . . . . . . . . . . . . . . 69
3.6 Bibliography Notes . . . . . . . . . . . . . . . . . . . . . . . 72
3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4 Statistical Inference 77

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2 Estimation of Population Mean and Confidence Interval . . . 78
4.2.1 Bootstrap Method for Estimating Standard Error . . . 86
4.3 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . 90
4.3.1 t-Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.3.2 Two-Sided Alternatives . . . . . . . . . . . . . . . . . 98
4.3.3 Hypothesis Testing Using the Confidence Interval . . . 99
4.4 A General Procedure . . . . . . . . . . . . . . . . . . . . . . 101
4.5 Nonparametric Methods for Hypothesis Testing . . . . . . . 102
4.5.1 Rank Transformation . . . . . . . . . . . . . . . . . . 102
4.5.2 Wilcoxon Signed Rank Test . . . . . . . . . . . . . . . 103
4.5.3 Wilcoxon Rank Sum Test . . . . . . . . . . . . . . . . 104
4.5.4 A Comment on Distribution-Free Methods . . . . . . 106
4.6 Significance Level α, Power 1 − β, and p-Value . . . . . . . . 109
4.7 One-Way Analysis of Variance . . . . . . . . . . . . . . . . . 116
4.7.1 Analysis of Variance . . . . . . . . . . . . . . . . . . . 117
4.7.2 Statistical Inference . . . . . . . . . . . . . . . . . . . 119
4.7.3 Multiple Comparisons . . . . . . . . . . . . . . . . . . 121
4.8 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.8.1 The Everglades Example . . . . . . . . . . . . . . . . 127
4.8.2 Kemp’s Ridley Turtles . . . . . . . . . . . . . . . . . . 128
4.8.3 Assessing Water Quality Standard Compliance . . . . 134
4.8.4 Interaction between Red Mangrove and Sponges . . . 137
4.9 Bibliography Notes . . . . . . . . . . . . . . . . . . . . . . . 142
Contents ix

4.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

II Statistical Modeling 147


5 Linear Models 149

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 149


5.2 From t-test to Linear Models . . . . . . . . . . . . . . . . . . 152
5.3 Simple and Multiple Linear Regression Models . . . . . . . . 154
5.3.1 The Least Squares . . . . . . . . . . . . . . . . . . . . 154
5.3.2 Regression with One Predictor . . . . . . . . . . . . . 156
5.3.3 Multiple Regression . . . . . . . . . . . . . . . . . . . 158
5.3.4 Interaction . . . . . . . . . . . . . . . . . . . . . . . . 160
5.3.5 Residuals and Model Assessment . . . . . . . . . . . . 162
5.3.6 Categorical Predictors . . . . . . . . . . . . . . . . . . 170
5.3.7 Collinearity and the Finnish Lakes Example . . . . . . 174
5.4 General Considerations in Building a Predictive Model . . . 185
5.5 Uncertainty in Model Predictions . . . . . . . . . . . . . . . 189
5.5.1 Example: Uncertainty in Water Quality Measurements 191
5.6 Two-Way ANOVA . . . . . . . . . . . . . . . . . . . . . . . . 193
5.6.1 ANOVA as a Linear Model . . . . . . . . . . . . . . . 193
5.6.2 More Than One Categorical Predictor . . . . . . . . . 195
5.6.3 Interaction . . . . . . . . . . . . . . . . . . . . . . . . 198
5.7 Bibliography Notes . . . . . . . . . . . . . . . . . . . . . . . 200
5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

6 Nonlinear Models 209

6.1 Nonlinear Regression . . . . . . . . . . . . . . . . . . . . . . 209


6.1.1 Piecewise Linear Models . . . . . . . . . . . . . . . . . 220
6.1.2 Example: U.S. Lilac First Bloom Dates . . . . . . . . 226
6.1.3 Selecting Starting Values . . . . . . . . . . . . . . . . 229
6.2 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
6.2.1 Scatter Plot Smoothing . . . . . . . . . . . . . . . . . 240
6.2.2 Fitting a Local Regression Model . . . . . . . . . . . . 243
6.3 Smoothing and Additive Models . . . . . . . . . . . . . . . . 245
6.3.1 Additive Models . . . . . . . . . . . . . . . . . . . . . 245
6.3.2 Fitting an Additive Model . . . . . . . . . . . . . . . . 248
6.3.3 Example: The North American Wetlands Database . . 250
6.3.4 Discussion: The Role of Nonparametric Regression
Models in Science . . . . . . . . . . . . . . . . . . . . 254
6.3.5 Seasonal Decomposition of Time Series . . . . . . . . 259
6.3.5.1 The Neuse River Example . . . . . . . . . . 261
6.4 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . 267
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
x Contents

7 Classification and Regression Tree 271

7.1 The Willamette River Example . . . . . . . . . . . . . . . . . 272


7.2 Statistical Methods . . . . . . . . . . . . . . . . . . . . . . . 275
7.2.1 Growing and Pruning a Regression Tree . . . . . . . . 277
7.2.2 Growing and Pruning a Classification Tree . . . . . . 285
7.2.3 Plotting Options . . . . . . . . . . . . . . . . . . . . . 289
7.3 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
7.3.1 CART as a Model Building Tool . . . . . . . . . . . . 293
7.3.2 Deviance and Probabilistic Assumptions . . . . . . . . 297
7.3.3 CART and Ecological Threshold . . . . . . . . . . . . 298
7.4 Bibliography Notes . . . . . . . . . . . . . . . . . . . . . . . 300
7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300

8 Generalized Linear Model 303

8.1 Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . 305


8.1.1 Example: Evaluating the Effectiveness of UV as a
Drinking Water Disinfectant . . . . . . . . . . . . . . 306
8.1.2 Statistical Issues . . . . . . . . . . . . . . . . . . . . . 307
8.1.3 Fitting the Model in R . . . . . . . . . . . . . . . . . . 308
8.2 Model Interpretation . . . . . . . . . . . . . . . . . . . . . . 309
8.2.1 Logit Transformation . . . . . . . . . . . . . . . . . . 310
8.2.2 Intercept . . . . . . . . . . . . . . . . . . . . . . . . . 310
8.2.3 Slope . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
8.2.4 Additional Predictors . . . . . . . . . . . . . . . . . . 312
8.2.5 Interaction . . . . . . . . . . . . . . . . . . . . . . . . 314
8.2.6 Comments on the Crypto Example . . . . . . . . . . . 315
8.3 Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
8.3.1 Binned Residuals Plot . . . . . . . . . . . . . . . . . . 316
8.3.2 Overdispersion . . . . . . . . . . . . . . . . . . . . . . 316
8.3.3 Seed Predation by Rodents: A Second Example of
Logistic Regression . . . . . . . . . . . . . . . . . . . . 319
8.4 Poisson Regression Model . . . . . . . . . . . . . . . . . . . . 332
8.4.1 Arsenic Data from Southwestern Taiwan . . . . . . . . 332
8.4.2 Poisson Regression . . . . . . . . . . . . . . . . . . . . 333
8.4.3 Exposure and Offset . . . . . . . . . . . . . . . . . . . 340
8.4.4 Overdispersion . . . . . . . . . . . . . . . . . . . . . . 341
8.4.5 Interactions . . . . . . . . . . . . . . . . . . . . . . . . 344
8.4.6 Negative Binomial . . . . . . . . . . . . . . . . . . . . 351
8.5 Multinomial Regression . . . . . . . . . . . . . . . . . . . . . 353
8.5.1 Fitting a Multinomial Regression Model in R . . . . . 354
8.5.2 Model Evaluation . . . . . . . . . . . . . . . . . . . . . 358
8.6 The Poisson-Multinomial Connection . . . . . . . . . . . . . 361
8.7 Generalized Additive Models . . . . . . . . . . . . . . . . . . 367
Contents xi

8.7.1 Example: Whales in the Western Antarctic Peninsula 369


8.7.1.1 The Data . . . . . . . . . . . . . . . . . . . . 371
8.7.1.2 Variable Selection Using CART . . . . . . . 371
8.7.1.3 Fitting GAM . . . . . . . . . . . . . . . . . . 374
8.7.1.4 Summary . . . . . . . . . . . . . . . . . . . . 378
8.8 Bibliography Notes . . . . . . . . . . . . . . . . . . . . . . . 380
8.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381

III Advanced Statistical Modeling 385


9 Simulation for Model Checking and Statistical Inference 387

9.1 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388


9.2 Summarizing Regression Models Using Simulation . . . . . . 390
9.2.1 An Introductory Example . . . . . . . . . . . . . . . . 390
9.2.2 Summarizing a Linear Regression Model . . . . . . . . 392
9.2.2.1 Re-transformation Bias . . . . . . . . . . . . 396
9.2.3 Simulation for Model Evaluation . . . . . . . . . . . . 397
9.2.4 Predictive Uncertainty . . . . . . . . . . . . . . . . . . 405
9.3 Simulation Based on Re-sampling . . . . . . . . . . . . . . . 408
9.3.1 Bootstrap Aggregation . . . . . . . . . . . . . . . . . . 410
9.3.2 Example: Confidence Interval of the CART-Based
Threshold . . . . . . . . . . . . . . . . . . . . . . . . . 411
9.4 Bibliography Notes . . . . . . . . . . . . . . . . . . . . . . . 414
9.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414

10 Multilevel Regression 417

10.1 From Stein’s Paradox to Multilevel Models . . . . . . . . . . 417


10.2 Multilevel Structure and Exchangeability . . . . . . . . . . . 421
10.3 Multilevel ANOVA . . . . . . . . . . . . . . . . . . . . . . . 425
10.3.1 Intertidal Seaweed Grazers . . . . . . . . . . . . . . . 426
10.3.2 Background N2 O Emission from Agriculture Fields . . 431
10.3.3 When to Use the Multilevel Model? . . . . . . . . . . 434
10.4 Multilevel Linear Regression . . . . . . . . . . . . . . . . . . 436
10.4.1 Nonnested Groups . . . . . . . . . . . . . . . . . . . . 447
10.4.2 Multiple Regression Problems . . . . . . . . . . . . . . 453
10.4.3 The ELISA Example—An Unintended Multilevel Modeling
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 464
10.5 Nonlinear Multilevel Models . . . . . . . . . . . . . . . . . . 465
10.6 Generalized Multilevel Models . . . . . . . . . . . . . . . . . 469
10.6.1 Exploited Plant Monitoring—Galax . . . . . . . . . . 470
10.6.1.1 A Multilevel Poisson Model . . . . . . . . . . 471
10.6.1.2 A Multilevel Logistic Regression Model . . . 474
xii Contents

10.6.2 Cryptosporidium in U.S. Drinking Water—A Poisson


Regression Example . . . . . . . . . . . . . . . . . . . 478
10.6.3 Model Checking Using Simulation . . . . . . . . . . . 482
10.7 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . 486
10.8 Bibliography Notes . . . . . . . . . . . . . . . . . . . . . . . 489
10.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489

11 Evaluating Models Based on Statistical Significance Testing 493

11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 493


11.2 Evaluating TITAN . . . . . . . . . . . . . . . . . . . . . . . . 495
11.2.1 A Brief Description of TITAN . . . . . . . . . . . . . 496
11.2.2 Hypothesis Testing in TITAN . . . . . . . . . . . . . . 498
11.2.3 Type I Error Probability . . . . . . . . . . . . . . . . . 499
11.2.4 Statistical Power . . . . . . . . . . . . . . . . . . . . . 503
11.2.5 Bootstrapping . . . . . . . . . . . . . . . . . . . . . . 511
11.2.6 Community Threshold . . . . . . . . . . . . . . . . . . 512
11.2.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . 513
11.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514

Bibliography 515

Index 529
Preface

I learned statistics from Bayesian statisticians. As a result, I do not pay


attention to hypothesis testing and p-values in my work. Likewise, I do
not emphasize the use of them in my teaching. However, most students
from my classes remember the term “statistically significant” (or p < 0.05)
better than anything and check the R2 value when evaluating a regression
model. I have talked to many of them on their experiences in learning
and using statistics to understand why they seem to be naturally drawn
to these numbers that few can explain clearly in plain language. I came to
a satisfactory explanation around 2007 when I read slides of a presentation
given by Dick De Veaux of Williams College entitled “Math is Music; Statistics
is Literature.” (This presentation is now available on YouTube.) According
to Dr. De Veaux, statistics is challenging to both students and instructors
alike, because we want to teach not only the mechanical part of statistics,
but also the process of making a judgment. As a statistics course is always
counted as a quantitative methods class, students naturally view statistics as
a mathematics class. But statistics is not mathematics. In a typical statistical
class for environmental/ecological graduate students, we typically use very
simple (but often tedious) mathematics. Students expect to learn statistics as
they learn mathematics. However, the mode of inference in mathematics is
deduction while the mode of inference of statistics is induction. As a result,
statistics cannot be learned by remembering rules and formulae. The process
of making a judgment requires putting the analysis in the context, combining
information from multiple sources, using logic and common sense. Learning
statistics is not about learning rules (as in mathematics) but more about
interpretation and synthesis, which requires experience (as in literature).
When deciding to write this book, I wanted to put together some examples
to illustrate the process of making a judgment and integrate these examples
to illustrate the iterative process of statistical inference. This process will
inevitably include more than one statistical topic. As a result, many examples
included in this book are used in multiple chapters. For example, I used the
PCB in fish example as an example of a two-sample t-test in Chapter 4, simple
and multiple regressions in Chapter 5, and an example of nonlinear regression
in Chapter 6. With these examples, I try to illuminate the difference between
how we learn statistics and how we use statistics. In learning statistics, we
learn by topics (e.g., from t-test to ANOVA to linear regression, and so on).
By the end of the class, students often see statistics as a collection of unrelated

xiii
xiv Preface

methods. When using statistics, we first must decide what is the nature of the
problem before deciding what statistical tools to use. This first step is not
always taught in a statistics class.
Using the PCB in fish example, I want to illustrate the iterative nature
of a statistical inference problem. We may not be able to identify the most
appropriate model at first. Through repeated effort on proposing the model,
identifying flaws of the proposed model, and revising the model, we hope to
reach a sensible conclusion. As a result, a statistical analysis must have subject
matter context. It is a process of sifting through data to find useful information
to achieve a specific objective. The basic problem of the PCB in fish example
is the risk of PCB exposure from consuming fish from Lake Michigan. The
initial use of the data showed a large difference between large and small fish
PCB concentrations. However, Figure 5.1 suggests that the difference between
small and large fish PCB concentrations cannot be adequately described by the
simple two sample t-test model. Throughout Chapter 5, I used this example
to discuss how a linear regression model should be evaluated and updated. In
Chapter 6, some alternative models are presented to summarize the attempts
made in the literature to correct the inadequacies of the linear models. But I
left Chapter 6 without a satisfactory model. In Chapter 9, I used this example
again to illustrate the use of simulation for model evaluation. While writing
Chapter 9, I discovered the length imbalance. In a way, this example shows
the typical outcome of a statistical analysis — no matter how hard we try, the
outcome is always not completely satisfactory. There are always more “what
if”s. However, the ability to ask “what if” is not easy to teach and learn,
because of the “seven unnatural acts of statistical thinking” required by a
statistical analysis: think critically, be skeptical, think about variation (rather
than about center), focus on what we don’t know, perfect the process, and
think about conditional probabilities and rare events [De Veaux and Velleman,
2008]. By examining the same problem from different angles, I hope to bring
home the essential message: statistical analysis is more than reporting a p-
value.
Since the publication of the first edition, I have learned more about the
problem of using statistical hypothesis testing. One part of these problems
lies in the terminology we use in statistical hypothesis testing. The term
“statistically significant” is particularly corruptive. The term has a specific
meaning with respect to the null hypothesis. But by declaring our result
to be “significant” without further explanation, we often mislead not only
the consumer of the result but also ourselves. In this edition, I removed the
term “statistically significant” whenever possible. Instead, I try to use plain
language to describe the meaning of a “significant” result. As I explained in
a guest editorial for the journal Landscape Ecology, a statistical result should
be measured by the MAGIC criteria of Abelson [1995]: a statistical inference
should be a principled argument and the strength of the inference should
be measured by Magnitude, Articulation, Generality, Interestingness, and
Credibility, not just a p-value or R2 or any other single statistic. Throughout
Preface xv

the book, I emphasize the interpretation of a fitted model and making


conclusions based on the context of the problem. I have followed the following
rules in all examples:
• Verbal description of a model – a clear description of the model using
nonstatistical terms should be a first step. When describing the model in
clear scientific terms, we can better judge whether the model is sensible
and whether the real world can be reasonably represented by the model.
Even for a simple model such as a t-test or ANOVA, a verbal description
can be helpful.
• Verifying model assumptions – plots, plots, and more plots.
• Verbal description of estimated model coefficients – before finalizing the
model, we should describe the estimated model coefficients in words.
This should be done even in a simple two-sample t-test.
The American Statistical Association issued a statement on p-values
[Wasserstein and Lazar, 2016]. The statement emphasizes that the use of
statistics should include the context of the problem, the process of data
collection and model formulation, and the purpose of the analysis. I will use
the statement as a required reading in my class during the first and last weeks
of the semester.
Major changes made in this edition include:
• New and revised Chapters and Sections

– Sections 1.2–1.5 describe main examples used in more than one


chapter.
– Chapter 2 is rewritten with a brief introduction to R and the use
of R for data manipulation.
– Section 5.1 is rewritten to use the PCB in fish example as the lead
for linear regression model.
– New section 5.3.1 introduces the ELISA data collected during the
Toledo water crisis in 2014.
– New section 6.1.3 presents the use of a self-starter function for
nonlinear regression.
– Sections 8.5–8.6 present the multinomial regression and the
connection between multinomial and Poisson models.
– Section 9.2 is revised to include nonlinear regression simulation.
– Two-way ANOVA is removed from section 10.3.
– Section 10.4.3 is added to introduce the ELISA example as a
multilevel modeling problem.
– Section 10.5 is added to introduce nonlinear multilevel models.
xvi Preface

– Section 10.6.1 uses new examples for generalized multilevel models.


– Chapter 11 is added to discuss the use of simulation in evaluating
hypothesis testing based methods. This chapter demonstrates the
importance of putting a statistical test in the context of a real-
world problem. We should ask: what is the scientific problem at
hand, what is the null hypothesis in the context of the problem,
what alternatives are supported when the null is rejected? Once
these questions are answered, we often have a better understanding
of the problem and can be better prepared for making a sound
judgment.
• Exercises are added to the end of each chapter.
• Online materials (data and R code) are at GitHub (https://github.
com/songsqian/eesR).

Song S. Qian
Sylvania, Ohio, USA
July 2016

Вам также может понравиться