Академический Документы
Профессиональный Документы
Культура Документы
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
February 1, 2010
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Outline
1
Preliminaries
Data sets
Descriptive Statistics
Probability Models
Linear Regression
Upcoming Mini-Courses
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Preliminaries
Software Installation
R Help
Data sets
Descriptive Statistics
Probability Models
Linear Regression
Upcoming Mini-Courses
Exercises
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Software Installation
Installing R on a Mac
Go to
http://cran.r-project.org/
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
R Help
R Help
? plot
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Preliminaries
Data sets
Loading data into R
Viewing data sets in R
Descriptive Statistics
Probability Models
Linear Regression
Upcoming Mini-Courses
Exercises
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
dim ( survey )
[1] 1325
29
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
names ( survey )
[1] "gender"
[6] "birthmonth"
[11] "graduate"
...
"hand"
"birthday"
"oncampus"
"eyecolor"
"birthyear"
"time"
"glasses"
"california"
"ageinmonths" "height"
"walk"
"hsclass"
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
attach ( survey )
sleep
the object with that name in the data frame will be seen
before another object with the same name that is lower in the
search() path. Thus, your object is masking the other.
To detach a data frame, i.e. remove from the search() path
of available R objects - but we wont do that now.
1
detach ( sleep )
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Preliminaries
Data sets
Descriptive Statistics
Variable classes
Displaying categorical data
Displaying quantitative data
Describing distributions numerically
Probability Models
Linear Regression
Upcoming Mini-Courses
Exercises
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Variable classes
class ( instructor )
[1] "factor"
[1] "character"
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Tables
table ( gender )
gender
female
882
male
443
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Contingency tables
hand
gender
ambidextrous left right
female
9
67
806
male
11
45
387
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
0 200
600
Barplot of Gender
female
male
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
0.0
0.3
0.6
Relative Frequency
Barplot of Gender
female
male
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
females
males
ambidextrous
left
right
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Pie charts
Pie charts display counts as percentages of individuals in each
category.
1
male
% 33
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Histograms
Display the number of cases in each bin
1
300
200
0
100
Frequency
400
200
250
300
350
ageinmonths
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
0.020
0.000
0.010
Density
0.030
Relative Frequency
Histogram of Age in Months
200
250
300
350
Age in Months
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Stem-and-Leaf Plots
Preserve individual data values.
1
stem ( ageinmonths )
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
48
004444555566666666666666666777777777778888888888889999999999999999
00000000000000000000000000111111111111111122222222222222222222333333+258
00000000000000000000000000000000000000000000001111111111111111111111+379
00000000000000000000000000000000000000000000111111111111111111111111+170
00000000000001111111111111112222222222222222222223333333344444444445+24
000000000001111111111222222333334444444444556666778889
00111222222344566789
01334558888
0004569
267
02257
44
5
89
3
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Boxplots
1
200
250
300
350
fivenum ( ageinmonths )
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Summary
Categorical variables:
1
summary ( hand )
ambidextrous
20
left
112
right
1193
Quantitative variables:
1
summary ( ageinmonths )
Median
235.0
Max.
353.0
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Measures of center
Mean (arithmetic average):
1
mean ( ageinmonths )
[1] 237.8309
median ( ageinmonths )
[1] 235
[1] 228
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Mode (alternative)
To find the mode, you may also use the Mode function in the
prettyR package.
1
2
3
[1] "228"
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
2
3
4
300
200
0
100
Frequency
400
Mean
Median
200
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
250
300
350
ageinmonths
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Measures of spread
Range (Min, Max):
1
range ( ageinmonths )
IQR:
1
IQR ( ageinmonths )
[1] 15
Standard deviation:
1
sd ( ageinmonths )
[1] 16.03965
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Preliminaries
Data sets
Descriptive Statistics
Probability Models
Geometric
Binomial
Poisson
Normal
Linear Regression
Upcoming Mini-Courses
Exercises
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Geometric
Geometric distribution
If the probability of success is 0.35, what is the probability that the
first success will be on the 5th trial?
1
dgeom (4 ,0.35)
[1] 0.06247719
Note: dgeom gives the density (or probability mass function for discrete
variables), pgeom gives the distribution function, qgeom gives the
quantile function, and rgeom generates random deviates. This is true for
the functions used for Binomial, Poisson and Normal calculations as well.
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Binomial
Binomial distribution
If the probability of success is 0.35, what is the probability of
3 successes in 5 trials?
1
dbinom (3 ,5 ,0.35)
[1] 0.1811469
[1] 0.2351694
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Poisson
Poisson distribution
The number of traffic accidents per week in a small city has
Poisson distribution with mean equal to 3. What is the probability
of
two accidents in a week?
1
dpois (2 ,3)
[1] 0.2240418
[1] 0.1991483
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Normal
Normal distribution
Scores on an exam are distributed normally with a mean of 65 and
a standard deviation of 12. What percentage of the students have
scores
below 50?
1
[1] 0.1056498
[1] 0.5558891
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Normal
[1] 80.37862
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Preliminaries
Data sets
Descriptive Statistics
Probability Models
Linear Regression
Upcoming Mini-Courses
Exercises
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
2
3
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
3
4
5
6
7
8
9
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Preliminaries
Data sets
Descriptive Statistics
Probability Models
Linear Regression
Scatterplots, Association, and Correlation
Simple Linear Regression
Upcoming Mini-Courses
Exercises
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Scatterplots
Is there an association between amount of alcohol consumed and
maximum speed?
1
100
0
50
speed
150
20
40
60
80
alcohol
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Correlation
[1] 0.2309745
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Call:
lm(formula = speed ~ alcohol)
Residuals:
Min
1Q
-90.769 -8.725
Median
1.275
3Q
11.275
Max
91.541
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 88.7248
0.6511 136.261
<2e-16 ***
alcohol
0.9469
0.1108
8.549
<2e-16 ***
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 21.83 on 1297 degrees of freedom
(26 observations deleted due to missingness)
Multiple R-squared: 0.05335, Adjusted R-squared: 0.05262
F-statistic: 73.09 on 1 and 1297 DF, p-value: < 2.2e-16
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Preliminaries
Data sets
Descriptive Statistics
Probability Models
Linear Regression
Upcoming Mini-Courses
Exercises
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Preliminaries
Data sets
Descriptive Statistics
Probability Models
Linear Regression
Upcoming Mini-Courses
Exercises
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Upcoming Mini-Courses
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Thank you
Any questions?
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Preliminaries
Data sets
Descriptive Statistics
Probability Models
Linear Regression
Upcoming Mini-Courses
Exercises
Linear Reg.
Resources
Upcoming
Exercises
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Exercises
1
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Solution to Exercise 1
1
50
Minutes
100
150
bicycle
bus
motorcycle
other
segway
skateboard
walk
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Solution to Exercise 2
1
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC
Prelim.
Data
Descriptive Statistics
Prob. Models
Linear Reg.
Resources
Upcoming
Exercises
Solution to Exercise 3
1
Mine C
etinkaya mine@stat.ucla.edu
Introductory Statistics with R
UCLA SCC