Вы находитесь на странице: 1из 4

Ignacio Cascos Fernandez

Departamento de Estadstica
Universidad Carlos III de Madrid

Computer lab 1, Descriptive Statistics 20142015.


At the computer labs, we will be using R, the most widely used statistical
package/language. It is a free software environment for statistical computing
and graphics.
You can download the latest versions from http://www.r-project.org/
In first place, we will type some simple commands associated with descriptive statistics on the console. Type the following commands and then press
INTRO.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

3*4+2-5^2
x=c(-1.1,2.2,5.3,4.7,1.6,2.2,4.3,2.2,1.1)
x
length(x)
abs(x)
x^2
sum(x)
help(sum)
table(x)
mean(x)
median(x)
mean(x,trim=1/9)
var(x)
quantile(x)
quantile(x,.25)
stem(x)
x>2
sum(x>2)/length(x)
y=(1:10)
y=seq(0,7,0.5)
y+1

>
>
>
>
>
>
>
>
>
>
>

y=matrix(c(1,2,4,3,7,9),ncol=2,byrow=T)
y
y[1,]
y[1,2]
y=data.frame(y)
y$X2
mean(y$X2)
attach(y)
mean(X2)
X1[X2==9]
detach(y)

The following are basic graphical commands.


>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

win.graph()
dev.off()
boxplot(x)
hist(x)
hist(x,breaks=seq(1,5.5,.5))
aux1=seq(-2,2,.2)
aux2=aux1^2
plot(aux1,aux2)
points(aux1[c(1,2)],aux2[c(1,2)],col="red")
points(aux1[c(3,4)],aux2[c(3,4)],pch=3)
var(aux1,aux2)
cor(aux1,aux2)
y=c(2,3,5,5,2,3,4,2,2)
lsfit(x,y)
plot(x,y)
abline(lsfit(x,y))
cor(x,y)

Download the file alaska.txt from Aulaglobal2 to your computer, open


it with the notepad and have a look at its structure. The alaska data set
consists of 15 variables on the 3611 commercial passenger flights departing
from or arriving to Alaska in January 2011.
The 15 variables are:
YEAR
MONTH
DAY OF MONTH
UNIQUE CARRIER
ORIGIN AIRPORT ID
ORIGIN CITY NAME
DEST CITY NAME
DEP TIME HOUR
DEP TIME MIN
DEP DELAY
DEP DELAY NEW
ARR DELAY
ARR DELAY NEW
AIR TIME
DISTANCE

year (2011)
month (January)
day of month
airline code
identification number of the origin airport
origin city name
destination city name
departure time (hour)
departure time (minute)
difference in minutes between scheduled and actual
departure time (early departures show negative numbers)
difference in minutes between scheduled and actual
departure time (early departures set to 0)
difference in minutes between scheduled and actual
arrival time (early arrivals show negative numbers)
difference in minutes between scheduled and actual
arrival time (early arrivals set to 0)
flight time in minutes
distance between airports in miles

Source: http://transtats.bts.gov/
Type:
>
>
>
>
>

alaska=read.table(".../alaska.txt",header=T)
attach(alaska)
names(alaska)
DISTANCE[UNIQUE_CARRIER=="AS"]
sum(DISTANCE[UNIQUE_CARRIER=="AS"])

Assignment: Complete the instructions 1 to 7. You must write the commands that you have typed to obtain the desired statistic or figure and the
numerical result obtained, or alternative plot the chart.
1. Compute the mean, median, and variance of the flight time of the 3611
flights. Compute afterwards the mean, median, and variance of flight
time of the flights (*) Juneau (Juneau,AK).
2. What percentage of flights were shorter than 2 hours and 15 minutes?
What flight time was exceeded by 65% of the flights?
3. Obtain a box plot for the delay at arrival. What was the delay of the
most delayed flight (at arrival)?
4. Obtain a stem-and-leaf chart for the flight time of flights operated by
US Airways (US). How many flights operated by US Airways lasted
longer than 300 but shorter than 310 minutes?
5. Obtain a histogram for the delay at (**). Is the histogram right-skewed
or left-skewed?
6. Plot the delay at departure (axis X) versus the delay at arrival (axis Y).
The observations corresponding to carrier AS (Alaska Airlines) should
go in a different color or plotting character.
7. Plot the distance (axis X) versus the flight time (axis Y) for flights (*)
Juneau. Find the correlation between the variables. Add the regression
line that allows you to predict the flight time of a flight (*) Juneau,
in terms of the distance that it covers. What is the approximate flight
time, in minutes, of a flight (*) Juneau that covers a distance of 1000
miles?
First (alphabetically ordered) family name starting by AF,
(*) departing from
(**) departure.
First family name starting by GZ,
(*) with destination
(**) arrival.

Вам также может понравиться