Вы находитесь на странице: 1из 32

Welcome

to CME 195
Introduc4on to R

Xiaotong Suo

Course overview
Two parts of this short course:
R basics
R in sta4s4cs

Course prerequisite
No programming course required. So if you
already took CS 106B or the same level
course, you probably will not like this course.
Some knowledge of sta4s4cs would help but I
will go over the basics in class.

Homework
It is a short course. We will have 4 homework
and you have to get at least 60% on each
homework to pass this course.
?

Todays Agenda

Introduc4on to R
Variables
Func4ons
Special values in R
Very brief introduc4on to vectors

What is R?
R is a soUware for sta4s4cal compu4ng and data
analysis. An implementa4on of the S language.
R is freely distributed soUware(
www.r-project.org ) with contribu4ons from
developers from around the world. It is one of
the main soUware for sta4s4cal compu4ng.

Ge[ng started
Download R at www.r-project.org. R studio
has nice interface and you can get it for free at
www. rstudio.com.

Ge[ng started
There are two ways to work in R:
A conven4onal approach: you open a le and
write program describing what you intend to do
and run that program.
An interac4ve approach: you interact with R and
do whatever you want to do, one step at a 4me.
We type in expressions and R evaluate them and
return a value if needed.
We combine both approaches most of the 4mes.

Variables
A variable in computer science is a name given
to some storage loca4on. In more prac4cal
terms, it is a binding between a symbol and a
value.
x <- 20
y <- x + 1
x + 2 -> z
assign(u, x+2)
Note that R is case-sensi4ve!

Variables con4nued
Both <- and = assign values to a variable, there
are some dierences between two
assignment.
For example, if we type:
median(x <- 1:10)
versus
median(x = 1:10)
In general, you might run into problems using = for
assignment operator and <- is preferred!

Variables con4nued
We can create a vector v, which holds many
values, as follows:
v -> c(1,2,3,4,5)
(We will discuss more detailed about vectors in R
next lecture)
Here, c means concatena4on.

Variables con4nued
It is important to understand Rs organiza4on.
As you create new variables in R, there are kept
in the computer memory. It is useful some4mes
to know what variables are currently in memory
and be able to save or delete them.
ls()
ls.str()
both commands list exis4ng variables

Variable con4nued

x<-rnorm(1);
u<-x+1;
Save x and u in le x1.RData
save(list=c(x,u) , le=x1.RData)
Delete x and u
rm(list=c(x,u))
Delete all variables
rm(list=ls())
Load the data in x1.Rdata
load(le = x1.RData)

Working directory
At the beginning of each R sec4on, a directory
is akached to the sec4on called the working
directory.
To see the current working directory
getwd()

To set the working directory.


setwd()

Working directory con4nued


Whenever you try to read or save a le without the
full path, the working directory (wd) will be used.
Typically the wd is the directory from which you start
R.
At the end of an R session, you can choose to save all
the objects in memory. A le .RData is then
created for this purpose. Next 4me, star4ng R from
the same directory, this le .RData will be
automa4cally loaded.

Working directory con4nued


You can load the .RData from another
directory with load().
Note that only the le .RData is automa4cally
loaded whereas other le lename.RData are
not. You need to load them with the func4on
load

Working directory con4nued


Another important concept to know is the
search directories. That is the sequence of
Environments in which R searches for
whatever variable or func4on you request.
You can see that hierarchy with search().
This hierarchy changes as you add or remove
packages to your R session.

Working directory con4nued


Another important concept to know is the
search directories. That is the sequence of
Environments in which R searches for
whatever variable or func4on you request.
You can see that hierarchy with search(). This
hierarchy changes as you add or remove
packages to your R session.
Type ?environment in R to nd how to get, set
and create environments.

Func4ons
Beside variables, func4ons are the other most
important concept in computer programming.
A func4on is a piece of code that takes some
input called arguments, performs a specic
task and possibly returns a value. In order to
properly use a func4on we must properly set
up its arguments.
In R we specify arguments either by name or
by posi4on

Func4ons con4nued
The func4on rnorm we used earlier:
u <- rnorm(100,0,2) # by posi4on
x <- rnorm(n = 100, mean = 0, sd = 2) # by name
We can nd the arguments of a given func4on by
using the func4on args
args(rnorm)

Func4ons con4nued
There are a lot of build-in func4ons in R. Before
wri4ng your own func4on, I would check whether
there are exis4ng func4ons available rst.
Some4mes it is hard to google summa4on in R.
Instead, you can google summa4on in R cran
If you know the build-in func4on name, but you
are not sure how to use it,
?rnorm

Func4ons con4nued
We can also dene a func4on:
f <- func4on(x, i){
x[i] = 4

}
w = c(10, 11, 12, 13)
f(w, 1)
w
w is not changed when we call the func4on f on it.
Therefore, w is passed by value, which means that R
makes a copy of w and changes the rst element of the
copied w. We will talk more about this in later classes.

Special values in R
NA is used to represent missing values and
stands for not available.
v -> c(1,2,3)
length(v) = 4
R automa4cally lls a NA into the end of v since no
value is provided.

Special values con4nued


Inf and Inf: If a computa4on results in a
number that is too big, R will return Inf for a
posi4ve number and -Inf for a nega4ve
number.
2^1024
-2^1024
1/0

Special values con4nued


NaN: a computa4on will produce a result that
makes likle sense. In these cases, R oUen
returns NaN, which stands for not a number.
Inf Inf
0/0

NULL
NULL: A null object in R, represented by the
symbol NULL. NULL is oUen used as an
argument in func4ons to mean that no value
was assigned to the argument.
f1 = func4on(arg1, arg2 = NULL)

Data structures
In order to work with a language we need to
know the objects that language oers. R oers
5 basic objects: vectors, matrix, factor,
dataframe and list.

Vectors
A vector is a collec4on of objects which all
have the same data type (also called mode). R
supports many dierent mode: integer,
double, logical, character and complex.

Vectors con4nued- crea4ng a vector


To create a vector use the func4on vector,
or simply create a new variable.
x1 <- vector(double,2);
x2 <- 1 #The variable x is a vector of size 1

Another common way of crea4ng a vector is


using the concatena4on func4on c or :.
X3 <- c(1,2,10)
X4 <- 1:10

Vectors con4nued
We use [] to access the elements of a vector.
Thus x[1] is the rst element of x, etc...
x3[1]+2
x3[3:10]
X3[1]

Note: the index in R is 1 based!

Vectors con4nued
x1 <- vector(double,length=20) # a vector of real
numbers
x2=c(1:10) # a vector of integers
x3=c(T,F,FALSE,TRUE) # a vector of logical values
x4=c(MY,NAME, IS) # a vector of characters
x5 = c(2+2i,complex(real=cos(pi/
3),imaginary=sin(pi/3))) # a vector of complex
numbers
typeof(x4) #or class(x4) or mode(x4)
typeof(x5)

Next 4me
5 basic objects: more on vectors, matrices,
factors, dataframes and lists.

Вам также может понравиться