Академический Документы
Профессиональный Документы
Культура Документы
to
CME
195
Introduc4on
to
R
Xiaotong
Suo
Course
overview
Two
parts
of
this
short
course:
R
basics
R
in
sta4s4cs
Course
prerequisite
No
programming
course
required.
So
if
you
already
took
CS
106B
or
the
same
level
course,
you
probably
will
not
like
this
course.
Some
knowledge
of
sta4s4cs
would
help
but
I
will
go
over
the
basics
in
class.
Homework
It
is
a
short
course.
We
will
have
4
homework
and
you
have
to
get
at
least
60%
on
each
homework
to
pass
this
course.
?
Todays Agenda
Introduc4on
to
R
Variables
Func4ons
Special
values
in
R
Very
brief
introduc4on
to
vectors
What
is
R?
R
is
a
soUware
for
sta4s4cal
compu4ng
and
data
analysis.
An
implementa4on
of
the
S
language.
R
is
freely
distributed
soUware(
www.r-project.org
)
with
contribu4ons
from
developers
from
around
the
world.
It
is
one
of
the
main
soUware
for
sta4s4cal
compu4ng.
Ge[ng
started
Download
R
at
www.r-project.org.
R
studio
has
nice
interface
and
you
can
get
it
for
free
at
www.
rstudio.com.
Ge[ng
started
There
are
two
ways
to
work
in
R:
A
conven4onal
approach:
you
open
a
le
and
write
program
describing
what
you
intend
to
do
and
run
that
program.
An
interac4ve
approach:
you
interact
with
R
and
do
whatever
you
want
to
do,
one
step
at
a
4me.
We
type
in
expressions
and
R
evaluate
them
and
return
a
value
if
needed.
We
combine
both
approaches
most
of
the
4mes.
Variables
A
variable
in
computer
science
is
a
name
given
to
some
storage
loca4on.
In
more
prac4cal
terms,
it
is
a
binding
between
a
symbol
and
a
value.
x
<-
20
y
<-
x
+
1
x
+
2
->
z
assign(u,
x+2)
Note
that
R
is
case-sensi4ve!
Variables
con4nued
Both
<-
and
=
assign
values
to
a
variable,
there
are
some
dierences
between
two
assignment.
For
example,
if
we
type:
median(x
<-
1:10)
versus
median(x
=
1:10)
In
general,
you
might
run
into
problems
using
=
for
assignment
operator
and
<-
is
preferred!
Variables
con4nued
We
can
create
a
vector
v,
which
holds
many
values,
as
follows:
v
->
c(1,2,3,4,5)
(We
will
discuss
more
detailed
about
vectors
in
R
next
lecture)
Here,
c
means
concatena4on.
Variables
con4nued
It
is
important
to
understand
Rs
organiza4on.
As
you
create
new
variables
in
R,
there
are
kept
in
the
computer
memory.
It
is
useful
some4mes
to
know
what
variables
are
currently
in
memory
and
be
able
to
save
or
delete
them.
ls()
ls.str()
both
commands
list
exis4ng
variables
Variable con4nued
x<-rnorm(1);
u<-x+1;
Save
x
and
u
in
le
x1.RData
save(list=c(x,u)
,
le=x1.RData)
Delete
x
and
u
rm(list=c(x,u))
Delete
all
variables
rm(list=ls())
Load
the
data
in
x1.Rdata
load(le
=
x1.RData)
Working
directory
At
the
beginning
of
each
R
sec4on,
a
directory
is
akached
to
the
sec4on
called
the
working
directory.
To
see
the
current
working
directory
getwd()
Func4ons
Beside
variables,
func4ons
are
the
other
most
important
concept
in
computer
programming.
A
func4on
is
a
piece
of
code
that
takes
some
input
called
arguments,
performs
a
specic
task
and
possibly
returns
a
value.
In
order
to
properly
use
a
func4on
we
must
properly
set
up
its
arguments.
In
R
we
specify
arguments
either
by
name
or
by
posi4on
Func4ons
con4nued
The
func4on
rnorm
we
used
earlier:
u
<-
rnorm(100,0,2)
#
by
posi4on
x
<-
rnorm(n
=
100,
mean
=
0,
sd
=
2)
#
by
name
We
can
nd
the
arguments
of
a
given
func4on
by
using
the
func4on
args
args(rnorm)
Func4ons
con4nued
There
are
a
lot
of
build-in
func4ons
in
R.
Before
wri4ng
your
own
func4on,
I
would
check
whether
there
are
exis4ng
func4ons
available
rst.
Some4mes
it
is
hard
to
google
summa4on
in
R.
Instead,
you
can
google
summa4on
in
R
cran
If
you
know
the
build-in
func4on
name,
but
you
are
not
sure
how
to
use
it,
?rnorm
Func4ons
con4nued
We
can
also
dene
a
func4on:
f
<-
func4on(x,
i){
x[i]
=
4
}
w
=
c(10,
11,
12,
13)
f(w,
1)
w
w
is
not
changed
when
we
call
the
func4on
f
on
it.
Therefore,
w
is
passed
by
value,
which
means
that
R
makes
a
copy
of
w
and
changes
the
rst
element
of
the
copied
w.
We
will
talk
more
about
this
in
later
classes.
Special
values
in
R
NA
is
used
to
represent
missing
values
and
stands
for
not
available.
v
->
c(1,2,3)
length(v)
=
4
R
automa4cally
lls
a
NA
into
the
end
of
v
since
no
value
is
provided.
NULL
NULL:
A
null
object
in
R,
represented
by
the
symbol
NULL.
NULL
is
oUen
used
as
an
argument
in
func4ons
to
mean
that
no
value
was
assigned
to
the
argument.
f1
=
func4on(arg1,
arg2
=
NULL)
Data
structures
In
order
to
work
with
a
language
we
need
to
know
the
objects
that
language
oers.
R
oers
5
basic
objects:
vectors,
matrix,
factor,
dataframe
and
list.
Vectors
A
vector
is
a
collec4on
of
objects
which
all
have
the
same
data
type
(also
called
mode).
R
supports
many
dierent
mode:
integer,
double,
logical,
character
and
complex.
Vectors
con4nued
We
use
[]
to
access
the
elements
of
a
vector.
Thus
x[1]
is
the
rst
element
of
x,
etc...
x3[1]+2
x3[3:10]
X3[1]
Note:
the
index
in
R
is
1
based!
Vectors
con4nued
x1
<-
vector(double,length=20)
#
a
vector
of
real
numbers
x2=c(1:10)
#
a
vector
of
integers
x3=c(T,F,FALSE,TRUE)
#
a
vector
of
logical
values
x4=c(MY,NAME,
IS)
#
a
vector
of
characters
x5
=
c(2+2i,complex(real=cos(pi/
3),imaginary=sin(pi/3)))
#
a
vector
of
complex
numbers
typeof(x4)
#or
class(x4)
or
mode(x4)
typeof(x5)
Next
4me
5
basic
objects:
more
on
vectors,
matrices,
factors,
dataframes
and
lists.