Вы находитесь на странице: 1из 18

14-05-17 2:59 PM R tutorial - BCH441Hwiki

Page 1 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial
From BCH441Hwiki
R tutorial
This is a tutorial introduction to R for users with no previous background in the platform or the language.
Contents
The environment
Installation
User interface
The Help system
Working directory
.Rprofile - startup commands
... unix systems
... Mac OS X systems
...Windows systems
Workspace
Packages
Scripts
Simple commands
Operators
Functions
Variables
Scalar data
Vectors
Matrices
Lists
Data frames
Writing your own functions
Notes
Further reading and resources
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 2 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial

The environment
In this section we discuss how to download and install the software, how to configure an R session and what
work with the R environment includes.
Installation
1. Navigate to http://probability.ca/cran/
[1]
and follow the link to your computer's operating system.
2. Download a precompiled binary (or "build") of the R "framework" to your computer and follow the
instructions for installing it. You don't need tools, or GUI versions for now, but do make sure that the
program is the correct one for your version of your operating system.
3. Launch R.
The program should open a windowthe "R console"and greet you with its input prompt, awaiting your input:
>
The sample code on this page sometimes copies input/output from the console, and sometimes shows the actual
commands only. The > character at the beginning of the line is always just R's input prompt; It is shown here
only to illustrate the interactive use of the program and you do not need to type it. If a line starts with [1] or
similar, this is R's output on the console. A #-character this marks the following text as a comment which is not
executed by R. In principle, commands can be copied by you and pasted into the console, or into a script -
obviously, you don't need to copy the comments. In addition, I use syntax highlighting
(http://www.mediawiki.org/wiki/Extension:SyntaxHighlight_GeSHi) on R-script, to color language keywords,
numbers, strings, etc. different from other text. This improves readability but keep in mind that the colours you
see on your computer will be different. One more thing about the console: use your keyboard's up-arrow keys to
retrieve previous commands, then enter the line with left-arrow to edit it; hit enter to execute the modified line.
User interface
R comes with a GUI
[2]
to lay out common tasks. For example, there are a number of menu items, many of
which are similar to other programs you will have worked with ("File", "Edit", "Format", "Window", "Help"
...). All of these tasks can also be accessed through the command line. In general, GUIs are useful when you are
not sure what you want to do or how to go about it; the command line is much more powerful when you have
more experience and know your way around in principle. R gives you both options.
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 3 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial
In addition to the Console, there are a number of other windows that you can open (or that open automatically).
They all can be brought to the foreground with the Windows menu and include help, plotting, package browser
and other windows.
Let's look at some functions of R that refer to how you work, not what you do.
The Help system
Help is available for all commands and for the R command line syntax. As well, help is available to find the
names of commands when you are not sure of them.
("help" is a function, arguments to a function are passed in parentheses "()")
> help(rnorm)
>
(shorthand for the same thing)
> ?rnorm
>
(what was the name of that again ... ?)
> ?binom
No documentation for 'binom' in specified packages and libraries:
you could try '??binom'
> ??binom
>
(found "Binomial" in the list of keywords)
> ?Binomial
>
That's all fine, but you wil soon notice that R's help documentation is not all that helpful for the newcomers
(who need the most help). If you look at the bottom of the help function, you will usually find examples of
command usage; these often make matters more clear than the terse and principled help-text above. Or you can
just Google for what interests you and this is often the quickest way to find working example code. Also, as a
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 4 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial
result of Google search it may turn out for example that something can't be done (easily)and you won't find
things that can't be done at all in the help system. You may want to include "r language" in your search terms,
although Google is usually pretty good at figuring out what kind of "r" you are looking for, if your query
includes a few terms vaguely related to statistics.
There is also an active R-help mailing list (https://stat.ethz.ch/mailman/listinfo/r-help) which you can post to
or at least search the archives: your question probably has been asked and answered before.
Working directory
To locate a file in a computer, one has to specify the filename and the directory in which the file is stored; this is
sometimes called the path of the file. The "working directory" for R is either the direcory in which the R-
program has been installed, or some other directory, as initialized by a startup script. You can execute the
command getwd() to list what the "Working Directory" is currently set to:
> getwd()
[1] "/Users/steipe/R"
It is convenient to put all your R-input and output files into a project specific directory and then define this to be
the "Working Directory". Use the setwd() command for this. setwd() requires a parameter in its parentheses: a
string with the directory path. Strings in R are delimited with " or ' characters. If the directory does not exist, an
Error will be reported. Make sure you have created the directory. On Mac and Unix systems, the usual
shorthand notation for relative paths can be used: ~ for the home directory, . for the current directory, .. for the
parent of the current directory.
On windows systems, you need know that backslashes "\" have a special meaning for R, they work as
escape characters. Thus R gets confused when you put them into string literals, such as Windows path names.
R has a simple solution: simply replace all backslashes with forward slashes, and R will translate them back
when it talks to your operating system. Instead of C:\documents\projectfiles you write
C:/documents/projectfiles.
[3]
My home directory...
> setwd("~")
> getwd()
[1] "/Users/steipe"
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 5 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial
Relative path: home directory, up one level, then down into chen's home directory)
> setwd("~/../chen")
> getwd()
[1] "/Users/chen"
Absolute path: specify the entire string)
> setwd("/Users/steipe/abc/R_samples")
> getwd()
[1] "Users/steipe/abc/R_samples"
Task:
1. Create a directory for your sample files and use setwd("your-directory-name")
to set the working directory.
2. Confirm that this has worked by typing getwd().
The Working Directory functions can also be accessed through the Menu, under Misc.
.Rprofile - startup commands
Often, when working on a project, you would like to start off in your working directory right away when you
start up R, instead of typing the setwd() command. This is easily done in a special R-script that is executed
automatically on startup
[4]
. The name of the script is .Rprofile and R expects to find it in the user's home
directory. You can edit these files with a simple text editor like Textedit (Mac), Notepad (windows) or Gedit
(Linux).
Besides setting the working directory, other items that might go into such a file could be
libraries that you often use
constants that are not automatically defined
functions that you would like to preload.
... unix systems
Navigate to your home directory (cd ~).
Open a textfile
Type in: setwd("/path/to/your/project")
Save the file with a filename of .Rprofile. (Note the dot prefix!)
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 6 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial
... Mac OS X systems
On Macs, filenames that begin with a dot are not normally shown in the Finder. Either you can open a terminal
window and use nano to edit, instead of Textedit. Or, you can configure the Finder to show you such so-called
"hidden files" by default. To do this:
1. Open a terminal window;
2. Type: $defaults write com.apple.Finder AppleShowAllFiles YES
3. Restart the Finder by accessing Force quit (under the Apple menu), selecting the Finder and clicking
Relaunch.
4. If you ever want to revert this, just do the same thing but set the default to NO instead.
In any case: the procedure is the same as for unix systems. A text editor you can use is nano in a Terminal
window.
...Windows systems
...
Workspace
During an R session, you might define a large number of variables, datastructures, load packages and scripts
etc. All of this information is stored in the so-called "Workspace". When you quit R you have the option to save
the Workspace; it will then be reloaded in your next session.
List the current workspace contents: initially it is empty. (R reports an object of type "character" with a length of 0.)
> ls()
character(0)
>
Initialize three variables (multiple commands on one line can be separated with a semicolon";")
> a <- 1; b <-2; eps <- 0.0001
> ls()
[1] "a" "b" "eps"
>
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 7 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial
Remove one item. (Note: the parameter is not the string "a", but the variable name a.)
> rm(a)
> ls()
[1] "b" "eps"
>
We can use the output of ls() as input to rm() to remove everything and clear the Workspace. (cf. ?rm for details)
rm(list= ls())
> ls()
character(0)
>

Packages
R has many powerful functions built in, but one of it's greatest features is that it is easily extensible. Extensions
have been written by legions of scientists for many years, most commonly in the R programming language
itself, and made available through CRANThe Comprehensive R Archive Network (http://cran.r-project.org/) .
A package is a collection of code, documentation and sample data files. To use packages, you need to install
them (once), and add them to your current session (for every new session). You can get an overview of installed
and loaded packages by opening the Package Manager window from the Packages & Data Menu item. It
gives a list of available packages you currently have installed, and identifies those that have been loaded at
startup, or interactively.
Task:
Navigate to http://cran.r-project.org/web/packages/ and read the page.
Navigate to http://cran.r-project.org/web/views/ (the CRAN task-views.
Follow the link to Genetics and read the synopsis of available packages. The
library sequinr sounds useful, but check first whether it is already installed.
library() opens a window of installed packages in the library; search() shows which one are currently
loaded.
> library()
> search()
[1] ".GlobalEnv" "tools:RGUI" "package:stats" "p
[5] "package:grDevices" "package:utils" "package:datasets" "p
[9] "Autoloads" "package:base"
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 8 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial
In the R packages available window, confirm that seqinr is not yet installed.
Follow the link to seqinr to see what standard information is available with a
package. Then follow the link to Reference manual to access the documentation
pdf. This is also sometimes referred to as a "vignette" and contains usage hints and
sample code.
Read the help for vignette. Note that there is a command to extract R sample code from a vignette, to
experiment with it.
> ?vignette
>
Install seqinr from the closest CRAN mirror and load it for this session. Explore some functions.
The fact that these methods work shows that the library has been downloaded,
installed and downloaded and its functions are now available. Just for fun and
demonstration, let's use these functions to download a sequence and calculate some
> ??install
> ?install.packages
> install.packages("seqinr")
--- Please select a CRAN mirror for use in this session ---
trying URL 'http://probability.ca/cran/bin/macosx/contrib/2.13/seqi
Content type 'application/x-gzip' length 4528528 bytes (4.3 Mb)
opened URL
==================================================
downloaded 4.3 Mb


The downloaded packages are in
/var/folders/dq/dqPEEPbF0ApRU/-Tmp-//RtmpBlw/downloaded_pac
>
> library("seqinr")
> ls("package:seqinr")
[1] "a" "aaa" "AAstat"
[4] "acnucclose" "acnucopen" "al2bp"
[...]
[205] "where.is.this.acc" "words" "words.po
[208] "write.fasta" "zscore"
> ?a
> a("Tyr")
[1] "Y"
> choosebank()
[1] "genbank" "embl" "emblwgs" "swissprot"
[...]
[31] "refseqViruses"
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 9 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial
statistics (however, not to digress too far, without further explanation at this point).
Copy the code below and paste it into the R-console
choosebank("swissprot")
query("seq", "N=MBP1_YEAST")
mbp1 <- getSequence(seq)
closebank()
x <- AAstat(mbp1[[1]])
barplot(sort(x$Compo))
Scripts
My preferred way of running R is not strictly through the console. I open a new file - a script - and enter my R
commands into the file. Then I execute the commands directly from the script. I may try things in the console,
experiment, change parameters etc. - but ultimately everything I do goes into the file. This has four major
advantages:
The script is an accurate record of my procedure so I know exactly what I have done;
I add numerous comments to record what I was thinking when I developed it;
I can immediately reproduce the entire analysis from start to finish, simply by rerunning the script;
I can reuse parts easily, thus making new analyses quick to develop.
Task:
Use the File menu to open a New Document (on Mac) or New Script (on
Windows).
Enter the following code (copy from here and paste):
# sample script:
# define a vector
a <- c(1, 1, 2, 3, 5, 8, 13)
# list its contents
a
# calculate the mean of its values
mean(a)
save the file in your working directory (e.g. with the name sample.R).
Placing the cursor in a line and pressing command-return (on the Mac, ctrl-r on
Windows) will execute that line and you see the result on the console. You can also select
more than one line and execute the selected block with this shortcut. Alternatively, you
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 10 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial
can run the entire file. In the console type:
source("sample.R")
However: this will not print output to the console. When you run a script, if you want to
see text output you need to explicitly print() it.
Change your script to the following, save it and source() it.
# sample script:
# define a vector
a <- c(1, 1, 2, 3, 5, 8, 13)
# list its contents
print(a)
# calculate the mean of its values
print(mean(a))
Confirm that the print(a) command also works when you execute the line
directly from the script.
Nb. if you want to save your output to file, you can divert it to a file with the sink() command. You can read
about the command by typing:
?sink

Simple commands
The R command line evaluates expressions. Expressions can contain constants, variables, operators and
functions of the various datatypes R recognizes.
Operators
The common arithmetic operators are recognized in the usual way. Try the following operators on numbers:
5
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 11 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial
5 + 3
5 + 1 / 2
3 * 2 + 1
3 * (2 + 1)
2^3 # Exponentiation
8 ^ (1/3) # Third root via exponentiation
7 %% 2 # Modulo operation (remainder of integer division)
7 %/% 2 # Integer division
Functions
Most of R's functionality is expressed through functions. These are either defined by default (built-in), loaded
in specific packages (see above), or they can be easily defined by you (see below). In general a function is
invoked through a name, followed by one or more arguments (also parameters) in parentheses, separated by
commas. Whenever I refer to a function, I write the parentheses to identify it as such and not a constant or other
keyword eg. log(). Here are some examples for you to try and play with:
There are several ways to populate the argument list for a function and R makes a reasonable guess what you
want to do. Consider the specification of a complex number in Euler's identity above. The function complex()
can work with a number of arguments that are given in the documentation (see: ?complex). These include
length.out, real, imaginary, and some more. The length.out argument creates a vector with one or more
complex numbers. If nothing else is specified, this will be a vector of complex zero(s). If there are two, or three
arguments, they will be placed in the respective slots. However, since the arguments are named, we can also
define which slot of the argument list they should populate. Consider the following to illustrate this:
Variables
cos(pi) #"pi" is a predefined constant.
sin(pi) # Note the rounding error. This number is not really different from zero.
sin(30 * pi/180) # Trigonometric functions use radians as their argument - this conv
exp(1) # "e" is not predefined, but easy to calculate.
log(exp(1)) # functions can be arguments to functions - they are evaluated from the
log(10000) / log(10) # log() calculates natural logarithms; convert to any base by
exp(complex(r=0, i=pi)) #Euler's identity
complex(1)
complex(4)
complex(1, 2) # imaginary part missing - defaults to zero
complex(1, 2, 3) # one complex number
complex(4, 2, 3) # four complex numbers
complex(real = 0, imaginary = pi) # defining via named parameters
complex(imaginary = pi, real = 0) # same thing - if names are used, order is not imp
complex(re = 0, im = pi) # names can be abbreviated ...
complex(r = 0, i = pi) # ... to the shortest string that is unique among the named p
complex(i = pi, 1, 0) # Think: what have I done here? Why does this work?
exp(complex(i = pi, 1, 0)) # (The complex number above is the same one as in Euler'
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 12 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial
In order to store the results of evaluations, you can freely assign them to variables. Variables are created
internally whenever you first use them (i.e. they are allocated and instantiated). Variable names are case
sensitive. There are a small number of reserved strings, and a very small number of predefined constants, such
as pi. However these constants can be overwritten - be careful. Read more at:
?make.names
?reserved
To assign a value to a constant, use the assignment operator <-. You could also use the = sign, but this is too
easily confused with the equality test ==, and such errors are hard to debug. Try:
a <- 5
a
a + 3
b <- 8
b
a + b
a == b # equality test
a != b # not equal
a < b # less than
Note that all of R's data types can be assigned to variables (as well as functions and other objects).

Scalar data
Scalars are single numbers, the "atomic" parts of more complex datatypes. We have covered many of the
properties of scalars above, e.g. the use of constants and their assignment to variables. To round this off, here
are some remarks on the types of scalars R uses and on coercion, or casting one datatype into another. The
following scalar types are supported:
Boolean constants: TRUE and FALSE. This type has the "mode" logical";
Integers, floats (floating point numbers) and complex numbers. These types have the mode numeric;
Strings. These have the mode character.
Other modes exist, such as list, function and expression, all of which can be combined into complex
objects.
The function mode() returns the mode of an object and typeof() returns its type. Consider:
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 13 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial

Vectors
Since we (almost) never do statistics on scalar quantities, R obviously needs ways to handle collections of data
items. In its the simplest form such a collection is a vector: an ordered list of items of the same type. Vectors
are created from scratch with the c() function which concatenates individual items into a list. Vectors have
properties, such as length; individual items in vectors can be combined in useful ways.
a <- 3 > 5; a; mode(a); typeof(a) # Note: a > 5 is a logical expression, its value
a <- 3 < 5; a; mode(a); typeof(a)

a <- 3.0; a; mode(a); typeof(a) # Double precision floating point number
a <- 3.0e0; a; mode(a); typeof(a) # Same value, exponential notation

a <- 3; a; mode(a); typeof(a) # Note: numbers are double precision floats by d
a <- as.integer(3); a; mode(a); typeof(a) # If we really want an integer, we must

a <- "3"; a; mode(a); typeof(a) # Forcing the number to be interpreted as a charac

# More coercions. For each of these, first think what result you would expect:
as.numeric("3") # character as numeric
as.numeric("3.141592653") # string as numeric
as.numeric("pi") # another string as numeric
as.numeric(pi) # not a string, but a predefined constant

as.logical(0)
as.logical(1)
as.logical(-1)
as.logical(pi) # any non-zero number is TRUE ...
as.logical("pi") # ... but not non-numeric types. NA is "Not Available".
#Create a vector and list its contents and length:
f <- c(1, 1, 3, 5, 8, 13, 21)
f
length(f)

# Various ways to retrieve values from the vector.
f[1] # By index: "1" is first element.
f[length(f)] # length() is the index of the last element.
1:4 # This is the range operator
f[1:4] # using the range operator (it generates a sequence and returns it in a vect
f[4:1] # same thing, backwards
seq(from=2, to=6, by=2) # The seq() function is a flexible, generic way to generate
seq(2, 6, 2) # Same thing: arguments in default order
f[seq(2, 6, 2)]

# ...using an index vector with positive indices
a <- c(1, 3, 4, 1) # the elements of index vectors must be valid indices of the tar
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 14 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial
Many operations on scalars can be simply extended to vectors and R computes them very efficiently by
iterating over the elements in the vector.

Matrices
If we need to operate with several vectors, or multi-dimensional data, we make use of matrices or more
generally k-dimensional arrays R. Matrix operations are very similar to vector operations, in fact a matrix
actually is a vector for which the number of rows and columns have been defined.
The most basic form of such definition is the dim() function. Consider:
a <- 1:12; a
dim(a) <- c(2,6); a
dim(a) <- c(2,2,3); a
dim() also allows you to retrieve the number of rows and columns. For example:
f[a] # Here, four elements are retrieved from f[]

# ...using an index vector with negative indices
a <- -(1:4) # If elements of index vectors are negative integers, the corresponding
f[a] # Here, the first four elements are omitted from f[]
f[-((length(f)-3):length(f))] # Here, the last four elements are omitted from f[]

# ...using a logical vector
f>4 # A logical expression operating on the target vector returns a vector of logic
f[f>4]; # We can use this logical vector to extract only elements for which the log

# Example: extending the Fibonacci series for three steps.
# Think: How does this work? What numbers are we adding here and why does the resul
f <- c(f, f[length(f)-1] + f[length(f)]); f
f <- c(f, f[length(f)-1] + f[length(f)]); f
f <- c(f, f[length(f)-1] + f[length(f)]); f
f
f+1
f*2

# computing with two vectors of same length
a <- f[-1]; a # like f[], but omitting the first element
b <- f[1:(length(f)-1)]; b # like f[], but shortened by the least element
c <- a / b # the "golden ratio", phi (~1.61803 or (1+sqrt(5))/2 ), an irrational num
c
abs(c - ((1+sqrt(5))/2)) # Calculating the error of the approximation, element by e
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 15 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial
dim(a) # returns a vector
dim(a)[3] # only the third value of the vector
If you have a two-dimensional matrix, the function nrow() and ncol() will also give you the number of rows
and columns, respectively. Obviously, dim(mat)[1] is the same as nrow(a).
As an alternative to dim(), matrices can be defined using the matrix() or array() functions (see there), or
"glued" together from vectors by rows or columns, using the rbind() or cbind() functions respectively:
a <- 1:4
b <- 5:8
c <- rbind(a, b); c
d <- cbind(a, b); d
e <- cbind(d, 9:12); e
Addressing (retrieving) individual elements or slices from matrices is simply done by specifying the appropriate
indices, where a missing index indicates that the entire row or column is to be retrieved
e[1,] # first row
e[,2] # second column
e[3,2] # element at index row=3, column = 2
e[3:4, 1:2] # submatrix
Note that R has numerous functions to compute with matrices, such as transposition, multiplication, inversion,
calculating eigenvalues and eigenvectors and more.

Lists
While the elements of matrices and arrays all have to be of the same type, lists are more generally ordered
collections of components. Lists are created with the list() function, which works similar to the c() function.
Components are accessed through their index in double square brackets, or through their name, if the name has
been defined. Here is an example:

pUC19 <- list(size=2686, marker="ampicillin", ori="ColE1", accession="L01397", BanI
pUC19[[1]]
pUC19[[2]]
pUC19$ori
pUC19$BanI[2]
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 16 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial
Data frames
Data frames combine features of lists and matrices, they are one of the most important data objects in R,
because the result of reading an input file is usually a data frame. Lets create a little datafile and save it in the
current working directory. You can use the "New document" command from the menu and save the following
data as e.g. vectors.tsv (for "tab separated values").
Name Size Marker Ori Sites
pUC19 2686 Amp ColE1 EcoRI, SacI, SmaI, BamHI, XbaI, PstI, HindIII
pBR322 4361 Amp, Tet ColE1 EcoRI, ClaI, HindIII
pACYC184 4245 Tet, Cam p15A ClaI, HindIII
This data set uses tabs as column separators and it has a header line. Similar files can be exported from Excel or
other spreadsheet programs. Read this as a data frame as follows:
Vectors <- read.table("vectors.tsv", sep="\t", header=TRUE)
Vectors
You can edit the data through a spreadsheet-like interface with the edit() function.
V2 <- edit(Vectors)
Here is a collection of examples of subsetting data from this frame:
Vectors[1, ]
Vectors[2, ]
Vectors[ ,2 ]

Vectors$Name

Vectors$Size > 3000
Vectors$Name[Vectors$Size > 3000]
Vectors$Name[Vectors$Ori != "ColE1"]

Vectors[order(Vectors$Size), ]

grep("Tet", Vectors$Marker)
Vectors[grep("Tet", Vectors$Marker), ]
Vectors[grep("Tet", Vectors$Marker), "Ori"]
as.vector(Vectors[grep("Tet", Vectors$Marker), "Ori"])

14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 17 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial
Writing your own functions
Writing your own functions in R is easy and gives you access to flexible, powerful and reusable solutions.
Functions are assigned to function names and invoking the function returns some value, vector or other data
object.
lg <- function(x) { log(x) / log(10) }
lg(10000) # should be 5
We can use loops and control structures inside functions. For example the following creates a series of
Fibonacci numbers.
fib <- function(n) {
if (n < 1) { return( c(0) ) }
else if (n == 1) { return( c(1) ) }
else if (n == 2) { return( c(1, 1) ) }
else {
v <- c(1, 1)
for ( i in 3:n ) {
v <- c(v, v[length(v)-1] + v[length(v)])
}
return( v )
}
}
This concludes our introduction to R.

Notes
1. This is the CRAN mirror site at the University of Toronto, any other mirror site will do. You may
access a choice of mirror sites from the R-project homepage (http://r-project.org) .
2. Graphical User Interface
3. ... at least that's how I believe it works. If a Windows user would care to confirm this, I would
appreciate it.
4. Actually, the first script to run is Rprofile.site which is found on Linux and Windows machines in the
C:\Program Files\R\R-{version}\etc directory. But not on Macs.
14-05-17 2:59 PM R tutorial - BCH441Hwiki
Page 18 of 18 http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial

Further reading and resources
R on Wikipedia (http://en.wikipedia.org/wiki/R_(programming_language))
Introduction to R at CRAN (http://cran.r-project.org/doc/manuals/R-intro.html)
User-contributed documents about R at CRAN (http://cran.r-project.org/other-docs.html)
including for example E. Paradis' R for Beginners and J. Lemon's Kickstarting R.
The "Task-views" section of CRAN (http://cran.r-project.org/web/views/) : thematically
organized collections of R-packages.
The "Views" section of Bioconductor
(http://www.bioconductor.org/packages/release/BiocViews.html) , and
Bioconductor annotated workflows. (http://www.bioconductor.org/help/workflows/)
Quick-R how-to's (http://www.statmethods.net/index.html)
R tagged questions on stackoverflow. (http://stackoverflow.com/tags/r/info)
Cross Validated statistics questions on stackexchange. (http://stats.stackexchange.com/)

Retrieved from "http://biochemistry.utoronto.ca/undergraduates/courses/BCH441H/wiki/index.php/R_tutorial"
Categories: Applied Bioinformatics | R
This page was last modified on 17 February 2014, at 00:01.
This page has been accessed 13,914 times.

Вам также может понравиться