Вы находитесь на странице: 1из 8

CURS 1

Limbajul R.
Instalare R, pachete, obiecte R (vectori, factori, matrici, data frames)

Bibliografie:
1. B. S. Meritt, T. Hothorn, A Handbook of Statistical Analyses Using R, Chapman&Hall,
2009
2. M. J. Crawley, The R Book, Wiley, 2013
3. P. Dalgaard, Introductory statistics with R, Springer, 2008
4. L. Wasserman, All of Statistics: A concise Course in Statisticals Inference, Springer Texts
in Statistics (2004)

Continutul cursului:
-

introducere in limbajul R
estimatii punctuale
metoda verosimilitatii maxime, metoda celor mai mici patrate
intervale de incredere
verificarea ipotezelor statistice
teste asupra mediei (t.test, Wilcox.test), asupra variantei (var.test, Bartlett.test)
coeficientul de corelatie (Pearson, Spearman, Kendall); teste asupra coef. de corelatie
(cor.test)
ANOVA (pairwise.t.test, oneway.test, kruskal.test, friedman.test)
prop.test, binom.test, chisq.test, fisher.test, prop.trend.test pentru tabele de frecventa
teste de concordant

Limbajul R.

high-level language and environment for data analysis and graphics in statistics
the root of R is the S language, developed by John Chambers and colleagues (Becker,
Wilks) at the Bell Labs starting in the 60s
The resulting language is very similar in appearance to S
Why R: cutting-edge statistical techniques developed in R, utilizarea pe scara larga (a
large proportion of the worlds leading statisticians use R), understanding the literature,
quality of back-up and support available, write your own functions, user contributed
extensions, open source
The base distribution of R and a large number of user contributed extensions are available
under the terms of the Free Software Foundations GNU General Public License in
source code form. This licence has two major implications for the data analyst working
with R. The complete source code is available and thus the practitioner can investigate
the details of the implementation of a special method, can make changes and can
distribute modifications to colleagues
R is most widely used for teaching undergraduate and graduate statistics classes at
universities all over the world (since it is free)
The base distribution of R is maintained by a small group of statisticians, the R
Development Core Team
A huge amount of additional functionality is implemented in add-on packages authored
and maintained by a large group of volunteers
http://www.R-project.org

All resources are available from this page: the R system itself, a collection of add-on
packages, manuals, documentation and more
Writing in command line; also menu available

The R system for statistical computing consists of two major parts: the base system and a
collection of user contributed add-on packages. The R language is implemented in the base
system. Implementations of statistical and graphical procedures are separated from the base
system and are organised in the form of packages. A package is a collection of functions,
examples and documentation. The functionality of a package is often focused on a special
statistical methodology. Both the base system and packages are distributed via the
Comprehensive R Archive Network (CRAN) accessible under
http://CRAN.R-project.org

Instalarea programului R:
-

http://www.R-project.org -> Download -> Cran mirror -> Download R for Windows ->
base -> Download R 3.3.1 for Windows
Pe aceeasi pagina gasim Manuals si The R Journal (include add-ons packages,
programmers niche, help desk, applications)
De discutat pagina cu Manuals si ce se gaseste acolo; vazut o carte

Help in R:

> help();
> help(tapply)
> help.search(lapply)
> ?mean
> help(package=AssocTests)
> vignette(=animation)
> find(z.test)
> apropos(lm)

Exemple si demo-uri:
> example(mean)
> demo(graphics)
Setarea directorului curent:
-

getwd()
setwd(director)
de la meniu

Obiectele din R:
> objects()
Pachete in R:
> install.packages("sandwich") # instalarea unui pachet
> library("sandwich") # atasarea pachetului
Lista tuturor pachetelor: https://cran.r-project.org/web/packages/
Task views pe pagina principal: gruparea pachetelor pe domenii.
Base program comes with already installed packages:
Boot

nlme

KernSmooth

MASS

base

class

cluster

datasets

foreign

grDevices

graphics

grid

lattice

methods

mgcv

nnet

rpart

spatial

splines

stats

stats4

survival

tcltk

tools

utils

> library(help=spatial)
>objects(grep("spatial",search()))

Command lines vs scripts:


-

Scripts in R: File -> New script


You can type and edit in this, then when you want to execute a line or group of lines, just
highlight them and press Ctrl+R (the Control key and R together). The lines are
automatically transferred to the command window and executed.
.R file extension
File -> Open script
Editors with more features. Tinn-R (this is not notepad for R) is very good,
or you might like to try RStudio, which has the nice feature of allowing you to scroll back
through all of the graphics produced in a session. These and others are free to download
from the web.

Data Import and Export


> csvForbes2000 <- read.table("Forbes2000.csv", header = TRUE,
+ sep = ",", row.names = 1) # si pt fisiere .txt
-

read.csv()
write.table(Forbes2000, file = "Forbes2000.csv",
+ sep = ",", col.names = NA)

> data("Forbes2000", package = "HSAUR") # 2000 world leading companies (2004)


> str(Forbes2000) # ce se gaseste in acest data frame
> class(Forbes2000) # tipul obiectului
> dim(Forbes2000)
> nrow(Forbes2000)
> ncol(Forbes2000)
> names(Forbes2000)
> class(Forbes2000[, "rank"])
> table()

# comentarii in R

+ comanda neterminata
R is case-sensitive
Calcule aritmetice cu R
+, -, *, /
^, %%, %/%
> 2+3*5
> 2/3
> 2^(-5)
> 17%%5 # restul impartirii (modulo 5)
> 17%/%5 # impartirea intreaga (catul impartirii lui 17 la 5)
R calculates to a high precision, but by default only displays 7 significant digits. You can change
the display to x digits using options(digits = x)
> 1/3
> options(digits = 16)
> 1/3
> floor()
>ceiling()
>round(x, digits=2)
>trunc()
>signif(x,digits=2)
Variabile (asignarea in R): <-, -> sau =
> x<-100
>x
> (1 + 1/x)^x
> x<-200

> (1 + 1/x)^x
> x<-x+1 # expresia din dreapta este evaluate mai intai
Functii matematice

Obiecte in R
-

Vectori=lista indexata de variabile (multimi ordonate de elemente finite); de 3 tipuri


(numerici, logici, sir de caractere)

Functiile de baza pentru a construe vectori: c() #combine, seq(from, to, by), rep(x, times).
> x<-seq(1,20,by=2)
> y<-rep(3,4)
> z<-c(x,y)
Operatorul :
> x<-100:110
> y<-10:1

> length(x)
> x[2]
> x[-2]
> x[c(2,4,6)]
> x+2;x*2;x^2;x^y
Functii utile pentru vectori: sum(...), prod(...), max(...), min(...), sqrt(...), sort(x).
Date lipsa: NA
>a<-c(2,3,NA)
>mean(a)
>mean(a,na.rm=TRUE)
Vectori logici:
> x<-c(T,T,F,T,F,F)
> y<-c(5,-3,4,1,-9,6)
> z<-(y>3) # genereaza un vector logic

Factori

Matrici=tablouri bidimensionale cu date de acelasi tip

a) pronind de la un vector
> v<-1:12
> dim(v)<-c(3,4) #completarea se face pe coloane
>v
b) rbind(), cbind()
> v1<-c(2,3,4);v2<-c(5,8,7)
> rbind(v1,v2)

> cbind(v1,v2)
c) matrix(date,nrow=n,ncol=m,byrow=T); daca nu apare byrow=T, se vor pune elementele mai
intai pe coloane
> m1<-matrix(1:12,nrow=3)
> m2<- matrix(1:12,nrow=3,byrow=T)
> rownames(m1,c(L1,L2,L3))
> colnames(m2,c(col1,col2,col3))
> m1;m2
> m2[2,3]
> m2[,1]
> m2[1,]
> m2[-2,]; m2[,-3]; m2[-2,-3]

Data frames=

Вам также может понравиться