R Programming Exam With Solutions

Modern Statistical Computing in R
Instructors Albert Satorra and Ferran Carrascosa

July 26, 15:30 to 17:00, 2018, UPF (4nd quarter)
This is the Final Exam of Modern Statistical Computing in R This exam is composed of 10 short queries
—Q1 to Q10— plus one Question. Queries and Question have a context that is implicitely defined by related
R sintaxis. Be very carefull with the sintaxis! Q1-Q10 account for 80% of the grade,
the Question for the remaining 20%. Time for this exam is 1:30h.
Q1
x<- ("A","B","C")
is.numeric(x)
The result of the following R sintaxis will be:

1. TRUE
2. FALSE
3. there is a sintaxis error
Write the response number of your choice with a brief justification comment (one/two lines)
-
-> x<- ("A","B","C")
Error: unexpected ',' in "x<- ("A","
Option 3 because it should have been x<-c("A","B","C"). (student text in the exam 26-7-2018)
-
-
Q2
x<- rep(1,10)
sd(x)
[1] 0
The result of the above R sintaxis will be:
1. 0
2. 1
-
-Option 1 because x will be a vector of 10 1's, and because it is
all the same numbers , variance will be 0.
(student text in the exam 26-7-2018)
1
-
-
Q3
Consider the following sintaxis (that simulates data for x and y) and the regression analysis results on that
data
set.seed(3984) ; # seed number
x<-runif(1000)
u<-runif(1000)-0.5
y <- 0+ .3*x + 0.6*u
plot(x,y)
0.6
0.4
0.2
y
0.0
−0.2
0.0 0.2 0.4 0.6 0.8 1.0
x
res<- lm(y~x)
A<-coefficients(summary(res))
A
Estimate Std. Error t value Pr(>|t|)

(Intercept) -0.009588047 0.01088475 -0.8808694 3.786006e-01
x 0.321875626 0.01869217 17.2198105 2.210016e-58
e<-res$residuals
qqnorm(e)
qqline(e, col="red")
2
Normal Q−Q Plot
0.3
0.2
Sample Quantiles
0.1
−0.1
−0.3
−3 −2 −1 0 1 2 3
Theoretical Quantiles One of

our colleagues defends: “In this regression the standard error cannot be trusted since the residuals are clearly
non-normal”. To further check for this, we run the following sintaxis
MT<-c()
for (i in 1:1000){
x<-runif(1000)
u<-runif(1000)-0.5
y <- 0+ .3*x + 0.6*u
res<- lm(y~x)
A<-coefficients(summary(res))[2,1]
MT<-c(MT,A)
}
sd(MT)
[1] 0.018433
In view of all above, we can say that the statement of our colleague
1. is correct
2. is not correct
3. the value obtained of sd(MT) has nothing to do with the statement of our colleague.
-
-By bootstrap we estimate the coefficient beta1 one hundere times.
After that we compute the standard deviation.
It appears to be the same as the computed by regresion,
thus, we can trust the regresion.
-
-
3
Q4
Executing the sintaxis below, produces
X<-matrix(rbinom(2*1000,4,.2),1000,2)
M<-t(X)%*%X
iM<- solve(M)
sum(diag(iM%*%M))
[1] 2
1. a random number
2. 1000
3. 2
-
-
-Option 3 because it sum two ones. (student text in the exam 26-7-2018)
-
Q5
In relation to the y and x above in Q3
library(MASS)
boots<-c()
for (i in 1:300){
n<- length(y)
ind<- sample(1:n,n)
res<- rlm(y[ind]~x[ind])
A<-coefficients(summary(res))[2,2]
boots<- c(boots,A)
}
sd(boots)
[1] 9.181486e-18
Be carefull in the sintaxis sample(). The value produced by the sintaxis will be
1. approximately .018
2. 0 (at the machine level )
3. approximately .0.31
-
-
-Option 2 because sample function does not use replacement.
-
4
Q6
The following sintaxis, produces
data<-sample(1:4, 100, replace=TRUE)
prop.table(table(data))
data
1 2 3 4
0.22 0.20 0.26 0.32
The value obtained will be
1.
data
1 2 3 4
0.23 0.27 0.23 0.27
2.
data
1 2 3 4
23 27 23 27
3. There is a sintaxis error

-
- Option 1 because it is supposed to give the proportions of each value in the variable "data"
-
-
Q7
The sintaxis
data<- 1:30
a<- data > 3
is.numeric(a)
[1] FALSE
results to
1. FALSE
2. TRUE
-
- Option 1 because we feed a logical value into variable "a" so for every value,
it will answer te question with true's and fale's , which is not numeric, it is logical.
-
-
5
Q8
The following sintaxis
length(rep(1:3,3))
[1] 9
results in
1. 3
2. 9
3. 6 Write the response number of your choice with a brief justification comment (one/two lines)
-
-
- Option 2 because is the length of a vector of 9 elements.
-
Q9
The following R object
> da
x fx
1 1 1
2 1 1
3 2 2
4 3 3
results in
> mean(da$x)
[1] 1.75
> mean(da$fx)
[1] NA
Warning message:
In mean.default(da$fx) : argument is not numeric or logical: returning NA
>
This implies that

is.matrix(da)
results in
1. TRUE
2. FALSE
3. error of computation
-
-
- Option 2 becase if "da" contains both numeric and factor variables
then it can not be a matrix, it is a data frame.
6
-
Q10
With respect to number P computed in the following sintaxis
data<-sample(1:10,1000,rep=TRUE)
O<-table(data)
E<-1000*rep(1/10,10)
T<- sum(((O-E)**2)/E)
P<-1-pchisq(T,9)
[1] 0.2077303
1. It is very likely, probability 95%, that P > 0.05.
2. It is very likely that P > 0.95.
3. P follows a chi-square distribution with 9 degrees of freedom
-
- The null hypothesis for the chi-square goodness of fit test holds in this case, so P
has approximately a uniform distribution in the interval (0,1);
so it is approximately 95\% probable to be above .05.
Question
The file File logitConditionalEffectPlot.R of web of the course, models de variable Vot (voting (yes/no to the
party A) as a function of the log of income (Lrenda) and gender (Genere, Gender = 0 is male). Below is the
sintaxis and its execution in the Console of R.
# source("/Users/albert/FUNCTIONS/logitConditionalEffectPlot.R")
library(foreign)
data<- as.data.frame(read.spss("http://84.89.132.1/~satorra/dades/M2014dadesSIM.sav"))
dim(data)
[1] 800 4
head(data)
Lrenda Ldespeses Genere Vot

1 9.477 4.503 1 1
2 11.435 6.147 1 0
3 10.686 4.961 0 0
4 10.407 3.993 0 0
5 10.814 5.746 0 0
6 9.944 4.950 0 1
7
attach(data)
# this is my sintaxis for the conditional effect plot

reg <- glm(Vot ~ Lrenda , binomial)
summary(reg)
Call:
glm(formula = Vot ~ Lrenda, family = binomial)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.5206 -0.9692 0.4540 0.9087 2.5511
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 12.389 1.027 12.07 <2e-16 ***
Lrenda -1.208 0.101 -11.96 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1101.02 on 799 degrees of freedom

Residual deviance: 900.73 on 798 degrees of freedom
AIC: 904.73
Number of Fisher Scoring iterations: 4

beta<-reg$coefficients
x<-seq(min(Lrenda)-3,max(Lrenda)+2, length=200)
Logit <- beta[1] + beta[2]*x

prob <- 1/(1 + exp(-Logit))
reg3 <- glm(Vot ~ Lrenda + Genere, binomial)

beta3 <- reg3$coefficients
# x<-seq(min(Lrenda)-3,max(Lrenda)+2, length=200)
Logit1 <- beta3[1] + beta3[2]*x+ beta3[3]*1
Logit0 <- beta3[1] + beta3[2]*x+ beta3[3]*0
prob1 <- 1/(1 + exp(-Logit1))
prob0 <- 1/(1 + exp(-Logit0))
plot(Lrenda, Vot , main ="conditional effect plot: Vot vs Lrenda + Gender ")
lines(x, prob, col="gray", lwd=3)
lines(x, prob0, col="blue", lwd=3)
lines(x, prob1, col="orange", lwd=3)
abline(v = 10, lty = 3, lwd=0.8)
abline(v = 13, lty = 3, lwd=0.8)
legend(11, 0.9, c("Gender=0","Gender=1","overall"), col=c("blue","orange","grey"),lwd=3)
8
conditional effect plot: Vot vs Lrenda + Gender
1.0
0.8
Gender=0
Gender=1
overall
0.6
Vot
0.4
0.2
0.0
7 8 9 10 11 12 13
Lrenda
Comment briefly the main actions of the code and the results obtained in the analysis. Use a language that
can be understood by a non-statistician. Maximum length one page.
-
-
-
-
-
-
-
-
-
-
-
-
-
-

R Programming Exam With Solutions

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

R Programming Exam With Solutions

Загружено:

Авторское право:

Доступные форматы

Modern Statistical Computing in R

Instructors Albert Satorra and Ferran Carrascosa

The result of the following R sintaxis will be:

0.0 0.2 0.4 0.6 0.8 1.0

Estimate Std. Error t value Pr(>|t|)

Theoretical Quantiles One of

3. There is a sintaxis error

This implies that

Lrenda Ldespeses Genere Vot

# this is my sintaxis for the conditional effect plot

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 1101.02 on 799 degrees of freedom

Number of Fisher Scoring iterations: 4

Logit <- beta[1] + beta[2]*x

reg3 <- glm(Vot ~ Lrenda + Genere, binomial)

Вам также может понравиться