Академический Документы
Профессиональный Документы
Культура Документы
Submitted by
INTRODUCTION TO R
How to install R Studio
Four Panes in R
Import of Data Sheet in Excel
Descriptive statistics
Correlation
Hypothesis Testing: Two sample - Independent sample t test
Hypothesis Testing: Two sample - Paired Sample t test
Hypothesis Testing: One-way ANOVA
Hypothesis Testing: F test
Hypothesis Testing: Chi square test
3) Press enter
STEPS :
3) Press ENTER
3) Press ENTER.
STEPS :
3) Click the corner of the K1 cell with the sum and drag till K9 to get the corresponding
values of the sum in both the subjects for all the students.
STEPS :
STEPS :
STEPS :
2) Press ENTER.
1. Select the cell where you want to apply the countif function.
2. Type =COUNTIF(D2:D9,">75").
3. Press ENTER.
STEPS:
1. Select the cell where you want to apply the sumif function.
2. Type =SUMIF(C2:C9,B17,K2:K9)
3. Press ENTER.
SYNTAX: =AVERAGEIF(range,criteria,average_range)
STEPS:
1. Select the cell where you want to apply the averageif function.
2. Type =AVERAGEIF(C2:C9,C2,D2:H9)
3. Press ENTER.
STEPS :
1) Select the cell where you want to enter the formula and get the result.
3) Press ENTER.
STEPS : 1) Click the cell where the VLOOKUP is to be calculated, ie, “ B29” and type the formula
for VLOOKUP.
2) Specify the cell in which you will enter the value whose data you’re looking for, “A29”.
3) Specify the data which you want VLOOKUP to use for its search in the table_array box, ie,
“A2:H9”.
4) Specify the column number which VLOOKUP will use to find the relevant data in the
col_index_num box, ie, “2”.
6) Press ENTER.
1) Click the cell where the HLOOKUP is to be calculated, and type the formula for HLOOKUP.
2) Specify the cell in which you will enter the value whose data you’re looking for, “ N12”.
3) Specify the data which you want HLOOKUP to use for its search in the table_array box, ie,
“B12:I14”.
4) Specify the row number which HLOOKUP will use to find the relevant data in the
row_index_num box, ie, “2”.
6) Press ENTER.
OTHER TOOLS
1. TRANSPOSE TABLE IN EXCEL
STEPS :
1) Select the given data, ie, A1:H9 and copy it.
2) Select the area where the transpose is to be pasted, ie, A12:I19
3) In the “Home” ribbon, go to Paste > Transpose.
STEPS:
1) Select the data, i.e. marks of the students to highlight the marks of students.
2)Home < Conditional Formatting < Top Rules < Top 10 items.
3) Enter 10 to highlight top 10 and select Yellow fill with dark yellow text.
STEPS:
STEPS:
STEPS:
4) In the number tab, select TEXT. In alignment, select BOTTOM. In font select CAMBRIA,
ITALICS and SINGLE UNDERLINE. Then select a border and a colorfill.
NUMBER ALIGNMENT
FONT
BORDOR
STEPS :
NOTE : If any other value is entered, Excel shows an Error message, ie, “The value you entered is
not valid.”
STEPS:
Now click the items you wish to add to the quick access toolbar, by selecting them and then
clicking “add”.
6. Click OK.
STEPS: Apply the formula of relative frequency, ie, H13/H20, and so on for all the frequencies.
3. Percentage Frequency
STEPS: Apply the formula of relative frequency, ie, H13/H20*100, and so on for all the frequencies.
4. Bar Graph
STEPS:
4) Choose PivotTable report to be placed on the existing worksheet and enter the location for the
same.
5) Click OK.
6) Tick all the fields to add them to the report. Drag the “Sales value” Label field into values,
“Product” into Column labels, “Company” and “Salesperson” into Row column.
STEPS:
STEPS :
1. Since descriptive statistics tool require numeric data, we will convert the non numeric data,
into numeric data by generating the codes given in the table below.
3) Press ENTER.
Problem
Age Dummy
42 0
76 0
56
67
65
65
89
76
45
45
65
78
55
52
53
44
65
76
89
44
54
45
56
56
56
76
Hypothesis Testing :-
H0: µ<=40
H1: µ=40
Age
Mean 61.15384615
Variance 195.7353846
Observations 26
Pooled Variance 188.2071006
Hypothesized Mean Difference 40
df 26
t Stat 2.101328716
P(T<=t) one-tail 0.022727409
t Critical one-tail 1.70561792
P(T<=t) two-tail 0.045454818
t Critical two-tail 2.055529439
Decision Rule:
Inference:
Conclusion:
Research Problem
Hypothesis
Group A Group B
76 95
87 97
98 87
45 89
66 87
78 45
76 76
88 56
78 76
87 87
54 45
65 76
75 45
89 88
65 76
78 66
54 78
87 56
45 77
Hypothesis Testing :-
H0: µA ≠µB
H1: µA = µB
Group A Group B
Mean 73.21052632 73.78947368
Variance 236.5087719 287.3976608
Observations 19 19
Pooled Variance 261.9532164
Hypothesized Mean Difference 0
df 36
t Stat -0.110252646
P(T<=t) one-tail 0.456410681
t Critical one-tail 1.688297714
P(T<=t) two-tail 0.912821361
t Critical two-tail 2.028094001
Decision Rule:
Inference:
Conclusion:
Research problem
Two type of drugs were used on 5 and 7 patients for reducing their weight, drug A
was imported and drug B was indigenous, the decrease in the weight after using
drugs for 6 months was as follows:
Drug A Drug B
10 8
12 9
13 12
11 14
14 15
10
9
Hypothesis
H0 = No significant difference
H1 = Significant difference
µA = efficiency of drug A
µB = efficiency of drug B
H0 = µA - µB = 0
H1 = µA - µB ≠ 0
Drug A Drug B
Mean 12 11
Variance 2.5 7.333333333
Observations 5 7
Pooled Variance 5.4
Hypothesized Mean Difference 0
df 10
t Stat 0.73493092
P(T<=t) one-tail 0.239630988
t Critical one-tail 1.812461123
P(T<=t) two-tail 0.479261977
t Critical two-tail 2.228138852
Decision rule
Inference
Here t stat (0.73) is less than t critical (2.22), therefore accept null
hypothesis.
Conclusion
Research Problem
Is there sufficent evidence to suggest that the mean time to exhaustion is
greater after chocolate milk than after carbohydrate replacement drink? Use
a significant level of 0.05.
2 47.08 50.1
3 57.51 41.67
4 46.9 32.69
5 29.1 46.33
6 57.5 31.63
7 23.87 20.61
8 28.65 14.99
9 35.37 20.11
Hypothesis Testing -
:- µCM ≤ µCD
H1:- :- There issufficent evidence to suggest that the mean time to exhaustion
isgreater after chocolate milk than after carbohydrate replacement drink.
:- µCM >µCD
Decision Rule:
Inference:
Since T-stats 1.98) is more than t-Critical (1.85), therefore reject null
hypothesis
Conclusion:
Research problem
Coaching was given to student for stastistical software after their result
was evaluated in january in order to improve their performance in april
exams. Determine if the coaching was successful.
Jan May
45 56
54 57
44 45
56 67
34 44
45 34
34 34
67 76
45 56
54 45
67 76
56 87
56 66
56 65
76 45
Hypothesis Testing: -
H0 :-µmay ≤µjan
H1 :-µmay >µjan
Jan May
Mean 54.0625 58.0625
Variance 164.3291667 258.0625
Observations 16 16
Pearson Correlation 0.591118937
Hypothesized Mean Difference 0
df 15
t Stat -1.19611891
P(T<=t) one-tail 0.125107938
t Critical one-tail 1.753050356
P(T<=t) two-tail 0.250215876
t Critical two-tail 2.131449546
Decision rule
Inference
Here t stat (-1.19) is less than t critical (1.75), therefore accept null
hypothesis.
Conclusion
Research problem
Diet was given to 8 patients for weight loss. The weight (in llb) are as follows.
Determine whether the diet was effective?
Before After
162 168
170 136
184 147
164 159
172 143
176 161
159 143
170 145
Hypothesis Testing: -
H0 :-Diet given to 8 paitents for weight loss was not effective. Patient
didn’t loose there weight.
:- µB≤µA
H1:- :- Diet given to 8 paitents for weight loss was effective. Patient loose
there weight.
:- µB>µA
Decision rule
Inference
Here t stat (3.70) is more than t critical (1.89), therefore reject null
hypothesis.
Conclusion
Therefore, Diet given to 8 paitents for weight loss was effective. Patient
loose there weight
α = 0.05.
Research problem
Determine whether or not there is a significance difference between
variances of two sets.
Group 1 Group 2
150 125
175 165
160 130
130 155
160 170
145 150
Hypothesis Testing: -
:- µ1≠µ2
H1:- :- There is significance difference between variances of two sets.
:- µ1=µ2
Group 1 Group 2
Mean 153.3333333 149.1666667
Variance 236.6666667 334.1666667
Observations 6 6
df 5 5
F 0.708229426
P(F<=f) one-tail 0.357123518
F Critical one-tail 0.1980069
Here group 1 variance is less than group 2 variance, so will swap the
values.
Group 2 Group 1
Mean 149.1666667 153.3333333
Variance 334.1666667 236.6666667
Observations 6 6
df 5 5
F 1.411971831
P(F<=f) one-tail 0.357123518
F Critical one-tail 5.050329058
Decision rule
Inference
Here F(1.411) is less than F critical (5.050), therefore accept null hypothesis.
Here P(F<=f) (0.357) is greater than α (0.05), therefore accept null hypothesis
Conclusion
Research problem
Determine whether or not there is a significant difference between variance
of mathematics score of two class groups.
Class
1 Class2
65 76
76 54
65 57
76 65
56 76
45 66
Hypothesis Testing: -
: - µ1≠µ2
H1: -There is a significant difference between variance of
mathematics score of two class groups.
: - µ1=µ2
Decision rule
Inference
Here F (1.68) is less than F critical (5.05), therefore accept null hypothesis.
Here P(F<=f) (0.29) is greater than α (0.05), therefore accept null hypothesis.
Conclusion
Research problem
The net annual returns (the returns on investment after deducting all relevant
fees) in percentage are given. Can investors do better by buying mutual funds
directly from banks or other financial institutions than by purchasing mutual
funds through brokers. Can we conclude at the 5% significance level that directly-
purchased mutual funds outperform mutual funds bought through brokers?
H0: - µd ≤µb
H1: - µd>µb
Direct Broker
Mean 6.6312 3.7232
Known Variance 37.48818 43.33928
Observations 50 50
Hypothesized Mean
Difference 0
z 2.287177862
P(Z<=z) one-tail 0.011092722
z Critical one-tail 1.644853627
P(Z<=z) two-tail 0.022185444
z Critical two-tail 1.959963985
Decision rule
Inference
Here z (2.28) is more than z critical (1.64), therefore reject null hypothesis.
Here P(Z<=z) (0.01) is smaller than α (0.05), therefore reject null hypothesis.
Conclusion
Research problem
The Marks for three different groups in economics, science, history are given below.
Determine whether there is a significant difference between the mean of population.
Hypothesis Testing: -
SUMMARY
Groups Count Sum Average Variance
Economics 9 435 48.33333 23.5
Science 7 420 60 32.33333
History 9 393 43.66667 50.5
ANOVA
Source of SS df MS F P-value F crit
Total 1871.84 24
Decision rule
Inference
Here F(15.19) is more than F-critical (3.44), therefore reject null hypothesis.
Here Pvalue (7.16) is smaller than α (0.05), therefore reject null hypothesis.
Conclusion
Therefore we will reject Null Hypothesis, andat least one of the means is different.
Economics Science
Mean 48.33333 60
Variance 23.5 32.33333
Observations 9 7
Science History
Mean 60 43.66667
Variance 32.33333 50.5
Observations 7 9
Pooled Variance 42.71429
Hypothesized Mean Difference 0
df 14
t Stat 4.959051
P(T<=t) one-tail 0.000105
t Critical one-tail 1.76131
P(T<=t) two-tail 0.00021
t Critical two-tail 2.144787
t-Test: Two-Sample Assuming Equal
Variances
History Economics
Mean 43.66667 48.33333
Variance 50.5 23.5
Observations 9 9
Pooled Variance 37
Hypothesized Mean Difference 0
df 16
t Stat -1.62747
P(T<=t) one-tail 0.061584
t Critical one-tail 1.745884
P(T<=t) two-tail 0.123167
t Critical two-tail 2.119905
Decision rule
- If T stat is greater than T
Economics Science
H0 :- µe = µs
H1 :- µe ≠ µs
Accept Null
Economics Science
H0 :- µs = µh
H1 :- µs ≠ µh
Reject Null
History Economics
H0:- µh= µe
H1:- µh ≠ µe
Accept Null
Research problem
Determine whether there is a significant different between marks of student,
Subject-vise or student-wise.
Student a 42 69 35
b 53 54 40
c 49 58 53
d 53 64 42
e 43 64 50
Hypothesis Testing: -
Economics 5 240 48 28
Science 5 309 61.8 34.2
History 5 220 44 54.5
ANOVA
Source of Variation SS df MS F P-value F crit
Rows 60.93333 4 15.23333 0.300263 0.869889 3.837853
Columns 872.1333 2 436.0667 8.595269 0.010172 4.45897
Error 405.8667 8 50.73333
Total 1338.933 14
Decision rule
Inference (Row)
Here F (0.30) is less than F-critical (3.83), therefore accept null hypothesis.
Here Pvalue (0.86) is more than α (0.05), therefore accept null hypothesis.
Inference (Column)
Here F (8.59) is more than F-critical (4.45), therefore reject null hypothesis.
Here Pvalue (0.01) is smaller than α (0.05), therefore reject null hypothesis.
Conclusion
Research problem
30
22
20
20
23
18
21
23
21
20
21
22
24
24
19
23
22
24
Hypothesis Testing: -
H0: - µ=20
H1: - µ≠20
Age Dummy
Mean 22 0
Known Variance 25 0.0001
Observations 22 1
Hypothesized Mean
Difference 20
z 1.876083758
P(Z<=z) one-tail 0.03032189
z Critical one-tail 1.644853627
P(Z<=z) two-tail 0.06064378
z Critical two-tail 1.959963985
Decision rule
Inference
Here z (1.87) is more than z critical (1.64), therefore reject null hypothesis.
Here P(Z<=z) (0.03) is smaller than α (0.05), therefore reject null hypothesis.
Conclusion
The mean annual return from directly purchased mutual funds is larger than the
mean of broker purchased funds.
Direct Broker
9.33 3.24
6.94 -6.76
16.17 12.8
16.97 11.1
5.94 2.73
12.61 -0.13
3.33 18.22
16.13 -0.8
11.2 -5.75
1.14 2.59
4.68 3.71
3.09 13.15
7.26 11.05
2.05 -3.12
13.07 8.94
0.59 2.74
13.57 4.07
0.35 5.6
2.69 -0.85
18.45 -0.28
4.23 16.4
10.26 6.39
7.1 -1.9
-3.09 9.49
5.6 6.7
5.27 0.19
8.09 12.39
15.05 6.54
13.21 10.92
1.72 -2.15
14.69 4.63
-2.97 -11.07
1037 9.24
-0.63 -2.67
Hypothesis Testing: -
H0:- µm - µb ≤ 0
H1:- µm - µb > 0
Direct Broker
Mean 27.56142857 3.895306122
Known Variance 37.488 43.339
Observations 49 49
Hypothesized Mean
Difference 0
z 2.287311718
P(Z<=z) one-tail 0.011088817
z Critical one-tail 1.644853627
P(Z<=z) two-tail 0.022177635
z Critical two-tail 1.959963985
Decision rule
Inference
Here z (2.28) is more than z critical (1.64), therefore reject null hypothesis.
Here P(Z<=z) (0.01) is smaller than α (0.05), therefore reject null hypothesis.
Conclusion
15-25 65 76 72 213
26-35 60 40 64 164
36-45 45 52 50 147
46-55 55 65 60 180
Hypothesis Testing: -
Decision rule:
If Chi square stats are greater than tabulated value, reject null hypothesis.
Inference
If Chi square stats(0.768) are less than tabulated value(12.592), therefore accept
null hypothesis.
Conclusion
There is an association between brand preference and age group.
In order to install R Studio,we first need to install R. Following are the steps how
to install R:
1. Go to CRAN, click Download R for Windows, click Base, and download the installer for
the latest R version.
2. Right-click the installer file and select Run as Administrator from the pop-up menu.
3. Select the language to be used during installation.
This doesn’t change the language used by R; all messages and Help files remain in English.
4. Follow the instructions of the installer.
You can safely use the default settings and just keep clicking Next until R starts installing.
After installing the setup of R,we can install the setup of R Studio. Following are
the steps how to install R Studio:
There are four panes in R Studio also known as windows. These panes are given below:
1.Source
Source: This is that part of the window where we write our code. Our code will not be evaluated
until we run this code in the console.
Console: This is that part of the pane where our code from the source is evaluated by R. We can
also use the console to perform quick calculations that we don’t need to save.
Environment/History: This is that part of the window where we can see that what objects are in
our working space.
Files/Post/Packages/Help: This is that part of the pane where we can see filedirectories, view
plots,see our packages and access R help.
IMPORTING FILES