Understand Statistical Symbols

Learning the Language of
the Statistician
The following slides contain many of the symbols we

will be using in this class. These are the symbols we
will be using in formulas. While I do not require you
to memorize all of the formulas, it is important that
you know what these symbols mean. You will be
expected to memorize a few of the simpler formulas
for the departmental final.
To do responsible research, you must assimilate,
integrate and apply. This power point presentations
concentrates on assimilating this basic information.
Sample
Sampling
Population
Distribution
--------------------------------------------------------------------------------------------------------
Individual Score
yi
yi
Sample Size
Mean
Mu
Sigma
Standard Deviation
/n
estimated by
s/
Variance
S2
Sum
Proportion
Hypothesized Mean
Hypothesized Proportion p0
Pi
Stating Hypotheses with Symbols
One Sample Hypothesis Test for a Proportion

o Null hypothesis
P = The sample proportion is the same as the population proportion.
o Research hypothesis
P The sample proportion is NOT the same as the population
proportion. If you have a theory, you can use a one-tailed test and
indicate that it is greater or less than the population proportion.
One Sample Hypothesis Test for a Mean

o Null hypothesis
= The sample mean is the same as the population mean.
The sample mean is not the same as the population mean. If you
have a theory, you can use a one-tailed test and indicate that it is greater
or less than the population mean.
Chi Square
o Null hypothesis
H0 E=O, The expected value equal the observed value
The dependent variable is contingent on the independent variable
in the population
H1 EO, The expected value does not equal the observed value
The dependent variable is NOT contingent on the independent
variable in the population
NOTE For an Elaborated Chi Square you simply state that E=0 for all of the
independent/dependent combinations for the null hypothesis. For the
research hypothesis you state that E 0 for at least one of the combinations.
You would actually test each dependent/independent combination
separately.
One-Way Anova - with 2 groups

o Null hypothesis
H0 1 = 2, The Means are equal Or The Mean of Group 1 is the same as
the Mean of Group 2 in the population
Two Tailed one the computer uses
H0 1 2, The Means are not equal OR the Mean of Group 1 is not the
same as the Mean of Group 2 in the population
One Tailed - state a direction
H0 1 < 2, or 1 > 2 The Mean of Group 1 lower than the Mean of
Group 2 in the population. The Mean of Group 1 is higher then the
mean of Group 2 in the population.
One-Way Anova - with more than 2 groups*

o Null hypothesis
H0 1 = 2..k The Means of all the groups are equal.
Two Tailed one the computer uses
H0 1 2,.. k The Means are not equal. The Mean of one group is
not equal to the Mean of at least one other group.
o * This is still bi-variate. You dont have more variables only more categories
in the categorical variable.
Bi-Variate Regression
o Null hypothesis
H0 1 = 0, The regression slope is not different from 0 in the population
There is no relationship between the independent and dependent variables
in the population.
H0 1 0, The Slope is different from 0 in the population
There is a relationship between the independent and dependent variable in
the population.
Multi-Variate Regression
o Null hypothesis
H0 1..k = 0, The regression slope is not different from 0 in the population
There is no relationship between the independent and dependent variable
in the population.
H0 1k 0, At leas one of the Slopes is different from 0 in the population.
There is a relationship between the independent variable and at least one
of the dependent variables in the population.
Matching Variables with Types of Analysis

Chi-square (2 categorical variables)
type of car you drive by gender
race by political preference
race by eye color
gender by YES/NO questions
Anova (1 categorical and one continuous variable)
gender by yearly income
gender by score on self esteem index
race by yearly income
political preference
by yearly income
age by whether or not you have children
Bi Varate Regression (Two Continuous Variables)
yearly income by years of education
years married by marital satisfaction (scale score)
age by number of children
Multiple Regression ( continuous/dummy independent and continuous
dependent)
number of dates per year by yearly income, age,
height, gender (dummy variable).
poverty rates by sex ratio, percent single headed
household, percent employed.
Statistics That Do Not Use Hypotheses
Confidence Intervals
o We generally do not state a hypothesis for a Confidence Interval.
Confidence Intervals are used to estimate a population mean or
proportion based on a sample mean or proportion. Opinion polls
use Confidence Intervals to predict election results etc.
Pearson Correlation (correlation co-efficient or r)

o We generally do not associate Pearson Correlation Matrixes with
hypotheses. We generally use Pearson Correlation Matrixes for
diagnostic purposes and to test the strength of bi-variate relationships.
Equations/Formulas Z Tests
Z scores
o
Z=
o Where yi = individuals score

o = population mean
o = population standard deviation
o Information needed
Population mean and standard deviation
o Example of when we would use this
If you knew an individuals SAT/ACT score, you could
determine what percentile they scored in (i.e., the 95%)
OR if you know what percentile they are in, you can
determine their score.
Equations for Inferential Statistics
Summary Statistics
o
Mean
= /n
o Median
Order values and count up this far
o Variance
S2 = ( )2
o Standard Deviation
S = 2
Inferring a Population Mean or Proportion Based on

Sample Mean or Proportion
The following Slides Focus on How to Estimate a
Population Mean or Proportion if we ONLY have a
random sample.
In these cases we estimate one point in the
population (i.e., the mean IQ of USU students)
BUT we build a confidence interval around this
single point generally a 95% confidence interval
error
A One or Large Sample Hypothesis Test

In the following slides we compare a sample mean
or proportion with a population mean or proportion.
We want to know if our sample mean or proportion
is different from the population mean or proportion
The population mean or proportion could actually
be a mean/proportion that is specified by a theory
or by past research (rather than a number
computed from a population data set)
Equations/Formulas for One Sample Hypotheses Tests
The equations are outlined in red
What do the symbols mean

o
o
o
o
o
One sample hypothesis test for Proportion

P = proportion in the sample
0 =proportion or hypothesized proportion in the population
n = sample size
Z = computed statistic
o
o
o
o
o
o
One sample hypothesis test for Mean

= mean in the sample
0 = mean or hypothesized mean in the population
n = sample size
s = standard error or an estimate of the standard deviation in the population
s = computation for estimating the standard error using standard deviation
of the sample size times the square root of the sample size.
Symbols for Statistics that Infer the Relationship in the

Sample to the Population
Chi Square
Regression
Symbol(s)
X2
Interpretation
Chi Square Statistic
beta slope in population

slope in sample
alpha intercept or constant
in prediction formula
value of the X variables
y-hat or predicted Y
Y bar or the mean of Y
X1X
Anova
yi -
Mu or mean in population
Chi-Square Equation
Equations/Formulas for Inferential Statistics

o
Pearson Correlation Coefficient and R2

Formula
o r = ( ) (yi )
( )
o R2 = r squared
Multiple Regression
Prediction Equation
= + b1x1 + b2x2 + b3x3 +..
= predicted score for the dependent variable
a = intercept or constant
b = slope or parameter estimate for independent variables unit increase in Y
variable for ever 1 unit increase in X
X = value of the X values taken from the codebook
Anova
o Formula
o TSS = - G2
o SSB = ( ) 2
o
TSS = Total Sum of Squares

SSB = Sum of Squares Within
SSW = Sum of Squares Between
SSW = TSS SSB
s2B = F statistic
s2w
s2B = SSB/k-1
S2w = SSW/n-k
F = S2B/S2W
df between = k-1
df within = n-k
Anova and Regression

Sums of Squares
Anova
o TSS = Total Sum of Squares
o SSW = Sum or Squares within each group
o SSB = Sum of Squares between the groups
SSB/TSS = R square or the proportion of the total sum of squares that is
explained by group membership
Regression
o TSS Total Sum of Squares
o SSM Sum of Squares Model
o SSE Sum of Squares Error
Two Sample T-test

o
Formula
T = 1 2
__________
s1 2
this part is computed as follows

s1 2 = SP / + /
Pooled standard
deviation
Sp =
standard deviation
of sample 1
+
+
What symbols mean

t = critical value
1 = mean of sample one
2 = mean of sample two
n1 = size of sample 1 and n2 = size of sample 2
Degrees of freedom = df = n1 + n2 2
Uses a T distribution
Estimated standard error of the

difference between the two means
standard deviation
of sample 2
Mann Whitney
o Focuses on ranks rather than on means medians
o Two Groups
o Formula
Z= T1 E(T1)
(1)
E(T1) = n1 (n+1)
2
Rank values from smallest to largest

Sum ranks in smaller group = T1
Compute E(T1)
Compute Variance Var T1 = n1 n2 S2
s2 = (Yi - )2
n-1
Uses a Z dsitribution.
Kruskal Wallis
o Focuses on ranks (medians) rather than on means
o More than Two Groups
o Formula
-3 (n+1)
(+)
T = total sum of ranks for each sample

n = total number of cases
nk = number of cases for the k sample
Uses X2 Distribution
Degrees of Freedom = k-1 (where K is number of groups)
Use when you want to compare more than two groups, and the distribution is
not normal.

Formulas for Sample Size
Sample size (n) =
.9604
(+1)
D = degrees of freedom or margin of error (usually .05)

N= population size
.9604 = a constant related to at least 95% sure
This sample size is large enough that we can be at least
95% sure we can generalize to the population with a
margin of error of .05
Prepared by Dr. Carol Albrecht

Understand Statistical Symbols

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Understand Statistical Symbols

Загружено:

Авторское право:

Доступные форматы

Learning the Language of

The following slides contain many of the symbols we

Stating Hypotheses with Symbols

One Sample Hypothesis Test for a Proportion

One Sample Hypothesis Test for a Mean

Stating Hypotheses with Symbols

Stating Hypotheses with Symbols

One-Way Anova - with 2 groups

Stating Hypotheses with Symbols

One-Way Anova - with more than 2 groups*

Stating Hypotheses with Symbols

Matching Variables with Types of Analysis

Statistics That Do Not Use Hypotheses

Pearson Correlation (correlation co-efficient or r)

o Where yi = individuals score

Equations for Inferential Statistics

Order values and count up this far

Inferring a Population Mean or Proportion Based on

A One or Large Sample Hypothesis Test

Equations/Formulas for One Sample Hypotheses Tests

The equations are outlined in red

What do the symbols mean

One sample hypothesis test for Proportion

One sample hypothesis test for Mean

Symbols for Statistics that Infer the Relationship in the

beta slope in population

Equations/Formulas for Inferential Statistics

Pearson Correlation Coefficient and R2

Equations/Formulas for Inferential Statistics

TSS = Total Sum of Squares

SSW = TSS SSB

Anova and Regression

Equations/Formulas for Inferential Statistics

Two Sample T-test

this part is computed as follows

What symbols mean

Estimated standard error of the

Equations/Formulas for Inferential Statistics

Rank values from smallest to largest

Equations/Formulas for Inferential Statistics

T = total sum of ranks for each sample

Equations/Formulas for Inferential Statistics

Sample size (n) =

D = degrees of freedom or margin of error (usually .05)

Prepared by Dr. Carol Albrecht

Вам также может понравиться