Вы находитесь на странице: 1из 6

Assignment 1:

Sample standard deviation: (s) It is obtained by taking the square root of the sample
variance.
s=etc
population standard deviation: (sigma) It is obtained by taking the square root of the
population variance. sigma=etc
degrees of freedom: n-1 is called the degrees of freedom. These observations have
freedom to be whatever value they wish, but the nth value has no freedom.
z-score: The z.score represents the distance that a data value is from the mean in
terms of the number of standard deviations. It is the subtraction of the mean from the
data value and dividing the result by the standard deviation.
population z-score: z=x-mew divided by sigma
sample z-score: z=x-xbar divided by s
percentiles: Percentiles are values that divide a sample of data into one hundred
groups containing (as far as possible) equal numbers of observations. For example,
30% of the data values lie below the 30th percentile.
quartile:Quartiles are values that divide a sample of data into four groups containing
(as far as possible) equal numbers of observations. A data set has three quartiles.
References to quartiles often relate to just the outer two, the upper and the lower
quartiles; the second quartile being equal to the median. The lower quartile is the data
value a quarter way up through the ordered data set; the upper quartile is the data
value
a
quarter
way
down
through
the
ordered
data
set.
interquartile range (IQR): It is the range of the middle 50% of the observations in a
data set. It is the difference between the first and third quartiles and is found using
IQR=Q3-Q1
Sources:
Sullivan, Michael. Statistics.
http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#quart
Assignment 4:
exploratory data analysis: Is an approach to analyzing data for the purpose of
formulating hypotheses worth testing
five-number summary: A five-number summary is very useful when we have so many
data that it is sufficient to present a summary of the data rather than the whole data
set. It consists of 5 values: the most extreme values in the data set (maximum and
minimum values), the lower and upper quartiles, and the median. A five-number
summary can be represented in a diagram known as a box plot.
box plot: It is a type of graph which is used to show the shape of the distribution, its

central value, and variability. It is the graphic representation of a five-number


summary.
A box plot is especially helpful for indicating whether a distribution is skewed and
whether there are any outliers in the data set. They are also very useful when large
numbers of observations are involved and when two or more data sets are being
compared.

Assignment 5:
Observational study: observational study draws inferences about the possible effect of
a treatment on subjects, where the assignment of subjects into a treated group versus
a control group is outside the control of the investigator. The researcher basically
observes the behavior of the individuals in the study without trying to influence the
outcome
of
the
study.
Designed experiment: Its where the researcher is able to assign the individuals to a
certain group, intentionally to change the value of the explanatory variable and then
records the value of the response variable for each group.
*Basically, the difference between an observational study and a designed experiment
is the control the researcher has over the groups or individuals. In an observational he
observes behavior and on a designed experiment there are usually two groups, a
control group and a treatment group. The researcher uses scientific controls in a
designed experiment to control bias and produce reliable and valid data. Due to this
an observational study doesn't allow a researcher to claim causation, only association.
Random Sample: Random sampling is a sampling technique where we select a group
of subjects (a sample, n) for study from a larger group (a population, N). Each
individual is chosen entirely by chance and each member of the population has a
known, but possibly non-equal, chance of being included in the sample. By using
random
sampling,
the
likelihood
of
bias
is
reduced.
Simple random sampling: It is the basic sampling technique where we select a group
of subjects (a sample, n) for study from a larger group (a population, N). Each

individual is chosen entirely by chance and each member of the population has an
equal
chance
of
being
included
in
the
sample.
*Basically, the difference between Random Sample and Simple Random Sampling is
that in the Simple Random one every individual in a population has an equal chance of
being selected and the random one doesn't. The Random Sample is not equal in
chance,
but
is
better
to
avoid
bias.
http://en.wikipedia.org/wiki/Experimental_control#Controlled_experiments
Assignment 6:
Normal Quantile-Plot: A quantile-quantile (Q-Q) plot is used to see if a given set of
data follows some specified distribution. It should be approximately linear if the
specified distribution is the correct model.
Assignment 7:
Uniform Probability Distribution: When a random variable X can be any value in a
particular interval, because they are all equally likely to happen, then the random
variable X is said to follow a uniform probability distribution.
probability density function: It is an equation used to compute probabilities of
continuous random variables. It must satisfy two properties: 1.- The total area under
the graph of the equation over all possible values of the random variable must equal
1. 2.- The height of the graph of the equation must be greater than or equal to 0 for all
possible values of the random variable. That is, the graph of the equation must lie on
or above the horizontal axis for all possible values of the random variable.
Table 2:

Assignment 8:
Sampling distribution: The probability distribution or probability density function of the
statistic.
Assignment 9:
General Conference random quote!
I want it absolutely clear when I stand before the judgment bar of God that I declared
to the world . . . that the Book of Mormon is true
Assignment 10:
Point estimate: A single value that expresses the result of a determined estimation.
Confidence interval:
estimation.

A range of values that express the result of a determined

Level of confidence:
interval.

The probability value

associated with a confidence

Margin of error: Is a statistic expressing the amount of random sampling error in a


survey's results.
Assignment 11:
Margin of error formula:

Determining the Sample Size n formula:

Assignment 13:
Hypothesis: a proposed explanation for an observable phenomenon
Hypothesis testing: Is used to determine how likely it is that the overall effect would
be observed if no real relation as hypothesized exists
Rare Event Rule for Inferential Statistics (from class)
Null hypothesis: The type that states that there is no relation between the phenomena
whose relation is under investigation, or at least not of the form given by the
alternative hypothesis
Alternative hypothesis: Is the alternative to the null hypothesis: it states that there is
some kind of relation, and may be one sided, two sided, positive or negative.
Two-tailed test: a statistical hypothesis test in which the values for which we can reject
the null hypothesis, H0 are located in both tails of the probability distribution
Left-tailed test: A Hypothesis Test where the rejection region is located to the extreme
left of the distribution. A left-tailed test is conducted when the alternative hypothesis
(HA) contains the condition HA < x (less than a given quantity).
Right-tailed test: A Hypothesis Test where the rejection region is located to the
extreme right of the distribution. A right-tailed test is conducted when the alternative
hypothesis (HA) contains the condition HA > x (greater than a given quantity)
Type I error: The mistake of rejecting the null hypothesis when it is true.
Type II error: The mistake of failing to reject the null hypothesis when it is false
Level of significance: The probability of rejecting the null hypothesis when it is true.
Typical values of this are 0.05 and 0.01.
Assignment 14:
Statistically significant: A result is called statistically significant if it is unlikely to
have occurred by chance
P-value: The probability value (p-value) of a statistical hypothesis test is the probability
of getting a value of the test statistic as extreme as or more extreme than that
observed by chance alone, if the null hypothesis H0, is true

Assignment 16:
Student's t-Distribution: is a probability distribution that arises in the problem of
estimating the mean of a normally distributed population when the sample size is
small. It is the basis of the popular Student's t-tests for the statistical significance of
the difference between two sample means, and for confidence intervals for the
difference between two population means.
How to construct a Confidence Interval for mu with sigma unknown:

An explanation of the effect of outliers:


Assignment 19:
Response variable: Also called Dependent variable is the observed variable in an
experiment or study whose changes are determined by the presence or degree of one
or more independent variables.
Explanatory variable: A variable which is used in a relationship to explain or to predict
changes in the values of another variable; the latter called the dependent variable.
Scatter diagram: Is a useful summary of a set of bivariate data (two variables),
usually drawn before working out a linear correlation coefficient or fitting a regression
line. It gives a good visual picture of the relationship between the two variables, and
aids the interpretation of the correlation coefficient or regression model.
Correlation: An established relationship between to variables, either positive or
negative, that is, that the increase or decrease for a variable is in some way related to
an increase or decrease on another variable.
Lurking variable: Is a variable that has an important effect on the relationship among
the variables in a study but is not included among the variables studied.
Assignment 23:
Definitions for p, p-hat (p with an upside down v over it), x, and n.
Assignment 24:
Chi-square test for homogeneity of proportions: The chi-square test for homogeneity is
a test made to determine whether several populations are similar or equal or
homogeneous
in
some characteristics.

Вам также может понравиться