Академический Документы
Профессиональный Документы
Культура Документы
Overview
Most items that are studied in a statistics course are statistics (lower case s). If we
truly have access to the population information, such as a database, then often
the answer is known. The sample allows us to make educated decision without
having all the facts.
Page 3
Inferential Statistics - Uses methods that take a result from a sample, extend it to
the population, and measure the reliability of the result.
Page 4
Descriptive Statistics
Sometimes the difference between discrete and continuous blurs. I often use the
example in class asking an individual how old that they are. I may get an answer
such as 20 years old. I tell them that they are lying and often get funny looks. I ask
the class how I know they are lying. Sometimes a student can see that age is
continuous, but we treat it as discrete. A very humorous example is when the
statistic is brought up that the average number of children per family in the United
States is 2.01. I have heard the question, "How can someone have 0.01 children?"
They are taking it too literally that the number of children must be discrete in all
measures.
Consider ten babies that are raised in a lab. Everything in their entire life is
controlled. If these individuals do poorly in their studies, then the factors that
explain their results can be easily listed. How about ten individuals NOT raised
in a lab? There might be 1000 factors that determine their success or failure. In
mathematics, I have seen administration try to look for that magic explanatory
variable that explains why students do not get through developmental
mathematics. Over the last twelve years the student might have had at least ten
different teachers. Could the combination of these instructions cause the
embodiment of their knowledge? Could the ten generations of their family
determine their aptitude in a subject? Could their diet combination be a factor?
Typically, in a business the focus is put on the variables that probably explain the
results that are seen. For example, oil prices increasing might be based mainly on
the variables: market control of prices and competition, and international politics.
In Statistics, we can analyze the connection between any two variables, but the
individuals knowledge of their subject area must be an important component.
Observational studies attempt to measure the value without influencing. This can
be difficult to accomplish in many areas. An individual being studied can change
their personality due to this observation. Administration will come in to observe
an instructors classroom technique. Typically, the instructor will change their
technique based on the observation.
Lurking Variable - An explanatory variable that was not considered in a study, but
that affects the value of the response variable in the study.
Since observational studies cannot control every aspect of the individual, then no
direct claim can be made about what causes the result. That does not mean there
is not a direct connection between the two, but it could perhaps be the other two
thousand variables that are affecting the outcome.
Page 9
For purposes of this class, all samples are assumed to be simple random sampling
based on the population. If the sample is not done correctly, then the results are
invalid and may not represent what is being studied. Even if the problem does not
specify, then this requirement is assumed to be true.
Example: Consider if the population size is 1000 and the desired sample size is 30
homogeneous - Similar
heterogeneous - Dissimilar
Bias - Where the results of the sample are not representative of the population.
Three sources of Bias in Sampling:
Sampling bias
Nonresponse bias
Response bias
Bias is sometimes easy to see after the fact but prior and during the event can be
difficult to spot. Sometimes it takes an outside party to ask those questions that
point out what is overlooked.
Page 14
Sampling Bias - Means that the technique used to obtain the samples individuals
tends to favor one part of the population over another. Can also occur from
undercoverage, which occurs when the proportion of one segment of the
population is lower in a sample than it is in the population.
Response Bias - Exists when the answers on a survey do not reflect the true
feelings of the respondent.
This course is about analyzing the results of a sample not to study the details of
how a study should be performed. If you are interested in that area, then a
research class will give detailed information. But some basic definitions should be
covered to give the student an overview.
Page 15
Basic Definitions
Placebo - An innocuous medication, such as sugar tablet, that looks, tastes, and
smells like the experimental medication.
Single-Blind Experiment - One in which the experimental unit (or subject) does
not know which treatment he or she is receiving.
Double-blind - Means that neither the experimental unit nor the experimenter
knows what treatment is being administered to the experimental unit.
person before and after a treatment, husband and wife, same geographical
location)
Randomized Block Design - Used when the experimental units are divided into
homogeneous groups called blocks. Within each block, the experimental units are
randomly assigned to treatments.
It is interesting that many individuals who have taken multiple statistics course
still do not follow the basic ideas. One experiment that was done recently at a
local college did not have a control group. Without the control group, the results
that were obtained cannot be compared to see their effectiveness. It was to see
the results of various anti-bacterial soap and the reduction of the bacteria. It was
an interesting experiment, but without the control group then the results could
not be interpreted. Interestingly enough, one of the products increase the
number of bacteria which could mean a very poor product or the experiment was
done improperly.