Академический Документы
Профессиональный Документы
Культура Документы
In this
Statistics uld as reuwded hisloy
Thousand of years before Chist, data have already been compled on
lati brths deatis and taes Durng bilical mes, Moses and Davwd
the middle ages, govemments began to comple racords on land ownerslip.
undertook what are now called censuses
Henry VI ordered the registraion of dead pacple in 1532 in order to keep
abreast with the mounting number of deaths dlue to the Plague. About this time in
France, the govermment orderad the registration of marriages asida from
gathering the usual statistics Today, with the development of probability theory.
we are able to use statistical methods that not only describe important eatures of
the dale but metozs thit sliow u. 1 pe beyond the callecied ala int the
area of decision making through generalizations and predictions.
The origin of the word Statistics comes from statistik, an ltalian word
which means statesman.) T e word statistics was first used by Gottfried
Achenwall (1719-1772), a professor at Marlborough in England and Gottingen in
Gemany. Howavet, ts use ves popularized in the wark of Sir John Sinciair
entitled Statistical Account of Scotland (1791-1799) (Garcia, 2003).
Statistics may be defined as the science that deals with the collection,
organization, presentation, analysis, and interpretation of data in order be able to
draw judgments or conclusions that help in the decision-making process.
The two parts of this definition correspond to the two main divisions of
Statistics. These are Descriptive Statistics and Inferential Statistics. Descriptive
Statistics, which is referred to in the first part of the definition, deals with the
procedures that organize, summarize and describe quantitative data. It seeks
imnlied in the second part of the
merely to describe data. Infarentia! Statistics,
definition, deals with making a judgment or a conclusion about a population
based on the findings from a sample that is taken from.the population.
1.2.Classification of Variables
Variables differ in "how well" they can be measured, i.e., in how much
measurable information their measurement scale can provide. There is obviously
ure t error inVoled every measurement, ch deter
"amount of information" that we can obtain. Another factor that determines the
amount.of information that can be provided by a variable is its" "type of
measurement scale. Specifically variables are classified as (a) nominal, (b)
ordinal, (c) interval or (d) ratio.
a Nominal variables allow for only qualitative classification. That is, they
can be measured only in terms of whether the individual items belong to
sOme distinctively different categories, but we cannot quantify or even
colos
rank order those categories. For example, all we can say is that 2
individuals are different in terms of variable A (e.g., they are of different
race), but we cannot say which one "has more" of the quality represented
by the variable. Typical examples of nomina! variables are gender, race,
color, city, etc.
b. Ordinal variables alow us to rank order the items we measure in terms of
which has less and which has more of the quality represented by the
variable, but still they do not allow us to say "how much more." A typical
example of an ordinal variable is the socioeconomic status of families. For
»zmple we kno» hat ber-middle is higher than middle but we cannot
Say , ior example, O/ Also this very diStinCtioii ctionbetween
nominal, ordinal, and interval scales itself represents a good example of
an ordinal variable. For example, we can say that nominal measurement
provides lessinformaion than ordinal measurement, but we cannot say
"how much less" or how this difference compares to the difference
between ordinal and interval scales.
Interval variables aliow us not only to rank order the items that are
fmeasureč, but aiso to quanify and compare ihe sizes of differences
between them. For 8xample, temperature, as measured in degrees
Fahrenheitor Celsius, constitutes an interval scale. \ We can say that a
temperature of 40 degrees is higher than a temperature of 30 degrees, by
10 degrees.
d a Ratio variables are very similar to interval variables, in addition to all the
properties of interval variables, they feature an identifiable absolute zero
nt, thus they aliow for statements such as x is two times more than y.
point
Typical examples of ratio scales are measures of time or space. For
example, as the Kelvin temperature scale is a ratio scale, not only can we
say that a temperature of 200 degrees IS higher than one of 100 degrees,
we can correctly state that it is twice as high. Interval scales do not have
the ratio property. Most statistical data analysis procedures do not
2 Steps in Conducting a Statistical Investigation/inquiry
2.4Presentation of Data
Once we have organized the data, we nead to present the data in a form
vi be sasy to read, understand and anayze. Mosi cten ms vill be
accomplished by using a graph such as a column graph, bar graph, pie chart, dot
plot or lne chart. The particular type of graph to be used will depend on the
purpose of the investigation.
2.5 Analysis of Data
Analysis refers to the use of appropriate staistical methods to the obtained
data.
26 Interpretation of Data
After the presentation and analysis of the data, it is time to examine and
interpret the data, to decide on what it means and to ultimately draw conclusions
from it. This may involve identitying trends and patterns from the _graph, and
identifying how those trends and patterns change over time or across categories.
2.7 lnferences/Modeling
The trends and patterns identified in interpreting the data will lead to
drawing conclusions and inferences. Inferences àre assumptions or conclusion
that is rationally and logically made, based on given facts or circumstances. A
statistical model can also be developed. A statistical model is a class of
mathematical model, which embodies a set of assumptions concerning the
generation of some sample data, and similar data from a larger population. In
simple terms, statistical modeling is a simplified, mathematicaly-formalized way
to approximate reality (i.e what generates your data) and optionally to make
predictions from this approximaticn. The statistical model is the mathematical
equation that is used.
3 Presentation of Data
After data collecion, the next step is to present these data using the best
method. Dala shouid be properly presented so hal the viewer can easily see and
identify the significant characteristics of the data.
Data can be presented in textual form, tabular form or graphical form. Data
can also be presented as ungrouped or grouped data. Ungrouped data are not
organized and arrangement is just owest to highest or highest to lowest.
Grouped data, on the other hand, are organized and arranged into different
classes.
3.1 Textual Form
This form is usually used for ungrouped data. It involves enumerating the
important characteristics of the data, which are oftentimes arranged in increasing
or decreasing order, and giving emphasis on significant figures and identifying
special features of the data.
Example 1:
A university gives a standardized entrance exam consisting of 100
questions to 30 entering freshmen. The acadenmic ability scores for students are
the data which collected by the administration to determine placement.
Another way to present data is by means of stem-and leaf plot. This is one
way of sorting out the raw data using pattern. it involves separating a number
into two parts. In a single-digit number, the stem is zero. In a two digit number,
+the stom rofors tn the first dicit and the leaf is the second dig ln a thrae-dicit
1t
number, the stem consists of the first two digits and the leaf consists of the last
digit
. Present the data in Example1 1 using stem-and-leaf plot.
. Based on the plot, how many scores are below 50%?2 Above 50%?
3.2 Tabular
Another way of presenting data is by the use of table. It is easier to compare data
in tabular forms. Tahlac how w better and more complete infnrmatinn rarardinn
the data. Tables should include the following important parts:
Table number
Table title
Column header
Row classifier
Body
Source note
This is called frequency table. It shows the data arranged in different classes or
categories and the corresponding frequency in each class or category.
Table 1 is an example of ungrouped data.
Independent Sampling
Independent samples are those samples selected from the same population,
or different populations, which have no efect on one another. That is, no coreiation
exists between the samples.
There are two ways in selecting a sample. These are the non-probability
sampling and the probability sampling.
bn has 40% women and 60% men, and that you want a total sample size of o
100 you wil continue samping unti you get those percenlages and then you
stop. Non-proportional quota sampling is a bit less restrictive. In this method, a
-ifer 2nd not concerned
, jc cnortO
wth having numbers that match the proportions in the population.
Stratified Sampling
oro may nftan he factors which divide up the population into sub-populations
(groups I strata) and the measurement of interest may vary among the different sub
populations. This has to be ccounted for when a sample from the population is
selected in order to obtain a sample that is representative of the population. This is
achieved by stratified sampling.
A stratifecd sample is obtained by taking samples from each stratum or sub-
group of a population, When a sample is to be taken from a population with several
strata, the proportion of each stratum in the sample should be the same as in the
population.
Stratified sampling techniques are generally used when the population is
heterogeneous, or dissimilar, where certain homogeneous, or similar, sub-populations
can be isolated (strata). Simple random sampling is most appropriate when the entire
population from which the sample is taken is homogeneous. Some reasons for using
stratified sampling over simple random sampling are:
a. the cost per observation in the survey may be reduced;
b. estimates of the population parameters may be wanted for each sub-
population;
.. increased accuracy at given cost.
Cluster Sampling