Probab

1. Basic Concepts e data.
In this
Statistics uld as reuwded hisloy
Thousand of years before Chist, data have already been compled on
lati brths deatis and taes Durng bilical mes, Moses and Davwd
the middle ages, govemments began to comple racords on land ownerslip.
undertook what are now called censuses
Henry VI ordered the registraion of dead pacple in 1532 in order to keep
abreast with the mounting number of deaths dlue to the Plague. About this time in
France, the govermment orderad the registration of marriages asida from
gathering the usual statistics Today, with the development of probability theory.
we are able to use statistical methods that not only describe important eatures of
the dale but metozs thit sliow u. 1 pe beyond the callecied ala int the
area of decision making through generalizations and predictions.
The origin of the word Statistics comes from statistik, an ltalian word
which means statesman.) T e word statistics was first used by Gottfried
Achenwall (1719-1772), a professor at Marlborough in England and Gottingen in
Gemany. Howavet, ts use ves popularized in the wark of Sir John Sinciair
entitled Statistical Account of Scotland (1791-1799) (Garcia, 2003).
Statistics may be defined as the science that deals with the collection,
organization, presentation, analysis, and interpretation of data in order be able to
draw judgments or conclusions that help in the decision-making process.
The two parts of this definition correspond to the two main divisions of
Statistics. These are Descriptive Statistics and Inferential Statistics. Descriptive
Statistics, which is referred to in the first part of the definition, deals with the
procedures that organize, summarize and describe quantitative data. It seeks
imnlied in the second part of the
merely to describe data. Infarentia! Statistics,
definition, deals with making a judgment or a conclusion about a population
based on the findings from a sample that is taken from.the population.
1.1.Definition of Statistical Terms

Population or Universe refers to the totality of objects, persons, places, things
used in a particular study. All members of particular group of object (items) or
people (individual), etc. which are subjects or respondents of a study.
Samplo 1s any subset of population or few members of a population.
Data are facts, fqures and nfomaon colacled an some chaactarislies af e
pOpulation or sample. These can be classified as qualitative. Lantitative data,
Ungrouped (or raw) data are data which are not organized in any specific way.
ato
They are sim ly the çollection of data as they are gathered.
e
roupod Data are raw dala organized nto grougs a calegories wih
coresponding fequencies. Organized in this manner,the data is refered to as
requency distribution.
Parameter is the descriptive measure of a characłleristic of a population
Statistic is a measure of a characteristic of sample
Constant is a characterisic or property of a population or sample which is
common to all membe fthe group.
Variable is a measure or characieristic or property of a population or sample
that may have a number of diferent velues It diferentiates a particular member
from the rest of the group. It is the characleristic or property that is mesured,
controlled, or manipulated in research. They differ in many respects, most
notably in the role they are given in the research and in the type of measures that
can be applied to them.
1.2.Classification of Variables
Variables can be classified according to functional relationship,

continuity of values and on scale of measurement.
1.1.1. Classification of Variables according to Functional

Reiationship
They can either be independent or dependent variables as to the functional

relationship is concerned. lndependent variables are those that are manipulated
1Col
Whereas dependent variables are only measured or registered. Example is when
yariable Y is dependent on variable X. Any change in X will have an effect on Y.
X is the predictor or causal variable and Y is the presumed effect or response to
the change in X..
1.1.2. Classification of Variables according to Continuity of Values
Continuous and discrete are the clasification of variables according to

continuity of variables. continuous variable is a characteristic that can assume
any value on the measurement scale employed, can take the form of decimal.
Examples are weight (55.0 kgs), height(152.4 cm), temperature (37.5C), etc. t
answers the question how much?
Discrete variabie (discontinuous) is a characteristic which can oniy assume
designated values and can take for its values only whole numbers such as 1, 2.
3, 4, 5, etc or of values that come from equal-sized intervals such as the
numerical passing grades given by the professors as 1.00, 1.25, 1.50, 1.75, 2.00,.
2.25, 2.50, 2.75 and 3.00 (Panopio, 2004). It answers the questions how many?
1.1.3. Classification of Variables according to Scale of Measurement

AA
Variables differ in "how well" they can be measured, i.e., in how much
measurable information their measurement scale can provide. There is obviously
ure t error inVoled every measurement, ch deter
"amount of information" that we can obtain. Another factor that determines the
amount.of information that can be provided by a variable is its" "type of
measurement scale. Specifically variables are classified as (a) nominal, (b)
ordinal, (c) interval or (d) ratio.
a Nominal variables allow for only qualitative classification. That is, they
can be measured only in terms of whether the individual items belong to
sOme distinctively different categories, but we cannot quantify or even
colos
rank order those categories. For example, all we can say is that 2
individuals are different in terms of variable A (e.g., they are of different
race), but we cannot say which one "has more" of the quality represented
by the variable. Typical examples of nomina! variables are gender, race,
color, city, etc.
b. Ordinal variables alow us to rank order the items we measure in terms of
which has less and which has more of the quality represented by the
variable, but still they do not allow us to say "how much more." A typical
example of an ordinal variable is the socioeconomic status of families. For
»zmple we kno» hat ber-middle is higher than middle but we cannot
Say , ior example, O/ Also this very diStinCtioii ctionbetween
nominal, ordinal, and interval scales itself represents a good example of
an ordinal variable. For example, we can say that nominal measurement
provides lessinformaion than ordinal measurement, but we cannot say
"how much less" or how this difference compares to the difference
between ordinal and interval scales.
Interval variables aliow us not only to rank order the items that are
fmeasureč, but aiso to quanify and compare ihe sizes of differences
between them. For 8xample, temperature, as measured in degrees
Fahrenheitor Celsius, constitutes an interval scale. \ We can say that a
temperature of 40 degrees is higher than a temperature of 30 degrees, by
10 degrees.
d a Ratio variables are very similar to interval variables, in addition to all the
properties of interval variables, they feature an identifiable absolute zero
nt, thus they aliow for statements such as x is two times more than y.
point
Typical examples of ratio scales are measures of time or space. For
example, as the Kelvin temperature scale is a ratio scale, not only can we
say that a temperature of 200 degrees IS higher than one of 100 degrees,
we can correctly state that it is twice as high. Interval scales do not have
the ratio property. Most statistical data analysis procedures do not
2 Steps in Conducting a Statistical Investigation/inquiry
A staistical inquiry is a process of tansforming raw dala into useful

intormation that can tell us more about a subject and alow us to make.
ecommendations and possibly make precdictions of future outcomes, A proper
System is essential for conducting a slatistical investigationinquiry. Careful
planning is necessary to get the best results at the least cost and time. The
following points can be considered during planning stage.
Objective of the inquiry should be fully known

Scope of the inquiry should be determined
Nature of information to be collected should be decided
Unit of data collection should be defin
ce of data collecticn o cf data tc hculd be
Method of data collection should be decided beforehand
Choice of frame should be made
- Reasonable standard of accuracy should be fixed
Statistical inquiry can be conducted by following the step-by-step procedures.
1. Collection of data
2. Organization
. Processing
4. Presentation
5. Analysis of data
6. Interpretation of data
7. Inferences/ modeling
2.1 Collection of Data

Colection refers to the data gathering. This involves acquiring of
information through questionnaires, interviews, experimentations, test or
examinations and other forms of data gathering instrument. The person who
y is an investigatof the one who helps in collecting

conducts the inquiry
infa aton is an.enumerato information is collected from a respondent.
rocess of investioatin are known as primary data. These are co!lected for the
Dala can be primary or secandary, Accoring to Wesse. "Data collected in the
investigator's use from the primary s0urce. Secondany data on the other hand, is
ollected by some other organization for their own use but the investigator also
gets it for his use. According to M.M. Blair, "Secondary data are those already in
existence for some other purpose than answering of the question in hand."
2.2 Organization of Data esc ho UrgrnuP
This refers to the arrangement of the data collected into a form that gives
structure and order to the data. A common way of accomplishing this is to use a
frequency table The way the data will be arganized will ary as a funcion of the
nature of the statistical investigation.
2.3 Processing of Data
Data processing involves translating the answers on a questionnaire into a
form that can be manipulated to produce statistics. In general, this involves
coding, editing, data entry, and monitoring the whole data processing procedure.
The main aim of checking the various stages of data processing is to produce a
file of data that is as error free as possible. Adopting a methodical and consistent
approach to each processing task is important to ensure that the processing is
completed satisfactorily and on schedule. $
2.4Presentation of Data
Once we have organized the data, we nead to present the data in a form
vi be sasy to read, understand and anayze. Mosi cten ms vill be
accomplished by using a graph such as a column graph, bar graph, pie chart, dot
plot or lne chart. The particular type of graph to be used will depend on the
purpose of the investigation.
2.5 Analysis of Data
Analysis refers to the use of appropriate staistical methods to the obtained
data.
26 Interpretation of Data
After the presentation and analysis of the data, it is time to examine and
interpret the data, to decide on what it means and to ultimately draw conclusions
from it. This may involve identitying trends and patterns from the _graph, and
identifying how those trends and patterns change over time or across categories.
2.7 lnferences/Modeling
The trends and patterns identified in interpreting the data will lead to
drawing conclusions and inferences. Inferences àre assumptions or conclusion
that is rationally and logically made, based on given facts or circumstances. A
statistical model can also be developed. A statistical model is a class of
mathematical model, which embodies a set of assumptions concerning the
generation of some sample data, and similar data from a larger population. In
simple terms, statistical modeling is a simplified, mathematicaly-formalized way
to approximate reality (i.e what generates your data) and optionally to make
predictions from this approximaticn. The statistical model is the mathematical
equation that is used.
3 Presentation of Data
After data collecion, the next step is to present these data using the best
method. Dala shouid be properly presented so hal the viewer can easily see and
identify the significant characteristics of the data.
Data can be presented in textual form, tabular form or graphical form. Data
can also be presented as ungrouped or grouped data. Ungrouped data are not
organized and arrangement is just owest to highest or highest to lowest.
Grouped data, on the other hand, are organized and arranged into different
classes.
3.1 Textual Form
This form is usually used for ungrouped data. It involves enumerating the
important characteristics of the data, which are oftentimes arranged in increasing
or decreasing order, and giving emphasis on significant figures and identifying
special features of the data.
Example 1:
A university gives a standardized entrance exam consisting of 100
questions to 30 entering freshmen. The acadenmic ability scores for students are
the data which collected by the administration to determine placement.
Another way to present data is by means of stem-and leaf plot. This is one
way of sorting out the raw data using pattern. it involves separating a number
into two parts. In a single-digit number, the stem is zero. In a two digit number,
+the stom rofors tn the first dicit and the leaf is the second dig ln a thrae-dicit
1t
number, the stem consists of the first two digits and the leaf consists of the last
digit
. Present the data in Example1 1 using stem-and-leaf plot.
. Based on the plot, how many scores are below 50%?2 Above 50%?
3.2 Tabular
Another way of presenting data is by the use of table. It is easier to compare data
in tabular forms. Tahlac how w better and more complete infnrmatinn rarardinn
the data. Tables should include the following important parts:
Table number
Table title
Column header
Row classifier
Body
Source note
This is called frequency table. It shows the data arranged in different classes or
categories and the corresponding frequency in each class or category.
Table 1 is an example of ungrouped data.
3.2.1 Frequency Distribution

Frequency distribution- a lbuletion af dat showing the number of times a
Frequency nmer of nes a scame r'a graup tsores (ctas) coaursin a
SCore appear.
population or sample.
Class or category - group of scores in a population or sample.
Class width or class size (c)
lower limits it is obtained by subtracting any two consecutive
Class Mark or class midpoint -hailhsay betwean the upper limil and lower limit
of a class
Upper Class Boundary y-haliway between the upper limit of one class and the
lower limit of the preceding class. (The exact limit). Upper class boundary of the
next class can also be obtained by adding the class width to upper class
boundary.
Lower Class Boundary- is equal to the uppar class boundary of the preceding
class. Lower class boundary of the next class can also be obtained by adding
the class width to lower class boundary.
Relative Frequency-the frequency of one score or group of scores divided by
the total frequency of all the observations.
Cumuiative Frequency the frequency of any class plus the frequencies of all
preceding class in a distribution.
CA
Frequency Distribution Table for Interval Data
There is no fast rule in organizing or constructing frequency distribution
table. However, we should be guided with the number of elements of the
gathered data. The number of class intervals in a given situation depends on the
nature, magnitude and range of data. Here are some suggestions
Class intervals must be between 5 to 20.
Select the class width to be an odd positive integer such as 3,5, 7,9, and
SO on.
3.3 Graphical Form
Nominal, Ordinal and interval data can also be presented graphically in t

of pie chart, bar graph, me grh. hebgram and rqunay oygon. Som.e pecne
preferred this kind of presentation of data than the tøxtual and tabular form because
the data are 328ier to comprehend. Comparison and mterprelation of data can be
Graph - a pictorial representation of a set of data.

Pie chart t- also termed as circle graph which shows the proportions of each class
using relative frequency.
Bar graph - a method of presenting the data using vertical or horizontal bars or
rectangles with class marks on the x-axis and frequencies on the y-axis.
1Histogr
scores by the height of the bar with overlapping upper boundary of a class and the
lower boundary of the next class.
Frequency Polygon- a graph on which the frequencies of classes are plotted at the
. class marks and the class marks are connected by straight lines. It is a closed figure
and normally done after a histogram is constructed.
Ogive e-is a line graph made up of connected line segments whose endpoints are the
ciass boundaries and ihe heights of the endpoints are the frequencies. Less than
cumulative frequencies and greater than cumulative frequencies can be presented
using ogive.
4 Sampling Techniques
Sampling is the process of selecting units (es. people, organizations) from a

popuialion of interesi. Sarmple must be a represenlalive of te iarget popuiation.
target population is the entire group a researcher is interested in; the group about
which the researcher wishes to draw conclusions.
Independent Sampling
Independent samples are those samples selected from the same population,
or different populations, which have no efect on one another. That is, no coreiation
exists between the samples.
There are two ways in selecting a sample. These are the non-probability
sampling and the probability sampling.
4.1 Non-Probability Sampling
on-probability sampling is also called judgment or subjective sampling. This

method is convenient and economical but the inferences made based on the findings
are not so reliable. The most common types of non-probability sampling are the
convenience samplina. purposive samplina and quota sampling.
In convenience sampling, the researcher use a device in obtaining the

information from the respondents which favors the researcher but can cause bias to
respoi
In purposive sampling, the selection of respondents is predetermined
according to the characteristic of interest made by the researcher. Randomizationis
absent in this type of sampling.
There are two types of quota sampling: proportional and non proportional.
n proportional quota samp hajor characteristics of the populati
samolinc oportional amount of each is represented. For instance, if you know the
bn has 40% women and 60% men, and that you want a total sample size of o
100 you wil continue samping unti you get those percenlages and then you
stop. Non-proportional quota sampling is a bit less restrictive. In this method, a
-ifer 2nd not concerned
, jc cnortO
wth having numbers that match the proportions in the population.
4.2 Probability Sampling

In probability sampling. every member of the population is given an equal
chance to be selected as a part of the sample. There are several probability
techniques. Among these are simple random sarmpling, stratified sampling and cluster
sampling.
Random Sampling
Rendom sampling is a samping techrique where a group of subjects (a
sample) is selected for study from a larger group( (a ulation). Each individual is
chose ireiy by chance and each member of the population has a known, but
cossibiv non-eaual, chance of being included in the samole By using random
sampling, the likelihood of bias is reduced.
Simple Random Sampling
Simple random sampling is tha basic sampling technique where a group of

subjects (a sample) is selected for study from a larger group (a population). Each
individual is chosen entirely by chance and each member of the population has an
equal chance of being inciuded in the sample. Every possible sample of a given size
has the same chance of selection; i.e. each member of the population is equally likely
to be chosen at any stage in the sampling proce ss.
Stratified Sampling
oro may nftan he factors which divide up the population into sub-populations
(groups I strata) and the measurement of interest may vary among the different sub
populations. This has to be ccounted for when a sample from the population is
selected in order to obtain a sample that is representative of the population. This is
achieved by stratified sampling.
A stratifecd sample is obtained by taking samples from each stratum or sub-
group of a population, When a sample is to be taken from a population with several
strata, the proportion of each stratum in the sample should be the same as in the
population.
Stratified sampling techniques are generally used when the population is
heterogeneous, or dissimilar, where certain homogeneous, or similar, sub-populations
can be isolated (strata). Simple random sampling is most appropriate when the entire
population from which the sample is taken is homogeneous. Some reasons for using
stratified sampling over simple random sampling are:
a. the cost per observation in the survey may be reduced;
b. estimates of the population parameters may be wanted for each sub-
population;
.. increased accuracy at given cost.
Cluster Sampling
CiüSter Sampiing is a Salipiig tetiiifdue wvi ieie tiie eiiuiie pUpuidtioii

into groups, or clusters, and a random sample of these clusters are selected. All
observations in the selected clusters are included in the sample.
Cluster sampling is typically used when the researcher cannot get a complete
list of the members of a population they wish to study but can get a complete list of
groups or 'clusters' of the population. It is also used when a random sample would
produce a list of subjects so widely scattered that surveying them would prove to be
far too expensive, for example, people who live in different postal districts in the UK.

Probab

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Probab

Загружено:

Авторское право:

Доступные форматы

1. Basic Concepts e data.

1.1.Definition of Statistical Terms

Variables can be classified according to functional relationship,

1.1.1. Classification of Variables according to Functional

They can either be independent or dependent variables as to the functional

1.1.2. Classification of Variables according to Continuity of Values

Continuous and discrete are the clasification of variables according to

1.1.3. Classification of Variables according to Scale of Measurement

A staistical inquiry is a process of tansforming raw dala into useful

Objective of the inquiry should be fully known

2.1 Collection of Data

y is an investigatof the one who helps in collecting

3.2.1 Frequency Distribution

3.3 Graphical Form

Nominal, Ordinal and interval data can also be presented graphically in t

Graph - a pictorial representation of a set of data.

Sampling is the process of selecting units (es. people, organizations) from a

4.1 Non-Probability Sampling

on-probability sampling is also called judgment or subjective sampling. This

In convenience sampling, the researcher use a device in obtaining the

4.2 Probability Sampling

Simple Random Sampling

Simple random sampling is tha basic sampling technique where a group of

CiüSter Sampiing is a Salipiig tetiiifdue wvi ieie tiie eiiuiie pUpuidtioii

Вам также может понравиться