Академический Документы
Профессиональный Документы
Культура Документы
LESSON 12:
MEASUREMENT & SCALING
Levels of Measurement • o Most opinion and attitude scales or indexes in the social
We know that the level of measurement is a scale by which a sciences are ordinal in nature
variable is measured. For 50 years, with few detractors, science
Interval Level of Measurement
has used the Stevens (1951) typology of measurement levels
The interval level of measurement describes variables that have
(scales). There are three things, which you need to remember about
more or less equal intervals, or meaningful distances between
this typology: Any thing that can be measured falls into one of
their ranks. For example, if you were to ask somebody if they
the four types
were first, second, or third generation immigrant, the assumption
The higher the level of measurement, the more precision in is that the distance, or number of years, between each generation
measurement and every level up contains all the properties of the is the same.
previous level. The four levels of measurement, from lowest to
Ratio Level of Measurement
highest, are as follows:
The ratio level of measurement describes variables that have equal
• Nominal intervals and a fixed zero (or reference) point. It is possible to
• Ordinal have zero income, zero education, and no involvement in crime,
• Interval but rarely do we see ratio level variables in social science since it’s
• Ratio
almost impossible to have zero attitudes on things, although
“not at all”, “often”, and “twice as often” might qualify as ratio
level measurement.
Advanced statistics require
• At least interval level measurement, so the researcher always
strives for this level,
• Accepting ordinal level (which is the most common) only
when they have to.
• Variables should be conceptually and operationally defined
with levels of measurement in mind since it’s going to affect
the analysis of data later
RESEARCH METHODOLOGY
contamination of other variables creeping into your study.
Anything you do to standardize or clarify your measurement
instrument to reduce user error will add to your reliability.
It’s also important consider the time frame that is appropriate for
what you’re studying as soon as possible. Some social and
psychological phenomena (most notably those involving
behaviour or action) lend themselves to a snapshot in time.
If so, your research need only be carried out for a short period of
time, perhaps a few weeks or a couple of months. In such a case,
your time frame is referred to as cross-sectional. Sometimes, cross-
sectional research is criticized as being unable to determine cause
and effect A longer time frame is called when cross-sectional data
fails to depict the cause- effect relationship, one that is called
longitudinal, which may add years onto carrying out your research.
There are many different types of longitudinal research, such as
those that involve time-series (such as tracking a third world nation’s
economic development over four years or so). The general rule is
to use longitudinal research the greater the number of variables
you’ve got operating in your study and the more confident you
want to be about cause and effect.
Methods of Measuring Reliability
Now, the question arises that how will you measure the reliability
of a particular measure? There are four good methods of measuring
reliability:
• Test-retest
• Multiple forms
RESEARCH METHODOLOGY
structuring devices. How would you rate the following aspects of your food store?
To quantify dimensions that are essentially qualitative, rating scales Extremely Extremely
or ranking scales are used. Important unimportant
Rating Scales Service 1 2 3 4 5 6 7
One uses rating scales to judge properties of objects without Check outs 1 2 3 4 5 6 7
reference to other similar objects. These ratings may be in such Bakery 1 2 3 4 5 6 7
forms as “like-dislike,” “approve-indifferent disapprove,” or other Deli 1 2 3 4 5 6 7
classifications using even more categories. There is little conclusive
Semantic Differential scale
support for choosing a three-point scale over scales with five or
A semantic differential scale is constructed using phrases describing
more points. Some researchers think that more points on a rating
attributes of the product to anchor each end. For example, the left
scale provide an opportunity for greater sensitivity of measurement
end may state, “Hours are inconvenient” and the right end may
and extraction of variance. The most widely used scales range
state, “Hours are convenient”. The respondent then marks one
from three to seven points, but it does not seem to make much
of the seven blanks between the statements to indicate his/her
difference which number is used-with two exceptions.4 First, a
opinion about the attribute.
larger number of scale points is needed to produce accuracy with
single-item versus multiple-item scales. Second, in cross-cultural The process entitled Semantic Differential employs a similar
measurement, the culture may condition respondents to a standard approach as the Likert scaling in that it seeks a range of responses
metric-a ten-point scale in Italy. between extreme polarities but it seeks to place the ordinal range
of responses between two keywords expressing opposite “ideas”
Ranking Scales
or concepts.Bobbie’s illustration provides the best illustration of
In ranking scales, the subject directly compares two or more objects
the concept.
and makes choices among them. Frequently, the respondent is
asked to select one as the “best” or the “most preferred.” When
there are only two choices, this approach is satisfactory, but it often
results in “ties” when more than two choices are found. For Semantic Differential: Feelings about Musical Selections
example, respondents are asked to select the most preferred among
three or more models of a product. Assume that 40 percent choose Very Some- Neither Some- Very
Much what what Much
model A, 30 percent choose model B. and 30 percent choose model
C. “Which is, the preferred model?” The analyst would be taking Enjoyable Unenjoyable
some model other than A. Perhaps all B and C voters would place
A last, preferring either B or C to it. This ambiguity can be avoided Discordant Harmonic
by using some of the techniques described in this section.
Some of the measurement scales are discussed below:
Traditional Modern
assumption that a respondent choosing one level of response Bright ___ ___ ___ ___ ___ dark
would give the same type of response to all inferior levels.
Low quality ___ ___ ___ ___ __high quality
The number of response sets that violate the scalar pattern is
compared to the number that do reflect the pattern and what is Conservative ___ ___ ___ ___ __innovative
referred to as a coefficient of reproducibility. Again, Bobbie’s Stapel Scale
illustration provides a very clear understanding. It is similar to the semantic differential scale except that numbers
Guttman Scaling and Coefficient of Reproducibility identifies points on the scale, only one statement is used and if
the respondent disagrees a negative number should marked, and
Response Number Index Scale Total there are 10 positions instead of seven. This scale does not require
Pattern of Cases Scores Scores Scale Errors
+++
that bipolar adjectives be developed and it can be administered by
612 3 3 0
Scale Types ++= 448 2 2 0 telephone.
+== 92 1 1 0
=== 79 0 0 0 Q-sort Technique
=+= 15 1 2 15 In Q- sort Technique the respondent if forced to construct a
Mixed Types +=+ 5 2 3 5 normal distribution by placing a specified number of cards in one
==+ 2 1 0 2
=++ 5 2 3 5 of 11 stacks according to how desirable he/she finds the
characteristics written on the cards. This technique is faster and less
Number of Errors
Coefficient of Reproducibility = 1 - tedious for subjects than paired comparison measures. It also
Number of Guesses
forces the subject to conform to quotas at each point of the scale
27 27
In the example = 1 -
1,258 x 3
=
3,774
= .993 or 99.3% so as to yield a normal or quasi – normal distribution.
Thus we can say that the objective of Q-Technique is intensive
The entire exercise is really just a way of indicating that the degree study of individuals.
to which a set of responses accurately reflects the scalar assumptions
Selection of an appropriate attitude measurement of scale:
is an indication of the degree to which the entire set could be
recreated from the scale itself. What the above illustration shows We have examined a number of different techniques, which are
is that if we were to project an imaginary “sample” from the available for the measurement of attitudes. Each method has got
coefficient of reproducibility of 99.3%, then the projection would certain strengths and weaknesses. Almost all the techniques can be
reflect the real sample to that degree. Guttman scaling shows that used for the measurement of any component of attitudes. But all
a well constructed scale can very accurately the profile of a response the techniques are not suitable for all purposes. The selection
set. But then, you only know the coefficient of reproducibility depends upon the stage and size of research.
after you have run the survey and crunched the numbers so it is Generally, Q-sort and Semantic differential scale are preferred in
not a predictive tool, it is a proof of the strength of the scale as a the preliminary stages. The Likert scale is used for item analysis.
measure. For specific attributes the semantic differential scale is very
A brief word on typologies is in order. So far, we have limited appropriate. Overall the semantic differential is simple in concept
ourselves to an examination of unidirectional variables; that is and results obtained are comparable with more complex, one-
one thing in one direction (attitudes for or against abortion, etc.). dimensional methods. Hence it is widely used.
Often relationships are better explained as the function of the Limitations of Attitude Measurement Scales
intersection of several variables. This is referred to as a typology. The main limitation of these tools is the emphasis on describing
Remember what we have noted about making sure that your attitudes rather than predicting behaviour. This is primarily because
indices and scales are comprised of single dimension indicators. of a lack of models that describe the attitudes in behaviour
Recall that while “religion” can have a strong correlation with
“attitudes on abortion”, that does not mean that a question on Tutorial
religion belongs in an index or scale of questions on “attitudes on Prepare a questionnaire on any one of the following objectives
abortion”. But, if you wish to examine the intersection of the 1. To know the corporate productivity
two, you can construct a typology effectively showing, for example 2. Job analysis / needs and satisfaction level of employees/
that “Catholics” may be “conservative” on “abortion” but remain motivation level of employees /job involvement etc.
“liberal” on “other human rights”. 3. Product testing / Feedback of after sales services
Bobbie warns us that typologies are useful as independent variables
References
(“religion” may be a good causal factor in “attitudes on abortion”)
but can be problematic as dependent variables (explaining the Donald R. Cooper – Business Research Methods, Tata McGraw –
“why” isn’t always clear). Catholics may be more anti-abortion Hill Publication
because the church has forbidden it but what of other groups? Kothari C R – Quantitative Techniques (Vikas Publishing House
You can get onto some very shaky ground using typologies as the 3rd ed.)
“effect” or dependent variable. Levin R I & Rubin DS - Statistics for Management (Prentice Hall
Example of Semantic Differential of India, 2002)
How would you describe Kmart, Target, and Wal-Mart on the
following scale?