Вы находитесь на странице: 1из 10

1

How to Design and Report Likert Scale

Hi, Marcie,
Do you have examples for reporting the results from Likert Scale
survey? I am looking for a simple and straightforward example for 690
students.

I have my cultural studies data and report, but my study was a little
complicated.

As I learned, Likert Scale is often treated as Ordinal data, and


therefore Weighted Mean should be reported instead of regular mean.
Analyse-it, however, does not calculate Weighted Mean.

I'm not sure what you mean by a weighted mean--in this situation. What would be
the weighting?

I show students both nonparametric and traditional parametric procedures for


handling ordinal data---the former being more accurate/stringent than the latter.

However, plenty of people treat ordinal data as continuous --- especially when
survey data are supplemented with other measures.

Attached is a small survey report I just finished --- part of a blended learning
initiative I'm working on for Academic Affairs and ITS. You're welcome to use it!

M.

First How to Design one:


• http://www.gifted.uconn.edu/siegle/research/instrument%20Reliability%20and
%20Validity/Likert.html
(See the many different types of Likert Scale)
• http://www.socialresearchmethods.net/kb/scallik.htm (a mini job-aid with good
examples)

Likert scale: A Likert scale (pronounced 'lick-ert') is a type of psychometric response


scale often used in questionnaires, and is the most widely used scale in survey research.
When responding to a Likert questionnaire item, respondents specify their level of
agreement to a statement. The scale is named after Rensis Likert, who published a report
describing its use (Likert, 1932).

Sample Question presented using a five-point Likert Scale

A typical test item in a Likert scale is a statement, the respondent is asked to indicate
their degree of agreement with the statement. Traditionally a five-point scale is used,
however many psychometricians advocate using a seven or nine point scale.
2

Ice cream is good for breakfast

1. Strongly disagree
2. Disagree
3. Neither agree nor disagree
4. Agree
5. Strongly agree

Likert scaling is a bipolar scaling method, measuring either positive and negative
response to a statement. Sometimes Likert scales are used in a forced choice method
where the middle option of "Neither agree nor disagree" is not available. Likert scales
may be subject to distortion from several causes. Respondents may avoid using
extreme response categories (central tendency bias); agree with statements as
presented (acquiescence response bias); or try to portray themselves or their group in
a more favorable light (social desirability bias).

Scoring and analysis: http://www.answers.com/topic/likert-scale

After the questionnaire is completed, each item may be analyzed separately or item
responses may be summed to create a score for a group of items. Hence, Likert scales are
often called summative scales.

Responses to a single Likert item are normally treated as ordinal data, because, especially
when using only five levels, one cannot assume that respondents perceive the difference
between adjacent levels as equidistant. When treated as ordinal data, Likert
responses can be analyzed using non-parametric tests, such as the Mann-
Whitney test, the Wilcoxon signed-rank test, and the Kruskal-Wallis test.[1]
When responses to several Likert items are summed, they may be treated as interval data measuring a latent
variable. If the summed responses are normally distributed, parametric statistical tests such as the analysis
of variance can be applied.

Examples:

• Attitudes toward Computer (20 questions, but Each participant gets one score,
summed)
• Some of the personality test (same way)

Data from Likert scales are sometimes reduced to the nominal level by combining all
agree and disagree responses into two categories of "accept" and "reject". The Cochran
Q, or McNemar-Test are common statistical procedures used after this transformation.
3

Example of a Likert Scale (ordinal) Survey and Data Analysis

Data set: the one posted on Course Website: Cultural differences in online Learning

See the Survey at: http://surveymonkey.com/s.asp?u=61201883523 (Question 4—


Students’ Perceptions on Teachers and Teaching in General)

Mini-Data Analysis for two Class Periods


• Set the data correctly (Data, Variable)
• Analyze Data_Practice_continuous (Descriptive Analysis, first
time, leave “factor” unchecked; second time, check it, compare
the results)
• Look at the results and see what conclusions can you draw?
• Analyze Data_Practice_Ordinal (Inferential statistics)

Summary of Data: (Descriptive analysis)—generated by SurveyMonkey

. Following are several questions about your perceptions on or expectations about


teachers and teaching in general; please click on the button to indicate your choice:
Strongly Strongly Response
Agree Undecided Disagree
agree disagree Average
I typically consider my 34% 58%
7% (9) 1% (2) 0% (0) 1.76
teachers to have wisdom. (45) (78)
I usually have a great deal of 29% 58%
9% (12) 4% (5) 0% (0) 1.87
respect for my teachers. (39) (78)
I feel me and my teachers are 40%
9% (12) 26% (35) 23% (30) 2% (3) 2.69
essentially equals. (53)
I think there should be express
rules of conduct in every class 27% 50%
17% (23) 6% (8) 0% (0) 2.02
which all students should (36) (66)
follow.
I expect my teachers to be
45% 43%
recognized experts in the field 8% (11) 3% (4) 1% (1) 1.71
(60) (57)
which they teach.
I am more comfortable when
my teacher conducts class in a 30% 35%
7% (9) 23% (31) 5% (7) 3.02
formal manner rather than (40) (47)
informally.
Total Respondents 134
(filtered out) 3
4

(skipped this question) 1


Note: however, this summary does not give us Mean and SD, still I needed to
analyze the raw data.

Data: (Part of the Raw Data-SurveyMonkey will give you a numerical version)
Culture QS
Open-Ended Q1Wisd Q2Resp Q3Equ Q4rulescond Q5expe Q6formalman
Response om ect al uct rts ner

Chinese 2 1 2 2 1 5
Chinese 2 2 2 1 1 4
Chinese 2 2 2 2 2 2
Chinese 1 1 2 1 1 2
Chinese 1 1 2 2 1 2
Chinese 3 1 2 2 4 4
Chinese 1 1 1 1 2 4

American 2 2 4 2 1 3
American 1 1 3 1 2 2
American 2 4 4 2 3 4
American 1 1 4 1 1 1
american 2 2 3 1 1 3
American 1 1 2 2 1 4
American 2 2 3 2 2 2
american 2 2 2 2 3 5
American 2 2 4 1 2 1
American 2 2 1 4 1 4

Descriptive Analysis: (from 1 strongly agree to 5 strongly disagree)


n Mean SD
Q1Wisdom4 128 1.750 0.6148
Q2Respect4 128 1.875 0.6987
Q3Equal4 127 2.669 0.9682
Q4rulesconduct4 127 2.031 0.8351
Q5experts4 127 1.701 0.7592
Q6formalmanner4 128 3.047 1.0338
Note: Likert Scale has to be set as Continuous Data in order for Analyse-it to run descriptive Statistics; not
accurate but Acceptable.
The rigorous analysis is to get a Weighted mean, which Analyse-it does not do.
Often times, researchers go right into Inferential Statistics and Skip the Descriptive Statistics, since
it is less informative.

[End of Descriptive Analysis]


5

Inferential Analysis on the differences among the three groups (American,


Chinese, Korean) For later discussion

1) Participants’ perceptions on teacher and teaching in general (pre-survey): Item 4 on the pre-
survey assessed participants’ perceptions and expectations on teacher and teaching in general.
The three questions that are closely related to sense of Power Distance were analyzed
inferentially with the Kruskal-Wallis Analysis of Variance test, with cultural identity being the
independent variable. The results indicate that:
a) there were significant differences in participants’ perceptions about being equal with their
instructor. The Korean group had the highest mean rank (45.53) on a scale of 1 (strongly agree) to
5 (strongly disagree). By contrast, the Anglo-American group had the lowest mean rank (29.77)
and therefore perceived their instructors more as equals.
b) There was no significant difference in participants’ perceptions about rules of conduct in
online classes. The Chinese group had the lowest mean rank (29.36), an indication of a stronger
agreement about implementing specific rules of conduct. This result aligned with some of their
narrative comments about “feeling lost” and hoping for more guidance.
And c) There was highly significant differences in their perceptions on course conduct. Again the
Chinese had the lowest mean rank, an indication of a stronger agreement about conducting
courses in a formal manner.

2) Post-survey: approaching superior and peer when completing individual assignments and
team work: Other Responses to the post survey that reflect the impact of Power Distance include:
a) Learners’ comfort level in approaching the instructor/facilitator/TA for help with individual
assignments and/or teamwork; and b) Their comfort level in approaching the peers for help with
individual assignments and/or teamwork. Participants rated their comfort level from very
comfortable (1), to somewhat comfortable (2), uncomfortable (3), and to very uncomfortable (4).
The lower their mean rating, the higher their comfort level. Kruskal-Wallis Analysis of Variance
was used again to compare the mean differences in participants’ ranking of comfort level in
approaching “superior” or their peers, when completing individual assignments and team work if
applicable.

[Note: Because the regular Mean of Likert Scale does not make much sense, I skipped the
descriptive Analysis and Went right into Inferential Analysis—Analysis of Variance using
the Non-Parametric Kruskal-Wallis statistic]

Item 1. Individual Assignment: Approaching Superior for Help (two-tailed test)


Rank Mean
O. IA: Approach "Superior" 5 n sum rank
American5 31 950.0 30.65
Chinese5 15 682.5 45.50
Korean5 29 1217.5 41.98
Kruskal-Wallis statistic 7.15
chisqr approximation,
p5 0.0280 corrected for ties)
When the level of significance is set at 0.05 (a), the small p value (0.02) indicates significant
difference in participants’ rating for approaching “superior” in individual assignment. The
American group, not surprisingly, had the lowest mean rank (30.65), an indication of greater
comfort level in approaching the instructors for help; and the Chinese group had the highest mean
rank (45.50) and thus lower comfort level in approaching their instructors.
6

Item 2. Individual Assignment: Approaching Peer for Help (two-tailed test)


cases excluded: 2 due to
n 73 missing values)
P. IA: Approach Peer by Group6 n Rank sum Mean rank
American6 31 911.5 29.40
Chinese6 15 372.5 24.83
Korean6 27 1417.0 52.48
Kruskal-Wallis statistic 26.46
chisqr approximation,
p6 <0.0001 corrected for ties)
When a=0.05, the small p value (<0.0001) indicates highly significant differences in participants’
comfort level in approaching peers for help with individual assignments. The Chinese group had
the lowest mean rank (24.83--higher comfort level), while the Korean group had the lowest mean
rank (52.48--lower comfort level).

Item 3. Teamwork: Approaching Superior for Help


cases excluded: 17 due to missing
n 58 values)
U. Team: Approach "Superior" by
Group6 n Rank sum Mean rank
American6 31 814.5 26.27
Chinese6 15 509.0 33.93
Korean6 12 387.5 32.29
Kruskal-Wallis statistic 2.88
chisqr approximation, corrected for
p6 0.2364 ties)

P=0.236 (>a=0.05) indicates no significant difference in participants’ comfortableness in


approaching superiors for help when completing teamwork.

Item 4. Teamwork: Approaching Peer for Help (two-tailed test)


cases excluded: 17 due to missing
n 58 values)
V. Team: Approach Peer by
Group6 n Rank sum Mean rank
American6 31 806.5 26.02
Chinese6 15 319.5 21.30
Korean6 12 585.0 48.75
Kruskal-Wallis statistic 24.80
chisqr approximation, corrected for
p6 <0.0001 ties)

The high Kruskal-Wallis statistic (24.8) and the small p value (<0.0001) again indicates highly
significant difference in participants’ comfort level in approaching peers for help with team work.
The Korean group (mean rank = 48.75) contributed greatly to this difference. However, the
statistical power might have been reduced in this test because of the 17 missing rating values
from the Korean group. As mentioned in the curriculum analysis, many of the Korean courses did
not involve team work and many chose “non applicable” for this survey question.
7

Summary: Influence of power distance evidenced by the four tests: Conforming to the existing
findings about Power Distance, the American group (mainly Anglo-American) had the lowest
PDI score, while the Chinese group had the highest PDI score. Possibly because of their sense of
PDI, the American group felt the most comfortable in approaching their instructors for help,
while the Korean group felt most uncomfortable in doing so. Chinese students, because of their
large class size, did not have much opportunity to interact with the instructors. Still, their reported
comfort level in approaching the instructors was low. As to approaching their peers for help, the
Chinese group felt the most comfortable in completing both individual assignments and team
work, the American group felt comfortable, while the Korean group felt the least comfortable in
completing both individual assignments and teamwork. Again, the Koreans’ cultural perceptions
on CMC might have influenced their ratings here. As some of the Korean participants
commented, peers or classmates online can be “strangers.” As to the high comfort level of the
Chinese, it is worth noting that most of these Chinese students worked in self-formed teams and
they therefore were comfortable about approaching their peers for help.

The four Kruskal-Wallis analyses on the post-survey items had revealing results. Although there
was no significant difference in the three groups’ comfort level in approaching superiors for help
with team work, there were significant differences in their rating for approaching superiors in
individual assignments, and there were highly significant differences in their levels of comfort in
approaching peers for help with individual assignments and with team work. Power Distance
indeed affected students’ ways in approaching instructors and their peers. By contrast, individuals
were able to overcome their sense of Power Distance when working as a group. In other words,
individuals became “braver” when working as a team to approach their instructors for help.

(From Wang’s Cultural Studies of Online Learning, British Journal of Educational


Technology)

More Likert Scale Examples:


http://www.socialresearchmethods.net/kb/scallik.htm
Defining the Focus. As in all scaling methods, the first step is to define what it is
you are trying to measure. Because this is a unidimensional scaling method, it is
assumed that the concept you want to measure is one-dimensional in nature.
You might operationalize the definition as an instruction to the people who are
going to create or generate the initial set of candidate items for your scale.

Generating the Items. next, you have to create the set of potential scale items.
These should be items that can be rated on a 1-to-5 or 1-to-7 Disagree-Agree
response scale. Sometimes you can create the items by yourself based on your
intimate understanding of the subject matter. But, more often than not, it's helpful
to engage a number of people in the item creation step. For instance, you might
use some form of brainstorming to create the items. It's desirable to have as
large a set of potential items as possible at this stage, about 80-100 would be
best.

Rating the Items. The next step is to have a group of judges rate the items.
Usually you would use a 1-to-5 rating scale where:
8

1. = strongly unfavorable to the concept


2. = somewhat unfavorable to the concept
3. = undecided
4. = somewhat favorable to the concept
5. = strongly favorable to the concept

Administering the Scale. You're now ready to use your Likert scale. Each
respondent is asked to rate each item on some response scale. For instance,
they could rate each item on a 1-to-5 response scale where:

1. = strongly disagree
2. = disagree
3. = undecided
4. = agree
5. = strongly agree

There are a variety possible response scales (1-to-7, 1-to-9, 0-to-4). All of these
odd-numbered scales have a middle value is often labeled Neutral or Undecided.
It is also possible to use a forced-choice response scale with an even number of
responses and no middle neutral or undecided choice. In this situation, the
respondent is forced to decide whether they lean more towards the agree or
disagree end of the scale for each item.

The final score for the respondent on the scale is the sum of their ratings for all of the
items (this is why this is sometimes called a "summated" scale). On some scales, you will
have items that are reversed in meaning from the overall direction of the scale. These are
called reversal items. You will need to reverse the response value for each of these
items before summing for the total. That is, if the respondent gave a 1, you make it a 5; if
they gave a 2 you make it a 4; 3 = 3; 4 = 2; and, 5 = 1.

Example: The Employment Self Esteem Scale

Here's an example of a ten-item Likert Scale that attempts to estimate the level of
self esteem a person has on the job. Notice that this instrument has no center or
neutral point -- the respondent has to declare whether he/she is in agreement or
disagreement with the item.

INSTRUCTIONS: Please rate how strongly you agree or disagree with each of
the following statements by placing a check mark in the appropriate box.

1. I feel good about my work on the


Strongly Somewhat Somewhat
Disagree Disagree Agree
Strongly Agree job.
2. On the whole, I get along well with
Strongly Somewhat Somewhat
Disagree Disagree Agree
Strongly Agree others at work.
3. I am proud of my ability to cope with
9

Strongly Somewhat Somewhat


Disagree Disagree Agree
Strongly Agree difficulties at work.
4. When I feel uncomfortable at work, I
Strongly Somewhat Somewhat
Disagree Disagree Agree
Strongly Agree know how to handle it.
5. I can tell that other people at work
Strongly Somewhat Somewhat
Disagree Disagree Agree
Strongly Agree are glad to have me there.
6. I know I'll be able to cope with work
Strongly Somewhat Somewhat
Disagree Disagree Agree
Strongly Agree for as long as I want.
7. I am proud of my relationship with
Strongly Somewhat Somewhat
Disagree Disagree Agree
Strongly Agree my supervisor at work.
8. I am confident that I can handle my
Strongly Somewhat Somewhat
Disagree Disagree Agree
Strongly Agree job without constant assistance.
9. I feel like I make a useful
Strongly Somewhat Somewhat
Disagree Disagree Agree
Strongly Agree contribution at work.
10. I can tell that my coworkers respect
Strongly Somewhat Somewhat
Disagree Disagree Agree
Strongly Agree me.

Usability Glossary: Likert scale

a type of survey question where respondents are asked to rate the level at which they
agree or disagree with a given statement. For example:

I find this software easy to use.


strongly disagree 1 2 3 4 5 6 7 strongly agree

A Likert scale is used to measure attitudes, preferences, and subjective reactions. In


software evaluation, we can often objectively measure efficiency and effectiveness with
performance metrics such as time taken or errors made. Likert scales and other
attitudinal scales help get at the emotional and preferential responses people have to the
design. Is it attractive, fun, professional, easy?

Producing Means and Standard Deviations:


http://www.uni.edu/its/us/document/stats/spss2.html

The DESCRIPTIVES procedure in SPSS produces means and standard deviations for
variables. It also prints the minimum and maximum value. Likert scale questions are
appropriate to print means for since the number that is coded can give us a feel for
which direction the average answer is. The standard deviation is also important as it
10

give us an indication of the average distance from the mean. A low standard deviation
would mean that most observations cluster around the mean. A high standard deviation
would mean that there was a lot of variation in the answers. A standard deviation of 0 is
obtained when all responses to a question are the same. The following code produces
descriptive statistics of columns 1 to 20. The minimum and maximum value tell us the
range of answers given by our survey population.

descriptives
variables = q1 to q20

Valid
Variable Mean Std Dev Minimum Maximum N Label

Q1 4.65 .66 2 5 80 question 1


Q2 4.59 .66 2 5 85 question 2
Q3 4.36 .75 2 5 90 question 3
Q4 4.72 .51 3 5 74 question 4
Q5 3.89 1.11 1 5 92 question 5
Q6 3.26 1.45 1 5 101 question 6
Q7 3.92 1.14 1 5 88 question 7
Q8 4.26 .90 1 5 94 question 8
Q9 4.32 .88 2 5 90 question 9
Q10 4.45 .86 2 5 75 question 10
Q11 3.86 1.45 1 5 95 question 11
Q12 3.71 1.26 1 5 110 question 12
Q13 4.62 .71 2 5 90 question 13
Q14 4.37 .85 2 5 97 question 14
Q15 3.08 1.39 1 5 109 question 15
Q16 4.45 .89 1 5 91 question 16
Q17 4.56 .81 1 5 79 question 17
Q18 2.68 1.34 1 5 116 question 18
Q19 4.54 .74 2 5 90 question 19
Q20 4.39 .76 2 5 96 question 20

Вам также может понравиться