Академический Документы
Профессиональный Документы
Культура Документы
CHAPTER
5
Collecting Data
Observations
Interviews
Qualitative Characteristics
data collection • Accuracy
techniques • Credibility
Journals • Dependability
Existing
documents and
Collecting records
Data
Surveys,
questionnaires,
and rating scales
Characteristics
Quantitative
data collection Checklists • Validity
techniques • Reliability
Chapter 5 Organizer
I n this chapter, we enter into the second stage—the acting stage—of conducting a classroom-based
action research project. Recall that the acting stage is composed of data collection, which will be
105
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 106
106— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
discussed in the present chapter, and data analysis, the topic for Chapter 6. As you will soon learn, there are
numerous techniques that can be used to collect both qualitative and quantitative data for your teacher-
led action research studies.
Action Research
• Identifying and
limiting the topic
• Gathering
information
• Reviewing related
literature
• Developing a • Collecting data
research plan • Analyzing data
• Developing an
action plan
• Sharing and
communicating
results
• Reflecting on the
process
Observations
As human beings, we are constantly observing and taking note of the world around us. Furthermore, as
teachers, we are constantly observing our students. However, on a daily basis, we typically observe our
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 107
108— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
noting preliminary interpretations of what has been observed (Leedy & Ormrod, 2005). Bogdan & Biklen
(2007) refer to these interpretations as observer’s comments, or OCs. Observer’s comments often shed
light on the emerging patterns from your observational data. Including observer’s comments in your
observation notes is also one way to integrate reflection into the process of action research. The separation
of these two types of commentaries is critical so that actual observations are not confused with what you
think the observed event means. Teachers conducting action research studies need to remain as objective
as possible in the records kept and data collected. As an aside, this need for objectivity also dictates that
you not censor what you record in your notes with your “teacher’s eyes”—do not hesitate to record
something even if it reflects negatively on your teaching (Hubbard & Power, 2003); after all, you are trying
to learn about and improve your professional practice. In addition, interpretations of observations may
change over time as you collect more data; having a record of these changing interpretations can be
invaluable over the course of your study. An example of a page from a book of field notes I recorded several
years ago during a study of positive reinforcement in a preschool setting, depicting this two-column format
of actual observations and associated observer’s comments, is shown in Figure 5.1.
Written field notes can become problematic, however. They are often insufficient to depict the richness
and the details of what one is observing (Leedy & Ormrod, 2005). Videotapes can provide assistance as a
tool for recording observations, although they are not without their respective limitations, as well.
Background noises may prevent you from hearing that on which you were hoping to focus your videotaped
observation. Furthermore, video cameras can only capture what is happening in a given direction (i.e., the
direction the camera is facing). Leedy and Ormrod (2005) suggest that, prior to beginning any formal
observations, researchers should experiment and become familiar with various methods of recording
observations in order to find what works best for the particular setting and situation. It is, however,
important to remember that whatever mechanism you use to record your observations, you simply cannot
physically record everything that you see or that is happening (Mills, 2007); it is best not to put pressure
on yourself to try to do so.
On a practical note, several tips may facilitate your observations and the development of your
observation skills. If you decide to observe and to record those observations using field notes, you may
want to consider carrying a clipboard or legal pad with you for several days prior to beginning your
observations and recording any field notes. It is important that the act of recording field notes becomes a
part of your daily routine, as opposed to something that “feels” unfamiliar, extraneous, or irrelevant.
Similarly, if you decide that you will record your observations through the use of a video camera, you may
want to set up the camera several days in advance of your recording. This is important because both you
and your students, or other participants, will be more comfortable being videotaped if you and they are
accustomed to seeing the camera in the classroom. Again, it becomes part of the daily routine or setting.
Interviews
An alternative to observing people is to directly ask them questions. This can be accomplished in
several ways. Interviews are conversations between the teacher-researcher and participants in the study
in which the teacher poses questions to the participant (Schmuck, 1997). Interviews can be conducted with
individuals or with groups. It is best to prepare an interview guide, containing either specific or general
questions to be asked, prior to conducting any interviews.
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 109
Figure 5.1 A Sample Fieldnote Page, the Left Column Showing Actual Observations and
the Right Column Showing Preliminary Interpretations
Obs. #3
June 10
10:15–11:00 < Observations > < Observer’s Comments (OC) >
Time There were very few forms of interactions I don’t think that, in the entire time I was
between the children and the teachers. The there today, I heard one positive comment or
children were playing; behaving, for the most saw one positive gesture. It seemed that the
part. One of the teachers was pushing two teachers were in only a supervisory role. All
girls on swings and the other teacher was they appeared to be doing was supervising
sitting near the wading pools, watching the the behavior and actions of the children in
children. Carol said several things to certain order to prevent accidents or injuries. I’m not
children. She repeatedly used phrases such saying that this is wrong; on the contrary, it
as, “Don’t do that,” “Don’t throw water,” “Don’t is necessary when conducting an activity
throw that in the pool,” and “You’re gonna of this nature, especially with very young
break the sprinkler . . . don’t do that!” children. I just expected to hear some
positive behaviors being praised in addition
to the negative being addressed.
Several children came close to hurting I began to wonder if this type of activity (i.e.,
themselves and/or others. One three-year- supervisory in nature) did not permit the use
old girl tried to pour water over the head of a of many positive comments. Maybe these
one-year-old. Two boys were throwing beach teachers leave those types of positive
balls into the pool and inadvertently hitting reinforcement for classroom activities.
smaller children who were playing in the Perhaps activities that require quicker
pool. thought and action on the part of the
teachers—in order to prevent children from
being hurt, or worse—don’t allow for positive
comments or identification of children to
model positive behaviors.
The children continued to play in the pools, Carol’s comment was not in jest. She said
the sprinkler, and the swings. I observed it with a firm tone in her voice. I didn’t like
very little verbal interaction between the hearing this. I was always taught never to
teachers and the children. Initially, most of threaten children, regardless of their age
what I heard came from Carol. She made and regardless of how idle the threat. I find
several comments to the children, such as myself expecting to see and hear this kind
“Don’t do that” and “You need to ride that of behavior from Carol and not from Marilyn,
bike over there.” Carol’s daughter picked up as I have not yet heard her say something
a garden hose and began playing with it. of this nature.
Twice Carol told the girl to stop playing with
the hose and put it down, but to no avail.
The third time she spoke to her, she said,
“You better put that down or it will turn
into a snake and bite you.”
110— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
usually not a concern when collecting qualitative data; it is typically more desirable for the researcher to
have some flexibility and to be able to ask clarifying questions (not initially included on the interview
guide), to pursue information not initially planned for, and to seek different information from different
people (Leedy & Ormrod, 2005).
When gathering truly qualitative data, interviews are probably best conducted following
semistructured or open-ended formats. In semistructured interviews, the researcher asks several “base”
questions but also has the option of following up a given response with alternative, optional questions that
may or may not be used by the researcher, depending on the situation. When developing interview guides,
it is best to keep your questions brief, clear, and stated in simple language (Johnson, 2008; Schwalbach,
2003). For example, if we were interviewing students regarding their opinions of our school, we might ask
the following questions, where the italicized questions represent the optional, follow-up, probing
questions:
The semistructured interview guide that I used in my positive reinforcement study is shown in Figure
5.2, and a portion of the transcript from one interview I conducted is shown in Figure 5.3.
Open-ended interviews provide the respondent with only a few questions, very broad in their nature.
The intent is to gather very different kinds of information from different individuals, depending largely on
how each interprets the questions. For example, an open-ended series of interview questions about school
climate might include the following:
As mentioned earlier, interviews are conducted not only with individuals but also with groups. A focus
group is the name given to simultaneous interviews of people making up a relatively small group, usually
no more than 10 to 12 people (Leedy & Ormrod, 2005). This type of interview typically lasts between 1 and
2 hours. Focus groups are especially useful when time is limited and because people often are more
comfortable talking in a small group, as opposed to individually. Furthermore, interactions among the focus
group participants may be extremely informative due to the tendency for people to feed off others’
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 111
comments. However, when conducting a focus group interview, it is important to ensure that each
participant is provided with the opportunity to speak and share her or his perspective (Mills, 2007). There
can be a tendency for one or two individuals to dominate the discussion; it is the responsibility of the
teacher-researcher to closely monitor the discussion in order to prevent this from happening. The set of
guiding questions I used for a study incorporating data collected via a focus group is provided in Figure 5.4.
Qualitative data may also be collected via the use of e-mail interviews (Mills, 2007). With schools
becoming increasingly networked, teacher-researchers can easily collect data from colleagues, parents, and
students by sending out a series of questions in an e-mail message. One benefit of doing so is that when
the respondent replies to your e-mail questions, the transcription of the interview has already been done
for you. However, you must be cautious of possible ethical complications and realize that e-mail responses
are not necessarily anonymous or confidential (Mills, 2007). Other individuals who may have access to a
server may be able to intercept e-mail responses from targeted respondents.
Hubbard and Power (2003) also remind teacher-researchers not to forget about the value of informal
interviews—that is, those that are spontaneous, that take place throughout the data collection process,
and that are typically part of the daily interactions with students in a classroom setting. Teachers are
constantly asking students questions, trying to gather various types of information from them.
Schmuck (1997) provides a discussion of the relative advantages and limitations of conducting
interviews as part of action research studies. Advantages include the fact that interviews permit the
teacher-researcher to probe further and ask for clarification in a participant’s response to a given question.
In addition, data can be collected—and, therefore, preserved—through the use of audio- and videotapes,
although you want to be sure that individuals being interviewed are not made to feel uncomfortable by the
• What do you see as acceptable forms of positive reinforcement for children in your school?
• What do you think the meaning of positive reinforcement is for you?
– Do you think it is the same for your teachers? Why or why not?
– Do you think it is the same for your students? Why or why not?
112— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
Figure 5.3 Portion of a Transcript From a Semistructured Interview, Using the Guide
Shown in Figure 5.2
CM: How would you describe positive reinforcement? How would you define that, or what does that
mean to you?
“Carol”: Positive reinforcement means not yelling at the children. It means talking to them in a positive way.
Sometimes you can lose your temper. I try not to use time-out a whole lot. I give them choices. If
you’re going to throw the blocks, then you’re going to pick them up. If you’re going to hit someone
in the head with that toy, then you’re going to go apologize to them. And tell them the difference
between right and wrong instead of, . . . take for instance E., who likes to throw toys at everybody.
Instead of putting him in the corner and my picking up all the toys he’s thrown, I make a game out
of it. Instead of “E., pick them up, pick them up,” we count them as we put them in. So he’s still
having to do what he did—you know, having to clean up his mess—but we’re making a game out
of it. Instead of “this was wrong and you’re going to sit in the corner for this.”
CM: So they don’t see it so much as a punishment. Rather, you try to turn it into something
constructive?
“Carol”: Right. Like this morning, he punched a little girl in the face, and Gail and I both agreed that he
needs to sit out of the group for a little while.
CM: So it really depends on the situation? It would be hard to take that situation and turn it into
something positive.
“Carol”: Right. It depends on what they’ve done and if they keep doing it all day long. Then they need time
away. That’s why we have that carpet out there. If the child needs to leave the room and get away
from the other children for 5 minutes, they go out and sit on the quiet rug.
presence of an audio or video recorder. Finally, for respondents who cannot or who are unwilling to share
their thoughts, feelings, or perceptions in writing, sitting down and carrying on a conversation about them
is often a reasonable alternative. On the other hand, interviews can be extremely time consuming. Not only
does it take time to collect data from individuals during a verbal conversation, but before the data can be
analyzed, the interviews must be transcribed so that the responses can be read and processed. The general
rule of thumb that I learned in my graduate school days is that for every hour of audiotaped interview, you
can expect approximately 8–9 hours of transcription work, depending on the quality of the recording.
Other limitations of interviews include the fact that respondents are not able to retain their anonymity.
Many people are simply uncomfortable with a tape recorder lying on the table between them and the
interviewer. Finally, respondents often fear that something they have said may be used against them at
some point in the future. An additional responsibility of the teacher-researcher is to put the mind of the
interviewee at ease about such possibilities.
Journals
Data journals may be kept by both teachers and students and provide valuable information into the
workings of a classroom (Mills, 2007). In a way, student journals provide information similar to
homework to the teacher, in that teachers can gain a sense of students’ daily thoughts, perceptions, and
experiences in the classroom. Teacher journals can similarly provide teacher-researchers with the
opportunity to maintain narrative accounts of their professional reflections on practice. They truly become
an ongoing attempt by teachers “to systematically reflect on their practice by constructing a narrative that
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 113
Figure 5.4 Sample of Guiding Questions Used for a Focus Group Interview
1. (a) What were your overall perceptions of the process used to gather student feedback on your
teaching?
(b) What aspects of the process did you like?
(c) What aspects did you dislike?
3. (a) What changes have you made to any of your teaching behaviors as a result of the student
feedback?
(b) What behaviors, if any, are you considering changing in your teaching as a result of the student
feedback?
4. (a) What unanticipated benefits did you experience as a result of this process of collecting student
feedback?
(b) What negative consequences did you experience as a result of this process of collecting student
feedback?
5. (a) Is this method, that of using rating scales, the most appropriate way to collect student feedback?
(b) What method(s) might work better? Why?
6. (a) For what specific school situations or student groups would this method of collecting student
feedback not be appropriate?
(b) What could be changed in order to make it more suitable in this context or to these students?
9. (a) What specific things could be changed in order to improve this process of collecting student
feedback?
10. (a) Based on your experience, will you continue to collect student feedback in this manner?
(b) If not, will you continue to collect this information but do so by using a different method? Can you
describe that method?
Upon completion of the above questions, explain to the participants that the meeting is about to end. Ask
them to take a moment and think about what has been discussed. Then, one by one, ask them if they have
any additional comments. If necessary, explore relevant or new comments in greater depth.
honors the unique and powerful voice of the teachers’ language” (Mills, 2007, p. 70) by reflecting not only
observations but also the feelings and interpretations associated with those observations.
Class journals are another means of incorporating journaling into your action research data collection.
A class journal is a less formal version of a student journal. Johnson (2008) suggests that a blank
notebook be passed around the class on a periodic basis or put in a learning center for an extended amount
of time. Students are encouraged to enter their thoughts, ideas, perceptions, feedback, or other forms of
response, such as pictures or diagrams, as they wish. Teachers may want to provide some sort of guidelines
for making entries into the class journal so that it does not become a “quasi-teacher-approved” form of
graffiti that may be offensive to other students (Johnson, 2008).
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 114
114— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
Figure 5.5 Sample of a Data Collection Form for Existing Student Data
Referral Referral
Number of Reasons for for Special for Social
Days Reason for Discipline Discipline Program? Services? Retained?
Student Name Absent Absences Referrals Referrals (Y/N) (Y/N) (Y/N)
dealing with the validity of qualitative data, researchers are essentially concerned with the
trustworthiness—for example, the accuracy and believability—of the data. Trustworthiness is
established by examining the credibility and dependability of qualitative data. Credibility involves
establishing that the results of qualitative research are credible or believable from the perspective of the
participant in the research (Trochim, 2002c). On the other hand, the concept of dependability emphasizes
the need for the researcher to account for the ever-changing context within which research occurs. The
researcher is responsible for describing the changes that occur in the setting and how these changes
affected the way the researcher approached the study (Trochim, 2002c).
There are three common practices, typical aspects of any qualitative research study, that can help
ensure the trustworthiness of your data. The first of these is triangulation, or the use of multiple data
sources, multiple data-collection methods, and perhaps even multiple teacher-researchers in order to
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 116
116— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
support the ultimate findings from the study (Glesne, 2006; Hubbard & Power, 2003). A given finding is
supported by showing that independent measures of it tend to agree with each other or at least do not
directly contradict each other (Hubbard & Power, 2003). For example, when you observe Susan actually
doing something that she has told you in an interview that she does and that is also indicated on an open-
ended questionnaire (see Figure 5.6), you likely will have more confidence in concluding that it is probably
an accurate depiction of Susan’s practice. In other words, your interview data has been supported by your
observation data and by the questionnaire responses. Had any of the three sources of data contradicted
each other, you likely would have arrived at a different conclusion, perhaps that Susan was telling you what
you wanted to hear, although in reality she did not practice it.
Interview with
Susan
Susan’s
Observations
questionnaire
of Susan
responses
A second practice that can help ensure the quality of your data is known as member checking. This
procedure involves the sharing of interview transcripts, analytical thoughts (such as observation notes
with observer’s comments), and drafts with the participants of the study. The purpose of sharing these
data sources is to make sure that you have represented your participants and their ideas accurately (Glesne,
2006). A third and final procedure involves prolonged engagement and persistent observation. The
idea here is that the more time you spend “in the field,” so to speak, the more you are able to develop trust
with and get to know your participants, learn the culture of their setting (whether it be a classroom or
school building), and observe patterns of behavior to the point of being routine (Glesne, 2006). Observing
or interviewing only once or twice will not afford you this luxury.
numerical scale. Quantitative data collection techniques include surveys, questionnaires, checklists, and
rating scales, as well as tests and other more formal types of measurement instruments. Generally
speaking, quantitative data collection techniques are more efficient, in that you can collect data from
numerous individuals simultaneously. However, the depth of those data does not begin to compare to that
resulting from the use of qualitative techniques.
Students would be instructed to select one of the four possible responses. This type of question is
easily quantifiable; you simply count the number of students who select each option. Furthermore, it is
relatively easy to report the “results” of this item. You might summarize your data and conclude the
following:
118— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
It is important to realize that this type of question may be misleading or controlling (Johnson, 2008).
If, in our example, the favorite subject of a given respondent is a foreign language class, how is that
person supposed to respond to the question? Any option that person selects will actually provide
inaccurate information. One alternative is to anticipate such an occurrence by revising the item to read
as follows:
Open-ended items allow the respondents to provide a seemingly limitless number of responses. For
example, we could have reworded our “favorite subject” question as an open-ended question by simply
asking,
Here we might get a wide variety of responses. It is then the responsibility of the researcher to “analyze”
the resulting data by grouping similar items together and then tallying the number of responses in each
category. The result might look like this:
Obviously, this form of the question provides a more accurate sense of what students really like.
The only problem associated with asking open-ended items like this is that you have the sometimes
messy task of grouping responses into similar categories before you can count the responses
(Johnson, 2008).
The main difference between a survey, or questionnaire, and a rating scale is that surveys are more
appropriate for content-based types of questions (similar to our example above), whereas rating scales
are appropriate when asking individuals to respond to a set of questions where their response indicates
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 119
the strength (e.g., the extent of agreement, level of frequency, degree of understanding) of that
response (Johnson, 2008). Rating scales can be used very effectively to measure students’ attitudes,
perceptions, or behaviors. There are two main types of scales that appear in items on a rating scale:
Likert and Likert-type scales. A Likert (pronounced “lick-ert”) scale begins with a statement and then
asks individuals to respond on an agree-disagree continuum. The Likert scale typically ranges from
strongly agree to strongly disagree. I typically recommend using a 5-point scale, with the 5 points
defined as follows:
1 = strongly disagree
2 = disagree
3 = no opinion
4 = agree
5 = strongly agree
There tends to be quite a bit of disagreement among those with expertise in conducting research
through the use of surveys regarding the appropriateness of including a neutral point on a scale. By
including it, you allow your respondents to indicate that they truly are neutral or have no opinion, if in
fact that is the case for them. However, if provided with a neutral option, there is a tendency for people
not to think much about how they truly feel; they simply select the neutral option, which may not
represent their true belief (i.e., the data they provide are inaccurate). On the other hand, if individuals
truly are indifferent or have no opinion and you do not provide this option—because you are operating
under the assumption that no one is truly neutral about anything—you “force” them to choose
something that they do not really believe, thus providing inaccurate data once again. There is no right or
wrong when it comes to deciding on the inclusion of a neutral point on your rating scale. However, you
should consider the implications of both including and excluding such a point and then design your scale
accordingly. Figure 5.7 presents a portion of a rating scale that I used in a study that focused on students
providing their teachers with feedback on their classroom teaching. Notice the format of the Likert-scaled
items. Also notice that a higher number corresponds to a higher level of agreement with a given
statement.
A similar type of scale is a Likert-type scale. This type of scale also exists on a continuum, but
something other than extent of agreement is being measured. For example, a Likert-type item might
require participants to respond on a scale that examines quality (“excellent . . . poor”), frequency of
occurrence (“always . . . never”), or level of comfort (“very comfortable . . . not at all comfortable”)
(Mertler & Charles, 2008). An example of a Likert-type scale, used in a study of prekindergarten-to-
kindergarten transitions, is shown in Figure 5.8.
I want to mention one more thing about using surveys and rating scales with students. Teacher-
researchers need to be sure that the various aspects—not just the reading level—of the instrument are
appropriate for the age or grade level of students. Although I recommended earlier that a 5-point scale is
typically appropriate, one could see how that might create difficulties for young children—they obviously
would not be able to discriminate between adjacent points on the scale. However, do not shy away from
using such data collection instruments with younger children. You would likely provide fewer options on
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 120
120— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
The purpose of this questionnaire is for you to help your teachers to improve. Several statements about your
teacher are listed below. Please circle the number, using the code below, that describes how much you agree
with each statement. Your responses will be anonymous; please do not place your name anywhere on this
form. Please respond to each statement as honestly as you possibly can and by circling only one number
for each statement.
1 2 3 4 5
|-------------------------|-------------------------|-------------------------|-------------------------|
Strongly Disagree No Agree Strongly
Disagree Opinion Agree
the scale and perhaps even use graphics for the children to respond to. Several years ago, I was part of a
research team that attempted to “survey” kindergarten students as part of the prekindergarten-to-
kindergarten transitions study. We had the teachers read the statements to the children and then asked
them to put an X through the face that represented how they felt (see Figure 5.9).
Unfortunately, the children had no idea—and our explanations did not help at all—what the numbers
were for. They were instructed to locate the number 1 on their response sheet, as the teacher read the first
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 121
Directions: Please list all students and rate each student on the eight characteristics listed as they relate to
the beginning of school. Use the numbered scale listed below. In addition, feel free to add any
comments that would aid in describing the adjustment of the students.
1 2 3 4 5
|-------------------------|-------------------------|-------------------------|-------------------------|
Not at Some of All of
All the Time the Time
Adjustment Indicators
122— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
1.
2.
3.
4.
5.
6.
7.
8.
statement number, and then place their X on the appropriate face. After the first few statements, we realized
that they were simply placing the X over the same faces in the first row. Several of the children had response
sheets that looked like this:
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 123
1.
Obviously, you can see the problems that this created with respect to the accuracy of our data! On the
spur of the moment, we decided to revise the nature of the response sheet and came up with what you see
in Figure 5.10. Using this format, we could direct the children’s attention to the box with a certain image in
it and have them place the response only in that box.
(
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 124
124— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
The advantages of surveys and rating scales include the fact that they are very effective at gathering
data concerning students’ attitudes, perceptions, or opinions. They are essentially written versions of
structured interview guides, where individuals respond to a specific set of questions in writing, as opposed
to responding orally. Rating scales and other closed-response items can be answered, and the responses
can be tallied or counted quickly. Integrating the use of computer software can make this process of
tallying even quicker.
There are, of course, also limitations to the use of surveys for action research projects. Analyzing
responses to open-ended items can sometimes be time consuming, due to the fact that responses may be
ambiguous (Schmuck, 1997). This limitation can be overcome by replacing open-ended items with rating
scales or other closed-response items. Another limitation is that if the teacher-researcher is not clear about
an individual response, there is no opportunity or mechanism for asking respondents to clarify their
answer, as with interviews.
At this point, I would like to offer several suggestions—adapted from several sources (Johnson, 2008;
Mills, 2007; Schmuck, 1997; Schwalbach, 2003)—regarding the development and use of surveys and rating
scales as means of collecting action research data. When developing a new instrument, it is important to
apply the following:
Checklists
A checklist is a list of behaviors, characteristics, skills, or other entities that a researcher is interested in
investigating (Johnson, 2008; Leedy & Ormrod, 2005). The primary difference between a checklist and a
survey or rating scale is that checklists present only a dichotomous set of response options, as opposed to
some sort of continuum. Instead of indicating the extent, degree, or amount of something, checklists enable
the teacher-researcher to indicate simply if the behavior or characteristic is observed or present or if it is not
observed or present. Checklists are quicker for the teacher-researcher to use than are surveys and rating
scales; however, they provide data that are not nearly as detailed as those resulting from the use of rating
scales.
If you are observing students, of any age, and are using a checklist to record behaviors, you will want to
keep the list of behaviors or characteristics to a manageable number. Otherwise, you may become
overwhelmed with the sheer volume of things you must observe and record on the checklist. A sample
student checklist is presented in Figure 5.11.
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 125
Figure 5.11 A Sample Student Checklist Looking at Independent Reading at the Elementary
Level
Grade: ____________
Date: _____________
Respects others
Stays on task
126— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
to measure, based on the focus of our research? Though any data that you might collect may be entirely
accurate, the critical factor is whether or not it is appropriate and accurate for your purposes (i.e., does it
enable you to accurately answer your research questions?). For example, imagine that a reading teacher
uses the results from the reading portion of a standardized test to group her students into above-average,
average, and below-average reading groups. Then imagine that a social studies teacher uses those same
reading scores to identify students who he believes would be successful in an advanced placement history
course. The first interpretation and use of the scores is valid; the second is not. In terms of the social
studies teacher’s use of the data, it was invalid for the purpose for which it was used. The determination of
the validity of data ultimately has a substantial effect on the interpretation of those data, once they have
been analyzed, and the subsequent conclusions drawn from those results (Mertler & Charles, 2008).
Presently, validity is seen as a unitary concept (AERA, APA, & NCME, 1999), combining that which has
been previously described as four distinct types of validity: content, concurrent, predictive, and construct.
It is defined as the “degree to which all the accumulated evidence supports the intended interpretation of
test scores for the proposed purpose” (p. 11). Validity of quantitative data can be determined through the
examination of various sources of evidence of validity. Although similar to the four outdated types of
validity, the five sources of validity evidence are unique in their own right (Mertler & Charles, 2008). These
five sources of evidence of validity are based on the following: test (or instrument) content, response
processes, internal structure, relations to other variables, and consequences of testing. Many of these
sources of validity evidence are more appropriate for large-scale testing programs, especially where it is
important for the results to be generalizable to much larger populations than simply those individuals
included in a research study. Since this is not a purpose or goal of classroom-based action research, I am
suggesting that teacher-researchers be most concerned with evidence of validity based on instrument
content. This source of evidence is based on the relationship between the content addressed on a test, or on
another instrument used for data collection, and the underlying construct (or characteristic) it is trying
to measure. For example, assume we wanted to survey students to determine their attitudes toward
learning mathematics. We would want to ensure that the questions we asked on the survey dealt directly
with various aspects of learning math, not learning in any other subject areas or questions that were
completely extraneous to the construct of “learning mathematics.” As another example, consider a test you
might administer to students on their understanding of the process of photosynthesis. If you wanted to be
able to draw conclusions specifically about their understanding of this scientific process, you would need
to be sure to ask only questions related to the process. If unrelated questions were also asked of students
on the test—and provided that they contributed to the overall score on the test—interpreting the scores
as an indication of their understanding only of photosynthesis would not be a valid, legitimate use of those
scores. This type of evidence is typically based on subjective, logical analysis of content coverage on the
test and can be established by critical review by teachers, as well as by the judgments of experts in the
particular content field. In other words, although it is a subjective process, it is important for teacher-
researchers to critically examine the individual items and overall content coverage on a survey, rating scale,
checklist, test, or quiz in order to ensure that they are measuring what they intended to measure.
Reliability, a second essential characteristic of quantitative data, refers to the consistency of collected
data. If you hear three accounts of a minor car accident from three different individuals, but each account
differs as to what happened, who was involved, and what the results were, you will likely have little
confidence in any of the versions you have heard. In other words, the accounts (the data) are inconsistent
and, therefore, unreliable. If, however, each account is essentially similar, the information you have received
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 127
is consistent and may be considered reliable. Similarly, if you administer a certain test repeatedly under
identical circumstances but find that you get different results each time, you will conclude that the test is
unreliable. If, however, you get similar results each time you administer the test, you will consider the
results reliable and, therefore, potentially useful for your purposes (Mertler & Charles, 2008).
As with the determination of the validity of quantitative data, there are several methods of determining
the reliability of data (Mertler & Charles, 2008), not all of which are appropriate for teachers conducting
classroom-based research. Reliability of quantitative data is usually established by correlating the results with
themselves or with other quantitative measures. Three different methods are used—test-retest, equivalent
forms, and internal consistency. Internal consistency is a statistical estimate of the reliability of a test that is
administered only once. For this reason, this type of reliability estimate is most useful for classroom teachers
conducting research. One of the easiest internal consistency formulas to use is the Kuder-Richardson
formula 21 (also known as KR-21). The resulting statistic will range from 0.00 to 1.00; the closer the value
is to 1.00, the more reliable your data are. The formula for calculating KR-21 internal consistency is
(K)(SD)2 − M(K − M)
r=
(SD)2 (K − 1)
where r is the reliability index, K is the number of items on the test or instrument, M is the mean or average
score, and SD is the standard deviation of the scores. Imagine that a test consists of 40 items, that the mean
is equal to 27.3, and that the standard deviation is 4.64. The internal consistency reliability for the exam,
using the KR-21 formula, is shown below.
We often think of validity and reliability as two distinct concepts, but in fact they share an important
relationship (Mertler & Charles, 2008). It is possible for scores obtained from an instrument to be reliable
(consistent) but not valid (measuring something other than what was intended). In contrast, scores cannot
be both valid and unreliable—if scores measure what was intended to be measured, it is implied that they
will do so consistently. Therefore, reliability is a necessary, but not sufficient, condition for validity. When
establishing the validity and reliability of your research data, always remember the following adage: A valid
test is always reliable, but a reliable test is not necessarily valid (Mertler & Charles, 2008).
128— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
This class is made up of thirty-one average and above-average science students. I chose this
last class of the day for purely logistical reasons. With only one computer in my classroom, I
needed to borrow eleven computers daily from neighboring teachers. Seventh period was the
most agreeable period to the other teachers. An extra advantage of using the last period of the
day was that students could return the computers after the final dismissal bell and not take
valuable class time for this task.
My data was generated by comparing these students’ attitudes toward learning science at the
beginning of the school year, during my study, and at the conclusion of the study period. The
students’ attitudes and reactions were documented by the students themselves, by their parents,
and by my own observations. Collecting data from three sources allowed for triangulation of the
findings in this study. Data triangulation helped reduce the likelihood of error in the findings when
similar results are reported from two or more of the sources. I surveyed all of the class members
and their parents at the beginning and the end of my study.
During the first six weeks of school, I reviewed the scientific method, the metric system,
scientific measurement, and laboratory safety. At this point multimedia technology was not part
of the curriculum. Some hands-on activities were used at this time. The students worked both
individually and in groups. To determine each student’s level of enthusiasm for learning science,
during this time I administered a survey which contained the following questions: How do you
like learning science? How have you liked learning science so far this year? How enthusiastic
are you about exploring science at home? Students were asked to rate their answers to each
question using a scale of 1 to 5. The scale was represented by (1) a very unenthusiastic response,
(2) an unenthusiastic response, (3) indifference, (4) an enthusiastic response, and (5) a very
enthusiastic response.
Additionally, I sent home parent surveys with each student in order to solicit and record the
parents’ opinions concerning their child’s enthusiasm for learning science. The survey included
two questions: How enthusiastic is your child about learning science? How enthusiastically does
your child do science activities at home? I used the same rating scale for the parents that I used
with the students.
At the beginning of the second six weeks I introduced a unit on oceanography.
Oceanography was used as the unit of study primarily because of the number of resource
materials available to the students through the media center. It was during this unit that I began
to integrate technology into my curriculum. As the unit was introduced I asked my students to
look through the oceanography chapters in their textbooks and make a prioritized list of the
eleven subtopics in physical and biological oceanography they would like to study. Students
were grouped according to their interest as much as possible and were assigned to work in
groups of two or three to develop a multimedia presentation that would be used as an
instructional tool for the other students.
During this period I began to introduce them to the multimedia computer program, HyperStudio
(Wagner, 1994). HyperStudio is a program that allows the user to combine sound, graphics, and
animation with text to make creative and entertaining presentations. The introduction of
HyperStudio and the development of the student presentations took six weeks to complete.
Throughout the study I observed and made notes as to how the students were working and
their reactions to class. These observations were guided by several questions: What problems are
the students encountering as they work on their multimedia presentations? Are the students
having problems with content? Are there problems working in groups? Are they having problems
using the multimedia software? These observations and notes were useful in making sense of any
fluctuations I found in the end-of-study student surveys. I was able to discern the source of
problems so that content difficulties or friction within groups was not confused with a loss of
enthusiasm for technology.
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 129
At the end of the oceanography unit I had each group of students share their presentations
with the rest of the class. After the presentations, each group was asked to comment to the class
on how they enjoyed developing their works. I noted these student comments as they were
presented to the class. Each student was also asked to make written, individual comments to me,
responding to the following questions: What problems did you encounter while you were
developing your presentation? What did you learn about your topic while you were developing
your presentations? Did you learn from the other students’ presentations? Would you like to do
another presentation on some other topic in science? Again I surveyed the parents of these
students to gain information about their child’s interest in learning science. I asked the following
questions: Is your child talking about science at home? Is your child eager to share what we are
doing and learning in science class? Do you feel that your child is learning science? Why or why
not? How enthusiastic is your child about learning science? How enthusiastically is your child
doing science activities at home? I again surveyed the students asking the same questions that I
had asked in the beginning survey.
Two weeks prior to my starting date, a video camera was placed in my first period classroom and
left on so that the students would become comfortable in the presence of the camera in the room.
Students were given numbers on construction paper and asked to hold on to them for later use.
On day one the first period class was videotaped for the first time. At the close of the period
students were asked to complete a four-question survey. They were asked not to use their names,
but instead, they were asked to use a number that was given to them earlier. I jotted down notes
on how the class session went in a teacher journal.
The week continued with the second taping three days later. Student surveys were filled out
for the entire week. Entries were made in the teacher journal whenever I could remember. This
turned out to be about three times during that first week.
During the second week the class was taped on Monday and Thursday. At the end of the
second week modifications to the student survey were made on questions 1 and 3 due to mixed
responses given by students. The modified student survey questions were:
I continued to tape my first period science class twice a week for a total of five weeks. Student
surveys were given to all students on a random basis throughout the five-week period. Journal
entries were made daily.
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 130
130— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
Recall that the purpose of this action research study is to improve teachers’ classroom-based
assessments in an effort to improve student achievement.
The teachers who make up Team North at Jones Middle School felt very comfortable about the way
in which their data would fit into the pretest-posttest control group design that they selected for their
action research study. They developed consent forms that the students in Team North and the students
in Team East, as well as their parents, signed. The forms requested permission for the students’ fall
(October) and spring (March) test scores, resulting from the two administrations of the statewide
proficiency test, to be used for an additional purpose—their action research study.
Approximately 4 weeks after each administration of the test, the individual student test reports
came back to the school. For each of the students in the two teams, the four Team North teachers
pulled the test report from the student’s cumulative folder in the main office. From the test report, they
recorded the scaled score (ranging from 200 to 500) for each of the four main subtests: language arts,
mathematics, science, and social studies.They recorded the scores for each subtest (where “1” indicated
the fall test scores and “2” indicated the spring scores), along with each student’s identification number
and team membership (where Team North was coded “1” and Team East “2”) in a spreadsheet, which
looked like this:
student_id group la_1 math_1 sci_1 ss_1 la_2 math_2 sci_2 ss_2
They double-checked each of the students’ test scores across the two administrations of the
proficiency test for accuracy of entry into the database. When all of the scores had been verified, they
prepared their data (and themselves) for the next step—data analysis!
Recall that the purpose of this action research study is to improve students’ reading comprehension
skills within a Title I context.
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 131
In order to address her initial research question—which proposed to examine differences in students’
reading comprehension skills following the use of revised teacher-developed comprehension items, based on
pretest and posttest diagnostic test scores—Kathleen needed to select an appropriate and valid measure of
reading comprehension. After reviewing the various diagnostic tests with which she was familiar and had
experience administering, Kathleen selected the Woodcock Reading Mastery Test–Revised (Form H) to
administer to her reading students in September and in May. From the resulting student score reports,
she would extract the Reading Comprehension Cluster score, which appears as a percentile rank. An
average score on this subtest is the 50th percentile; Kathleen’s upper-elementary students typically score
near the 35th percentile. She obviously hoped to improve that performance over the course of the
school year.
Kathleen’s second research question dealt with the perceptions held by both her students and
herself regarding the students’ reading comprehension skills. She proposed to collect two forms of
data to enable her to address the nature of those perceptions. First, she would conduct daily
observations of her students and record both what she saw and any analytical thoughts she may
have had while conducting the observations. The focus of her observations would be the degree to
which the students could answer oral and written questions after having read a passage from a book.
Specifically, she would look for how her students used the strategies for reading comprehension
that they had been taught. Second, Kathleen also wanted to periodically ask her students direct
questions regarding the use of those reading comprehension strategies. She designed a semistructured
interview guide for conducting these student interviews. Her interview guide included the following
questions:
• What strategies do you use to help you understand what you read?
Do you enjoy reading?
Kathleen planned to interview each student at least twice at roughly 2-month intervals during the
course of her action research project. She anticipated learning more about their perceptions of reading,
in general, and reading for understanding. She was also curious as to whether those perceptions would
change over time.
(Continued)
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 132
132— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
(Continued)
Recall that the purpose of this action research study is to improve students’ understanding of the
processes of mitosis and meiosis.
Sarah taught her mitosis and meiosis unit in January and February and administered the unit test in
mid-February. In order to address her first research question, Sarah collected her students’ test scores
and recorded them on a spreadsheet, excerpts of which follow:
Adam F. 1 1 98
Becky S. 1 1 74
Chris W. 1 1 85
Michael M. 3 2 87
Nancy T. 3 2 91
Ophelia J. 3 2 95
Notice that Sarah recorded not only the test score for each individual student but also the
class period (i.e., 1 or 3 in the excerpt of the table) and the type of instruction that class received
(i.e., “1” = traditional instruction, and “2” = traditional instruction plus supplemental resources).
She knew that this information would be necessary for her to be able to conduct any
comparative statistical analyses. Because she utilized a two-group comparative design, she knew
from her coursework that she would need to analyze the test scores using an independent-
samples t test.
To help answer her second research question, Sarah first created a collaborative classroom space on
the Internet in the form of a blog (at http://www.blogger.com). In order to initiate the classroom
“discussion,” she posed a couple of general questions, in separate threaded discussion boards, about
mitotic and meiotic processes:
All of her students were required to post a minimum of two comments or questions. She believed
that the blogs could provide her with rich qualitative data, since the students could respond to each
other’s questions and comments. In addition, use of blogs has a distinct advantage from a data
collection perspective—all of the students’ submissions would be recorded for her to analyze at a
later time.
She also conducted what she called “group oral exams” and “individual oral exams,” which, as you
can imagine, did not excite her students very much. However, she explained to them that she really just
wanted them to discuss what they knew and clearly understood about the processes of mitosis and
meiosis, both in a group setting and individually. She informed them that these oral exams would count
in their grades but that the written unit test would serve as the primary basis for their grades in this
unit. During one day in each class, she engaged the students in a group discussion about what they had
learned. For example, she asked one person to begin discussing the steps in mitosis, and at some point
during the response, she stopped that student and asked another to continue from that point. Sarah
took notes during each class, highlighting what the students in each class seemed to clearly understand
and what they continued to struggle with. Before the end of the day, she had already noticed some
patterns emerging.
(Continued)
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 134
134— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
(Continued)
The next day, each student was called up to Sarah’s desk for the individual oral exam.These were very
structured (due to time constraints) and consisted of four brief questions. She asked each student the
following, carefully recording the responses:
The final question for each student came from a set of questions about the stages of mitosis. Sarah
showed the Java-based animation of the entire process of mitosis, stopping it at random, but different, places
for each student. She then asked each student to identify in which phase the process had been stopped.
After recording all of this information and gathering all of the blog entries, Sarah was ready to begin
the analysis of her data.
Summary
1. Qualitative data are narrative, appearing primarily as words.
• Qualitative data are usually collected through observations, interviews, or journals or by
obtaining existing documents or records.
• Observations involve carefully and systematically watching and recording what you see and hear
in a given setting.
• Classroom observations may be structured, semistructured, or unstructured.
• Unstructured or semistructured observations allow for the flexibility to attend to other events
occurring in the classroom.
• Classroom observations are usually recorded in the form of field notes, which may include
observer’s comments.
• Interviews are typically formal conversations between individuals.
• Interviews typically follow an interview guide, which may be structured, semistructured, or
open-ended.
• Interviews can also be conducted with groups of individuals in an interview known as a focus
group.
• Interviews may also be conducted informally or via e-mail.
• Journals may also be kept by both teachers and students in order to provide valuable insights
into the workings of a classroom.
• Existing documents and records, originally gathered for reasons other than action research, are
abundantly available in schools and may be used as additional sources of information. These
include classroom artifacts, such as student work.
• It is important for teacher-researchers to establish the trustworthiness of their data. This
includes the accuracy, credibility, and dependability of one’s qualitative data.
2. Quantitative data are numerical and include just about anything that can be counted, tallied, or
rated.
• Surveys are lists of statements or questions to which participants respond.
• Questionnaires are one specific type of survey involving the administration of questions or
statements in written form.
• Items on surveys can consist of open-ended questions or closed-response rating scales.
• A closed-response question or statement provides the respondent with a number of choices
from which to select. Analysis of the resulting data involves counting the number of responses
for each option.
05-Mertler (Action)-45613:05-Mertler (Action)-45613 6/7/2008 6:29 PM Page 136
136— P A R T I I I • “ W H AT D O I D O W I T H A L L T H E S E DATA ? ”
• Open-ended items allow for a seemingly limitless number of possible responses. Analysis of
these data involves categorizing responses into similar groups and then counting them.
• Surveys and rating scales are effective at gathering data simultaneously from numerous
individuals but can sometimes be time consuming to analyze.
• Checklists are a simple form of rating scale where only a dichotomy of response options (e.g.,
present/not present) exists.
• Tests and other formal instruments can be used as quantitative data, provided they are
supplemented with other forms of data.
• Validity of quantitative data has to do with the extent to which the data are what they are
believed to be.
• Reliability refers to the consistency of quantitative data and is determined statistically.
• Remember the following: A valid test is always reliable, but a reliable test is not necessarily valid.