Академический Документы
Профессиональный Документы
Культура Документы
This was produced from a copy of a document sent to us for microfilming. While the
most advanced technological means to photograph and reproduce this document
have been used, the quality is heavily dependent upon the quality of the material
submitted.
The following explanation of techniques is provided to help you understand
markings or notations which may appear on this reproduction.
1. The sign or "target" for pages apparently lacking from the document
photographed is "Missing Page(s)". If it was possible to obtain the missing
page(s) or section, they are spliced into the film along with adjacent pages.
This may have necessitated cutting through an image and duplicating
adjacent pages to assure you of complete continuity.
2. When an image on the film is obliterated with a round black mark it is an
indication that the film inspector noticed either blurred copy because of
movement during exposure, or duplicate copy. Unless we meant to delete
copyrighted materials that should not have been filmed, you will find a
good image of the page in the adjacent frame.
3. When a map, drawing or chart, etc., is part of the material being photographed the photographer has followed a definite method in "sectioning"
the material. It is customary to begin filming at the upper left hand corner
of a large sheet and to continue from left to right in equal sections with
small overlaps. If necessary, sectioning is continued againbeginning
below the first row and continuing on until complete.
4. For any illustrations that cannot be reproduced satisfactorily by
xerography, photographic prints can be purchased at additional cost and
tipped into your xerographic copy. Requests can be made to our
Dissertations Customer Services Department.
5. Some pages in any document may have indistinct print. In all cases we
have filmed the best available copy.
University
Microfilms
International
300 N ZEEB ROAD. A N N A R B O R , Ml 48106
18 B E D F O R D ROW, LONDON WC1R 4EJ, E N G L A N D
8012402
OEHMKE, THERESA MARIA
PH.D.
1979
University
Microfilms
I n t e r n c i t I O PI 3.1 300 N. Zeeb Road, Ann Arbor, MI 48106
Copyright 1979
by
OEHMKE, THERESA MARIA
by
Theresa Maria Oehmke
Graduate College
The University of Iowa
Iowa City, Iowa
CERTIFICATE OF APPROVAL
PH.D. THESIS
Thesis committee:
ffy%A-yi~^-0L^&(
Thesis3 supervisor
Member
Member
Member
r^oci^-e^z^
DEDICATION
To
ii
ACKNOWLEDGEMENT
For their help in the preparation of this thesis, I owe an expression of appreciation to a large number of persons; only a few of whom I
shall mention by name.
To Professor Harold Schoen I extend my thanks for all the guidance,
motivation, direction and assistance he gave me from the initiation to
the completion of this study.
Special thanks are due George Immerzeel, Joan Duea, Earl Ockenga
and John Tarr and the personnel at the Malcolm Price Laboratory School
for their encouragement and support during several phases of this investigation.
A debt of gratitude to Professors H. D. Hoover and A. N. Hieronymus
is acknowledged for their assistance on some of the statistical details
of the testing procedure and for providing me with the opportunity to
collect some of the pertinent data in this study.
I would like to express my appreciation to William M. Smith for the
use of his expertise on the technical aspects of the use of the computer
on data processing.
Thanks is given to Ada Burns for her amazing ability to type what I
thought I had written.
Finally, I would like to thank my husband Bob, and son Jim, and all
my friends who were always ready with an encouraging word.
iii
TABLE OF CONTENTS
Page
LIST OF TABLES
vi
LIST OF FIGURES
vii
CHAPTER
I.
INTRODUCTION
Purpose
Overview of the Study
IPSP Test Development
IPSP Test Validation
Reliability
Content Validity
Concurrent Validity
Discriminant Validity
Operational Definitions Used in This Study
Overview
II.
12
.....
3
3
h
5
6
8
8
9
10
11
13
19
22
23
23
2*1
28
30
30
3b
1+0
^0
kl
k2
^3
kk
^5
h9
CHAPTER
Page
50
50
51
53
55
55
60
60
62
72
72
73
73
7^
7^
. 76
8l
82
APPENDIX A.
APPENDIX B.
APPENDIX C.
85
FORMS
9^
98
APPENDIX D.
APPENDIX G.
106
10 8
Ill
120
BIBLIOGRAPHY
125
LIST OF TABLES
Table
Page
27
2.
52
5^
b.
56
Reliability Analysis
5. Reliability Analysis
57
6.
Reliability Analysis
58
7.
Reliability Analysis
59
8.
6l
9-
October 1978 . . 68
10.
March 1979 . . . 69
11.
12.
VI
70
National
107
LIST OF FIGURES
Figure
Page
1.
2.
Interviewing Procedures
31
3.
35
36
b.
5-
37
6.
b2
7.
bG
8.
9.
10.
11.
63
6b
65
66
12.
77
13-
78
lU.
122
vii
CHAPTER I
INTRODUCTION
becomes aware of the vast quantity of research that is being done but
also of the many and diverse methods used to study the problem solving
processes.
Studies range from simply observing individual students as they
solve problems to factor analysis of paper-pencil measures of problem
solving. A data-gathering method that has come into prominent use today
is the structured one-to-one interview.
experiences and his analyses of what others think they are doing when
they solve problems, Wallas suggested a four-step model: preparation,
incubation, illumination, and verification.
experience as a mathematician and a teacher, Polya (1957, 1962) proposed another four-step model: understand the problem, make a plan,
carry out the plan, and look back at the complete solution.
Restle and
Davis (1962) suggested that the problem solver goes through a number of
independent but sequential stages. The student solves a subproblem at
each stage, thereby allowing him to go on to the next step.
These and
Would the teacher be able to use such a test to help plan for
and the
However,
2.
Judge the content validity of the IPSP test using the judg-
5
1.
Get to
A.
B.
C.
2.
3.
Do It
A. Choose the necessary computation
B. Estimate from a diagram
C. Compute from a diagram
D. Use a table
E. Compute from an equation
b.
Look Back
A. Identify problems that can be solved in the same
way as a given one
B. Vary conditions in a given problem
C. Check a solution with the conditions of the
problem
Steps and Component Skills of the IPSP Test Model
Figure 1
Like standardized tests the IPSP test can be efficiently administered to large groups of students and machine scored with various norm
data easily obtainable.
6
Major questions concerned the validity of the test, if indeed, a reliable test with reliable subtests could be constructed. By utilizing
the Iowa Testing Program's tryout facilities it was possible to construct experimental test units, administer them to representative samples of Iowa fifth through eighth graders, and revise the units based
on the item analyses and test data. Also, over 100 students
were interviewed at various stages in the test development process as a
concurrent check on the test validity.
7
the test, i.e. , noise, distractions, poor lighting, or lack of uniformity in giving test instructions. The most difficult factors to control
are the subject's attributes.
Reliability of a score for a testing sample is defined as the ratio
of the variance of the true score to the variance of the observed score:
2
r
tt
2
S
2
r JX = reliability of the test, s m = variance of the true
tt
T
2
score, and s
= variance of obtained score. Reliability is also re-
where
It may be calcu-
- srr (1 A>
where
J.
items and
Xi
is the
8
Content Validity
The basic content validity question is: Are the items in question
a representative sample of the construct or subject matter domain to be
measured?
The mathematics educators were the IPSP project team and ad-
series of problems to a child and asking him to think aloud as he attempted to solve each problem.
10
discriminant validity issue refers to the degree to which the subtest
scores differ from each other and from scores on other similar tests.
Discriminant validity in this study is approached through the use
of matrices of correlations corrected for attenuation.
Intercorrelated
variables include steps 1, 3, and b of the IPSP test and the aforementioned subtests of the ITBS.
should not be highly correlated with each other nor with the ITBS
subtests.
Problem solving
To search consciously for some action appropriate to attaining a
clearly conceived, but not immediately attainable aim. To solve
a problem means to find such action (Polya, 1962),
2.
Cognitive processes
Actions of cognitive performance such as perceiving, remembering, thinking, and desiring that depend on the subject's performance capacities.
11
subject's choices of solution strategies;
b)
framework and design of the IPSP test, procedures used to develop the
verbal problems, scoring methods for the think aloud interviews, and
data gathering procedures. The results are discussed in Chapter IV.
A summary of the validation results, implications for future research,
and implications for the teaching of mathematical problem solving are
discussed in Chapter V.
12
CHAPTER II
13
from audio-tape to paper as they use the think aloud technique.
Hollander's (1973) review focuses on studies related to word problems for students in grades three through eight. The review includes
studies carried out from 1922 to 1969 in seven categories: problem
analysis, computation, general reading ability, specific reading skills,
specificity of the problem statement, the problem situation, and language factors.
Webb (197 5) notes that
...Because of the complexity of problem solving
processes and the number of variables associated
with problem solving, research in this area has
been too diverse to have any real consolidation. ..(p. l)
His review focuses on studies that involve problem solving tasks and
strategies. These studies were conducted from 1967 to 1973 and involved students in grades three through eight.
Lucas (1972) discusses the nature of problem solving, the search
mode used by information processors, some formal models of the problem
solving process and some techniques used by earlier researchers to
externalize thought processes.
Models of Problem Solving
The attempt to describe the thought processes used in mathematical
problem solving is not a new quest.
the individual thought processes used in formulating and proving mathematical conjectures.
lit
to be solved."
"Sudden illumination" is the third stage of the psychologist
Wallas' (1926) model. In an attempt to analyze and thereby
further understand the problem solving process, Wallas observed accounts
of thought processes related to him by his students, colleagues and
friends.
incubation:
the second stage during which one rests from any conscious
thought about the problem at hand and/or consciously
A review which instituted an inquiry into the habits of mind and methods of work of mathematicians during the early 20th century.
15
thinks of another problem,
illumination: the thircPstage during which the idea and/or solution appears as a 'flash' or 'aha'.
Wallas added a fourth stage, verification, during which the validity of
the idea is tested: which Helmholz did not describe but which Poincare
vividly describes in his accounts.
Polya (1957,1962) also developed a four-step model for problem
solving.
lem solving was to enable the student to ask questions which focus on
the essence of the problem. Drawing upon many years of experience as a
mathematician and a teacher of mathematics, he describes the following
four-step model:
(1) Understanding the problem:
the problem by looking at the data and asking the questions, Is it possible to satisfy the conditions of the problem?, Is there redundant or
insufficient data?
(2) Devising a plan: The student tries to find a connection between the data and the unknown; the student should eventually choose a
plan or strategy for the solution.
(3)
Carrying out the plan: The student carries out the plan,
the results and/or arguments. The student also attempts to relate the
method or result to other problems.
In a discussion of Polya's model, Mayer (1977) points out that some of
Polya's ideas (restating the given and the goal) are examples of the
16
Gestalt idea of "restructuring."
He concludes that:
spection refers to Bloom and Broder (1950) , who stated that the difficulties with retrospection lie in remembering all the steps in one's
thought processes, including errors and blind alleys, and in reproducing these steps without rearranging them into a more coherent, logical
order.
It appears that Claparede (1917;193*0 was the first to use a third
approach, the "think aloud" technique (Kilpatrick, 1967).
This technique
does not require the subjects to think and observe themselves thinking
at the same time.
their thought processes nor are they required to have special training.
There are, however, potential difficulties with the think aloud technique: interference of speech with thinking, the lapse into silence when
17
the subject is deeply engrossed in thought, and the essential difference
between the verbalized solution and the one found silently.
Kilpatrick
(1967) summarizes the views of several authors (Rota, 1966; Brunk, Collister, Swift and Slayton, 1958; Gagne and Smith, 1962; Dansereau and
Gugg, 1966) concerning these difficulties and concludes that:
...The method of thinking aloud has the special
virtues of being both productive and easy to use.
If the subject understands what is wantedthat
he is not only to solve the problem but also to
tell how he goes about finding a solutionand if
the method is used with an awareness of its limitation, then one can obtain detailed information
about thought processes ...(p. 8).
One of the first attempts to systematically gather empirical evidence was by Duncker (19^5) who studied the problem solving protocol of
subjects who were given a problem and asked to "think aloud." Two of
the problems that he used were the tumor problem:
...Given a human being with an inoperable stomach
tumor, and rays which destroy organic tissues at
sufficient intensity, by what procedure can one
free him of the tumor by these rays and at the
same time avoid destroying the healthy tissue which
surrounds it ?...(p. 1).
and the 13 problem:
...Why are all six place numbers of the form 276,276,
591,591, 112,112 divisible by 13?...(p. 31).
Duncker illustrated a typical solution protocol for the tumor problem with a flow chart and observed that the problem solving process
starts from a general solution, then progresses to a functional solution
and then to a specific solution.
In a more recent attempt to gather data empirically, Restle and
Davis (1962) developed a model which describes the subject as going
18
through sequential stages when solving a problem. Each stage is a subproblem with its own subgoal. Thus, the individual solves a sequence
of subproblems which then enables him to continue on to the next stage.
The model states that the number of stages, k,
2 2
k = t /s . They do not describe the stages, and assume
observe human beings working on well-structured problems that the subjects find difficult but not unsolvable.
19
process, would it be possible to measure a person's ability at each
step? Would this information be more useful to a teacher than just the
single number right, percentile, or grade level equivalent score?
How can this type of evaluation be effected?
A paper and
(1973) summarized what they and other authors (Ray, 1955; John, 195^;
Keisler, 1969) view as characteristics of a good problem solving process
test.
20
1.
The test should yield a variety of continuous measures concerning the outcomes of the problem solving, the processes,
and the intellectual skills involved.
2.
of the end product and not of any particular step in his model. His
content validation procedure used the judgment of a panel of experts,
and his split half reliabilities ranged from .60 - .82.
21
Foster (1972) developed a problem solving test which had a multiple-choice format for some items and an open-ended format for other
items.
22
format.
For the
interview test he used Lucas' point system which gave 1 point for
'Approach', 2 points for 'Plan', and 2 points for 'Result'. The written
test was scored using the number of correct answers. The correlation
between the rankings on the written tests and the interviews was .68,
below the criterion Zalewski had set prior to his study.
Summary
Many researchers, including psychologists, mathematicians and educators, have investigated the problem solving processes using multistep
models.
23
CHAPTER III
2it
that the test can be machine scored, (2) the items should measure problem solving subskills and not just the ability to get a final answer,
and (3) the test should be based on the IPSP testing model as developed
by the IPSP team.
A search of the literature was completed to locate instruments
which measure problem solving processes and any items which were found
were classified according to the IPSP testing model.
Instruments were
Wearne, 1976;
In addition, many
25
items were written to test subskills in each category. The objective
was to build a large item bank to be used during the formative period.
Thus, valid items which satisfied item analysis and reliability criteria
in tryouts could be selected for inclusion in the IPSP test.
A first draft of the IPSP test was examined by two authors of the
Iowa Test of Basic Skills (ITBS) at The University of Iowa, by the IPSP
team, and by the IPSP Advisory Board.
did measure the subskills in the IPSP testing model. However, there was
also a consensus that the items were "too wordy" and tended to be
"cute."
26
27
Table 1
Phases of Validation of the IPSP Test
Test
Form
Grade
Level
Number of Items
Total (Subtest)
Phase of
Validation
Test Date:
December, 1977
561
5,6
it0(l2,it,12,12)
Pilot Interview
Study; Pretest
Sample Tested:
Malcolm Price
Lab School
562
5,6
U0(12,it ,12,12)
Pilot Interview
Study; Posttest
781
7,8
it0(l2,it,12,12)
Pilot Interview
Study; Pretest
782
7,8
it0(l2,it,12,12)
Pilot Interview
Study; Posttest
Test Date:
January, 1978
563
5,6
20(6,2,6,6)
Final Tryout
Series
Sample Tested:
Representative
Sample of Iowa
Students
56b
5,6
20(6,2,6,5)
581
5,6,7,8
20(6,2,6,6)
582
5,6,7,8
20(6,2,5,7)
783
7,8
20(6,2,6,6)
78U
7,8
20(it,2,7,7)
Test Date:
May, 1978
561
it0(l2,i|,12,12)
Final Interview
Study
Sample Tested:
Two Fifth Grade
Classes from an
Iowa City Elementary School
Test Date:
October, 1978
Sample Tested:
One Hundred
Fifth and Sixth
Grade Classes
from Iowa
Schools
565
5,6
30(10,0,10,10)
785
7,8
30(10,0,10,10)
28
well as the emphasis in this report, was placed on the relationship between the interview data and IPSP test scores.
Design and Development of Interview Procedures
A preliminary list of interviewing procedures was developed using
the experiences that were gained during the first round of interviews
described in the previous section. The purpose of the later interviews
was to discover the strategies the child uses to solve verbal problems.
Hence it was decided that the interviewer should not lead the child into
selecting a particular heuristic. Any questions asked by the interviewer to elicit more information should not lead the child. There were
also instances of nonverbal leading that occurred during the trial interviews. For example, a child unsure of which operation to use would
say, "I think I should multiply?" and then look at the interviewer's
face to get some sort, of reaction. Whether the child in fact multiplied
or performed some other operation depended on the reaction of the interviewer.
29
seventh grader.
Some of the
If
the child was in deep thought working on the problem, the interviewer
would ask a question about the overt observable behavior of the child,
e.g., "Are you doing some multiplication now?" or "Are you adding now?",
when it was obvious that the child was doing that particular computation
using paper and pencil. The child usually mumbled "yes" or shook his
or her head and went on with the arithmetic. This strategy was used as
an indicator on the tape to let the coder know what the student was
doing during that time.
A second strategy that worked very well was to make a comment similar to the following:
30
seemed to encourage the child to vocalize and give more details, yet
did not appear to lead the students into using a specific strategy. The
final form of the list of interview procedures was the result of five
tryouts and revisions. Figure 2 shows the final form used in this validation study.
The Quantification Scale
Development
Since one of the goals of this study was to assess the relationship
between the IPSP test and the results of the think aloud interviews, a
quantification code based on the IPSP testing model was needed to process the interview findings. Some related research was found.
Kilpatrick (1967) developed a coding scheme to analyze the protocols used by his subjects in a think aloud interview, but did not
attempt to quantify these protocols. Lucas (1972) used a modification
of Kilpatrick's coding scheme with calculus students. His five-point
scoring code is based on three categories: Approach (one point), Plan
(two points), and Result (two points).
31
Interviewing Procedures
Problems should be typed one on a page, preferably placed at the
side of the page so that the student can use the rest of the page
for any computing, drawing diagrams, tables, or any type of thinking.
Start the interview with 2 sample problems, thus allowing the student to become familiar with the routine and with the type of
information the interviewer would like to find. At all times make
a conscious effort to put the child at ease.
Tell the student that no information will be given on whether the
answer or strategies are correct since you want to get the best
possible data.
Do encourage the students to go on by making comments such as,
"You're doing just fine." "That's good, you're telling me what
you're thinking." "Go ahead." BUT DO NOT LEAD THE STUDENT INTO
USING A STRATEGY.
Don't go any longer than about 15-20 seconds without recording
something on tape. EXCEPTION: If the student is doing computations or drawing a diagram or making tables, etc., make some sort
of statement as, "You're making a table," etc.
Encourage the student to vocalize his thinking as much as possible.
If a student falls silent while writing or drawing, prompt him by
reading what he has written or ask him what he is doing. However,
rule 5 takes precedence over rule 7If a student doesn't answer, or doesn't make any comments about
his thinking, wait about 15 seconds and ask, "Can you tell me what
you are thinking?" Wait another 10 seconds or so and ask again.
This time, "Are you trying to figure something out?" If nothing
happens, call this IMPASSE. Now ask the question, "Would you like
a hint or another problem?"
If the child says yes, this would indicate to you that the rating
part of the data gathering is over, but to continue to get diagnostic data. This can be done by asking the student to identify
the area that presented the trouble, why did he have this trouble,
e.g., didn't know method, lack of understanding of the problem,
read problem incorrectly, etc.
Figure 2
32
Figure 2 (cont'd.)
8c.
If the student says no, then allow more time and ask him if he
would tell you what he is thinking or what method he is trying,
or would he try to do his figuring on paper. Then repeat steps
8a, 8b and 8c again.
9-
If the student is not trying to solve the problem get him on the
right track, but after IMPASSE.
10.
For the first half of the problems observe the student. Does he
have a habit of LOOKING BACK? If not, follow step 11.
11.
If the student does NOT have the habit of LOOKING BACK, and has
already been given the first half of the problems, then lead him
on with prompts listed on the LOOKING BACK coding sheet, e.g.
"Did you check your answer with the conditions of the problem?"
"Did you check your answer?" "How sure are you that your answer
is correct?"
DON'T
1.
2.
Do not give any tutoring or prompting until after the IMPASSE and
then only if the child asks a question. However do use the procedure listed in step 8a.
3.
Do not summarize what the child has done. Try to get him/her to
do it.
33
Written tests were administered to these same subjects who were then
ranked according to the number of correct answers. The correlation
coefficient between the written tests and interviews was .68. Zalewski
concluded that a higher correlation is necessary before the written
test scores can be used as a substitute or predictor" for interview
results.
Webb (1975) also used an adaptation of the coding system developed
by Kilpatrick and Lucas. He used the "Approach," "Plan," and "Result"
scoring system and obtained a frequency count from a check list of problem solving process variables.
From the preceding discussion it appeared that no 3-step quantification scheme was available to investigate the relationships between
interviews and IPSP test results. A first attempt at developing the
scale was made using Kilpatrick's processing sequence with some modifications in order to follow the IPSP testing model. In trying to quantify these processing sequences the procedures became very cumbersome.
A new attempt was made in which flow charts were designed for each step
of the model. Again when it came time to assign a number at the various
branches the instrument became unmanageable. Another attempt was made
in which behavior in each step of the testing model was assigned three
numbers:
egory would be those processes which were totally incorrect or a response such as "I don't know what to do." The 2 category would contain
responses which were completely correct and the 1 category would contain the intermediate responses. This new procedure was used with the
audio tapes from the first interviews.
3it
that at least one more category was needed and that the categories were
not explicit enough for each step of the model.
These revisions were made and the resulting instrument now had four
categories:
gory. This new instrument was used to process additional tapes and further revisions were made. At this stage the instrument was examined by
the same two mathematics educators who were consulted on the interview
form. Each category and its descriptors were thoroughly discussed.
Step it, the looking back step, presented the greatest difficulty.
If
It was decided
to include this process under step 1 and be more explicit with the
descriptors under step it.
After general agreement on the appropriateness of the scale was
reached, three raters quantified audio tapes of interviews using this
form. After a few minor additions the instrument was considered to be
in "final" form. As a final test, each rater analyzed the same three
interviews on audio tapes.
Use of the Scale
The final form of the quantification scale is given in Figures 3,
it, 5. Behavior which involved reading, analyzing and understanding the
problem was classified as step 1 behavior. Briefly, a score of 0 was
assigned to a student who failed completely to understand a problem;
was assigned to a student whose analysis of the problem was incorrect
Rereads problemappears to
know there is something missing but cannot state what is
wrong and makes a false
start (missing data).
Tries to solve the problem
without regard to using data
correctly. After a brief trial
and error realizes he is not
using data correctly but cannot correct the situation.
Figure 3
OJ
Figure b
Looking Back
0
1
Expresses uncertainty about
answer.
Says it's probably wrong (or
some version) and attempts to
give a reason for his/her
uncertainty.
Makes an attempt to check the
answer but is not successful
enough to be convinced that
it is right or wrong.
Checks computations involved
in answer but does not check
to see if answer satisfies
condition of problem. Errors
here should be major.
3
Attempts tt -heck the values
of an unknc . or the validity
of an argument.
Tries to decide whether the
answer makes sense (i.e.,
realistic, reasonable estimates) .
Checks that all pertinent data
has been used.
Suggests a new problem that
can be solved in the same way.
Successfully attempts to simplify the problem.
Checks solution by retracing
steps or substitution.
Checks that solution satisfies
conditions of problem.
Figure 5
38
and were given a score of 0 for step it. Briefly, a score of 1 was
assigned if some uncertainty was expressed but no systematic check was
made; 2 was assigned if a check was attempted but was either incorrect
or incomplete; 3 was assigned if a valid check of the computation, conditions and/or reasonableness of the solution was carried out. Again,
specific criteria were described for each numerical score, but an important point is that the step it score was not affected by any behavior
preceding a tentative solution. An exception was that students were
assigned 0 on step it if no tentative solution was reached.
The following two examples will illustrate the scoring scheme.
Ann, a sixth grader, was presented with this problem:
A bag of XL-50 brand marbles contains 25 marbles and costs
19$.
Ann read the problem aloud and this is the transcribed interview:
A:
Uh . . . Oh boy . . . hm . . .
I:
A:
39
. . .O.K.
(multiplies)
I:
A: What I should do with 25, 19, and 125, because I know with those
numbers I have to do something.
(Silence)
(Rereads the problem)
(Silence)
A:
and ability to use tables and diagrams, as illustrated in other problems , were good. On this problem, she was given a score of 0 for step
1; 3 for step 3; 0 for step it. If she had made a computational error or
misused an equation, she would have received a 0, 1, or 2 on step 3. A
similar pattern emerged in her solution to other problems.
Dave, a fifth grader, is an example of a student who was able to
understand most of the problem settings presented to him, but had difficulty carrying out his solution strategies. This is illustrated with
the following example of a single-step problem:
itO
I:
D:
D:
It'd be $5.60.
I:
D: Yes.
Dave chose to add eight 75's, which was a correct strategy. However, he had difficulty in finding the sum. To make the computation
easier, he correctly noted that 8 sevens is the same as it fourteens. He
was scored a. 3 on step 1, and 2 on step 3 on this problem. His score on
step it was 0 since he did not exhibit any behavior in that category.
Lateri with prompting, Dave realized that he had left out the 8 fives,
and he corrected himself.
itl
Pilot Sample
All of the students in grades five through eight in the Malcolm
Price Laboratory School, Cedar Falls, Iowa, were involved in this pilot
study. The students were randomly divided into two groups across grade
levels.
eight.
However, within the groups, the fifth and sixth graders were administered one form of the test while the seventh and eighth graders completed
another form of the test.
teacher was asked to divide each of their classes into an upper ability
and lower ability half and to select one "verbal" student from each
half. This resulted in the selection of 32 students, four from each
grade, to be involved in think aloud interviews. All of the interviews
it2
were conducted by the investigator and followed the interview form discussed in the previous section.
Students in
group two completed the IPSP test after the interviews were completed.
Correlation coefficients between interview and IPSP test scores were
then computed. Figure 6 shows the time schedule for the study.
Responsibility
Activity
11/28
classroom teacher
11/29
the investigator
11/30
classroom teacher
the investigator
Figure 6
Interview Problems
One hundred open-ended verbal problems were developed for grades
five through eight independently from those on the IPSP test. These
problems were reviewed'by members of the IPSP staff. Samples from the
100 problems were administered to six volunteer students in grades five
through eight in think aloud interviews. Information obtained from
these interviews and suggestions from the staff were used in revising
k3
some problems and eliminating others. A pool of 65 problems resulted.
These problems were then classified into seven levels: level one containing simple one step word problems and each succeeding level containing problems that were increasingly difficult in both concepts and
computations to be used. Each problem was typed on a half sheet of
paper so the student could do any needed computations on that paper.
These problems are included in Appendix C.
The Interviews
The investigator conducted all 32 interviews. Because the interviews were taking place during the regular school day, a rather brief
time limit of 20 minutes per student was allotted. The first five minutes were used in talking to the student about the procedure to be used
and in presenting two sample problems. Students were encouraged to talk
but were not given any hints or told whether what they were doing was
correct.
The student's responses to the sample problems were used by the
interviewer to choose the difficulty level of the first problem to be
presented during the interview.
it it
discussion of the comments that were made by this student in the interview recording.
all right since the interviewers were more interested in what the child
was thinking than in his working the problem correctly.
The procedure used was as follows. The first day, one fifth grade
class completed the IPSP test; the second day, students from both fifth
grade classes were asked to solve the same five verbal problems in a
U5
think aloud interview; and the third day, the other fifth grade class
completed the IPSP test.
The IPSP tests were scored and the audio-tapes of the think aloud
interviews were coded using the quantification scale described at the
beginning of this chapter. The investigator coded all 55 students.
However, a doctoral student in mathematics education and a senior ma^nematics major also coded several student interviews and the percent of
interrater agreement was computed.
The time schedule and the five verbal problems for this investigation are shown in Figure 7ITBS and the IPSP Test
To further describe the IPSP test against more familiar measures,
the relationship between the four scales of the ITBS and the three subtests of the IPSP test was investigated. The four scales of the ITBS
(forms 5 & 6, levels 11 through lit) used in this study were: Test R,
Reading Comprehension; Test W-2, Reading Graphs and Tables; Test M-l,
Mathematics Concepts; Test M-2, Mathematics Problem Solving.
subtests of the IPSP test were:
The three
Reading Comprehension
Activity
Sample
May 8, 1978
May 9, 1978
Think Aloud
Interview
T o m ' s t h r o w w a s 3it m e t e r s .
Figure 7
Figure 7 (cont'd.)
Together you and I had $6.00. We spent a total of $3.20 for a record
and some ice cream. We each took half of the money that was left.
Which question below could be answered using this information?
1) How much did the ice cream cost?
2) How much did the record cost?
3) Could we buy another record at the same price?
it) How much money did each of us have left?
Starting
point
^ y a r d s
"
Throw
landed here
I threw a football itO yards. The picture above shows the path that the
football followed. At its highest point, about how high was the
throw above the ground.
1)
2)
3)
b)
50
10
30
5
yards
yards
yards
yards
h8
it9
tion served as a pretest for the IPSP project evaluation and the March
administration was the posttest.
In order to determine the level of relationship among the subtests
in each IPSP test, raw correlations between pairs of subtests were computed.
50
CHAPTER IV
essary in this study since the IPSP test appears to be the first effort
to develop a test to measure this set of skills. It was decided that
the concurrent measure would be data from students as they thought aloud
while solving open-ended verbal problems in one-to-one interviews.
A major problem that this procedure presented was that no reliable
and valid technique for gathering quantitative data from individual
51
interviews relative to the three steps in the IPSP test was available.
Consequently a quantification scheme as described in Chapter III was
developed. The scheme was used on tapes from both the Pilot and the
Final Interview studies.
ship between the think aloud interview and the IPSP test (forms 56l and
78l).
Table 2
Analysis of IPSP Test with
InterviewsPilot Study
Correlation Coefficients
Interview
IPSP
Test
Step 1
Step 3
Step it
Step 1
.28
.23
.19
Step 3
.lt6
.30
.11
Step it
.12
.01
.09
Reliability Coefficients
Step 1
Step 3
Step it
Form 561
77
72
7it
Form 781
7it
72
.79
.86
95
.86
Inter-rater
Agreement*
53
(1968, p. 20it) sample analog of the Cronbach estimate for the measure of
internal consistency of the five problems given to the 55 students in
this study. The reliability coefficients and the relationship between
the interview data and IPSP test results are displayed in Table 3.
Based on these findings there appears to be a rather strong relationship between the two measurement techniques in step 1 and step 3The .13 correlation in step b, the looking back step, verified observations that were made by the interviewers; i.e., this step was rarely
observed in the think aloud interviews. This observation has also been
made by other researchers (Kantowski, 1975; Kilpatrick, 1967).
Even
when the interviewer asked the student to look back at a problem and
check the answer, the student would not do so on the next problem.
Table 3
Analysis of IPSP Test with
InterviewsFinal Study
Correlation Coefficients
Interview
Step 1
Step 3
Step
Step 1
.6b
.6b
.20
Step 3
.56
.55
.11
Step it
.52
.50
.13
Reliability
Coefficients
Step 1
Step 3
Step
Form 561
79
.lb
.73
Interview
Problems
78
.93
.itO
.73
.93
Inter-rater
Agreement
1.00
P r o p o r t i o n of agreement b a s e d on 5 problems
each from 15 s t u d e n t s .
55
ITBS and the IPSP Test
IPSP Test Administration
Six experimental forms of the IPSP test were developed for administration January, 1978: two equivalent forms for grades five and six,
two equivalent forms for grades seven and eight, and two equivalent
forms for grades five, six, seven, and eight. There was an average of
six items in each of the three steps with 18 as the total number of
items.
In particular, forms 563 for grades five and six and 783 and
582 for grades seven and eight, respectively, were used. While these
forms are not equivalent to the final IPSP test forms, they are constructed from roughly the same proportion of item types. Their relationship to the ITBS subtests should certainly be a good approximation
of the relationship of the final IPSP tests to the same subtests.
Table it
R e l i a b i l i t y Analysis
Grade
Step
Number
of
Number
Items
Form 563
lit7
1
3
it
Total
litO
1
3
it
Total
lit it
1
3
it
Total
138
1
3
it
Total
Note:
SD
SD
SE
m
r
XX
J a n u a r y , 1978
6
6
6
3.56
it.33
3.56
1.6l
l.itit
1.6l
1.07
.86
1.03
.56
.6b
59
18
ll.it6
3.79
1.7k
.79
6
6
6
it.07
it. 8 1
U.26
l.it7
1.19
1.57
l.Oit
.81
.50
5it
18
13.lit
3.26
1.66
lb
Form 56U
5
.6b
J a n u a r y , 1978
7
6
5
it.01
3.71
2.95
1.63
l.lt6
l.it8
l.ll
99
-9U
.5b
.5b
.60
18
10.68
3.71
1.78
.77
7
6
5
U.it5
it.33
3.30
1.67
1.27
1.52
1.06
.97
.87
.60
.it2
.67
18
12.08
3.70
1.70
79
= Standard Deviation
SE = S t a n d a r d E r r o r of Measurement
m
r
= Cronbach a l p h a r e l i a b i l i t y
57
Table 5
Reliability Analysis
Grade
Step
Number
Number
of Items
Form 783
116
Note:
SD
January, 1978
1.2U
1.59
1.27
1.01
1.05
1.17
3it
.56
.15
18
7.25
3.26
1.87
.67
1
3
it
5
6
7
2.86
2.6l
2.53
1.39
1.79
1.U6
95
1.06
1.16
.53
.65
.37
Total
18
8.00
3.70
1.85
.75
January, 1978
6
6
6
3.19
3.2it
1.33
1.55
1.55
1.26
1.06
l.lit
95
.53
.it6
.it3
18
7.76
3.17
I.85
.66
1
3
it
6
6
6
3.56
3.59
1.78
l.it6
I.56
1.33
1.08
1.09
1.06
.51
.37
Total
18
8.93
3.38
1.88
69
1
3
it
tal
130
xx
2.56
2.25
2.it it
Form 78)+
119
SE
m
5
6
7
1
3
it
Total
121
SD
= Standard Deviation
Table 6
Reliability Analysis
Grade
Step
Number
Number
of I t e m s
Form 581
13k
1
3
it
Total
131
1
3
it
Total
118
1
3
it
Total
lit it
1
3
it
Total
Note:
SD
SD
X
January,
SE
m
r
XX
1978
6
5
7
2.57
2.82
3.16
1.1*7
1.33
1.51*
1.13
1.01
l.ll*
.1*1
.1*2
.1*5
18
8.56
3.1*1
1.87
70
6
5
7
3.18
3.08
3.96
1.59
1.11
1.71*
1.10
1.05
1.10
52
.11
.60
18
10.22
3.1*9
1.88
.71
6
5
7
3A9
3.33
U.25
1.53
1.23
1.55
1.09
.91*
1.13
.1*9
.1*1
.1*7
18
11.07
3.1*5
1.83
72
6
5
7
3.90
3.U8
it. 7 1
1.62
1.26
1.78
l.Oit
91
1.02
18
12.09
3.97
1.73
59
.1*8
.67
.81
= Standard Deviation
59
Table 7
Reliability Analysis
Grade
Step
Number
Number
of Items
Form
137
1
3
1*
Total
. 138
1
3
1+
Total
111*
1
3
I*
Total
11*1*
1
3
It
Total
Note:
SD
SE
r
582
SE
r
XX
January, 1978
6
6
6
2.81*
3.25
2.6l
1.51
1.1*6
1.26
'' 1.12
1.16
.1*9
.1*1
15
18
8.69
3.21
1.93
.61*
6
6
6
3.51
it.Oit
3.03
1.50
1.1*8
1.28
1.05
1.03
1.12
.51
.52
.21*
18
10.59
3.35
1.83
70
6
6
6
3.92
it.20
3.31
1.63
1.2l*
1.59
.98
1.01*
1.07
.61*
.29
18
11.1*3
3.67
1.80
.76
6
6
6
It.05
it.68
3.57
1.52
1.33
1.33
95
90
1.11
.61
.51*
18
12.31
3.38
1.72
.lb
= Standard Deviation
m
SD
1.08
55
.30
6o
ITBS Subtests
The four scales of the ITBS that were compared with the IPSP test
were Test R, Reading Comprehension; Test W-2, Reading Graphs and Tables;
Test M-l, Mathematics Concepts; and Test M-2, Mathematics Problem Solving.
Reliability estimates
ITBS re-
sults were obtained from the October, 1977, administration of the ITBS
to the same students who were administered the IPSP forms in January,
1978.
To further illus-
and
where
r
r
xy
(r r )"
xx yy
respectively.
r
xx
x
and
and
y,
Table 8
Correlations Between IPSP Subtests and
Iowa Test of Basic Skills Tests
Grade
li*2
IPSP
Form
563
Step
Number
litO
563
119
783
it
55
.1*6
.63
51
.1*1*
.63
Total
.68
.61*
.68
.65
it
.5U
.31
.62
56
.37
.58
55
.27
.58
.1*1*
.39
.61
Total
.65
.66
.62
.63
it
50
.1*7
.31*
52
.1*3
.31*
50
.43
.38
Total
.61
.60
.55
51
U3
.68
it
.51*
.39
50
.62
.1*5
5h
59
51
.61
.59
.1*5
.1*8
Total
.56
.61*
.68
.60
lit it
582
Mathematics S k i l l s
Problem
Concepts Solving
53
.1*1
.62
Graphs
57
.1*5
.65
Reading
.61
62
Appendix
bar graphs.
Of the three steps, step 3 appears to hold the least relationship
to the four ITBS subtests while step b appears to hold the greatest relationship. Note, however, that this trend did not appear in grades
seven and eight as can be observed in Figures 10 and 11.
Another obser-
vation is that, of the four ITBS subtests, reading comprehension consistently contributed the least to the multiple R-squared.
Certainly
the data provide a strong indication that the IPSP subtests measure
skills which are different from those measured by any one ITBS subtest
or any combination of them.
There
for grades seven and eight. Each level consists of 30 items with each
of the three steps containing 10 items. Students were given 1*0 minutes
to complete the test. The October administration served as a pretest in
the final IPSP project evaluation and the March administration was the
posttest. The sample was obtained from a pool of nearly 200 Iowa classrooms whose teachers volunteered to participate in the IPSP evaluation.
In all, each form of the test was administered to over 1000 Iowa students at each of the four grade levels. The March sample was the same
as the October sample except for student attrition, turnover and absenteeism.
63
r-
.6
.5
.it
.2
.1
_1_
a b c d e
Step 1
b c d e
Step 3
a - M u l t i p l e R-squared
b - Reading Comprehension
c - Graph S k i l l s
d - Mathematics Concepts
e - Mathematics Problem S o l v i n g
Figure 8
a b c d e
Step it
.8...
.7..
.6.
54.1*
2__
a b c d e
Step 1
a b c d e
Step 3
a - Multiple R-squared
b - Reading Comprehension
c - Graph Skills
d - Mathematics Concepts
e - Mathematics Problem Solving
Figure 9
a b c d e
Step U
65
1.
9-U
.8.
.7-
.64-
-T
l*._
.3__
a b o d e
Step 1
a b c d e
Step 3
a - Multiple R-squared
b - Reading Comprehension
c - Graph Skills
d - Mathematics Concepts
e - Mathematics Problem Solving
Figure 10
a b c d e
Step it
66
.7
.6
.5
.3
.2
l._
a b c d e
Step 1
a b c d e
Step 3
a - Multiple R-squared
b - Reading Comprehension
c - Graph Skills
d - Mathematics Concepts
e - Mathematics Problem Solving
Figure 11
a b c d e
Step 1*
67
of problem solving instruction, a fact which may have affected the
March data. Hence, we will use the data from the October administration
in this section.
The means, standard deviations, and reliability coefficients are
presented in Tables 9,10.
F.
ficients for these forms of the IPSP test range from .67 to .78 for the
individual steps and from .8U to .87 for the total test.
These coeffi-
cients, particularly those for the total, are well within the acceptable
estimates of internal consistency, especially for a rather short test.
In order to determine whether the IPSP subtests measure different
underlying skills, Pearson Product Moment correlation coefficients were
computed for each pair of subtests and corrected for attenuation. These
results, shown in Table 11, can be viewed as a measure of the relationship between the IPSP subtests after adjustment for the lowering
effect of their unreliability.
the correlations between subtests should be, a comparison was made between these corrected correlations and similar statistics computed for
the Iowa Test of Basic Skills data from a nationally representative
sample of 2558 fifth graders. As shown in Table
11
the corrected
correlations for the IPSP subtests range from .77 to .90; the corrected correlations for the four ITBS subtests range from .75 to .85.
It seems reasonable to conclude that the IPSP subtest scores are no
more highly related than are the ITBS tests of quite different content
68
Table 9
Reliability Analysis
Sample of Iowa Students
October 1978
Grade
Step
Number
Number
of Items
SD
r
XX
Form 565
1215
1311*
10
10
10
5.1*1
6.1*1*
It.96
2.51*
2.10
2.1*7
77
72
78
Total
30
16.81
6.22
.87
1
3
it
10
10
10
6.62
7-23
5-99
2.1* it
1.87
2.36
77
.68
77
Total
30
19.83
5.76
.86
1
3
Form 785
1078
1101
1
3
1*
10
10
10
6.16
5.86
5.37
2 -1*3
2.10
2.18
77
Total
30
17.38
5.72
.84
1
3
it
10
10
10
6.93
6.48
5.96
2.30
2.08
2.19
.77
.68
.70
Total
30
19.38
5-57
.84
.67
.69
69
Table 10
Reliability Analysis
Sample of Iowa Students
March 1979
Grade
Step
Number
Number
of Items
SD
r
XX
Form 566
1161
1184
1
3
4
10
10
10
6.4o
7.04
5-50
2.16
1.62
2.22
.69
.58
.71
Total
30
18.94
4.98
.81
1
3
4
10
10
10
7-25
7.57
6.39
2.05
1.56
2.14
71
Total
30
21.20
4.75
.81
59
.71
Form 786
1024
1
3
4
10
10
10
6.28
6.04
5.6l
2.15
2.14
2.39
.70
Total
30
17-93
5.67
.83
1
3
4
10
10
10
2.13
2.13
2.29
.72
.68
.72
Total
30
19.31
5.55
.84
VD VD VD
910
VD O LTN
t-VD
H
.66
.73
Table 11
Correlations Corrected for Attenuation
of October 1978 IPSP Subtests
Grade
Test
Subtest
IPSP Test
SI
56 5
1215
SI
S3
s4
.86
.88
S3
.80
S4
565
1314
SI
90
77
.82
S3
S4
785
1078
SI
.80
85
S3
.82
S4
785
1101
.83
SI
.82
S3
77
S4
ITBS Test
R
5
11
2558
R
W-2
M-l
M-2
W-2
M-l
M-2
.74
.80
.82
75
.75
.85
71
described in the previous paragraphs. A high relationship between
scores on tests which measure such apparently different content is
usually attributed to a general intelligence factor.
7?
CHAPTER V
forms of the IPSP test for grades five and six (565 and 566) and two
equivalent forms for grades seven and eight (785 and 786).
Each of the
73
Phase 1:
An important goal of the IPSP test development was that the test
results should be highly related to data collected via individual "think
aloud" interviews.
74
steps of the IPSP test. Conclusions five and six are based on the
findings in this phase of the study.
C5:
Correlations corrected
for unreliability were computed for each pair of subtest scores. The
lower these corrected correlations are, the less related are the subtests.
the four subtests of the Iowa Test of Basic Skills used in phase 2.
Conclusions 7 and 8 are based on findings in phase 3.
C7:
In Chapter II it was
pointed out that many researchers suggested that the problem solving
75
process is a multistep process, but it is not clear how skills within
each step are related to overall problem solving skills. That is, the
three scores that the IPSP test yields are measures of skills when the
problem solving process is broken down, but they may not be measures of
how well these skills can be synthesized to solve a new problem. Likewise, the specific skills in the IPSP testing model within the three
steps may not accurately reflect the complexity of the steps . Thus
there may be other important skills which have been overlooked.
The constraint that the IPSP test be machine scorable could be considered as a limitation in that light. The student must choose a constructed answer rather than construct one of his/her own. However, the
advantage of easy administration may outweigh this limitation.
The students for the January, 1978, test comprised a nearly representative sample of Iowa students. Although the sample for the October,
1978, test was not representative, it was chosen from over 200 volunteer
teachers with over 1000 students in each of the grades five through
eight.
Thus any statistics obtained from the IPSP test should be viewed
in this light.
As previously mentioned, there are limitations in the use of the
think aloud interviewing method for determining the processes students
use to solve problems. The presence of an observer may change the behavior of the problem solver who may also not accurately report his
thinking. However, the think aloud approach seems to be the best method
available for directly observing the problem solving process.
The strength of the relationship between the IPSP test and interview data is not only a function of the IPSP test but also of the
76
particular quantification scheme used for the interviews. While the
scheme in this study was carefully developed and yielded high interrater
agreement and internal consistency estimates, other reasonable approaches may have given quite different correlations with the IPSP test.
Classroom Implications
Two implications of the IPSP test development for the teaching of
problem solving will be discussed:
and the use of questions in the form of those on the IPSP test in homework assignments and classroom tests.
First, the IPSP test development effort illustrates that it is
possible to construct a psychometrically sound test based on the three
steps from the problem solving model. Profiles of students' scores
showing percentile ranks on the three steps of the problem solving model
can be used as a diagnostic tool.
13) of Ann and Dave, the two children whose interviews were described in
Chapter III, it can be seen that Ann is weakest in understanding the
problem while Dave is comparatively weak at carrying out his solution
strategies. Diagnostic assessments such as this would enable the
teacher to provide special instruction to students or classes who are
comparatively low in one of the steps. The IPSP problem solving modules are designed to facilitate such instruction (immerzeel et al.,
1977).
77
Percentile
Rank
90
80
70
60
50
40
30
20
10
Figure 12
IPSP
Total
4
78
Percentile
Rank
90
80
70
60
50
40
30
20
10
Figure 13
IPSP
Total
4
79
validation, over one hundred students were observed individually as they
solved problems.
solving the problemspossibly because they were not aware that they
should or because they thought the first answer had to be the correct
one.
A frequently observed approach might be called trial and error
with the four operations. The student would try a particular operation
(e.g., addition) which involved all the numbers in the problem regardless of their pertinence to the solution. The student would then look
at the answer, decide whether to choose another operation or accept
this answer. Even when closely questioned, most of these students were
unable to explain their reasoning. The most frequently heard responses
were:
"It (the answer) doesn't look right," or "I don't know" (why I
80
of developing these skills. The following are two examples of problem
settings and questioning sequences that could easily be included on a
worksheet or a test.
Example 1.
Tell which information is needed and which information is extraneous (not needed) to answer each question a to d.
Weekend Telephone Discount Rates
First minute
Chi cago
$ 19
$ .14
$ .20
$ .15
81
a) Is the above solution correct?
b) Suppose machine A is speeded up to produce 85 hamburger patties
per minute (instead of 76).
82
IPSP Test
1. A previously stated limitation of the IPSP test was that it
was administered to an Iowa population. To further verify the results
obtained in this study the IPSP test should be validated with other
samples of fifth through eighth graders. The IPSP test has also been
used with a group of remedial college mathematics students with reliable
results (Bellile, 1980).
in which one group of students was taught those skills which the IPSP
test indicated they lacked, while another group was taught general problem solving skills. Performance on a problem solving posttest would be
the criterion. A pilot study with this design was run in December,
83
1978, with neutral results. The study is described in Appendix
G. A
Is it taught?
Should it be taught?
If so,
how?
b)
Step 1 and step 4 skills are very closely related. Perhaps looking back involves essentially the same processes
as getting to know the problem initially. The unique
aspects of the relationship between these skills should be
investigated so that better teaching methods can be
designed.
84
IPSP test. To improve problem solving skill, we must look
to special types of reading, organizational and logical
skills in connection with understanding of mathematical
concepts.
d) It is interesting that step 3 skills were less closely related to the ITBS measures than were skills in the other
steps.
85
APPENDIX A
86
APPENDIX A
87
provides a language to use in discussing and analyzing a problem.
Each
step in the model is crucial; each step has implications for teaching.
What can the teacher do to help students get to know the problem?
can the teacher do to help students choose what to do?
What
THE CALCULATOR
Now that hand-held calculators are commonly available, they may be
used to solve problems which previously were beyond the ability of many
students.
Problems can be used which are more like those which arise
tational drudgery, the student may focus on the problem solving process
and those skills needed to solve problems.
Throughout the Iowa Problem-Solving Project students are expected
to use a calculator whenever they wish.
is appropriate.
88
The Iowa Problem-Solving Project has written one module specifically to introduce students to using calculators. A second, more advanced, module will be developed later. Throughout the other modules,
the calculator will be used as needed.
THE MODULES
In solving problems, a variety of tools, skills, and strategies
are needed.
The booklet,
usually about 30 pages long, will provide experiences to develop a particular skill. The card deck, usually about 100 problems, will provide
practice in solving problems which are especially suited to that skill.
USING THE MATERIALS
The eight modules (2 for the calculator, 6 for the problem solving
skills) will be taught in grades 55 6, 7, and 8two modules each year.
Students having all eight modules will have a rich experience seldom
found in existing programs.
89
A module is expected to be taught in the mathematics period, replacing the usual instruction during its use.
approximately one week and involve some discussion with the teacher.
The problem cards, however, are suited to individual, partnership, or
small group work.
in a deck, but rather they will select those problems suited to their
ability and interest.
problem deck also be taken from the regular mathematics period, after
the work in the skills booklet has been completed.
the Project will revise the materials, readying them for a broader
tryout.
The tryout of the materials will be conducted in Iowa schools (to
get feedback from varied settings).
the tryout schools will be conducted jointly by the Project team and
representatives of Area Education Agencies.
Following the tryout, the materials will be revised and readied
for wide distribution.
of 1978.
EVALUATION OF THE PROJECT
An integral part of the Project is the development of instruments
to measure students' problem-solving skill. Existing instruments are
90
generally narrow in scope, tapping only a few of the tools commonly
used, and often interwoven with computation difficulties.
Concurrently with module development, the Project is building a
problem-solving testa test that is sensitive to the four steps in the
model and that encompasses the variety of skills found in the Project.
The test development involves the tryout of many test items and
validation in the classroom using trained observers and teachers.
91
SAMPLE PROBLEM
What sum are you most likely to get when you roll two dice?
Roll two dice 25 times and keep track of the sums.
j
Three
-7M U^
Will you add 2 numbers each time you roll the dice?
GET TO KNOW
THE PROBLEM
TO DO
Could you use a graph?
DO IT
IPSP CALENDAR
Sept
76
Nov
76
Sept
77
Nov
77
Try-out
Jan
78
Mar May
78
78
Sept
78
Nov
78
Develop Instructional
Model for ProblemSolving Process
Problem-Solving
Process Handbook
for Teachers
Using the Calculator
with Whole Numbers
Using Guesses in
Problem-Solving
Using Tables in
Problem-Solving
Using Resources in
Problem-Solving
Using the Calculator
with Decimals
Using Models in
Problem-Solving
Using Computation in
Problem-Solving
Using Equations in
Problem-Solving
Problem-Solving
Instruments for
Summative Evaluation
Study of ProblemSolving Project
NO
ro
93
FLOW CHART OF IPSP MODULE DEVELOPMENT
Write Problem
Bank Examples
Identify
Write Objectives
for Skills Booklet
Complete Problem
Bank
Write Skills
Booklet
Pilot
Module
Evaluate
No
Yes
>
Revise
Module
Write Teachers'
Guide
~~'W
No
Yes
->
Try-out Module
Revise
->| Module
<r
V
Evaluate
No
Yes
->
Disseminate
Revise
- Module
94
APPENDIX B
95
APPENDIX B
ACTIVITY
September, 1976
October-December, 1976
January, 1977
March, 1977
January-September, 1977
October, 1977
December, 1977
January, 1978
January-April, 1978
96
March-April, 1978
April-May, 1978
Final Interview Study at an Iowa City Elementary schoolform 561 was administered to
55 students who were also interviewed in a
think aloud setting
January-August, 1978
October, 1978
March, 1979
April-September, 1979
97
FLOW CHART OF IPSP EVALUATION INSTRUMENTS DEVELOPMENT
I d e n t i f y problemsolving behaviours
t o be measured
Write t r i a l
items
test
Initiate validation
procedure
Interview
tryouts
Analyze d a t a
Yes
No
Revise
Analyze data
Need
v ^ Revision?^^
No
~s
Yes
~5" Revise
Analyze data
Yes
No
Revise
98
APPENDIX C
99
APPENDIX C
How many
Suppose you had 10 pieces of candy and you gave away 4 of them.
How many pieces would you have left?
Judy baked 18 cookies and Jay baked 24 cookies for the homeroom
party. How many cookies did Judy and Jay bake altogether?
DANISH
APPLE
BQA0 IZkjP
57*
I Mao
l'0bfiF 4f<i
100
At the fair Art burst 4 balloons with his first set of darts and
6 balloons with his second set. How many balloons did Art burst
altogether?
Kelly spent 68 cents for lunch and has 21 cents left. How much
did she have to begin with?
At the beginning of the year the school library had 2768 books .
At the end of the year 163 had not been returned. How many books
were left ?
Jerry had 18 baseball cards and David has 23 cards. Then David
gave 10 of his cards to Jerry. How many baseball cards do Jerry
and David have altogether?
Amy and Beth washed the 10 windows in their house. How many
windows did each girl wash?
econd
base
third
base
home
plate
Mr. Price earned $75 in each of 8 weeks. How much did he earn
for all 8 weeks?
Danny who is 11 years old saves all of his allowance. If he gets
75 cents a week how much will he have saved at the end of 4 weeks?
Jackie worked for 5 hours and earned a total of $8.75. What was
Jackie's average pay per hour?
101
3.2.
f?0LL&
60flST
3.3.
3-4.
A new school has 2 classrooms for each of the grades. How many
classrooms will there be for grades one through six?
3.5.
3.6.
In art class one day Mrs . White arranged the tables in 2 rows
with 12 in each row. The next day she arranged the tables in 3
rows with 11 tables in each row. On which day did she have more
tables in the room and how many more were there?
3.7.
Suppose you have traded two big marbles for 4 little marbles,
that rate how many big marbles will you need for 16 little
marbles?
4.1.
At a school fair 25 candy apples were sold, but only 1/5 as many
plain apples. How many apples were sold altogether?
4.2.
4.3.
Joe is saving money to buy a car which costs $475enough money to buy the car if he saves $235 more,
he saved?
4.4.
Jenny was given $15 for a birthday present. She used the money
to buy a calculator which had been reduced from its regular price
of $12.99 to $7.82. How much did Jenny have left?
4.5.
At
102
Susie has $2. How long can
she afford to park her car?
PARKING RATES:
First half hour
Second half hour
Each additional half hour
HALF
PRICE
75$
50$
25$
SALE ! ! !
SKATEBOARDS
buy one for $24.99
order another one for
ONLY $12.49!!!!
plus $4.55 for handling and
postage on each order of 2
skateboards sent to the same
address.
Doughnuts are sold for 15$ each or 6 for 85$. What is the largest
number of doughnuts you can buy for $4.55?
If a whole number is divided by 6 what is the greatest value the
remainder may have?
Ten boxes of apples each weighing 32 kilograms, and 5 boxes of
pears were delivered to a store. How many kilograms of fruit
were delivered?
Mr. Terry had $15-30 to buy some tickets to a game. He bought 3
adult tickets at $2.25 each and 5 children's tickets at 75$ each.
How much money did Mr. Terry have left?
103
Mr. Jones earns $23-75 a day. What is his monthly pay if he
works 20 days a month at 6 hours each day?
What is the area of the shaded
part of the figure at the right?
3 mm
lap 1
Hoyt
Gunner
Jones
55.9
54.8
54.6
lap 2
O OO H
Driver
lap 3
lap 4
56.8
56.2
56.6
56.0
55.4
54.9
104
Jake answered
the job. The
sales totaled
did Jake earn
Jo's math scores for the first 4 weeks were 84, 95, 89, and 88.
What was his average score? The math teacher said anyone who had
an average of 90 or better for the first 5 weeks would get an A.
What is the lowest score Jo could make on her fifth test to get
an A?
A rectangular lot is 120 feet by 78 feet,
on the lot is a rectangle 15 ft. by 17 ft.
after constructing the pool?
Orange
Orange
Orange
Orange
Orange
LABEL
% ORANGE JUICE
Juice
100
Juice Blend
70-95
Juice Drink
30-70
Drink
10-35
Flavored Drink less than 10
105
7-4.
7.5-
7-6.
7.7-
7.8.
There are two special rectangles. Their lengths and widths are
whole numbers. For each rectangle the area and the perimeter is
the same number. Find the lengths and widths of the rectangles.
7.9.
/*y522\
r^&?
7.10. A farmer sowed a plot with oats and two more plots with wheat.
He harvested b times as much oats as he sowed and 8 times as much
wheat. How much more wheat did he harvest than oats?
106
APPENDIX D
107
APPENDIX D
IOWA TEST OF BASIC SKILLS RELIABILITY ANALYSIS
NATIONAL REPRESENTATIVE SAMPLE
Table 12
Iowa Test of Basic Skills Reliability Analysis
National Representative Sample
Grade
Note.
N
2578
2558
2600
2679
Level
11
12
13
14
Test
SD
SE
m
3.7
2.1
2.9
2.4
.93
.75
.82
.80
.92
.70
.85
.81
XX
R
W-2
M-l
M-2
17.8
12.5
13.67
4.24
7.02
5.34
R
W-2
M-l
M-2
32.1
10.. 3
19.1
12.3
13.04
4.30
7.71
5.65
3,8
2.4
R
W-2
M-l
M-2
37.2
12.6
21.9
13.1
14.38
5.19
8.80
5.90
4.0
2.3
3.0
2.5
92
R
W-2
M-l
M-2
40.3
13.6
22.6
14.2
14.93
5.38
8.70
5,61
3.9
2.3
3,1
2.5
.93
.81
.88
,80
33.6
9-1
3'.0
2.5
.80
.88
,82
SD = Standard Deviation
SE = Standard Error of Measurement
m
r
= Split-Halves Reliability
108
APPENDIX E
109
APPENDIX E
y=P
where
R, W , M.. , M
RR+V2+SMl+V2+d
Basic Skills, Reading, Graph Skills, Mathematics Concepts and Mathematics Problem Solving, respectively. Also, for each of these tests we
have a standard deviation and reliability coefficient.
V
PR'
CT
W2' p w 2 '
etc
w=y-d
Denote these by
of the set of
p ,p
,p
and p
F, .
M2
From t h e p r o c e d u r e d e s c r i b e d on p . 200 of Lord & Novick (1968)
we d e r i v e t h e
formula:
44?R + 4 2 a W 2 P W 2
p -
2 2
+ P
2" 2"
M i a M 1 PM 1 + P M/M 2 P M 2
2~ 2"
2" 2
"
110
2)
= P
R4
1.
424
B's
+
2
PM/M 1 + P M 2 0 M 2
y-d
by a scaling factor
since it
e whose
NO"2 = E ( w - w ) 2 = (
w
v
w
B.(W.-W.
))2
= ( Bx 2 ( w1. - w .1 ) 2 ) + E(
w i
w if j
2 B . ( w . - w . )E B . ( w . - w . ) )
i
i
i j
J J <]
= 82 (w.-w. ) 2 + 2 B . ( E ( w . - w . ) E
i
i \
i
i
ij
i w i
i j
= NEP
B.(w.-w.))
J J J
E 2B.(E(w.-w, )EP,(w,-w.))
x
1
x
i5^j
w
w J J
J
? ?
= N3To\
i
where
y-d,
N
i
is the mean of
APPENDIX F
ANALYSIS OF IPSP TEST RESULTS
OCTOBER 1978 AND MARCH 1979 ADMINISTRATIONS
112
APPENDIX F
ANALYSIS OF IPSP TEST RESULTS
OCTOBER 1978 AND MARCH 1979 ADMINISTRATIONS
Test Form 565
This test was administered to a sample of Iowa fifth and
sixth graders on or about October 1, 1978. Pertinent results
by grade level are reported here.
Grade 6 (N=13l4)
Grade 5 (N=1215)
Subtest
S.D.
Reliability
S.D.
Reliability
1
2
3
5.1*1
6.44
4.96
2.54
2.10
2.47
0.77
0.72
0.78
6.62
7.23
5.99
2.44
1.87
2.36
0.77
0.68
0.77
Total
16.31
6.22
0.87
19.83
5.76
0.86
Percentile Ranks
Subtest 1
Subtest 2
Subtest 3
Raw
Score
5th
6th
5th
6th
5th
6th
10
9
8
7
6
97
89
81
71
59
94
80
95
81
62
42
25
99
51
38
97
88
74
58
40
95
86
76
64
98
90
77
62
48
5
4
3
2
1
0
46
32
19
10
4
1
27
17
9
4
1
l
25
14
6
3
1
1
13
6
3
1
1
1
51
37
25
14
6
l
34
22
13
6
2
1
66
OO
H
VO
-a
LfN
VD
LfN
H
-P
d
<D
O
Q)
OO
CM
CO
rH
Q)
VD
OO
O
00
00
rH
J"
ON
ON
CO
LfN
00
CM
JCM
ON
H
LfN
H
H
CO
LfN
00
CM
CM
ON
CO
t>-
VO
LfN
ON
J"
CO
OO
h
CM
t -
t
VD
O
VO
00
LfN
CO
J -
O N O N L ^ L f N C M C O L f N O
O N O N O N O N O N C O C O C O
VO
t
CM
t
O
00
CM
CM
H
CM
O
o3
LfN
H
OO
CM
CM
J -
VO
00
OJ
00
CM
t
VD
OO
VD
CO
LfN
00
LfN
t
J"
O
CM
ON
<-\
CO
r-{
tH
VD
<-l
CO
<D
PM
-p
-P
En
VD
H
cd
HJ
O
0)
EH
w
<L>
ON
ON
aj
CO
O
<D
-P
H
I
O
OO
<L>
d
a)
a) o
K o
CQ
ON
CM
OO
CM
t
CM
VO
CM
LfN
CM
J CM
00
CM
114
Grade 6 (N=ai84)
Grade 5 (N==1161)
Subtest
_
X
S.D.
Reliability
_
X
S.D.
Reliability
1
2
3
6.40
7.04
5-50
2.16
1.62
2.22
0.69
0.58
0.71
7-25
7-57
6.39
2.05
1.56
2.14
0.71
0.59
0.71
Total
18.94
4.98
0.81
21.20
4.75
0.81
Subtest 1
Subt<3St 2
Subt est 3
Raw
Score
5th
6th
5th
6th
5th
6th
10
9
8
7
6
96
87
74
58
42
93
78
60
1*1
26
98
88
69
46
26
96
81
57
33
15
99
94
85
72
58
97
89
75
58
41
5
4
3
2
1
0
26
15
7
3
1
1
15
8
3
2
1
1
13
4
1
1
1
1
6
2
1
1
1
1
42
27
15
6
2
1
26
14
7
3
1
1
Grade 6
_
Score
Grade 5
Grade 6
30
99
99
23
10
29
28
99
98
15
14
17
98
95
13
13
27
96
91
12
26
92
85
11
25
24
88
77
10
83
23
77
69
61
22
70
53
21
63
44
7
6
20
56
36
19
18
49
42
29
23
17
16
36
18
30
14
Grade 7 (N=1078)
Subtest
Grade 8 (N=1101)
S.D.
Reliability
S.D.
Reliability
1
2
3
6.16
5.86
5.37
2.43
2.10
2.18
0.77
O.67
O.69
6.93
6.48
5.96
2.30
2.08
2.19
0.77
0.68
0.70
Total
17.38
5.72
0.84
19 ..38
5-57
0.84
Subtest, 2
Subtest 3
Raw
Score
7th
8th
7th
8th
7th
8th
10
96
86
73
59
46
45
99
94
83
68
32
50
97
88
74
56
39
99
95
87
74
60
45
98
9
8
7
6
5
4
94
79
29
16
20
10
3
2
1
0
33
22
13
6
1
1
61
21
13
7
3
1
1
34
21
11
4
l
l
25
13
6
3
1
1
92
80
65
49
34
2
1
1
1
117
Grade 8
RaV
0
Grade 7
Grade 8
Soore
30
99
99
29
28
99
35
30
23
98
15
14
99
96
13
25
15
27
97
93
12
21
12
26
94
88
11
16
25
90
83
10
12
24
85
78
23
80
71
22
76
64
21
70
57
7
6
20
64
51
19
18
59
44
53
17
47
41
37
32
27
16
19
5
4
118
Grade 7 (N=910)
Grade 8 (N=1024)
Subtest
S.D.
Relia^ bility
S.D.
Reliability
1
2
3
6.28
6.04
5.61
2.15
2.14
2.39
0.70
0.66
0.73
6.76
6.60
6.15
2.13
2.13
2.29
0.72
0.68
0.72
Total
17.93
5.67
0.83
19.51
5-55
0.84
Percentile Ranks
Subtest 1
Subtest 2
Subte:st 3
Raw
Score
7th
8th
7th
8th
7th
8th
10
9
8
7
6
97
88
75
61
44
95
84
68
51
35
98
91
79
63
48
97
87
70
53
36
98
92
81
68
55
97
87
76
62
46
5
4
3
2
1
0
29
16
7
3
1
1
22
12
6
2
1
1
33
19
10
3
1
1
23
13
6
3
1
1
41
28
16
7
3
1
32
20
10
4
1
1
ON
CO
<D
13
CM
CM
CO
H
CO
LfN
J -
00
ON
VD
CM
H
H
H
CM
t
OO
CM
C3
d)
Ti
crj
CM
CM
CM
00
CM
LfN
_3-
0)
13
03
ON
CO
ON
0)
T3
ONONCO
LTNCMCO
ro
O N O N O N O N O N O O C O
r(
BJ
K
0)
VD
CO
0)
crj
K
O
O
o \
CO
t ^
VO
LTN
_=J-
O O C M
rl
<D
ft
O
-P
-P
W
0)
EH
CO
<D
EH
P
O
EH
ON
LTN
ON
ON
OJ
CO
VO
VD
00
VD
co
h
oooo
rVO
ON
t^
LfN
H
LfN
ON
00
OO
00
CO
CM
CMVD
o
-=r
VO
LTNLfN-d-
CO
00
<D
+5
H
I
O
oo
0)
o
K
o
CQ
00
ON
CM
CO
CM
VD
CM
LfN
CM
-=tCM
00
CM
CM
CM
o
CM
ON
CO
VD
120
APPENDIX G
121
APPENDIX G
PILOT TEACHING STUDY
A pilot teaching study was conducted in December, 1978, in conjunction with the pilot interview study. The schedule of activities is
shown in Figure l4.
of each
Those students
IPSP teachers while, group three was taught by the regular classroom
teacher.
The group means on the IPSP post-test were not significantly different . However, a number of improvements in the design and treatments
could yield different results.
122
Schedule for Pilot Study
Activity
Date
Responsibility
11/28
classroom teacher
11/29
the investigator
11/30
classroom teacher
the investigator
the investigator
12/5
the investigator
12/6, 12/7
involved teachers
12/8
12/9
Snow day
12/10,12/11
Weekend
12/12
involved teachers
12/13
involved teachers
Figure l4
123
1.
day and Wednesday went as scheduled but because of snow the school was
closed on Thursday (scheduled for the last day of the treatment) and
Friday (scheduled for the post-test day). Since school was not in session Saturday and Sunday, the students were absent from school for four
consecutive days.
over again.
2.
During this long weekend, the teacher who taught group two be-
came ill and had to be replaced. Fortunately, her replacement was the
director of the IPSP who had been visiting the classes intermittently
throughout the treatment and was well known by the students.
3. When the groups met on the first day and discovered that there
were mixed grade levels, the seventh and eighth graders asked, "Why are
we here?
students and were involved in the development of the modules which were
being used, they found it difficult to teach such a diverse group:
"The older and brighter students tended to be more aggressive."
5.
such activities as basketball practice, science projects, and music lessons to attend these treatment sessions. Their negative attitude was
another potential difficulty for groups one and two.
The teaching study should be carried out again with the following
suggested changes.
1.
124
2.
Do not give the treatment across all grade levels, but com-
bine fifth and sixth graders, and seventh and eighth graders.
3.
was the most important step in the model. Perhaps studies which focus
on the other steps should be undertaken also.
125
BIBLIOGRAPHY
La psychologie de 1'intelligence.
Claparede, E.
1-154.
La genese de l'hypothese.
126
Heath, 1933.
Psychological Monographs, 1945, 58,
127
Goldberg, D. J. The effects of training in heuristic methods on the
ability to write proofs in number theory. Unpublished doctoral
dissertation, Teachers College, Columbia University, 1973.
Guilford, J. P. The nature of human intelligence.
Hill, 1967.
New York:
Mcgraw-
128
New York:
129
Krutetskii, V. A. The psychology of mathematical abilities in schoolchildren . Translated from the Russian by Joan Teller. J. Kilpatrick and I. Wirszap (Eds.). Chicago: University of Chicago Press,
1976.
Kuder, G. F., & Richardson, M. W. The theory of the estimation of test
reliability. Psychometrika, 1937, 2.(3), 151-160.
Lester, F. K. Mathematical problem solving in the elementary school:
Some educational and psychological considerations. Paper prepared
for the Research Workshop on Problem Solving in Mathematics Education Center for the Study of Learning and Teaching Mathematics,
The University of Georgia, May, 1975.
Lord, F. M. , 8B Novick, M. R. Statistical theories of mental test
scores. Reading, Mass.: Addison-Wesley, 1968.
Lucas, J. F. An exploratory study in diagnostic teaching of elementary
calculus. Unpublished doctoral dissertation, University of Wisconsin, 1972.
Luchins, A. S. Mechanization in problem solving.
graphs , 1942, 5 M 6 ) .
Psychological Mono-
Lundsteen, S. W., & Michael, W. B. Validation of three tests of cognitive style in verbalization for third and sixth grades. Educational and Psychological Measurement, 1966, 26(2) , 449-461.
Mayer, R. E. Thinking and problem solving: An introduction to human
cognition and learning. Glenview, Illinois: Scott, Foresman and
Company, 1977McKeachie, W. J., & Doyle, C. L.
Wesley, 1970.
Mikhal'skii, K. A. The solution of complex arithmetic problems in auxiliary school. In J. Kilpatrick, E. G. Begle, I. Wirzap, & J. W.
Wilson (Eds.), Soviet studies in the psychology of learning and
teaching mathematics (Vol. 9) Stanford: School Mathematics Study
Group, 1975National Collection of Research Instruments for Mathematical Problem
Solving.Gerald Kulm (Ed.).Purdue University, 1976.
Newell, A., Shaw,
logic theory
Feigenbaum &
McGraw-Hill,
Englewood Cliffs:
130
Paige, J. M., & Simon, H. A. Cognitive processes in solving algebra
word problems. In B. Kleinmuntz (Ed.), Problem solving. New York:
John Wiley and Sons, 1966, pp. 51-119.
Poincare, H. Science and method. Translated by F. Maitland.
Charles Scribner's Sons, 19l4.
Polya, G. How to solve it (2nd ed.).
1957.
New York:
Polya, G. Mathematical discoveryon understanding, learning and teaching problem solving (Vol. l). New York: John Wiley and Sons,
1962.
Post, T. R. The effects of the presentation of a structure of the problem-solving process upon problem-solving ability in seventh grade
mathematics (Doctoral dissertation, Indiana University, 1967).
Abstract: Dissertation Abstracts, 1968, 28, 4545A, No. 11.
Proudfit, L. Measuring problem-solving processes in elementary children. Unpublished manuscript, Indiana University, Bloomington,
1979Ray, W. S. Complex tasks for use in human problem solving research.
Psychological Bulletin, 1955, 52, 134-149Reitman, W. Some Soviet investigations of thinking, problem solving and
related areas. In R. A. Bauer (Ed.), Some views on Soviet psychology. Washington: American Psychological Association, 1962, pp.
29-61.
Restle, F., & Davis, J. H. Success and speed of problem solving by
individuals and groups. Psychological Review, 1962, 69, 520-536.
Riedesel, C. A. Problem solving: Some suggestions from research.
Arithmetic Teacher, 1969, 16, 54-58.
Scandura, J. M. Mathematical problem solving. American Mathematical
Monthly, March 1974, 8l, 273-280.
Shulman, L. S. Psychology and mathematics education. In E. G. Begle
(Ed.), Mathematics education. Yearbook, National Society for the
Study of Education, 1970, 69, 23-71Shulman, L. S., & Elstein, A. S. Studies of problem solving, judgement,
and decision making: Implications for educational research. In
F. Kerlinger (Ed.), Review of research in education 3. Itasca,
Illinois: Peacock Publ., 1975.
131
132