Вы находитесь на странице: 1из 18

Test Review

Running Head: TEST REVIEW

Test Review: Test of Written Language Third Edition John Laing University of Calgary APSY 652 Adam McCrimmon

Test Review

Test Review: Test of Written Language Third Edition

The Test of Written Language Third Edition (TOWL-3) is a norm- referenced test that assesses written English-language proficiency of individuals or groups of students (Yarger, 1996). The authors of the TOWL-3 are Donald D. Hammill and Stephen C. Larsen (Hammill & Larson, 1996). The TOWL-3 was published in 1996 by Pro-Ed, Incorporated. The address of the publisher is: Pro-Ed, Inc. 8700 Shoal Creek Boulevard, Austin, Texas. The price of the TOWL-3 is approximately $160 (Hansen, 1998). Although there is no time limit, the TOWL-3 can be administered in approximately 90 minutes (Bucy, 1998; Hammill & Larson, 1996: Hansen, 1998). Overview The Test of Written Language (TOWL) was first developed in 1978 as a comprehensive test for evaluating written language (Hammill & Larson, 1996). The TOWL was a seven-subtest assessment that could be administered to students in Grades 3 through 8. The TOWL was based on five components of writing including: the mechanical component; the productive component; the conventional component; the linguistic component; and the cognitive component (Hammill & Larson, 1996). The TOWL was designed to measure written language using both contrived testing formats and spontaneous testing formats. Contrived testing formats measure units of written language such as punctuation, capitalization, spelling, vocabulary and grammar. Spontaneous testing formats assess skills in composition, creativity and intellectual abilities through the analysis of written essays and stories. Hammill and Larson modified the TOWL in 1983 by updating the test based on the current literature, improving the normative sample and making the test more user friendly. However, in

Test Review

1988 the TOWL-2 was created. The TOWL-2 did not include the mechanical component of written language, nor the productive component, however, the subtests were increased to 10. In 1996 Hammill and Larson created the TOWL-3. Improvements incorporated in the TOWL-3 include: a decrease in the amount of time it takes to administer; ease of use for young and poor writers; updated and improved normative data; inclusion of evidence for equivalent form reliability; strengthened reliability and validity of the measure; and inclusion of age and grade equivalents (Hammill & Larson, 1996). A new TOWL-3 storage box includes a 134 page examiners manual, 25 student response booklets Form A, 25 student response booklets Form B and 50 profile/story scoring forms (Hansen, 1998). The TOWL-3 can be used to assess students between ages 7 years and 17 years, 11 months. Like the TOWL, the TOWL-3 utilizes spontaneous and contrived test formats. Both the spontaneous and contrived formats include conceptual, linguistic and cognitive writing components (Hansen, 1998). Based upon the three writing components, as well as the two test formats, eight subtests were developed including: vocabulary; spelling; style; logical sentences; sentence combining; contextual conventions; contextual language; and story construction. Subtest one, Vocabulary, assess a students ability to write a sentence incorporating a stimulus word. Subtest two, Spelling, requires the student to write a sentence from dictation ensuring accurate use of spelling rules. Subtest Three, Style, focuses on capitalization and punctuation rules. For this subtest the student is required to write a sentence from dictation using correct capitalization and punctuation. Subtest four, Logical Sentences, requires the student to edit an illogical sentence so that it makes better sense. Subtest five, Sentence Combining, measures the students ability to combine several short sentences into one well written sentence. Subtest six,

Test Review

Contextual Conventions, requires the student to write a story in response to a stimulus picture. The student earns points based on accurate use of capitalization, punctuation, spelling and other elements of writing such as paragraph indentation. Subset seven, Contextual Language, measures the students story for quality of vocabulary, sentence structure and grammar. Lastly, Subtest eight, Story Construction, is used to evaluate the plot, prose, character development, interest and other compositional aspects of the students story (Hammill & Larson, 1996). Recommend Use Written language is one of the most important forms of communication, however, it may also be one of the most complex (Burns & Symington, 2003; Hammill & Larson, 1996). According to Hammill and Larson (1996), written language is comprised of characters, letters or words used to comprehend or express thoughts. Furthermore, to use written language effectively, one must possess both receptive and expressive capabilities. That is, written language is comprised from the ability to successfully read and write. Although written language includes reading and writing components, the TOWL-3 only measures the written component. According to Burns and Symington (2003), valid assessment of written language is important due to its complexity and because it is imperative for academic success. To write meaningfully students must master three basic cognitive abilities: (1) conventional component, (2) linguistic component, and (3) cognitive component. The TOWL-3 measures these abilities through the use of both spontaneous and contrived testing formats. The use of conventional, linguistic and cognitive writing components, coupled with spontaneous and contrived test formats, allows for a more accurate assessment of how well the student understands the conventions and forms of written language, as well as how the student

Test Review

actually writes (Hansen, 1998). The conventional component of writing refers to the accepted fashions or rules established for spelling, punctuation and capitalization. The linguistic component involves the use of serviceable syntactic and semantic structures. The cognitive component of writing refers to the ability to write logical, coherent and sequenced written products. Based on the students responses to given stimuli, the Vocabulary, Spelling, Style, Logical Sentences and Sentence Combining subtests are used to measure the students contrived writing ability (Hansen, 1998). Furthermore, using writing samples created by the student, contextual conventions, contextual language and story construction ability is measured using the spontaneous test format. After the completion of all eight subtests, three scores are produced: Spontaneous Writing, Contrived Writing and Overall Writing. The Spontaneous Writing score indicates how well the student did related to writing components in terms of their relationship to an actual written passage created by the student. The Contrived Writing score indicates the students ability to use the smallest units of written language such as spelling, punctuation, word usage and capitalization. The TOWL-3 can be used to assess the writing ability of students aged 7 years to 17 years 11 months. Valid assessment is necessary to highlight problems with written expression, as well as to direct appropriate intervention. The TOWL-3 is an assessment tool that can be used to identify students who need special help with written language, identify strengths and weaknesses, as well as monitor existing programs and interventions used to improve writing skills (Hansen, 1998). Furthermore, the TOWL-3 may be used to help conduct research in writing as it can provide a plethora of data for researchers based across a large demographic of students. Examiners should have formal training in assessment, as well as supervised practice using the TOWL-3.Training should provide the examiners with a basic understanding of general

Test Review

procedures governing test administration, scoring, interpretation and specific information about educational evaluation. Furthermore, examiners should also be familiar with the following: the nature of written language; the components of writing; useful testing formats and models for assessing writing; the TOWL-3 subtests; the procedures for using the subtests to generate scores; and the purpose of written language assessment (Hammill & Larson, 1996). The TOWL-3 is unique and has many strengths. Firstly, the TOWL-3 is relatively cheap in comparison to other educational tests and comes with an examiners manual that is informative, comprehensive and easy to read. The test can be administered over more than one sitting. This may be particularly important when administering the TOWL-3 to younger students or to students who have trouble maintaining their attention for longer periods of time. Also, the TOWL-3 can be used for a wide variety of ages ranging from 7 years to 17 years 11 months. Another feature of the TOWL-3 is that is can be administered to groups of students or individual students. Furthermore, because the TOWL-3 provides Form A and Form B, examiners can evaluate student progress using pre-testing and post-testing that is not contaminated by memory. Also, the TOWL-3 is comprehensive in that it measures writing competence through both essay-analyses (spontaneous) and traditional test formats (contrived). Also, according to Hammill and Larson (1996), the TOWL-3 is shown to be unbiased relative to gender and race and has a large normative base allowing the test to be administered to a large population of students. Lastly, composite quotients are available for Overall Writing, Contrived Writing and Spontaneous Writing making the TOWL3 easy to score and interpret. Major Features

Test Review

The TOWL-3 is comprised of eight subtests. Five subtests (Spelling, Style, Vocabulary, Sentence Combining, and Logical Sentences) are measured using the Contrived Writing format. These subtests measure the students writing skills in isolation. For example, the student is required to incorporate a word into a sentence, spell from dictation and combine sentences accurately (Bucy, 1998). The three remaining subtests (Contextual Conventions, Contextual Language and Story Construction) follow a Spontaneous Writing format. For example, the student produces a story that is evaluated for vocabulary, grammar, plot and punctuation. The spontaneous writing sample is created by the student after they are presented with a picture. The student is asked to write a story based on the picture presented using either Form A or Form B. Form A depicts a futuristic scene including astronauts and space ships. Form B displays a prehistoric picture including people wearing animal skins and a prehistoric animal. The Vocabulary subtest requires the student to write a sentence using a specific word presented to them. For example, the student reads the word see and then must write a sentence using the word. One point is given if the student demonstrates their understanding of the word and no points are given if the student does not clearly show that they understand the word based on its use in the sentence. The Style and Spelling subtests requires the student to write a sentence from dictation. The sentence is then evaluated for spelling, punctuation and capitalization. For example, the examiner reads the car sped and the student writes the sentence. The student receives 1 point if there are no spelling errors and 1 point if there are no capitalization or punctuation errors. Errors result in a score of 0.

Test Review

The Logical Sentence subtest requires the student to correct an illogical sentence. For example, the student would read the man blinked his nose and would be required to change the sentence to the man blinked his eye. Spelling and punctuation are disregarded for this subtest. The student would receive 1 point if they fixed the sentence in a logical and grammatically correct manner. For the Sentence Combining subtest the student is required to combine several short sentences into one grammatically correct and meaningful sentence. The student receives 1 point if the sentence is meaningful and grammatically correct and 0 points if it is not. The Contextual Conventions subtest requires the student to write a story based on the picture presented to them using Form A or Form B. The students story is evaluated for the correct use of punctuation, spelling and other rules of writing. The student is scored using 12 criteria such as the correct use of capital letters to begin sentences, the number of paragraphs and the use of quotation marks. The students story is evaluated for vocabulary, sentence structure and grammar for the Contextual Language subtest. As outlined in the examiners manual, the students story is evaluated based on 14 criteria. For example, the use of complete sentences, use of conjunctions to construct compound sentences and subject-verb agreement are among the 14 criteria. Lastly, the Story Construction subtest measures the students story for elements of a story including character development, plot development and thematic development. The Story Construction subtest is evaluated using 12 criteria including story beginning, relatedness to the stimulus picture, plot development, and character development.

Test Review

Each subtest produces a raw score, a standard score and a percentile. The standard scores for each subtest are derived from the raw scores. Standard scores are the best way to evaluate the students ability across all eight subtests. Based on the standard score, a description of the students ability for each subtest is given in the examiners manual. For example, a standard score ranging from 8 to 12 for any given subtest would indicate the student displayed average ability for that particular subtest. Also, composite scores can be created for Contrived Writing, Spontaneous Writing subtests and Overall Writing ability. The composite scores are a special type of standard score that allows the examiner to estimate a students overall writing competence, spontaneous writing competence and contrived writing competence. For example, a composite score ranging from 121 to 130 for the Overall Writing, Spontaneous Writing or Contrived Writing domains would indicate superior ability for that particular domain. The composites are important because they indicate the students skill in relation to components built into the test. Students who have good Contrived Writing composites understand the rules of test taking, have knowledge regarding the elements of writing and have mastered the basic skills of written language. Students who display good Spontaneous Writing composites have a good understanding of how to incorporate the elements of writing into meaningful writing. Furthermore, students who have high Spontaneous Writing composites show their mastery of writing as a communication medium (Hammill & Larson, 1996).

Administration

Test Review

10

The TOWL-3 can be administered to groups or individuals. Detailed instructions are outlined in Chapter 3 of the examiners manual regarding test administration. The test begins with the Spontaneous Story followed by the Contrived subtests. Contrived subtests are administered in order, beginning with Vocabulary, after the student has had 15 minutes to complete their story. Contrived subtests are given to individuals beginning with the first item in each subtest. A ceiling is reached when the student makes three consecutive errors. Although the examiner also begins with the first item when administering the Contrived subtests to groups, the subtest is concluded when the majority of the group attains the ceiling (Yarger, 1996). The contrived subtests each have verbatim written directions for the examiner to read when explaining the subtests to students. For the Spontaneous subtests students are asked to look at a stimulus picture and write a story about it. Students are given specific instructions to help them plan and organize their story. Samples are provided in the examiners manual illustrating a variety of ability levels to make scoring easier. The story is scored based on conventions, language use and story construction. Scoring and Interpretation Examiners can calculate five types of scores using the TOWL-3: (1) Raw scores, (2) Age and Grade equivalents, (3) Percentiles, (4) Subtest Standard scores, and (5) Composite quotients. Details regarding test interpretation are outlined in Chapter 4 of the examiners manual. The students indentifying and demographic information are recorded in section one of the Profile/Scoring Form. The Raw score, Percentile and Standard score for each subtest are recorded in section two. Raw scores are recorded first; Percentiles and Standard scores are recorded using

Test Review

11

information found in Appendix A of the examiners manual. Section three contains any other relevant tests that have been administered to the student. Standard scores obtained from other test measures should be converted into TOWL-3 equivalents. Table 4.1 of the manual can be used to convert standard scores of other tests into TOWL-3 equivalents. Standard scores for the TOWL-3 subtests are assigned to section four. The Standard scores of the subtests are combined to produce each composite score based on the test components. Section five contains the scores of the three Spontaneous subtests. The examiner is to score the Spontaneous subtests only if the students story contains 40 or more words. Specific guidelines including the subtest criteria for accurate scoring are listed in Chapter 3 of the manual. Section five of the Profile/Scoring Form contains graphs in which test results can be plotted. The examiner plots the standard scores for the subtests and the quotients for the composites on the profile provided. As a result, the examiner can see the presence of any scatter among the subtests, the composites and the other test results. Raw scores have limited value and cannot be used to make clinical interpretations about a students performance. Raw scores are not comparable to one another because they differ in the way they are calculated from one subtest to another. Also, caution should be used when using age and grade equivalents. Age and grade equivalents have inadequate statistical properties, are often misleading and their use is discouraged by many authorities including the American Psychological Association (Hammil & Larson, 1996). Raw scores converted into percentiles using Appendix A of the manual. Percentiles represent a value on a scale of 100 that indicates the percentage of the distribution in relation to the normative sample that is equal to or below the value. Subtest standard scores are the best way to evaluate a students writing ability. For each subtest the mean score is 10 with a standard deviation of 3. Standard scores provide equivalent

Test Review

12

indices for each subtest; therefore, subtests can be compared using standard scores allowing for accurate evaluation of strengths and weaknesses across all eight subtests. The interpretation of a students writing ability based on standard scores can range from very poor to very superior. Subtest scores are useful in hypothesizing why a person did poorly on a composite, however, decisions pertaining to diagnosis and placement should rest primarily on the interpretation of quotients. Subtest scores should only be interpreted in terms of the specific skills they measure. Standardization of the Test The TOWL-3 was normed in 1995 on 2,217 students in Grades 2 through 12, from 25 states, representing each major U.S. region including Northeast, North Central, South and West regions (Hansen, 1998). The normative samples are representative of the U.S. population, according to the 1990 census, with regard to gender, residence, geographic region, race, disability, and parental income and education (Hammil & Larson, 1996; Yarger, 1996). Furthermore, gender, ethnic and racial biases were absent (Hammill & Larsen, 1996). Students with disabilities comprise 10% of the standardization sample and include children with learning disabilities (6%), speechlanguage disorder (3%), mental retardation (1%) and other disabilities (1%) (Bucy, 1998). A total of 970 students in general classes were tested in the four regions. Also, 54 volunteers, selected among TOWL-2 test users, tested an additional 1,247 students (Hansen, 1998). To further support the representativeness of the sample the demographic information was stratified by age group. Data presented in Chapter 6 of the manual indicate that stratified variables conform to national expectations at each age group covered by the tests norms. Specifically, geographic region, gender, race, residence and ethnicity were stratified by age group. Norms for the TOWL-3 subtests are presented in terms of standard scores. Standard scores have a mean of 10

Test Review

13

and a standard deviation of 3. A mean of 10 with a standard deviation of 3 was selected because it is already known to examiners. Furthermore, percentiles are also provided for the TOWL-3 subtests and composites. Also, age and grade equivalents are available in relation to the raw score made by the student. Equivalents were determined by the average scores of all students included in the normative sample. The age and grade equivalents can be found in Appendix C of the examiners manual. Psychometric Properties Test reliability was provided in several ways. Internal consistency (coefficient alpha) was determined for each subtest and composite score at the 11 ages. Composite scores for Contrived Writing and Spontaneous Writing of both test formats meet minimal standards (.90) for educational decision making (Bucy, 1998). Furthermore, coefficient alpha for the Overall Writing Composite for both forms meet acceptable standards (.97). Internal consistency for some subtests meet acceptable standards, however, the Contextual Conventions and Contextual Language subtests do not meet acceptable standards (.76) (Bucy, 1998). Guilfords formula was used to calculate coefficients for the composites and A Z transformation technique averaged the coefficients (Hayward, Stewart, Phillips, Norris & Lovell, 2007). All subtests were .8 or higher, except for Contextual Conventions, and all composite alphas meet acceptable standards of .9 or higher. Alternative forms reliability was also provided. However, less detailed information is provided regarding alternative forms reliability (Hayward et al., 2007). Data provided in the manual show differences between alternative forms, favoring Form B. Specific procedures such as time between administration, order and format of administration and sample size are not provided.

Test Review

14

Reliability coefficients at each age level were acceptable for Spontaneous Writing (.76-.89), Contrived Writing (.86), and Overall Writing (.83-.96). Test-retest was determined using only 27 second grade students and 28 grade 12 students. Test-retest was conducted with two weeks delay between tests. Subtest coefficients are in the .70s and .80s with composite coefficients in the .80s to .90s. Coefficients with a minimum range of .90 are considered to be a more acceptable standard (Hayward et al., 2007). Two Pro-Ed staff tested inter-rater reliability. Interscorer reliability was determined to be acceptable for all composites and four of the eight subtests (Bucy, 1996). The two staff scored 38 random protocols from the normative sample and showed high inter-rater reliability with coefficients of .83 to .97 (Hayward et al., 2007). Evidence for test validity was determined using content, criterion and construct validity. Content validity involves ensuring that the test items hold up statistically and that the skills to be measured are congruent with current knowledge about the particular area to be assessed (Hammill & Larson, 1998). Content validity is demonstrated in three ways. First, the manual contains a rationale for each subtests content and format. Also, empirical support is highlighted by the results of classical item analysis that were used during the development of the subtests items. Lastly, validity of the items is supported by the results of differential item functioning analyses used to show the absence of bias in a subtests items. Evidence for criterion validity was demonstrated through a study of 76 elementary students. Moderate correlations were found between TOWL-3 scores and teacher rating scale scores of academic abilities, however, additional studies with larger samples, at all ages and with different criterion measures are needed (Bucy, 1998).

Test Review

15

Evidence for construct validity was provided through an examination of test performance at each age level and in correlations of subtest scores to age (Hayward et al., 2007). Strong evidence is provided for construct validity using logical analysis, correlation analysis and factor analysis (Bucy, 1998; Hansen, 1998). Hammill and Larson provide correlation coefficients which show strong correlations between ages 7 to 12, but not between ages 13 to 17, thus providing evidence for construct validity (Hayward et al., 2007). Evaluation The TOWL-3 is greatly improved from its earlier versions (Bucy, 1998; Hansen, 1998; Hayward et al., 2007). It is most useful in identifying writers whose skills are significantly below their peers (Bucy, 1998). Strengths of the TOWL-3 include: it can be administered and scored in a relatively short amount of time; computer scoring is available; subtest items are untimed making the test more fair to young children and children with disabilities; the test is based on a strong conceptual model of written language; generally acceptable levels of subtest reliabilities and good evidence of validity; can be administered over more than one sitting; opportunity for pre-and posttesting with the use of Form A and Form B; and can be used for a wide range of ages. Although the TOWL-3 is much improved over previous versions, some weaknesses still exists. Subjectivity remains in the scoring procedures for Story Construction, as well as Contextual Conventions. Also, a number of steps are involved to score the TOWL-3 making the process difficult for classroom teachers. Examiners are required to have formal training regarding educational assessment and should have supervised practice administering the TOWL-3. These requirements can be time consuming for professionals who have many responsibilities and a financial cost may be associated with supervised practice.

Test Review

16

The norming sample is adequate, however, a larger sample, with data showing proportionality of the subgroups to population statistics could be helpful (Hansen, 1998). Furthermore, a scale score can be computed from raw scores of 0. This creates a problem at younger ages because a 0 in Sentence Combining creates a Subtest Standard score of 9 when testing a 7 year old (Bucy, 1998). Furthermore, guidance in regards to scoring and interpreting protocols of students who score subtest raw scores of 0 should be included in the examiners manual. The TOWL-3 meets a need for a measure of written language for both diagnostic and research purposes (Bucy, 1998; Hansen, 1998; Hayward et al., 2007). Although a useful test, its utility for use with younger children is questioned (Bucy, 1998; Hayward et al., 2007). The TOWL-3 can be administered and scored much faster than the TOWL-2 and the entire test was renormed on a representative sample of students ages 7 through 17. Studies support the reliability of the TOWL-3, however, additional validity studies are needed (Bucy, 1998). I would use the TOWL-3 particularly for students above the age of 8 years who exhibit significant writing difficulties compared to their peers. The TOWL-3 is based on a sound model of written language and can provide a comprehensive assessment regarding a students writing ability.

Test Review

17

References Bucy, J. (1998). Test review of the test of written language-3. From J. C. Impara & B. S. Plake (Eds.), The thirteenth mental measurements yearbook [Electronic version]. Retrieved November28, 2009, from the Buros Institute's Test Reviews Online website: http://www.unl.edu/buros Burns, M., & Symington, T. (2003). A comparison of the spontaneous writing quotient of the test of written language (3rd ed.) and teacher ratings of writing progress. Assessment for Effective Intervention, 28(29), 29-34.

Test Review

18

Hammill, D. D., & Larsen, S. C. (1996). Test of Written Language-3 (TOWL-3). Austin, TX: ProEd, Inc. Hansen, J. (1998). Test review of the test of written language-3 From J. C. Impara & B. S. Plake (Eds.), The thirteenth mental measurements yearbook [Electronic version]. Retrieved November 28, 2009, from the Buros Institute's Test Reviews Online website: http://www.unl.edu/buros Hayward, D. V., Stewart, G. E., Phillips, L. M., Norris, S. P., & Lovell, M. A. (2008). At a glance test review: Test of written language-3 (TOWL-3). Language, Phonological Awareness, and Reading Test Directory (pp. 1-5). Edmonton, AB: Canadian Center for Research on Literacy. Retrieved November 27, 2009 from http://www.uofaweb.ualberta.ca/elementaryed/ccrl.cfm. Yarger, C. (1996). An examination of the test of written language-3. Volta Review, 98(1), 211-216.