Академический Документы
Профессиональный Документы
Культура Документы
http://journals.cambridge.org/LTA
Additional services for Language
Teaching:
http://journals.cambridge.org
IP address: 134.117.10.200
519
1 Applied linguistics (AL) is a notoriously conflicted term, with a long history in the literature. There are narrow
interpretations, such as linguistics applied (cf. Brumfit 1997; Widdowson 1998) and broad ones, such as those of Rampton
(1997) and Samson (2012), who argued that AL draws on theoretical foundations much larger than just linguistics,
including psychology, education, sociology and anthropology. The latter is in keeping with the definition given by the
American Association of Applied Linguistics (AAAL): a wide range of theoretical, methodological and educational
approaches from various disciplines from the humanities to the social, cognitive, medical, and natural sciences
(n. d.). This broader definition is characterized by interdisciplinarity (according to Samson 2012) and the theoretical
and empirical investigation of real-world problems in which language is a central issue (Brumfit 1997: 93). Here we have
adopted this broad definition of the term.
2 We have adopted the term doctoral dissertation as it is the term used in Canadian universities.
3 We have used the term language assessment to refer to broad issues in language assessment and testing. We regard
language assessment as an umbrella term, which includes language testing, but, consistently with research terminology
in the field (cf. journals such as Language Testing and Language Assessment Quarterly), we use language assessment, language
testing and language assessment and testing interchangeably.
http://journals.cambridge.org
IP address: 134.117.10.200
http://journals.cambridge.org
IP address: 134.117.10.200
521
http://journals.cambridge.org
IP address: 134.117.10.200
identifying synonyms or paraphrases and matching key vocabulary in the text were found to
favor Mandarin speakers, while items involving skimming for the gist, connecting or relating
information presented in different parts of the tests and drawing an inference favored Arabic
speakers. These results provide evidence for the validity of the bottom-up and top-down
reading strategy framework (Grabe 2009; Grabe & Stoller 2011) and a substantive method
for interpreting group differences on the CLBA reading assessment. This study employed
verbal protocol data and sophisticated statistical approaches such as differential item function
analysis, which are common to research methodologies and analytical approaches in language
testing (Bachman 2000). This mixed method study was elegantly designed and executed
and has led to a number of publications (e.g. Abbott 2007). The same characteristics and
contribution are also seen in the study by Gao (2007) discussed below, again in the area of
reading assessment.
Gaos (2007) study addressed the call to integrate cognitive psychology with assessment
to inform test design and validation and to provide detailed diagnostic feedback. One
approach to this is to model item statistics, in particular item difficulty. This study used
a cognitive-psychometric approach to model the reading items included in the Michigan
English Language Assessment Battery (MELAB) and tested the model through four stages:
a. An initial cognitive model was hypothesized to underlie the MELAB reading performance
based on the review of the process associated with reading and reading test taking by ESL
learners. The model was then validated by
b. having the cognitive demands made by the test items analyzed by raters,
c. collecting students verbal reports of the processes they used to arrive at the correct
responses, and
d. examining the relationship between the proposed cognitive processes and item difficulty
estimates using a tree-based regression.
The study demonstrated the value of using multiple sources of evidence, using a mixed
method design and employing different participants (e.g. raters and test-takers) to evaluate a
cognitive model and the relationship between cognitive psychology and language assessment.
Considering the interdisciplinary nature of language testing (Bachman 2000), studies of this
nature have particular strength and make an original contribution to our understanding of
test validation.
Zheng (2010) examined motivation, anxiety, global awareness and linguistic confidence
and their relationship to language test performance within the context of Chinese university
students taking the College English Test Band 4 (CET-4) in China. Using a mixed method
approach, through concurrent survey and interview inquiries, this study explored whether
and how the selected psychological factors contributed to students CET performance.
Results from exploratory factor analysis revealed that Chinese university students displayed
three types of instrumental motivation (mark orientation, further-education orientation
and job orientation), two types of anxiety (language anxiety and test anxiety) and two
types of confidence (linguistic confidence and test confidence). The results of confirmatory
factor analysis led to a modified socio-educational model of motivation with some contextspecific concepts (new instrumental orientations, global awareness and linguistic confidence)
http://journals.cambridge.org
IP address: 134.117.10.200
523
that more accurately represented the characteristics of the Chinese university students.
The results of structural equation modeling confirmed that attitude toward the learning
situation and integrative orientation were two strong indicators of motivation, which in
turn influenced language achievement and confidence. The negative impact of anxiety on
language achievement was confirmed. Certain group differences were found between male
and female students, high and low achievers, students from the arts programs and those from
the science programs, and students who started to learn English before Grade 7 and those
who did so after Grade 7. The interview findings further confirmed stronger instrumental
than integrative orientations within the study context. This study tested well-developed
motivation and anxiety models in the Chinese context and expanded our knowledge of
theory development in English language education. The implications of this study point
to the importance of understanding language test-takers characteristics in their macro- and
micro-learning contexts, thus providing evidence for the meaning of test scores and informing
test score use.
What impressed us is that all the three studies above 1) used multiple sources of evidence
to support their findings; 2) collected data concurrently or sequentially at different stages;
3) conducted data analysis using advanced statistical procedures alongside sophisticated
interpretative or qualitative analysis. All three are of high quality and have been published
(cf. Abbott 2007; Gao & Rogers 2011).
Farnia (2006) and Limbos (2005) took a different approach, collecting data through
individual testing of reading with younger students. Both studies therefore addressed the
nature of reading development at the school level and involved students learning English as
a first and a second language. Farnia (2006) was concerned with uncovering the patterns
of growth in text-reading fluency and reading comprehension in two groups of students:
children with English as a second language (ESL) and those with English as a first language
(EL1). The study explored the role of linguistic and cognitive predictors in understanding this
growth. The sample consisted of 50 EL1 and 107 ESL children whose performance on the
potential predictors of memory, phonological processing, spelling, word level reading skills
and oral language proficiency (vocabulary and grammar) was assessed yearly from Grade 1
to Grade 6. Text-reading fluency and reading comprehension were tracked from Grade 2 to
Grade 6.
Three interrelated studies were conducted. Study 1 focused on growth trajectories in
lower, word-level and high-level (comprehension and fluency) reading skills in Grades 13
and Grades 46. Hierarchical Linear Modeling indicated that ESL children developed faster
than EL1 children in word-level reading skills, aspects of phonological processing, memory
and oral language. Spelling and language proficiency skills were significant concurrent and
longitudinal predictors of reading comprehension in Grades 46. Study 2 revealed similar
patterns of text-reading fluency trajectories for the EL1 and ESL children. Regardless of
initial text-reading fluency level and language group, students who were able to read words
fluently at the outset demonstrated a faster growth rate from Grades 2 to 6. Study 3 explored
further facilitation between text-reading fluency and reading comprehension at Grades 4
6, and results showed that such growth among the EL1 and ESL students contributed
to their later performance in reading comprehension. This study considered implications
for assessment, the identification of developmental risk factors, practical means to improve
http://journals.cambridge.org
IP address: 134.117.10.200
text-reading fluency and reading comprehension for ESL children in comparison to EL1
students. Studies of this nature are extremely important for countries like Canada where
schooling involves both groups of students with reading development at various levels.
Limbos (2005), in a similar study, explored the common misconception that assessment
of reading disabilities in ESL students must be delayed for two or three years until oral
language skills have developed. Using a battery of measures assessing various domains of
reading, memory and cognitive and oral language, the study collected data from 339 Grade
1 students (107 EL1 students and 232 ESL) and 253 Grade 3 students (80 EL1 and 173 ESL).
Confirmatory factor analysis corroborated the theoretical constructs of Phonology, Oral
Language, Verbal Memory and Reading at Grades 1 and 3. Structural equation modeling
was used to examine the relationship of these constructs to Grades 1 and 3 reading criteria
and supported the phonological-core deficit model of reading. Logistic regressions showed
that at both Grades 1 and 3, a composite Phonology score consistently made a significant,
unique contribution to the predictor of at-risk EL1 and ESL students. The data suggested
that it is possible to identify EL1 and ESL students who are highly or moderately at risk at
Grade 1. As mentioned above, the Limbos (2005) and Farnia (2006) studies are of critical
importance to Canada because both EL1 and ESL students were studied and compared
within the same classroom context. Such studies have important implications for teaching
and assessment in multilingual and multicultural classrooms and important impacts on policy
within the Canadian education context.
http://journals.cambridge.org
IP address: 134.117.10.200
525
http://journals.cambridge.org
IP address: 134.117.10.200
test, and the third, the first official administration of the new test. This sequential and mixed
method design allowed the researcher to examine different aspects of the tests impact on
stakeholders and raters. This program of research is CRITICAL in that
It integrated social and political values into the test validation process suggested by
Cheng (2008) and Moss et al. (2006);
Test stakeholders, including the test-takers (teachers applying for certification)
themselves, were not only consulted but also determined to a great extent the direction
of the research program. Test-takers are increasingly being studied as they bear the
highest stakes of all (Fox & Cheng 2007; Cheng & Deluca 2011);
The conflicting views and competing interests of the stakeholders were embraced.
Similar results were found by previous studies (e.g. Fox 2003; Qi 2007).
Bakers three interrelated studies (2010) demonstrate a responsible and progressive approach
to researching the assessment of language proficiency for professional certification, an area
of language assessment that has been relatively less researched.
Shih (2006), Tan (2009) and Wang (2010) examined, primarily using survey methods
(questionnaires and interviews), the impact and washback of testing programs elsewhere in
the world. Shih (2006) used a case study approach and investigated a range of stakeholders
perceptions of the Taiwanese General English Proficiency Test (GEPT) and its washback
on schools policies, teaching and learning of English. The GEPT is a test of English
language proficiency for Chinese learners of English at all levels and has gained societal
acceptance for educational and employment purposes in Taiwan. The study was conducted
at applied foreign language departments in two technological (vocational) institutes. The
principal research methods used were interviews with the departmental chair, teachers,
students and their immediate family members, and observations of English language courses.
Such research methods are often used in traditional washback studies. However, Shihs
inclusion of family members as stakeholders is unusual in studies of this nature. His findings
revealed several features of the GEPTs washback on teaching and learning within the
research context. The dissertation concluded by proposing a washback model of students
learning; this will be very useful for future research into washback, as this area of empirical
research is relatively new (20 years) and is in need of both theoretical and methodological
guidance.
Wang (2010) explored the washback effects of the CET (College English Test) on teacher
beliefs, interpretations and practices, in particular how the teacher factor is manifested in
the washback phenomenon as proposed by Watanabe (2004). Wang (2010) also investigated
the pedagogical as well as the social and personal influences of the test on teachers beliefs
and interpretations and practices within the Chinese tertiary context. Participants were 195
tertiary-level EFL teachers in the non-English programs. The study asked whether tests are a
major constraint on College English instructional innovation in China and what aspects are
relevant to this factor (e.g. teacher beliefs, knowledge or experiences) and present a barrier
to the implementation of instructional change. A mixed methods approach of a teacher
survey and in-depth case studies was used to collect data. Qualitative analysis involved the
use of the constant comparative method, while quantitative analysis involved descriptive
http://journals.cambridge.org
IP address: 134.117.10.200
527
and inferential statistics (e.g. exploratory factor analysis, confirmatory factor analysis and
structural equation modeling). The findings from this study suggest that the CET, coupled
with various interrelated components of the teacher factor, does indeed create a washback
effect. Given the complexities underlying the washback phenomenon, the educational change
in curriculum and assessment is not sufficient on its own to entail teacher change in terms
of pedagogical strategies as previously suggested by Cheng (2005) and Wall (2005). Wang
(2010) emphasized that for fundamental changes in teacher practice to occur, they must be
accompanied by other changes in teachers knowledge, beliefs, attitudes and thinking that
inform such practice.
Tans (2009) study, in a secondary setting in Malaysia, examined a change in the language
of instruction for mathematics and science subjects from Bahasa Malaysia to English in
Malaysia. This policy has two objectives: to promote student learning of mathematics and
science, and to increase student proficiency in English. The Education Ministry also chose to
create a washback effect by introducing a bilingual high-stakes secondary exit examination.
As is common in applied linguistics, a new test (a revised testing system) is introduced to bring
about changes: what is assessed becomes what is taught. This is an area that has been studied
by many researchers (e.g. Cheng 2005; Wall 2005). To examine the policy, the study drew on
the insights and perspectives offered by the literature on educational change, content-based
instruction and washback in language testing. This was a mixed methods study involving
a longitudinal and cross-sectional research design. The results point to the complexity of
educational change processes and indicate that the classroom implementation of this policy
is affected by multiple factors.
Mullen (2009) studied the impact of using a proficiency test as a placement tool. This
is not an impact and washback study in the traditional sense, in that the study focused on
the impact (area, extent and direction) of using a test for a purpose for which it was not
originally designed. In particular, the research questions asked if a standardized proficiency
test, the Test of English for International Communication (TOEIC), when used for placement
purposes, had an impact, defined as one of the six qualities in the model of Test Usefulness
(Bachman & Palmer 1996). Increasing numbers of such studies of language testing are being
carried out, as tests designed with one purpose are retrofitted for another (e.g. Fox 2009;
Doe 2011; Fox & Hartwick 2011). In Phase I of this sequential mixed methods evaluative
study, 15 teachers and 677 North American university students who were studying L2
English answered questionnaires about their experiences of taking the TOEIC. First, the
results showed that the group of 126 misplaced student participants differed significantly
from the 551 correctly placed students on the dimensions of college English experience,
age and first and second languages. The first of these was correlated most strongly with
misplacement. These differences were found to relate to the TOEIC and its practice of
predicting productive skill ability from formally measured receptive skill ability. Second,
teachers who held Masters degrees had significantly fewer student failures in their classrooms
than teachers with Bachelors degrees. In response to the studys finding that impact exists
in the form of student misplacement, in Phase II, teacher and student interviews focused on
their experiences related to misplacement. Five teachers, who held a Bachelors or Masters
degree, and 13 students, eight misplaced and five correctly placed, supported the findings
of Phase 1, showing that the practice of predicting productive ability from receptive ability
http://journals.cambridge.org
IP address: 134.117.10.200
resulted in misplacement for certain groups of students. In turn, this misplacement resulted
in three consequences for stakeholders, involving the apparent validity of the TOEIC for
making course-level changes, the misplaced students willingness to learn, and the teachers
decision to award pass grades to the misplaced. There is evidence in this study that when a
proficiency test is used for placement purposes, misplacement occurs with identifiable groups
of students and impacts all stakeholders. Studies of this nature are extremely important not
only for Canada but also for English-medium universities around the world where a fast
increasing number of international students choose to pursue their degrees in English. One
of their first steps is to take an entrance English test and be placed into English programs
using such test scores.
http://journals.cambridge.org
IP address: 134.117.10.200
529
http://journals.cambridge.org
IP address: 134.117.10.200
analytic scale; and that experienced raters exhibited a higher degree of agreement (26.1%)
than novice raters (20%).
Kims (2010) study is extremely complex methodologically, weaving qualitative and
quantitative approaches together across two phases with multiple embedded studies and
five complex research questions. Like Barkaoui, she defines her approach as grounded on
pragmatism (p. 57) and explicitly acknowledges that mixed methods research can reduce
biases and limitations inherent in a single method, while strengthening the validity of inquiry
(ibid.). Of course, one of the dangers of such complex studies is that they are more difficult
to control for coherence. In this regard, Barkaouis study is more effective, but Isaacs (2010,
see below), who also uses a mixed method approach in her consideration of pronunciation in
speaking tests, utilizes the increasingly popular manuscript-based thesis option (Baker 2010),
which is particularly suited to complex, multi-step, mixed methods research, but remains
quite rare in Canadian doctoral research.
At the beginning of her dissertation, Isaacs explains that she has chosen to prepare a
manuscript-based thesis, comprised of a collection of three studies that are part of the same
overall program of research, in lieu of a traditional thesis (Isaacs 2010: iv). All the studies
were co-authored, with Isaacs as first author in each case, and one of the studies (Isaacs
& Trofimovich 2011) had been published by the time she completed her dissertation. The
advantages of such a manuscript approach in doctoral dissertation research are clear not
only in terms of the inherent coherence of the dissertation itself, but also with regard to
the next steps for doctoral graduates, who are enabled to apply for positions following their
degree completion with a clear program of research and a list of publications in peer-reviewed
journals.
Isaacs (2011) program of research examined variation in raters judgments of L2
pronunciation, an under-researched construct, which continues to be the EFL/ESL orphan
(Gilbert, cited in Isaacs 2010: 184). Specifically, she states that her purpose was to better
understand major constructs in L2 pronunciation research, improve current measurement
practice, and, ultimately, reinvigorate the conversation on L2 pronunciation assessment,
which has been virtually absent from the research agenda since the time of Lado (1961),
(Isaacs 2010: 24).
In study one, with her co-researcher (Isaacs & Trofimovich 2011), Isaacs takes a particularly
novel approach, recruiting 30 music majors and 30 non-music majors to rate 40 L2 speech
samples for three key features of pronunciation: comprehensibility, accentedness and fluency.
With their musical experience, music majors are more attuned to SOUND qualities, have a
heightened sense of the musicality, rhythm, flow, cadence and clarity of speech. Isaacs reports
that music majors assigned significantly lower ratings than non-music majors for accentedness,
particularly for low ability learners, but phonological memory and attention control did not
influence their ratings. In study two, Isaacs and her co-researcher investigate the effects of
scale length and rater experience on raters judgments of the same three features. The results
showed that experienced and novice raters achieved a high degree of consensus about the
highest and lowest scoring L2 speakers, but had difficulty differentiating between scale levels
in the absence of guidance from the rating instrument. Study three is the direct outcome of
studies one and two, as Isaacs confronts the key issue: what is the construct? (Bachman 2007:
41), recognizing that the numerical scales used in studies one and two did not sufficiently
http://journals.cambridge.org
IP address: 134.117.10.200
531
articulate what separated one scale point from another in terms of specific performance
features that raters should attend to. Like Barkaoui (2008) and Kim (2010), Isaacs attempts
to construct an empirically-driven L2 scale that builds into the scoring rubrics, the qualities
of the L2 performance that appear to be most salient to raters (Isaacs 2010: 127).
http://journals.cambridge.org
IP address: 134.117.10.200
class). Through methodical attention to detail he documents feedback events (within and
outside the classroom) that effect the socialization of these L2 learners. Ultimately, this
allows him to suggest how these processes unfold in both predictable and unpredictable
fashions (p. 156) within particular contexts. This leads him to propose a model of
feedback across the dimensions of text, classroom and institution, which serves pedagogic,
socialization, institutional and economic functions (p. 147). Serors study thus takes him
from a consideration of micro-level texts and individuals to discussions of macro-level social
and institutional contexts; in contrast, Neumann (2010) focused her research on what two
teachers of writing attended to when assessing the writing of 33 L2 students, studying in
an English-medium university in Quebec, and how these teachers feedback impacts their
students learning.
The overall purpose of Neumanns complex, mixed method, multi-phase study was to
investigate how teachers of writing assess the grammatical ability of their university-level
L2 student writers. She began in phase one of her study with a statistical analysis of 33
students essay exams, hand-tagging, coding and tallying features of their writing that have
been linked in the research literature to grammatical ability (e.g. clauses, sentences and tunits, and morphological and syntactic errors). She then used multiple regression analysis
to determine which type of features, ACCURACY MEASURES (morphological errors/t-units) or
COMPLEXITY MEASURES (syntactic errors/t-units), best predicted the grammar grades assigned
by the teachers. Simultaneously, she applied systemic functional linguistic analysis (e.g. Eggins
2004) to assess the students ability to manage information in their texts.
In phase two, a questionnaire survey and interviews were conducted with the students
to evaluate the students knowledge and understanding of their teachers assessment criteria
for grammar. Finally, in phase three, she interviewed the two teachers about the criteria
they applied in assessing the grammatical ability of their students, and elicited their views
regarding priorities.
Neumann reports that grammatical accuracy was the single most important assessment
criterion for [these] L2 writing teachers (2010: 79). By including students in her study,
she is able to identify unintended consequences of the teachers focus on accuracy in
writing, which, as she points out, inadvertently communicated a highly reductive construct
of grammatical ability in academic writing to their students, and ultimately had a negative
impact on the learning potential of these students. Like Seror (2008), Neumann finds a
gap between the teachers intentions and the students interpretation and use of feedback
and concludes that The teachers in this study certainly took the responsibility of writing
assessment seriously (Neumann 2010: 180), but were largely unaware of the tensions and
contradictions that resulted from the interplay. . .between the teachers pedagogical and
assessment goals (to improve students linguistic accuracy in writing), and the resulting, albeit
unintended [negative] washback effect (students error avoidance strategies) (ibid.).
She notes that the teachers aimed to reduce error through their feedback by attracting
their students attention to what was incorrect in their writing. The students did become
aware, but instead of learning more about writing by taking risks and trying out new ways
of expressing their ideas, they avoided potentially negative comments on grammatical errors
by relying on structures that were already familiar to them.
http://journals.cambridge.org
IP address: 134.117.10.200
533
Song (2007) also explores learners perceptions, but from a phenomenological perspective
(Creswell 1998; Creswell & Plano-Clark 2007), although she does not identify her study as
phenomenological, and prefers to use the overarching term of NARRATIVE INQUIRY (Clandinin
& Connelly 2000) to describe her approach. She briefly discusses interpretive biography (Song
2007: 51) as another possible perspective on her method. However, from the beginning of
her study, her approach is consistent with phenomenological research. As Volkmann (cited in
Leedy 1997) points out, Attention to experience and intention to describe experience are the
central qualities of phenomenological research (Leedy 1997: 88). Or as Leedy observes in
defining phenomenological research designs: the researcher often has personal experience
with the phenomenon and aims to heighten his or her own awareness of the experience while
simultaneously examining the experience through the eyes of other participants (1997: 161).
Song, having been a Chinese ESL learner herself, frames her stories of six Chinese ESL
learners perceptions of classroom assessment within her own story, lived and told (Song
2007: 1). She weaves theories of classroom assessment into representation and interpretation,
seeking to arrive at themes or patterns in her own experience and those of her participants that
are invariable across all manifestations of the phenomenon (Tesch 1994: 197). Her openended approach focuses on relationships as being at the heart of thinking narratively. . .[and]
key to what it is that narrative inquirers do (Song 2007: 176).
Songs research journey (p. 181) leads her to conclude that perceptions of classroom
assessment are embedded in curricular contexts and intimately related to second language
acquisition. She seems to argue for what is self-evident: that a learners perception of classroom
assessment varies from person to person, and alters with the changing landscape of the
classroom (Song 2007: i) She begins and ends with recognitions that:
1. Assessment is a way of knowing, for both teachers and learners;
2. Assessment is power. Teachers have the power to impose assessment for a range of
pedagogical purposes, but students can also exercise power by assessing critically,
creatively, and reflectively (p. 105); and
3. Classroom assessment is a site of in-between (between process and product; learner and
teacher; learner and other learners; researcher and researched; theory and life or at least
stories of life contained in the inquiry (Clandinin & Connelly 2000, cited in Song 2007:
177).
Rather than exploring perceptions of assessment, Colby (2010) examines the role assessment
can play in supporting L2 acquisition. Specifically, she developed and implemented pedagogic
tasks in order to investigate their impact on the learning of L2 students in two intact, preuniversity English for academic purposes (EAP) classes. She subsequently collected evidence
of the students use of a grammatical feature (the use of WOULD and WILL in contingent
use contexts) as a result of their exposure to the pedagogic tasks. The tasks were designed
according to Assessment for Learning (AFL) principles (see Colby-Kelly & Turner 2007
for details), so that Colby could apply these AFL procedures in an L2 classroom setting,
and. . .investigate their effect on learning (Colby 2010: ii). The investigation of AFL-inspired
tasks and procedures and their impact on learning led in her dissertation to evidence of
http://journals.cambridge.org
IP address: 134.117.10.200
the ASSESSMENT BRIDGE (AB), which she defines as the area of classroom practice linking
assessment, teaching, and learning (ibid.).
Citing Creswell & Plano-Clark (2007), Colby describes her research design as sequential,
exploratory mixed methods (2010: 59). Her research is quasi-experimental, because she is
working with two intact classes. She identifies a control and a treatment group, and through
the clever use of a split plot design manages to collect impressive evidence of the impact of
the AFL-inspired tasks on student learning over time.
Typically, a split plot project takes place over two time periods, between which the treatment
and control groups and their teachers reverse roles (Hatch & Lazaraton 1991). However,
Colby (2010) had to alter the design in order to take advantage of participant availability
and address constraints imposed by the operational requirements of the school where the
research took place. In the end, therefore, three teachers and four different classes participated
in the study, in two time periods, over a total of six weeks. Colbys (2010) research provides a
window on the challenges and benefits of longitudinal work in classroom-based research. Her
innovative approaches to meeting operational constraints with alternative research strategies
provide a model for field researchers undertaking longitudinal studies in situ. The pedagogic
tasks she develops will be of interest to other EAP teachers who are implementing AFL
principles in their teaching, as will the suggested teacher training strategies she applies.
http://journals.cambridge.org
IP address: 134.117.10.200
535
Kwan rightly points out that there is little research on YLLs early literacy development,
and little is known about the specific impact of phonics instruction in supporting it, or how
much phonics instruction is ideal. There has been concern that phonics instruction might
detract from an L2 childs acquisition of oral language proficiency. Since oral proficiency is
essential to a whole-language approach to reading (focusing on top-down meaning rather
than bottom-up soundletter correspondence in learning to read), Kwan also examines the
impact of systematic phonics instruction on oral language proficiency. As mentioned above,
he does not find that phonics instruction impedes (nor does it enhance) oral proficiency. The
benefits of systematic phonics instruction in Kwans study, for both L1 and L2 readers, quite
outweigh any concerns about the time it takes up.
Kwan acknowledges a problem in his research design, which confounds the timing of
instruction with the amount of instruction. He notes that systematic pre-testing of his
participants in the year prior to the study would have provided him with baseline data
enabling him to compare children who completed one year of systematic phonics instruction
in JK with children who completed one year in SK. Nor was Kwan able to consider the role
of language background of the L2 learners who spoke many different first languages. The
way the phonics program was implemented by each of the participating teachers and schools
(every day or every other day) was also an acknowledged limitation of the study (Kwan 2006:
147). However, quasi-experimental research designs at the level of the classroom always face
such challenges (as evidenced in Colbys (2010) research). The importance of Kwans research
is that it throws much needed light on an under-researched area and provides valuable insights
into the early literacy requirements of ESL (and L1) readers.
Gunning (2011) combines a general questionnaire survey study with a case study in her
mixed method investigation of 138 sixth-grade students studying ESL in the French-speaking
province of Quebec. She notes the critical need for a study of strategy use by elementary
school children because the curriculum requires teachers to integrate strategy training and
assessment into their teaching (2011: 65) of ESL. She investigates the relationship between
strategy instruction (SI) by ESL teachers and strategy use by learners, and how they impact
learning. She finds that the children reported using mainly affective and compensatory
categories of strategies, such as asking for help and risk-taking (2011: i), although this varied
in relation to their level of proficiency in English. Children with high levels of proficiency in
English reported using more strategies (affective and cognitive). She also reports that whether
or not the children liked studying English had a significant influence on their strategy use.
The case study participants were drawn from survey study participants, who were divided
into two sub-groups in order to investigate the effects of SI on the students in-class strategy
use and to assess the impact of their strategy use on success in ESL oral interaction tasks.
Gunning describes how Two intact, similar groups of participants from two different schools
served as a treatment group (n = 27) and a control group (n = 26) in the quasi-experimental
part of the research (ibid.). Like Colby (2010), Gunning devised pedagogic tasks, guided by
the principles of AFL, which were implemented at classroom level.
Gunnings mixed method study begins in phase 1 with the quantitative survey, followed
in phase 2 with a qualitative and quantitative investigation of strategy use at the classroom
level. Gunning identifies seven sources of evidence over the two phases of the study. Its
complexity is evident in her 59-page methodology section, which includes a description of
http://journals.cambridge.org
IP address: 134.117.10.200
the development and validation of the instruments used in the study, including the strategy
questionnaire, task-based questionnaire, oral interaction measures, strategy log, field notes,
video recordings and interview protocols. The richness and thickness of her descriptive
detail is such that only an experienced researcher could control the quantity of data used.
She notes that The strategy assessment techniques and instruments, and the combination of
a case study with a quasi-experimental component, contributed to a mixed methods analysis
that allowed me to infer from the integration of the qualitative and quantitative data that the
SI helped raise the childrens consciousness about strategies, and that the childrens strategy
use facilitated their English oral interaction (Gunning 2011: 240).
Although research with adult language learners still seems to dominate doctoral research
on teaching, learning and assessment, it is reassuring to see work like that of Kwan (2006),
Gunning (2011), Farnia (2006) and Limbos (2005) that systematically investigates the impact
of specific pedagogic approaches on the learning of YLLs. Gunning observes that classroombased assessments (CBAs) time as a paradigm has arrived (2011: 238), and that research
is needed to investigate the specific characteristics of pedagogic tasks and procedures that
appear to support learning, and collect evidence that such tasks and procedures directly
impact learning. Kwan (2006), Gunning (2011) and Colby (2010) have contributed to this
long-overdue research agenda, while Seror (2008) and Neumann (2010) provide thoughtful
considerations of the role of feedback on L2 learning.
http://journals.cambridge.org
IP address: 134.117.10.200
537
Baba (2007) argues that much more work is needed to fully develop the construct of
vocabulary knowledge as it is operationalized in tests, and to better understand the relationship
between vocabulary knowledge and writing performance. She found evidence in her study
that lexical proficiency was crucial (p. 157) for the students in her study, although not every
dimension of lexical proficiency was equally related to writing performance. She argues for
continued research on the role of tasks in relation to contexts in eliciting writing, more probing
interviews with writers to look in greater depth at their accounts of how and why they use
the words they do, and longitudinal research to trace the development of their vocabulary
knowledge over time.
Her calls for more longitudinal research into lexical richness in writing are answered in part
by the work of Douglas (2010), who investigated vocabulary knowledge as it was evidenced
in the lexical richness of novice native speaker (NS) and non-native English speaker (NNES)
undergraduate writers on a test of writing, and related initial measures of lexical richness to
academic outcomes. He found that vocabulary knowledge appears as an underlying variable
related to students academic outcomes at university (p. 206).
Douglas quantitative study drew on a sample of 745 students (561 NS students and
184 NNES students) who were required to take a writing test upon entry to undergraduate
programs at the university where the study took place. These students allowed their tests to be
used anonymously for research purposes and filled in an information form, which provided the
researcher with demographic information and a means of linking their initial test performance
to university course outcomes. He used a corpus linguistic approach and lexical analysis tools
available online to investigate the degree of lexical richness in the students written tests. A
number of tools were used to analyze the test material for indices of lexical richness, based on
a corpus developed by the researcher on the written entry test and responses to a provincial
test written at the end of grade 12 as a high school matriculation requirement. Academic
outcome measures were derived through detailed analysis of university transcripts.
Like Baba (2007), Douglas found evidence of the importance of vocabulary knowledge,
lexical depth and breadth in eventual academic success at university. He found that NNESs
who do not have the depth or breadth of vocabulary knowledge of their NS peers often persist
(Fox: 2005) in their academic study in spite of many failures and setbacks. As Douglas points
out, however, While [these] NNES students eventually graduated from university. . .they
were faced with diminished academic outcomes in terms of Grade Point Averages, Length
of Program, Courses Attempted and Not Earned, and Academic Standing (Douglas 2010:
ii). Hierarchical regression analysis found that lexical richness upon entry to university was
a significant predictor of academic outcomes. In this regard, Cervatiucs (2007) research is
also particularly important. Her research was aimed at obtaining insight into the factors
that account for high levels of lexical proficiency amongst highly successful academic or
professional adult NNES.
The 20 participants in Cervatiucs study had arrived in Canada after the age of 18, scored
at native-like levels on tests of vocabulary size (i.e. from 13,500 to 20,000 base words) and
had vocabulary profiles consistent with native-like writing standards (Morris & Cobb 2004).
She found that both situational and linguistic factors had an impact on their acquisition
of vocabulary, but the two underlying forces that drive their success and activate a unique
combination of situational factors, L2 input, individual differences, and learning strategies
http://journals.cambridge.org
IP address: 134.117.10.200
are AWARENESS of inner and outer resources and WILLPOWER to consistently employ these
resources in order to make language and vocabulary gains (Cervatiuc 2007: iv).
Babas (2007) focus on metacognitive strategy use in her Japanese EFL writers and
Douglass (2010) findings regarding PERSISTENCE support Cervatiucs (2007) conclusions
regarding her 20 highly proficient NNESs. Drawing on a vocabulary size test, language
proficiency assessments, interviews, questionnaires and samples of her participants writing,
she cross-analyzes or triangulates findings, guided by principles of grounded theory (Glaser
& Strauss 1967; Hutchinson 1997).
Although she does not describe it as such, Cervatiucs study employs a mixed method
approach with both quantitative and qualitative phases in her research design. In her
own view, her study is mainly interpretative (qualitative dominant) because meaning
is quintessential in interpretative research (Bogdan & Bicklen 1998) which focuses on
participants perspectives or perceptions, values and attitudes (Cervatiuc 2007: 73). Taking
into account her participants perceptions of how they acquired, used and developed their
lexical proficiency allows her to develop rich profiles of the 20 exceptionally successful L2
learners in her study and provides the initial groundwork for a theory of successful L2
vocabulary acquisition.
http://journals.cambridge.org
IP address: 134.117.10.200
539
Quantitative
Mixed methods
Song (2007)
Fleming (2007)
Sterzuk (2007)
Seror (2008)
Limbos (2005)
Farnia (2006)
Kwan (2006)
Douglas (2010)
Abbot (2005)
Shih (2006)
Gao (2007)
Baba (2007)
Cervatiuc (2007)
Barkaoui (2008)
Mullen (2009)
Tan (2009)
Baker (2010)
Colby (2010)
Isaacs (2010)
Kim (2010)
Neumann (2010)
Wang (2010)
Zheng (2010)
Gunning (2011)
16
Total: 24
http://journals.cambridge.org
IP address: 134.117.10.200
Sterzuk 2007; Tan 2009). There was also a balance between studies conducted in Canada
(e.g. Abbott 2007 and Mullen 2009) and elsewhere in the world (e.g. China and Malaysia).
Indeed, these studies reflect the social composition of Canadian society and two trends: the
increasing number of international students on university campuses and English language
learners at the school level.
The breadth of the literature that these researchers drew on to inform their work was
particularly impressive, crossing disciplinary, epistemological and methodological boundaries.
The depth of their inquiry was evident in their management of complex research designs
and analyses. As a collection, breadth lent vigor and strength to the doctoral research
considered here, while depth insured its quality. Taken together, these researchers have made
an important contribution to scholarship in language testing and assessment in Canada and
internationally.
What are the directions of future doctoral research on language assessment in Canada and
internationally? First of all, we expect more research to be conducted at the intersection of
assessment, teaching and learning, where testing is not viewed or researched in isolation, but
rather embedded within the context where it occurs. We also predict that an increasing
number of studies will explore the central role that assessment plays in teaching and
learning, as in classroom-based assessment, for example. This type of research will be more
interdisciplinary, contextually based, and involve multiple stakeholders. Methodologically,
we will continue to see more studies with mixed methods, multi-methods and multiphases: employing, in a sense, a program of research approach. We will see more use
of a (multiple) manuscript format which enables doctoral researchers to firmly establish a
program of research, become actively involved with the dissemination of research findings,
and publish during the course of their doctoral study. This format allows for a more
mentored, collaborative, and team-based approach to doctoral study and research, and
a more comprehensive preparation of doctoral researchers for a wide range of academic
and research careers. It is in these doctoral research studies that we see the future of our
field.
References
Abbott, M. L. (2005). English reading strategies differences in Arabian and Mandarin speaker performance on the
CLBA reading assessment (doctoral dissertation). Retrieved from Theses Canada (32659077).
Abbott, M. L. (2007). A confirmatory approach to differential item functioning on an ESL reading
assessment. Language Testing 24.1, 130.
Alderson, C. (2007). The challenge of (diagnostic) testing: Do we know what we are measuring? In J.
Fox, M. Wesche, D. Bayliss, L. Cheng, C. Turner & C. Doe (eds.), Language testing reconsidered. Ottawa,
ON: University of Ottawa Press, 2139.
Baba, K. (2007). Dimensions of lexical proficiency in writing summaries for an English as a foreign language test
(doctoral dissertation). Retrieved from Theses Canada (33748472).
Bachman, L. F. (2000). Modern language testing at the turn of the century: Assuring that what we
count counts. Language Testing 17.1, 142.
Bachman, L. F. (2007). What is the construct? The dialectic of abilities and contexts in defining
constructs in language assessment. In J. Fox, M. Wesche, D. Bayliss, L. Cheng, C. Turner & C. Doe
(eds.), Language testing reconsidered. Ottawa, ON: University of Ottawa Press, 4171.
Bachman, L. F. & A. S. Palmer (1996). Language testing in practice. Oxford: Oxford University Press.
http://journals.cambridge.org
IP address: 134.117.10.200
541
Baker, B. A. (2010). In the service of the stakeholder: A critical, mixed methods program of research in high-stakes
language assessment (doctoral dissertation). Retrieved from ProQuest (NR74366).
Barkaoui, K. (2007). Participants, texts, and processes in second language writing assessment: A
narrative review of the literature. The Canadian Modern Language Review 64, 97132.
Barkaoui, K. (2008). Effects of scoring method and rater experience on ESL essay rating processes and outcomes
(doctoral dissertation). Retrieved from Theses Canada (35096546).
Bogdan, R. C. & S. K. Bicklen (1998). Qualitative research in education. Boston, MA: Allyn and Bacon.
Brumfit, C. (1997). How applied linguistics is the same as any other science. International Journal of
Applied Linguistics 7.1, 8694.
Cervatiuc, A. (2007). Highly proficient adult non-native English speakers perceptions of their second language
vocabulary learning process (doctoral dissertation). Retrieved from ProQuest (NR33791).
Cheng, L. (2005). Changing language teaching through language testing: A washback study. Cambridge: Cambridge
University Press,
Cheng, L. (2008). Washback, impact and consequences. In E. Shohamy & N. H. Hornberger (eds.),
Encyclopedia of language and education, Vol. 7: Language testing and assessment. New York: Springer, 349
364.
Cheng, L. & C. DeLuca (2011). Voices from test-takers: Further evidence for test validation and test
use. Educational Assessment 16.2, 104122.
Cheng, L., Y. Watanabe & A. Curtis (eds.) (2004). Washback in language testing: Research contexts and methods.
Mahwah, NJ: Lawrence Erlbaum.
Clandinin, J. & M. Connelly (2000). Narrative inquiry: Experience and story in qualitative research. San Francisco,
CA: Jossey-Bass.
Colby-Kelly, C. & C. Turner (2007). AFL research in the L2 classroom and evidence of usefulness:
Taking formative assessment to the next level. Canadian Modern Language Review 64.1, 937.
Colby, D. C. (2010). Using Assessment of learning practices with pre-university level ESL students: A mixed
methods study of teacher and student performance and beliefs (doctoral dissertation). Retrieved from ProQuest
(NR61979).
Creswell, J. W. (1998). Qualitative inquiry and research design: Choosing among five traditions. Thousand Oaks,
CA: Sage.
Creswell, J. W. & V. L. Plano-Clark (2007). Designing and conducting mixed methods research. Thousand Oaks,
CA: Sage.
Cumming, A. (1990). Expertise in evaluating second language compositions. Language Testing 7, 3151.
Doe, C. (2011). The integration of diagnostic assessment into classroom instruction. In D. Tsagari & I.
Csepes (eds.), Classroom-based language assessment: Language testing and evaluation. Frankfurt: Peter Lang,
6376.
Douglas, S. R. (2010). Non-native English speaking students at university: Lexical richness and academic success
(doctoral dissertation). Retrieved from ProQuest (NR69496).
Eggins, S. (2004). An introduction to systemic functional linguistics (2nd edn). New York: Continuum.
Farnia, F. (2006). Modeling growth in reading fluency and reading comprehension in EL1 and ESL children: A
longitudinal individual growth curve analysis from first to sixth grade (doctoral dissertation). Retrieved from
Theses Canada (33265070).
Fleming, D. J. (2007). Becoming Canadian: Punjabi ESL learners, national language policy and the Canadian
language benchmarks (doctoral dissertation). Retrieved from Theses Canada (33664937).
Fox, J. (2003). From products to process: An ecological approach to bias detection. International Journal
of Testing 3.1, 2148.
Fox, J. (2005). Rethinking second language acquisition requirements: Problems with language-residency
criteria and the need for language assessment and support. Language Assessment Quarterly 2.2, 85
115.
Fox, J. (2009). Moderating top-down policy impact and supporting EAP curricular renewal: Exploring
the potential of diagnostic assessment. Journal of English for Academic Purposes 8, 2642.
Fox, J. & L. Cheng (2007). Did we take the same test? Differing accounts of the Ontario Secondary
School Literacy Test by first and second language test takers. Assessment in Education: Principles, Policy
& Practice 14.1, 926.
Fox, J. & P. Hartwick (2011). Taking a diagnostic turn: Reinventing the portfolio in EAP classrooms.
In D. Tsagari & I. Csepes (eds.), Classroom-based language assessment. Frankfurt: Peter Lang, 4761.
Fulcher, G. (1996). Does thick description lead to smart tests? A data-based approach to rating scale
construction. Language Testing 13.2, 208238.
http://journals.cambridge.org
IP address: 134.117.10.200
Gao, L. (2007). Cognitive-psychometric modeling of the MELAB reading items (doctoral dissertation). Retrieved
from Theses Canada (33905480).
Gao, L. & W. T. Rogers (2011). Use of tree-based regression in the analyses of L2 reading test items.
Language Testing 28, 77104.
Glaser, B. G. & A. L. Strauss (1967). The discovery of grounded theory. Chicago, IL: Aldine.
Grabe, W. (2009). Reading in a second language: Moving from theory to practice. New York: Cambridge
University Press.
Grabe, W. & F. L. Stoller (2011). Teaching and researching reading. Harlow, UK: Pearson Education Limited.
Greene, J. C., V. J. Caracelli & W. F. Graham (1989). Toward a conceptual framework for mixedmethod evaluation designs. Educational Evaluation and Policy Analysis 11, 255374.
Gunning, P. (2011). ESL strategy use and instruction at the elementary school level: A mixed methods investigation
(doctoral dissertation). Retrieved from ProQuest (NR77521).
Hamp-Lyons, L. (1990). Second language writing: Assessment issues. In B. Kroll (ed.), Second language
writing: Research insights for the classroom. Cambridge: Cambridge University Press, 6987.
Hamp-Lyons, L. (1995). Rating non-native writing: The trouble with holistic scoring. TESOL Quarterly
29, 759762.
Hatch, E. & A. Lazaraton (1991). The research manual: Design and statistics for Applied Linguistics. Rowley,
MA: Newbury House.
Hutchinson, S. A. (1997). Education and grounded theory. In R. Sherman & R. Webb (eds.), Qualitative
research in education: Focus and methods. Philadelphia, PA: Falmer Press, 123140.
Isaacs, T. (2010). Towards defining a valid assessment criterion of punctuation proficiency in non-native English-speaking
graduate students (doctoral dissertation). Retrieved from ProQuest (MR24877).
Isaacs, T. & P. Trofimovich (2011). Phonological memory, attention control, and musical ability:
Effects of individual differences on rater judgments of L2 speech. Applied Psycholinguistics 32, 113
140.
Ishii, D. N. (2009). Language dia-logs: A collaborative approach for providing effective feedback on ESL learners verb
errors in writing (doctoral dissertation). Retrieved from Theses Canada (37943444).
Kane, M. T. (2002). Validating high-stakes testing programs. Educational Measurement: Issues and Practices
21.1, 3141.
Kim, Y. (2010). An argument-based validity inquiry into the empirically-derived descriptor-based diagnostic assessment
in ESL academic writing. Unpublished doctoral dissertation. University of Toronto.
Kwan, A. B. (2005). Impact of systemic phonics instruction on young children learning English as a second language
(doctoral dissertation). Retrieved from Theses Canada (32659359).
Lado, J. (1961). Language testing: The construction and use of foreign language tests. London: Longman.
Leedy, P. (1997). Practical research: Planning and design (6th edn). Upper Saddle River, NJ: Prentice Hall.
Limbos, M. (2005). Early identification of second-language students at risk for reading disability (Doctoral
dissertation). Retrieved from Theses Canada (32659383).
McKay, P. (2006). Assessing young language learners. Cambridge: Cambridge University Press.
McNamara, T. & C. Roever (2006). Language testing: The social dimension. Malden, MA: Blackwell
Publishing.
Messick, S. (1989). Validity. In R. L. Linn (ed.), Educational measurement (3rd edn). New York: Macmillan,
13103.
Messick, S. (1996). Validity and washback in language testing. Language Testing 13, 243256.
Mislevy, R. J., L. S. Steinberg & R. C. Almond (2003). On the structure of assessment arguments.
Measurement: Interdisciplinary Research and Perspectives 1.1, 362.
Morris, L. & T. Cobb (2004). Vocabulary profiles as predictors of TESL student performance. System
32.1, 7587.
Moss, P. A., B. J. Girard & L. C. Haniford (2006). Validity in educational assessment. Review of Research
in Education 30, 109162.
Mullen, A. (2009). The impact of using a proficiency test as a placement tool: The case of Test of English for
International Communication (TOEIC) (doctoral dissertation). University of Laval, QC: Canada.
Neumann, H. (2010). Whats in a grade? A mixed methods investigation of teacher assessment of grammatical ability
in L2 academic writing (doctoral dissertation). Retrieved from ProQuest (NR77532).
Qi, L. (2007). Is testing an efficient agent for pedagogical change? Examining the intended washback
of the writing task in a high-stakes English test in China. Assessment in Education 14.1, 5174.
Rampton, B. (1997). Retuning in applied linguistics. International Journal of Applied Linguistics 7.1,
325.
http://journals.cambridge.org
IP address: 134.117.10.200
543
Samson, M. (2012). What applied linguists do: An investigation of research practices in the field (Masters research
essay). Carleton University, Ottawa, Canada.
Seror, J. (2008). Socialization in the margins: Second language writers and feedback practices in university content
courses (doctoral dissertation). University of British Columbia, Canada.
Shih, C. M. (2006). Perceptions of the General English Proficiency Test and its washback: A case study of two Taiwan
technological institutes (doctoral dissertation). Retrieved from ProQuest (NR16000).
Shohamy, E. & T. McNamara (2009). Language tests for citizenship, immigration, and asylum. Language
Assessment Quarterly: An International Journal 6, 15.
Song, Y. H. (2007). A narrative inquiry into classroom assessment: Stories of six Chinese adult learners of English as
a second language (doctoral dissertation). Retrieved from Theses Canada (33969044).
Sterzuk, A. (2007). Dialect speakers, academic achievement, and power: First nations and Metis children in Standard
English classrooms (doctoral dissertation). Retrieved from Theses Canada (34491819).
Suzuki, W. (2009). Languaging, direct correction, and second language writing: Japanese university students of English
(doctoral dissertation). Retrieved from Theses Canada (37943556).
Tan, H. M. (2009). Changing the language of instruction of mathematics and science in Malaysia: The PPSMI policy
and washback effect of bilingual high-stakes secondary school exit exams (doctoral dissertation). Retrieved from
Theses Canada (39290869).
Teddlie, C. & A. Tashakkori (2009). Foundations of mixed methods research: Integrating quantitative and qualitative
approaches in the social and behavioral sciences. Thousand Oaks, CA: Sage.
Tesch, R. (1994). The contribution of a qualitative method: Phenomenological research. In M.
Langenbach, C. Vaughan & L. Aagaard (eds.), An introduction to educational research. Needham Heights,
MA: Allyn & Bacon, 143157.
Turner, C. & J. Upshur (2002). Rating scales derived from students samples: Effects of the scale maker
and student sample on scale content and student scores. TESOL Quarterly 36.1, 4970.
Wakamoto, N. (2007). The impact of extroversion/introversion and associated learner strategies on English
language comprehension in a Japanese EFL setting (doctoral dissertation). Retrieved from Theses Canada
(33748547).
Wall, D. (2005). The impact of high-stakes examinations on classroom teaching: A case study using insights from testing
and innovation theory. Cambridge, UK: Cambridge University Press.
Wang, J. (2010). A study of the role of the teacher factor in washback (doctoral dissertation). Retrieved from
ProQuest (NR74872).
Watanabe, Y. (2004). Teacher factors mediating washback. In L. Cheng, Y. Watanabe & A. Curtis (eds.),
Washback in language testing: Research contexts and methods. Mahwah, NJ: Lawrence Erlbaum, 129146.
Weigle, S. C. (1994). Effects of training on raters of ESL compositions. Language Testing 11, 197223.
Widdowson, H. G. (1998). Retuning, calling the tune, and paying the piper: A reaction to Rampton.
International Journal of Applied Linguistics 8.1, 147151.
Yang, Y. (2008). Corrective feedback and Chinese learners acquisition of English past tense (doctoral dissertation).
Retrieved from Theses Canada (38060335).
Zheng, Y. (2010). Chinese university students motivation, anxiety, global awareness, linguistic confidence, and English
test performance: A causal and correlational investigation (doctoral dissertation). Retrieved from Theses
Canada (39291111).
LIYING CHENG Ph.D. is Professor at the Faculty of Education, Queens University, Kingston, Ontario,
Canada. Her research interests are the impact of large-scale testing on instruction, the relationships
between assessment and instruction, and the academic and professional acculturation of international
and immigrant students, workers and professionals to Canada. Her recent books are English language
assessment and the Chinese learner (co-edited with A. Curtis, Taylor & Francis, 2010); Language testing
reconsidered (co-edited with J. Fox et. al., University of Ottawa Press, 2007); Changing language teaching
through language testing (single-authored, Cambridge University Press, 2005); and Washback in language
testing: Research contexts and methods (co-edited with Y. Watanabe and A. Curtis, Lawrence Erlbaum,
2004).
JANNA FOX Ph.D. is Associate Professor in the School of Linguistics and Language Studies, Carleton
University, Ottawa, Ontario, Canada. Her research interests include language testing and assessment,
the development of academic literacy within and across university disciplines, the scholarship of
teaching in linguistically and culturally diverse contexts, and the interplay between language policy,
http://journals.cambridge.org
IP address: 134.117.10.200
curricula, assessment, and stakeholder impact. Her recent work in language testing and assessment is
published in Language Testing, Language Assessment Quarterly and Educational Measurement: Issues and Practice.
Her work on academic literacies is published in Written Communication, Multimodal Communication and
the Journal of English for Academic Purposes. In 2012, she received (with Natasha Artemeva) the College
Composition and Communication Award for Best Article on Pedagogy or Curriculum in Technical
or Scientific Communication (for Awareness versus production: Probing students antecedent genre
knowledge, in the October 2010 issue of the Journal of Business and Technical Communication).
http://journals.cambridge.org
IP address: 134.117.10.200