Академический Документы
Профессиональный Документы
Культура Документы
Portugal
1
Abstract
The research topic was external assessment for art and design at pre-university level
(age 17+). The research field was focused on formal assessment in art education,
especially on the visual arts at the end of secondary schooling (ages 17-18). In the
literature terms such as art, arts, art and design are used in the context of art
education. Through the study such terms were used interchangeably according to the
context.The principal audience for this study included key stakeholders in the
national examinations in Portugal - government, government agencies, university
admissions systems, and art teachers. It may also be relevant to the international
community of researchers in art and arts education and related subjects and
specialists in arts assessment in general.The study was undertaken in two stages: (1)
analysis of current art examinations at pre-university level in Portugal and England
and (2) the development of a conceptual framework for external assessment in art
education in general, and a pilot and trial of a new instrument and assessment
procedures for art examinations in Portugal. The first stage included three sections:
(1) The literature on assessment was reviewed and current problems in assessing the
arts were identified, including issues of quality such as reliability, validity, impact
and practicality. It was established that valid and reliable assessment in art education
required assessment instruments with clear instructions and criteria that allow
multiple evidence to be considered and that such assessment procedures should
include provision for in-service teacher training, standardisation and moderation. A
conceptual framework was developed for evaluating art examinations. (2) The
current Portuguese system of art examinations (age18) was critiqued. Analysis of
relevant documentation and questionnaires conducted with Portuguese art teachers
and art students enabled identification of key issues such as the lack of validity and
reliability in the system, as well as collecting proposals for alternative improved
forms of assessment. (3) The English system of art and design examinations (age
17+) was partially reviewed through direct observation of one example of
examination procedures, analysis of documents and the recording the views of
selected stakeholders. The second stage also included two sections that described and
evaluated: (1) the design and piloting of a new framework for external assessment of
art education in Portugal based on a portfolio assessment extended task, in-service
teacher training and standardisation procedures; (2) the trial of a new external
assessment instrument and procedures in five schools in Portugal.
The new assessment instruments and procedures were compared with the current
Portuguese art examination, establishing that greater validity and reliability was
achievable. The study concluded with discussion of the negative and positive aspects
of the proposed conceptual framework, the implications for assessment stakeholders
in Portugal including the need for reform in Portuguese art education. Further
research into alternative forms and methods of assessment was also recommended.
Keywords: Art Education; Arts; Art and Design; Art Examinations; Assessment and
Evaluation; Authentic Assessment; Portfolio Assessment.
2
List of Contents
3
Pages
Abstract i
List of Contents ii
List of Figures ix
List of Tables x
Acknowledgements xii
Introduction 1
4
Summary 49
2.1 Reliability 51
2.1.1. Problems of reliability in art examinations 52
2.1.1.1. Instrument 53
2.1.1.2. Assessor and examiners training 54
2.1.1.3. Judgements and moderation 54
2.1.1.4. Threats to reliability summarised 55
2.2. Validity 55
2.2.1. Some types of validity 56
2.2.1.1. Content validation 57
2.2.1.2. Face validation 57
2.2.1.3. Response validation 58
2.2.1.4. Washback validation 58
2.2.1.5. Criterion-related validation 59
2.2.1.6. Validation related to examination bias 60
2.2.1.7. Construct validation 61
2.2.2. Problems of validity in art examinations 62
2.2.2.1. Assessment instrument 62
2.2.2.2. The examination underlying model 63
2.3. Impact 65
2.3.1. Effects on society and individuals 66
2.3.2 Effects on students 67
2.3.3. Effects on teachers and teaching methods 67
2.3.4. Effects on schools 68
2.3.5. Effects on standards and curriculum 69
2.3.6. Effects on higher education 69
2.3.7. Ethical considerations 70
2.3.8. Evaluating the impact of examinations 70
2.4. Practicality 71
2.5. Conceptual framework for evaluating art examinations 72
Summary 75
5
3.4.2. Quantitative data 90
3.4.2.1. Questionnaires 90
3.4.2.2. Other data 93
3.5. Data analysis 93
3.5.1. Document analysis 93
3.5.2. Analysis of observation notes 94
3.5.3. Analysis of interviews 94
3.5.4. Descriptive statistics 98
3.5.5. Multi-faceted analysis 98
3.6. Reliability of data 101
3.7. Triangulation 102
3.8. Ethical considerations 103
Summary 105
6
5.4.1. The role of the QCA 164
5.4.2. The structure of awarding bodies 164
5.4.3. A modular structure of courses and examinations 166
5.4.4. Scheme of assessment 169
5.4.5. The specifications 169
5.4.6. Rationales for the specifications 170
5.4.7. Instructions for the conduct of examinations 172
5.4.8. Assessment instruments 176
5.4.8.1. Coursework 176
5.4.8.2. The ‘controlled test’ and question papers 176
5.4.8.3. Type and format of the assessment instruments 177
5.4.8.4. Process orientated instrument 179
5.4.8.5. Bias 181
5.4.8.6. Gender bias 181
5.4.8.7. Bias related to non English students 182
5.4.8.8. Bias related to geographical location and economical 182
background
5.4.8.9. Teachers’ influence upon students’ achievement 183
5.5. Assessment criteria 184
5.5.1. Mark schemes 186
5.5.2. Criterion referenced marking 188
5.6. Assessment procedures 191
5.6.1. In- service teacher training 191
5.6.2. Internal marking and internal moderation 192
5.6.3. Standardisation 194
5.6.3.1. Observation of standardisation procedures 194
5.6.4. Moderation 198
5.7. Conclusions in terms of validity summarised 199
5.7.1. Weaknesses 199
5.7.2. Strengths 200
5.8. Conclusions in terms of reliability summarised 201
5.8.1. Weaknesses 201
5.8.2. Strengths 201
5.9. Practicality 202
5.10 Impact 203
5.11. Suggestions for art and design examinations 205
Summary 206
7
6.2.2.3. Self-evaluation skills 212
6.2.3. Defining a content framework for the new art and design 213
external assessment
6.2.3.1. Kinds of evidence 215
6.2.3.2. Portfolio 215
6.2.3.3. Portfolio evidence 215
6.2.3.4. Assessment criteria 217
6.2.3.5. Assessment procedures 217
6.3. Piloting the new assessment instrument 218
6.3.1. Pilot sample (trial 1) 219
6.3.2. Place, time and programme 220
6.3.3. Feedback from the pilot 221
6.3.3.1. Underlying model 223
6.3.3.2. Format 223
6.3.3.3. Bias 229
6.3.3.4. Tasks 230
6.3.3.5. Time and resources constraints 232
6.3.3.6. Criteria and weightings 234
6.3.3.7. Impact 235
6.3.3.8. Assessment procedures 236
6.4. Suggestions for improvement 238
Summary 239
Chapter 7: Main Study: Trial of the new external assessment instrument 240
and procedures in five schools in Portugal
8
7.3.8. Criteria and weightings 268
7.3.9. Bias 270
7.3.10. Assessment procedures 273
7.3.11. Consequences/Impact 274
7.3.11.1. Results 274
7.3.11.2. Effects upon students 275
7.3.11.3. Effects upon teachers 275
7.3.11.4. Effects upon schools and curriculum 277
7.3.12. Reliability of results 278
7.4. Comparing the current MTEP examination and new model 279
7.4.1. Validities 279
7.4.2. Reliability 279
7.4.2.1. Summary of the FACETS 281
7.4.2.2. Reliability index 284
7.4.2.3. Raters measurement reports 284
7.4.2.4. Probability curves 288
7.4.2.5. Expected ogives 290
7.4.3. Consequences/impact 292
7.4.4. Practicalities 292
Summary 293
9
Glossary of terms 325
Bibliography 334
Appendices (Volume 2)
10
List of Figures
Figures Page
11
List of Tables
Tables Page
12
Table 45: Responses to questions about weightings 270
Table 46: Responses to questions about bias 271
Table 47: Responses about assessment procedures 273
Table 48: Responses about consistency of marking 274
Table 49: Responses about impact 277
Table 50: Comparing inter and intra raters reliability between the current 280
examination and the new assessment instrument and procedures
Table 51: Measurable data summary: current examination 283
Table 52: Measurable data summary: trial 1 283
Table 53: Measurable data summary: trial 2 283
Table 54: Current examination raters measurement report 286
Table 55: Trial 1 raters measurement report 286
Table 56: Trial 2 raters measurement report 287
Table 57: Current examination probability curves 289
Table 58: Trial 1 probability curves 289
Table 59: Trial 2 probability curves 290
Table 60: Current examination expected score ogive 291
Table 61: Trial 1 expected score ogive 291
Table 62: Trial 2 expected score ogive 292
13
Acknowledgements
Particular thanks are due to the Portuguese art teachers and students who took part in
the research and to my fellow students in the Centre for Art Education and
International Research and the Centre for Research in Testing, Evaluation and
Curriculum at the University of Surrey Roehampton for their lively discussions and
companionship.
And finally my thanks are due to my family and friends for their support and
encouragement.
14
Introduction
This chapter establishes the research problem by defining issues and current
The field of 'art education' can be elusive to define with precision because a variety
of terms are used in the literature, often interchangeably. For example, in the English
National Curriculum the term 'Art' was used until 2000 when after much lobbying
the subject title was changed to 'Art and Design'. Despite this change the National
Curriculum sub-text always stated that the subject title included 'Art. Craft and
Design'. The English awarding bodies have also used 'Art' and 'Art and Design'
Education' although the term 'Visual Culture Education' is becoming more common:
it is debatable whether these terms are synonymous. In Portugal the terms: ‘arte’,
‘artes’, ‘artes visuais’, ‘arte e design’, ‘educação artística’ are used and these have
been translated in this thesis as ‘art’. When the term 'Arts Education' is used it covers
a range of visual and performing arts including for example, dance, drama and music
education. Where any of these terms are used within quotations in this thesis they
are the choice of the original author. To avoid confusion discussion of these quotes
uses the same terms as the original unless there is an obvious reason not to do so.
(1994, p. 271), assessment should be closely integrated with the curriculum. Its
principal functions include diagnosis of the causes of the learner’s success or failure,
to provide motivation for the learner, to provide valid and meaningful accounts of
15
what has been achieved, and a means of evaluating teaching programmes.
Assessment can be used to provide feedback and motivation for teachers, parents and
instrument for ranking and selecting students (Eisner, 1998; Robinson, 1982; Gipps
consequences for students because it is one of the indicators used to select them for
systems. Different procedures and instruments are used in different countries, often
Assessment in the context of this research refers to the process of arriving at a view
reports or statements which result from that process, and from broader evaluation,
activities such as courses, programmes or some other larger aspect of the educational
process (Eisner, 1985). Examinations and tests are techniques used to assess students'
16
In art education it is usual to find that multiple viewpoints and value positions are
celebrated (Boughton, 1996, p.77). Boughton (1996) raised several problems for
educators who have to judge student art works, particularly at the senior school level.
For example:
Modernist and postmodernist views are two different underlying paradigms, which
may problematise the achievement of consensus about the quality of art works. In a
and instrumentalist justifications are proposed for art, using criteria emphasising
critical analysis and questioning, and accepts pluralist narratives of art (Efland,
Different viewpoints about art education present problems for assessment. For
modernist perspective; and copies and reinterpretations of art and design exemplars
criteria for judging drawing performance such as faithful coping of reality may not
17
In a modernist paradigm, events of judging artworks were often limited to formalist
criteria. In the postmodern paradigm judgements are more complex, they always
paradigms in art education is not yet resolved at the time of this research; however, it
particular problems.
On the one hand, dynamic, socio-cultural changes affect artistic expression of all
kinds; debates about, for example, reconstructing or maintaining the cultural identity
about art and education. On the other hand, the curriculum reforms currently being
outcomes and clearer specifications of achievement standards for the arts. At a time
when long standing beliefs about the nature of art are under challenge, teachers are
being asked to account for their students' learning in clearer and more precise ways
(Boughton, 1999).
In Europe the subject of art education is not well-defined (Schönau, 1999). The
centralised assessment, and some of the traditional ways of defining art, raising the
need to debate the impact of distinctions between modern and postmodern ways of
defining art and art learning in assessment. Selecting the type of evidence of
18
in the sense that political issues, the ideologies of curriculum planners and practical
assessment instruments and procedures. What is actually assessed in art and design
is taught and how it is taught, and what and how students learn. Examinations
organise and legitimate knowledge in ‘testable form’, always on the assumption that
appropriate knowledge can be organised and tested. The choice of what is tested or
examined, and the criteria used, influences curriculum practice. Large scale external
be omitted from examination papers because they are difficult to assess or because
also points out that '…systems of assessment have been criticised for putting a
examinations should have validity. In other words, assessment should serve its
prescribed purposes, reflecting the aims and objectives of the courses and units
taught (Eisner, 1998). Examination results should be reliable minimizing the effects
consistency of scores obtained by the same persons when re-examined with the same
19
test on different occasions, or with different sets of equivalent items, or under
variable examining conditions (Anastasi, 1988, p.188). For example, two or three
qualified judges (assessors or examiners) rating the same works and using the same
and utility or practicality raise difficult problems for external assessment, since these
examiners and test developers must make a choice of content, competencies and the
of what is most relevant and important in art education. They must define the criteria
expressing the qualities to be valued in student artwork and provide guidelines for
making judgements.
Assessing the arts has been considered an especially problematic field and doubts
have often been expressed about whether such assessment can be properly objective
(Boughton, 1996). Key difficulties include the extraordinarily varied nature of art
criteria in words (Boughton, 1996; Eisner, 1985). Criteria express the qualities
assessment (Boughton, 1996; Eisner, 1985; Blaikie, 1996; Schönau, 1996). The
concepts that underpin the criteria need to be shared by the whole art education
qualities in art students' achievements are tacit in nature, and are difficult to express
20
adequately in words. According to Boughton (1996, p.72), the problem with the use
of criteria is that they express inexact human values and are subject to various
interpretations by judges who may hold different views about the significance, or
precise meaning of any given criterion. He had raised important questions for
Even if examination papers are well constructed in terms of validity, and the criteria
are well defined, assessment of students' examination work may still lack reliability.
(Beattie, 1994; Blaikie, 1996; Boughton, 1996; Steers, 1996). An important aspect of
examinations is the method or procedures used to judge students' artworks, and these
art and design external assessment occurs at the end of secondary education in the
12th year of school (age 17-18). A single judge assesses examination works. In
England students of the same age usually are assessed by at least two judges and
(Schönau, 1996; Steers, 1994). Moderation is the means by which the marks of
different teachers in different centres/schools are equated with one another and
21
Blaikie (1996) moderation is a fundamental aspect of reliability in qualitative
subjective and informed opinions. Boughton (1997) refers to the importance of group
discussion in assessing tasks. Beattie (1994) notes the necessity of a second judging
for evaluation of the product, and she argues: 'This procedure is extremely important
assumption is that if judges can agree about the merit of student work then the
judgement process has demonstrated objectivity. If different judges, using the same
criteria can agree about the standard of work over a variety of cases, then a high
measure of reliability has been achieved, and confidence can be placed in the
judgements. This assumption is derived from the natural sciences and is based on the
belief that a ' true' score for student work exists, and may be approximated through
agreement between judges and, if they share the same assumptions about the value of
the students' art works, high reliability is obtained. Moderation meetings and the
exhibition and publication of exemplars for each level of achievement may serve the
The English system of moderation appeared to have advantages over the Portuguese
system because classroom teachers are the principal assessors of student’s work.
Teachers mark students’ work and those marks are ‘moderated’ by external assessors
who verify the marking of a sample of candidates’ work at each school. Moderation
22
may take the form of a sample of all the candidates’ work being sent to a moderator
at an awarding body, or moderators may visit schools when, for example, the product
common standards and their judgements are usually subject to various checks and
balances.
of alternatives is generally not discussed. It is all too easy for teachers whose
judgements are not challenged to resist change and judge their students' work
asked about the role and competence of judges. Boughton (1996, p. 300) suggests:
examinations at the end of secondary school (age 18) are teachers without any kind
are supposed to receive some measure of specific training for their work from
this study (see Chapter 5). The literature suggests that a major source of unreliability
23
objectives. In theory, if assessors share the same views and values they can act as one
Portugal, art external assessment in the end of secondary school (age 18) is limited to
tests administered during a short period of controlled time. The question papers are
not available to the students before the date of the examination; students are not
allowed to use preliminary sketches or preparatory research and there are limited
objectives, criteria and question papers are made available to students, teachers and
examinations do not take the form of pencil and paper tests as in Portugal, but of
rather practical art and design course tasks with, in some cases, a written component.
The assessment procedures used in art examinations raise specific problems. Ideally
the techniques used to assess students' achievement should have the necessary
negative impact on curriculum practice. But this is rarely the case (Shoamy, 2001).
In some systems reliability might be the main requirement, in other systems it might
24
publications on assessment in general (Boavida, & Barreira, 1993; Estrela, 1994;
Fernandes, 1992; Lemos, 1993; Pacheco, 1995; Ribeiro, 1997; Valadares & Graça,
1998), but very little research has been completed on assessment in art. In England
and the United States, some research in art education assessment has been carried
out, however the American art educator Zimmerman (1992) claims that there is a
particular need for research and development on the interface between teachers and
assessment procedures. The nature of the concepts involved in studio art makes the
Although some research has already been undertaken, perhaps most notably by the
Dutch Institute for Educational Measurement (Schönau, 1996), little in the way of
During the year 2001, approximately 4000 students in Portugal took national art
examinations in general art courses and approximately 1500 students took national
art examinations in technological art courses at the end of secondary school (Source:
Plastic Expression; Theory of Design and History of Art. The format is pencil and
paper tests requiring short essays and drawing exercises. Despite the considerable
number of hours of study allocated to them, Studio Art and Technologies are the only
specialisms in the Portuguese art curriculum which are not assessed externally. This
fact leads students to consider Studio Art and Technologies as less important in the
25
curriculum (Eça, 1999). In the Portuguese system, therefore, students are not given
an opportunity to reveal their true knowledge and ability, and this reduces the
validity of the system. Important skills and capacities in students' learning and
achievement, which are specific to art production, are left out of external assessment
thus raising doubts about the content validity of the assessment system. In the
Secundário, 1992) assessment objectives and criteria are not clearly defined.
are left free to judge students' coursework and examination work according to their
own, often very different, interpretations. In previous research (Eça, 1999), the
techniques used in Portuguese art and design external assessment were judged to be
inappropriate, and serious doubts about the validity and reliability of the system were
Portuguese art educators had a clear and evident dissatisfaction with the national
system of external assessment for art disciplines. It was concluded that there was an
urgent need to reform and to improve external assessment in art and design courses
Education (GCE) Art and Design examinations in 2001 (Source: Joint Council For
General Qualifications). Art and design examinations, in both studio art and
external set assignments at the end of the course. The components of examinations
allow for the inclusion of process and product in coursework and external set
26
assignments. Students can use preparatory research during the examination. The
external examinations are not pencil and paper tests. Written essays, work journals,
1986). These are particularly problematic in art and design where a consensus about
the interpretation and use of criteria is difficult to obtain because of the wide-ranging
frequently employed and in art and design there are no right or wrong answers, and
There is a need to reform Portuguese art and design examinations at age 18 since
evidence exists that the current system is deficient in most aspects of validity
(Afonso, 1998). At the beginning of the research it was hypothesised that a critical
evaluation of art and design examination systems in Portugal and England should
analysis of weaknesses and strengths of both systems might facilitate analysis of key
27
Therefore, the aims of the research reported in this thesis were to understand the
theory, policy and practice of art and design examinations in England and in Portugal
reliability, validity, impact and practicality or utility of the particular national art and
design examinations at age 17 /18.The principal audience for this study included key
agencies, university admissions systems, and art teachers. It may also be relevant to
the international community of researchers in art and arts education and related
The intention was to carry out a number of small empirical studies of assessment
art external assessment. It was intended that the final conclusions and
recommendations would; (1) establish a firm basis for making proposals to the
Portuguese Government about the development of new forms of art and design
external assessment and, (2) contribute to the international debate in art education
28
Taking into account the problem statement and research aims, the following
1. To what extent do English and Portuguese art and design external assessment
systems at age 17+ have the necessary validity, reliability and practicability to
assess art students’ achievement fairly and accurately?
2. What is the impact of the English and Portuguese art and design examinations at
age 17+, on teaching and learning?
3. How can the validity and reliability of Portuguese art and design external
assessments at pre-university level be improved?
4. What kind of conceptual framework would attain more valid and reliable art and
design external assessment instruments and procedures at pre-university level in
Portugal?
5. How might the conceptual framework for such assessment be applicable in more
general international contexts of art external assessment at pre-university level?
In order to compare and contrast the two systems of external assessment the
1. What types and models of art assessment instruments are currently in use
in England and Portugal at age 17+?
2. What are the similarities and differences between them?
3. What concepts and theories underpin art and design external assessment
in Portugal and England?
4. How are assessors and examiners trained in Portugal and England?
5. How is students' examination work judged in Portugal and England?
What criteria are applied? How is agreement reached about defining
criteria?
defined and key problems associated with assessment and assessment within
29
Chapter 2: consists largely of detailed considerations of particular qualities of
specific sources of invalidity are identified. The chapter ends with a proposal
Chapter 3: describes the design of the research. The first part concerns the
two research stages. The second part describes considerations of the data
Chapter 4: the first part is an overview of the history of art and design
their validity, reliability, impact and practicality. The second part of the
chapter reports the findings of the analysis of the current assessment model
examination in order to present a list of key features that might improve the
current model.
30
instruments and procedures of assessment; analysis of the researcher’s
Chapter 6: (1) proposes a possible new framework for art and design
examinations together with a rationale. (2) the content framework for the
Chapter 7: This chapter reports on and evaluate the larger-scale trial of the
Portugal.
Chapter 8: In the final chapter, findings from all the strands of the research
are synthesised to answer the research questions and conclusions are drawn
31
Chapter 1
and design is often seen as especially problematic because of the nature of the
obtaining a consensual view of criteria and how to interpret them. Key terms
are defined and significant issues related to assessment and assessment within
1.1. Assessment
cycle and can include a range of methods for monitoring and evaluating student
performance and attainment. These methods range from formal testing and
32
The word 'assess' is derived from the Latin ad+ sedere, meaning 'to sit down
together'. Wiggins (1993) points out that the etymology of the word alerts us to the
centred’ act:
persons: the student (or his/her work) and the assessor. MacGregor (1996, p. 29)
making decisions about values'. The assessor is viewed as someone who has
grading the performance. This implies a power relationship between the assessor and
the person being assessed. Assessment can be constructive when it provides guidance
for the student and self-reflection for the teachers and schools. However, it can also
performance.
students and teachers. From this point ethical and social dimensions have to be taken
into account. Assessment and especially assessment in art and design cannot be
33
1.1.1. Roles of assessment
Assessment is a broad concept with the capacity to fulfil multiple purposes. It should
be closely integrated with the curriculum (Broadfoot, 2000). Its principal functions
include diagnosis of the causes of young people's success or failure, motivation for
the learner and teacher, the provision of valid and meaningful accounts of what has
been achieved and it may form a part of the evaluation of teaching programmes
Sutherland (1996) argues that three uses of assessment have dominated its history in
Britain so far: as a device for raising standards, as a device for measuring deviation
standards. According to Gipps and Stobart (1997 p. vi), professional assessment and
managerial assessment are two different functions that need to be taken into
consideration. These are defined as: (1) Professional assessment , where assessment
primarily helps the teacher in the process of educating the pupil, and (2) managerial
assessment, which uses test and assessment results to help manage the education
system. Assessment for learning can be equated with professional purpose: it takes
place at any time but is more useful during a period of teaching when the information
produced is being used by the classroom teacher, and perhaps by other teachers in a
may also mean assessment of pupils' understanding and mental processes, rather than
34
According to Robinson (1982, p.82) 'The principal function of assessment in schools
However, assessment has other functions and needs to be viewed in light of the
treated separately (Cardinet, 1993; Rosales, 1992; Gipps and Stobart, 1993). The
motivation for the student or teacher, providing a basis for selection to monitor
national or local standards, and control of the curriculum may conflict (Lambert &
Lines, 2000, Broadfoot, 2000). Although the same assessment instrument can
provide data that is useful for different situations, there is a need to view assessment
For the purposes of the research reported in this thesis the main function of
35
1.1.1.1. Feedback, motivation
Robinson (1982) argues that assessment and evaluation have important roles in
education because they can provide feedback for teachers and schools. He states:
Feedback is information that provides performers with direct, usable insights, based
and even fewer tests, provide what students most need namely information designed
‘s terms (Wolf et al, 1991, p.183) becomes 'an episode of learning'. Wiggins (1993)
argues that many classroom teachers believe that a grade and a short series of
she might improve his or her performance. Grades and scores are symbols, or
question of assigning symbols. Students must understand and share the meaning of
constructive dialogue rather than a matter of marking works with numbers or letters.
externally defined by experts may not be efficient. Feedback for students and
36
teachers is complex and need to be explained through sophisticated forms of
…we now feel able to suggest ways forward. A pupil’s self-esteem and
confidence need not be at risk in a genuine conversation designed to
support her own reflection and help her make informed and constructive
judgements while at the same time hearing the opinions and advice of her
partner’ (Ross et al., 1993, p. 164).
‘language only experts can decode’ and ‘measures only easy-to-score variables’. This
Ross et al argued that grading in the arts, ‘must be criterion related and not based
each student, based upon what they are able to achieve in a supportive environment.
A system of assessment, which relies only upon quantitative data, may not provide
adequate information. To provide feedback and motivation for students and teachers,
symbols but also through detailed verbal reports about performance. Furthermore
there is a risk of misuse of data and incorrect interpretation of results if the only
37
information available is revealed through grades and examination/tests scores, such
pervasive consequences:
provider is obliged to respond to criticism from those the provider serves. The ability
framework in which the client has formal power to demand a response from the
provider, to influence the providing, and to change service providers if the responses
are deemed unacceptable'. In the case of education, the client may be a student,
38
formal relationship. And as the word's roots in European legal history suggest,
accountability presumes moral equality: no one is above the law. Wiggins (1993)
argues that teachers and administrators are obliged to answer questions about how
their students are doing, irrespective of what they choose to report (p. 257).
However, students and teachers are ill served by having complex performances
Accountability may be also designed as ‘quality control'. Wiggins (1993) states that
‘if the schools set clear, public targets; if the parents have a clear and present voice;
if the students, former students, and institutional customers have a voice then we will
have accountability that will improve schools ' (p.277). Accountability is a political
Assessment may take many forms. Informally, teachers are assessing pupils all the
time, as indeed pupils are assessing them. For Ash et al. (2000):
39
broad appraisal including many sources of evidence and aspects of
pupils' knowledge, understanding, skills and attitudes, or a particular
occasion or instrument. An assessment instrument may be any method or
procedure, formal or informal, for providing information about pupils'
learning: individual discussion, homework, project outcomes, mock
examination work, group critiques (p.135).
Assessment is formative ‘when the evidence is used to adapt the teaching work to
meet learning needs’ (Black et al, 2003, p. 2). Formative assessment can occur many
times in every lesson. It can involve several different methods to encourage students
to express what they are thinking and several ways of acting on such evidence (Black
et al, 2003, p. 2). The results may enable a teacher to give remedial help at an early
stage of learning, or change the emphasis of a course if required. They may also help
a student to identify and focus on areas of weakness. Ash et al (2000) point out that
learning activity. ‘It has been recognised by most art and design educators that
ongoing feedback during the process of making, rather than assessing and grading an
work forward. If pupils do not receive regular feedback on their work, they quickly
lose motivation and become unsure of their own assessment of success or failure’.
Summative assessment
40
end of a project, scheme of work or course of study and therefore
often takes the form of pupil self-assessment providing an opportunity for pupils to
assessment provides teachers with insights into pupils' understanding of their own
not only as a form of assessment but also as a form of learning. As Ross et al (1993,
p.164) point out: ‘Pupils must first know what they have done before they are in any
position to judge [and understand] how they have done’. Furthermore, the process of
students to monitor their own learning continuously, to help students reflect on their
own abilities and performance, related to specified content and skills and related to
41
Diagnostic assessment approaches pupils' work and behaviour as evidence for the
analysis of their ability in a given field (it is often used to discover learning needs). It
can be used constructively as a vehicle for discussion between teacher and pupil
where both parties consider progress and set targets for future development (Ash et
al., 2000, p. 136). Diagnostic assessment can be a powerful motivating factor for
pupils.
1996; Hannan, 1985). For Ash (2000) negotiated assessment in the form of
assessment process. Two main aspects of negotiated assessment can be found: (1): in
the relationship between student and teacher/assessor, and (2) in the relationship
Students and teachers can work on formative assessment together by sharing criteria,
(Shoamy 2000). ‘ Assessment done properly should begin with conversations about
performance’ (Wiggins, 1993, p.13) and this only happens by transparent means of
occurs when all the users of the system understand and agree with the applied
procedures. In that sense, teachers should not be marginalized from the development
42
and application of external assessment. Debate and consensus should form part of the
This research focused on external assessment, which is often associated with external
that designs the instrument mark the work submitted by students externally. External
summative forms of assessment. For example, English art and design examinations
such as the GCSE taken at age 16+ can be considered both formative and summative
From its beginnings in the universities of the eighteenth century and the school
43
Educational assessment can be seen as the representation of the desire to discipline
Broadfoot (2000):
individuals for preferment. However, this is not always the case since education and
assessment systems tend to favour specific social classes, genders and races.
Assessment systems do not necessarily introduce more equality and fairness into the
system they may offer less, with more streaming and stratification by family
income, wealth and race (Hilliard, 1991; Pullin, 1994; Madaus 1994; Darling-
and may function to maintain social stability. Foucault (1977) argues that
44
Educational assessment can no longer claim to be naive about such
matters or uninterested in them. Future theory and practice must be more
self-conscious about these implications of the process; in particular,
research and development in assessment must take more seriously the
ethical issues of what counts as acting in the 'best interests' of the
candidate, and take more trouble to ascertain candidates' views of the
process and its outcomes (p. 178).
Foucault, (1977) documents the move from the direct, explicit, feudal exercise of
power through physical coercion, to the modern social practices of self-control and
Foucault's argument is that the power of examinations derives from the use of
45
According to Barrett (1990), assessment refers to the judgement of a process or
assessment task (examination question papers for example). The judgement is based
on a list of known and previously stated criteria and expressed by qualitative and/ or
express their judgements using scores or marks. Using mark schemes they attribute a
terms of the variable balance of knowledge, skills and understanding about art
Armstrong (1994) states that arguments for assessing art emanate from three arenas:
(a) professional education, (b) the government, and (c) the local community.
46
Assessment and especially examinations in the arts subjects may also be viewed as a
form of legitimisation of their role in the curriculum. 'More and more art teachers see
argues:
Art educators cherish their uniqueness. At the same time, that uniqueness
contributes to a precarious position in which art teachers too frequently
find themselves. Few persons understand and value the uniqueness of art
education as an important part of general education for all students. Art
educators need to demonstrate that the learning unique to art contributes
to the education of all students and that students do produce evidence of
learning in art. Without assessing what students learn in art, art education
will remain in a peripheral value position in formal education.
Assessment provides an objective basis for the claims that arise from art
teachers' hunches and chance, casual observations made daily at the
scene of action in the art classroom (p.9).
that assessment in the arts must be flexible and include several outcomes:
47
In the case of arts the objects of assessment often include the products and processes
of art making. Beattie (1996) recommends that assessment components in the arts
But she also states that the process of making is an essential part of learning and
or product. Beattie (1995, p. 54) makes the point that recent research in cognitive
psychology, in educational and teaching theories, and the current educational reform
initiatives provide a strong basis for a paradigm shift from emphasis on assessing
process is an important aspect of education, what Resnick and Resnick (1992) call
understanding about art are often based on tests requiring capacities of recall and
technical ability. Tests might not assess thinking and critical skills validly. In Eisner's
(1979) words, such methods provide a biased, even distorted, picture of the reality
48
context-specific and do not easily allow direct testing (Beattie, 1996, p.53). Steers
Gipps & Stobart (1997) also refer to the need for appropriate use of assessment
instruments according to the purposes for which they are designed. They argue:
The multiple-choice test, true and false test, was considered reliable and
objective. Though, single correct answers and single correct solutions are
not the norm of intellectual life, nor are they the norm of daily life. The
problems that people confront in their intellectual endeavours, like the
problems they confront in ordinary living, are replete with reasonable
alternative solutions (p.144).
In the body of research that has been carried out in the assessment field there is a
move away from tests (Amabile, 1983; Gardner and Grunbaum, 1986). Tests have
been criticised for encouraging teachers to set tasks that promote de-contextualised
rote learning and that narrow the curriculum to basic skills with low cognitive
1997). According to Wiggins (1993, pp 54-67) the use of ‘secret tests’ is a vestige of
49
Wood (1987) and Gibbs (1994) argue that there has already been a paradigm shift in
proliferation of service industries and the changing character of work have created
a result a fast growing interest can be seen in the more formative, holistic,
(Wolf, 1995). It remains the case, however, that traditional forms of assessment are
not easily replaced, embedded as they are in complex histories, cultures and power
modern societies that they blind us to the potential for alternative forms of
assessment'.
For Torrance (2000) the way forward includes the necessity of assessment practices
curriculum knowledge that can be local. He states that what justifies and legitimates
50
multicultural society with global forms of communication, it is difficult to state that
one kind of selection of knowledge is more relevant than another; and Torrance
raises doubt about the veracity of obtaining a 'truer score', his question is whether:
‘…a true score can ever be produced for and adhere to an individual?’ For Torrance
results must be a function of the interplay of task, context, individual response and
would appear to be the appropriate goal for a system of educational assessment that
considers that assessment of thoughtful ‘mastery’ should ask students to justify their
self-assessment central.
Instead of asking students to respond to a test in order to determine, for example their
art learning, Gardner (1989) has proposed that learning ought to be organised around
meaningful projects carried out over significant periods of time, and that evidence of
student thinking ought to be collected throughout that period, what he has called, a
‘process-folio’. His process-folios include concept sketches and early drafts of ideas,
along with final products. Reflection sheets, interview forms, and journals are used to
help students, teachers, parents, and administrators trace the evolution of particular
51
ideas as well as general progress over time. Assessment of student's learning is
based more upon the process than the final product. In this way the assessment more
successfully captures the way in which learning occurs within the discipline. The
term, which has been increasingly used in the United States in recent years to
expression that has arisen in response to a growing recognition that standardised tests
capture only a small dimension of human intellectual functioning and it reflects the
The impact of Gardner’s ideas on assessment has been more radical in the American
practice exists. But in American art education, assessment has not been a particular
University’s Project Zero and the Arts Propel (Blaikie, 1994; Gardner, 1996)
52
employers and other third parties with a more comprehensive teacher-
written report on pupils' achievement (see Broadfoot et al, 1988). The
movement has also given significance to achievements outside the
mainstream academic curriculum and indeed, on occasions, to
achievements outside school. In this we can certainly see evidence of
attempts to accentuate the importance of the local and allow the voice of
the pupil to be heard.... (p.185).
Records of achievement have been criticised for potentially exposing pupils to even
more scrutiny than traditional examinations, since every aspect of a pupil's life, in
and out of school, becomes potential material for inclusion (Hargreaves, 1989,
(2000, p.185) concludes: ' I am minded to suggest that the Record of Achievement
1.2.2.4. Portfolio
Klenowski (2002, p.26) a portfolio is a collection of work that can include a diverse
53
assessment. Careful critical self-evaluation is an integral process and involves
judging the quality of one’s performance and the learning strategies involved. The
learning. Using portfolio for assessment of students may present several advantages.
As Zimmerman (1992, p.17) points out, through portfolios students can be observed
taking risks, solving problems creatively, and learning to judge their own
According to Eisner, (1998) the tasks used to assess students should reveal how
students go about solving a problem, not only the solutions they formulate. He points
out:
The new assessment practices will need to provide tasks that resemble in
significant ways the challenges that people encounter in the course of
ordinary living. This will require an entirely different frame of reference
for the construction of assessment tasks than the frame we now employ
… the challenge to assessment is to somehow create tasks that give
students opportunities to display their understanding of the vital and
connected features of the ideas, concepts, and images they have explored.
Portfolios and records of achievement seem to present advantages for art assessment
information through several tasks. Portfolios include process and product; through
practice.
The strength of using portfolios derives from students’ ownership of the learning
process. This contrasts with other forms of assessment where the teacher or examiner
54
is full charge of the process. The student in portfolio assessment is actively engaged
in thinking about his or her own work, the learning involved and the progress he or
active creators rather than passive receivers of knowledge, giving them opportunities
to examine and analyse their ideas, beliefs, and values in collaboration with others.
Portfolio use supports this process and helps to develop increased awareness of their
learning (Klenowski, 2002, p111). When students manage their portfolios they are
also developing important organising skills which serve to extend their sense of
and inventive as they strive to develop a portfolio of work that captures their unique,
personal achievements.
raised: extended tasks that encourage thinking and reasoning activities take time and
are open to different types of valid response. Klenowski (2002) points out some
obvious practical problems that need to be taken into account such as the in-service
provide for students. Problems associated with portfolio implementation include the
mark or grade the portfolios, support for students to understand the portfolio process
Reliable assessment can be improved and facilitated by written criteria and defined
55
frequently employ criteria to express the qualities valued in student artwork and to
guide the judgement of arts educators. The simple expression of the criteria used for
standards express the degree to which they exist. A criterion for judgement is not
target. Judgement of a work against pre-specified criteria does not necessarily mean
a good standard has been achieved, even if all the criteria have been evidenced in the
work.
Boughton (1999) argues that different cultures emphasise different criteria. Even
within a single cultural tradition, the criteria used for judgement may demand
different emphases according to the genre of the work especially for perceived
‘originality’. For example, contemporary work using new technologies and recycled
imagery may raise different issues for judgement than work undertaken within
traditional styles and using older media. Art works and art students' performances are
'profiles' can be inadequate, not only because words do not do justice to visual
Standards imply the existence of uniform exemplar answers. However in art and
design education such ‘answers’ are seldom the aim, rather, diversity of response
and outcome is sought. Boughton (1997) considers that: ' … the judgement task is
far more complex because the products being judged are expected to reflect a degree
56
of original thought, and some sense of the idiosyncratic [personal] characteristic of
their creators' (p.203). According to Boughton (1999) it is not possible, nor would it
He points out: '…what is much more appropriate is the use of criteria which express
However, assessors cannot exercise judgement or use any assessment tools unless
they are clear about the criteria and standards to be used when making their
judgements. The use of criteria and holistic judgements, rather than standards
variations of outcome in the students' work. Criteria provide guidelines for making
judgements. Any single criterion is able to indicate desirable qualities which may or
provide an array of windows through which a judge may look. They do not provide a
These windows may sharpen a judge's focus, but it still remains for the
judge to value the qualities of the work, to understand its genre and
context in which it was produced. An important element in the
construction of assessment policy in studio art that is most likely to be
responsive to variation of genre and cultural values is the employment of
criteria rather than standards for the guidance of examiners and a single,
holistic judgement following attention directed by the criteria (p.343).
'recognised measure' which comes into being through some kind of imposed act of
57
of arts education, particularly if educational managers and the community are to be
convinced that arts educators are able to reliably determine the quality of student
predictable content outcomes, the real issue is the problem of cultural relevance and
validity in assessment.
However, it must be borne in mind that the expression of a criterion in words can be
reductionist. Often the most important qualities in art students' achievement are tacit
or elusive in nature, and they are very difficult to express in words. The nature of
'aesthetic response' is one of these. One temptation is to write criteria that can be
easily interpreted, rather than to attempt to write those that are more appropriate, but
For Boughton (1996) the problem with the use of criteria is that they express inexact
human values and are subject to various interpretations by judges who may hold
different views about the significance, or precise meaning of any given criterion.
The problem with the notion of art is that it too is a social construction
subject to various interpretations in different social, educational,
historical contexts. It is no wonder that education bureaucrats who value
secure and predictable worlds __ who lust after precision and reliability __
becomes anxious about the validity and reliability of qualitative
judgements. In the age of educational reform, characterised by an
obsession with standards, reporting frameworks, and accountability
measures, the value of qualitative assessment information becomes more
58
and more difficult to justify to those who are preoccupied with statistics
and management strategies. At the same time as the industrialists and
politicians are demanding from educators increased precision in their
judgements, clear standards, and higher levels of accountability, the
impact of postmodern ideas is having the opposite effect in the world of
art (p.72).
must be broadly, even loosely, cast to allow for diversity, for plural visions of what
art is and how art may be created' (p. 47). Localising standards is a necessary
Teachers must have in mind some goals that are measurable and then
design instruction that will move students toward the achievement of
those goals. This is a highly useful way of thinking about instruction, but
this approach also has limitations and, I would argue, should not be used
exclusively. In the visual arts, open-ended exploration and
experimentation – true creativity – are valid ends as well (p. 63).
how a balance can be struck between the need to define clear standards in advance to
instruction [previous established] and the need for standards to emerge as a result of
In the case of studio art examinations, where practical skills are assessed, the
considered. Schönau (1999, p.32) answers: ' Yes and no. No, it is not, as no two art
works are the same. Yes, in practice art teachers do use standards. They compare
what they see with what they have seen before…. Art teachers set their own criteria
59
how much does each criterion relate to the final score? What is the
maximum level of competency? It is well known practice that teachers
never give a maximum score to their students, because things can always
be more perfect (Schönau, 1999, p.32).
as functionally more than the sum of its parts. According to Schönau (1999, p.32) '…
judging studio work using criteria is not fair when it ends with adding up separate
scores. For how do the separate scores relate to the overall score?' However Schönau
argues for the use of national standards or minimum levels of competency when
attainment targets or grade descriptors must be used in art and design examinations
but how they might be expressed in order to be commonly understood and applied. In
The difficulty of defining domains of knowledge in art and design education is not
the only problem in external examinations. Another important question is about the
possibility (or impossibility) of defining standards in art and design examinations and
assessment context different judges’ views present a problem for system managers.
According to MacGregor (1992) the criteria used by teachers to judge studio work
shows little variation in countries such as United Kingdom, Canada and The
60
competencies to develop and interpret a theme, level of technical expertise, and
techniques and processes. However it seems that criteria are sometimes confused
standards may not be written. The marking schemes used in art and design
establishing clear and unambiguous criteria and attainment targets increases in the
field of art and design where teachers use different underlying concepts. Beattie
(1997, p. 229) points out that '…establishing criteria, objectives, and standards can
help a state or school determine what it values and seeks to accomplish, an authentic
reform initiative'. But the real problem in art and design examinations is not the type
of criteria but how they are written and applied in order to facilitate validity and
61
examinations and this may be possible through the development of assessment
and objectives.
Hargreaves et al (1996) conclude that teachers may reach agreement about the
and constructs:
Another current problem in assessing the arts concerns teacher in-service training.
is taken as the art of appreciation, the ability to define the quality of an object or
62
environment (Armstrong, 1994). Eisner (1985) observes that assessors should be
qualified artists and teachers who know how to interpret the visual terms used in the
Steers (1994, p.296) also considers that art and design education in schools requires
subject, supported by adequate resources of time and materials. The teachers' role in
the assessment process is essential and without their collaboration and agreement
(1996): ‘This ownership only can be effective if the system provides an efficient and
connoisseurship, but not all systems provide such in-service training (Schönau, 1999,
p.33).
Processes of teacher in-service training and strategies for reaching agreement about
criteria and standards such as moderation and standardisation are further discussed in
Summary
In this chapter, general and specific art and design assessment literature was located
and discussed. Several key roles for assessment were identified: as a way to provide
63
feedback and motivation for teachers and students; as a mechanism of control; and a
tests and alternative forms of assessment, for example, records of achievement and
portfolios were briefly analysed in terms of their advantages for art assessment and it
was evident that tests are not normally the best instrument for assessment in the arts.
Specific assessment problems of the arts were described and the following issues are
• The nature of art raises difficulties in defining a shared language. There are
reliability of assessment.
64
Chapter 2
examinations.
2.1. Reliability
An instrument for assessing art learning by tests or other means must demonstrate
abilities, in different geographical regions to obtain the same result. In test literature
reliable if two identical students obtain the same results (Cardinet, 1993; Ribeiro,
1997; Valadares & Graça, 1998). However in reality it is unlikely that there will be
identical candidates (on two or more occasions) and in order to verify consistency
between test results some proxy measures are used such as: (a) test-retest reliability,
which is based on gaining the same marks if the test is repeated; (b) mark-remark
reliability which looks at the agreement between markers; and (c) parallel
forms/split-half reliability in which similar tests give similar marks (Gipps & Stobart,
1997). Reliability is also concerned with evaluation of the marking or rating process
estimate of reliability based on the degree to which different assessors agree in their
65
assessment of candidates’ performance) and intra-rater reliability (the degree of
Reliability as understood by test developers may not be as useful for teachers, and
especially for art teachers who usually assess variables outcomes of coursework or
individual students. He states that it is more appropriate to work with the concepts of
general,
means that the student has not attained: very often, all it means is that the question
has not been asked in the right way for that individual, and the verdict cannot be
that has been disclosed is observed, correctly interpreted and accurately recorded.
The concept of fidelity will be developed later through the discussion of validity.
The major problems associated with reliability in art examinations can be attributed
to: (1) in a priori stage: the construction of the examination instrument and, (2) in a
66
teachers, assessors, examiners’ training, and judgment of students' work through
2.1.1.1. Instrument
In art and design examinations tests are seldom the sole instrument. Several
instruments can be used and complement each other. This may present advantages
for the reliability of results because it provides several assessments of the same trait.
confidence that the events interpreted and appraised are not aberrant or exceptional,
regularly observe those behaviours that occur consistently or those that suggest
repeatedly assess the same thinking process (Armstrong, 1994). Gipps & Stobart
Here, Gipps & Stobart (1997) raise an important aspect of reliability that is,
particularly acute in art and design. In fact coursework or portfolios usually provide
67
more reliability than a single test because they allow repeated measures of the same
instrument but also in the use of methods and procedures that are unclear or weakly
defined and which result in inconsistency of the whole process. For example the use
of vague scoring instruments such as performance and grade descriptors, criteria and
instruments to judge students works. Boughton (1999, p. 72) considered that the arts
measurements of learning in the arts describe very little about highly personal
in the arts is the use of moderation (Askin, 1985; Chalmers, 1981; Steers, 1987).
68
According to Steers (1996, p.185): ‘ Moderation is the means by which the marks of
different teachers in different centres are equated with one another and through
which the validity and reliability of assessment are confirmed’. Moderators calibrate
interpretation of criteria (Schönau, 1996; Steers, 1994, Blaikie 1996; Heyfron, 1986).
(Steers, 1988), Australia (Boughton, 1994) and the Netherlands (Schönau, 1996), in
examination (Chalmers, 1981) and for the Advanced Placement examinations used in
the United States of America to select some students for progression to higher
results.
To sum up, threats to reliability in art and design are usually a consequence of a
assessor training. Standardisation meetings for assessors, when carefully planned and
69
2.2. Validity
Validity is used in the context of this research as a unitary concept: unless the type of
validity is specified in the text the term validity is used to include all types of
specific educational context and the specific inferences made from examination
what has been taught. Assessment of either process or product requires critical
scrutiny of the soundness of the interpretations that will be made from the derived
sought through two main stages: A priori validation involving a scrutiny of the
posteriori validation involving the investigation about the way the examination
appears to have worked after the event, it involves both the analysis of scoring data
70
2.2.1. Some types of validity
Validity is often discussed in test literature. Beattie (1996) makes the point that '…
although validity is currently viewed as a unitary concept, the scope of questions and
issues is broad subsuming all of the historically discrete validity types (e.g., content,
criterion-oriented, construct, face, and the like)'. In assessment and testing literature
1988). For the purposes of this research seven different types of validity have been
(if the assessment instrument covers the skills necessary for good performance,
or all the aspects of the subject taught). Content validation is also about ensuring
the authenticity of the tasks or, in other words, if the assessment instrument is
are constructed using a poor sample of artistic competencies or domains; tasks and
71
way; failure to restrict inferences from examinations results to what an examinee can
Face validation is also about checking if the assessment instrument measures what it
experts, face validity is judged by various groups of non-experts who come into
contact with the instrument taking into account the feedback from test takers and
the examination paper, for example, whether or not students understand the
instructions and if the tasks motivate them. Using the designation '
a selected sample.
72
2.2.1.4. Washback validation
on the teaching and learning situation and checking if the instrument and
underlying model of artistic skills and knowledge are related to the learning goals.
Washback validity is also concerned with the tasks and scoring process, verifying if
they clearly reflect the specified model and domains – in other words, if the tasks
authentic way.
Teachers may be influenced by the knowledge that their students are planning to take
a certain test, and adapt their methodology and the content of lessons to accordingly
reflect the demands of the test. The result may be positive or negative. Sources of
model of artistic skills and knowledge which are divorced from learning goals; tasks
and scoring procedures that do not fully and clearly reflect the specified underlying
model and domains of artistic skills and knowledge; tasks and scoring procedures
that do not enable the examinees to actively engage their artistic skills and
knowledge in a reasonable and authentic way; and tasks and scoring procedures that
(Hasselgren, 1998).
73
Criterion-related validation is an a posteriori evaluation of the examination that can
Predictive validity relates to whether the test predicts accurately or well some future
instrument correlates with, or gives substantially the same results as, another
use of external criteria that measure different competencies from the assessment
instrument in question and by failing to look for evidence to ensure that the
In describing the typical validation procedures that involve mere correlation between
test scores and criterion scores, Messick (1989, p.10) notes that 'criterion scores are
measures to be evaluated like all measures. They too may be deficient in capturing
the criterion domain of interest’. The solution, he argues, is to evaluate the criterion
measures, as well as the tests, in relation to construct theories of the criterion domain.
74
Validation related to examination bias or equity of an examination is concerned with
the verification of the assessment instrument or checking if the tasks and procedures
place one or other group of examinees at an advantage. Sources of bias can include:
A powerful case needs to be made that the task for assessing process or
product is clearly unbiased and fair for all students involved. Aspects of
content and format might favour some group over another. The emphasis
on written or oral descriptions of learning processes in reflective journals
may be unfair to disadvantaged students, minorities, or non-standard
language speakers. A gender bias might also occur. Females tend to be
more reflective than males; therefore, reflecting on their processes might
be easier for them. An example of a contextual bias that might occur with
respect to the evaluation of an art product stems from different emphases
art educators place on product assessment criteria. In one context, the
teacher may consistently emphasise formal criteria, while in another
context, value is placed on conceptual and affective criteria.
Critical to the assessment of procedural knowledge is the opportunity
afforded all students to learn and practice methods of inquiry
(Beattie, 1995, pp. 52-53).
In order to identify points of bias is important to ask experts to revise the task prior to
the construct. The term construct can be viewed as definitions of abilities that permit
us to state specific hypothesis about how these abilities are, or are not, related to
other abilities, and about the relationship between these abilities and observed
empirical and (2) a priori theoretical (Weir, 2004). Anastasi (1988) describes
75
specific empirical techniques that can contribute to construct validation related to test
validation. For example: correlations with other tests; factor analysis; internal
and construct irrelevancies. The more fully examination developers are able to
describe the construct that the examination instrument is attempting to measure at the
priori stage, the more meaningful might be the statistical procedures contributing to
constructs, descriptors of performance and the scoring system are adequate. A trial
examination design.
in order to provide tools and empirical evidence to support the instruments and
76
the creation of descriptors of performance and division of constructs in the
scoring system.
The major problems of validity in art and design examinations are related to: (1) the
model and domains of competencies and, (3) a consensual concept of the most
In art and design, one single event such as a short period examination seldom
1987; Dorn, 1988; Efland, Koroscik & Parsons, 1991). A range of tasks and
The choice of the assessment instrument is crucial to attain validity and also
reliability of examinations. In the case of art and design, the complexity of the
subject raises problems about the use of traditional assessment instruments structured
77
2.2.2.2. The examination underlying model
acknowledging concepts such as: artistic skills, knowledge and understanding and
judgement. For the purpose of this research it might be helpful to have a sharper
definition of whether the intention in art and design examinations is to assess student
Beattie (1996) cited Glaser’s (1991) dimensions of competence. Glaser (1991) has
identified four dimensions of developing performance that can be useful concepts for
assessment:
appropriate assessment instruments for art and design. The dimensions are
appropriate for the specialist field of art and design where knowledge and
78
understanding, problem-solving skills and aesthetic appreciation all need to be
evidenced.
design curriculum should underpin it for the purposes of theory based validity. The
assessment instrument should cover the artistic skills, knowledge and understanding
necessary for good performance, or all the important aspects of the subject taught
within that model. But it is widely agreed that art and design is a subject where there
are no single or ‘correct’ answers. Boughton (1997) notes that the problem with
targets and criteria, are influenced by trends and fashions in art. Moreover in the light
of postmodern theories, the nature of art defies attempts at a tidy definition, and
However the problem of different visions of art and design can be overcome by
agreement and consensus between the users. The scoring methods should be
79
The underlying model of skills, knowledge and understanding being assessed should
model (Weir, 2004). The agreed aims and objectives of art education are essential
elements to take into account in the development of art and design examinations.
One necessary condition for the success of the examination is a ‘high level of
consensus about the aims of art education’ (Steers, 1996, p. 189), rather than
2.3. Impact
The impact of an assessment process is concerned with the effect created by that
of the individuals who are affected by test results. The impact of an assessment
instrument and procedures includes all sort of consequential validities and washback.
Examinations formats and selections of syllabus content are not value-free; its effects
are both educational and societal. The former refers to the changes as a result of
assessment practices and knowledge examined, while societal effects are concerned
with the impact of examinations on gate keeping, ideology, ethicality, morality and
fairness.
80
2.3.1. Effects on society and individuals
affected by the results, are forced to change their behaviour and comply with the
examination’s demands in order to maximize and gain the benefits associated with
high scores. Examinations are used ‘to gather evidence upon which inferences are
drawn about the unobservable’ (McNamara, 1999), they might be imperfect tools, but
their power and use has strong consequences on society and on individuals.
Examinations are not used solely by educators and students – they can also serve
political agendas. Shoamy’s (2001, p. xxii) comments about tests are also valid in
For the examinees, examinations are threatening rituals detached from reality with
long-term effects (Shoamy, 2001, p.14). The uses of examination results have
detrimental effects for examinees since such uses can create ‘winners and losers,
Examination results are used as indicators to placing students in class levels, for
81
2.3.3. Effects on teachers and teaching methods
Art and design examinations influence teaching practice. To predict intended and
unintended results there is a need to study the impact of similar previous assessments
on curriculum and instruction. Beattie (1996, p.53) makes the point that:
The art curriculum might be changed to align more closely with the
emphasised object of assessment. Moreover, the methods by which
each is assessed could bring about a distortion of teaching with positive
or negative results. Task format might discourage other viable methods
of teaching concepts and skills.
Similarly Spolsky (1995, p.56) argues that the power of examinations inevitably
narrow the educational process. This is because once teachers have familiarised
The examinations act as a curriculum model for art teachers because of the
importance of results for students and schools. The competence of a teacher may be
results teachers tend to pay a great deal of attention to exemplary work graded as
the case of art examinations, where ‘originality’ is often an assessment criteria, this
82
effect can be pervasive, Steers claims: ‘When original work is seen to be rewarded
by the examination system it is rapidly imitated by students and teachers alike and by
the time of the next annual round of examinations an innovative approach is reduced
to a cliché’ (Steers, 1994b). On the one hand examinations offer models of ‘good
answers’ but on the other hand such models are contrary to the very nature of art
Examinations are powerful methods of control and a means for monitoring the
performance of schools. The examination results are the basis for accountability:
schools are compared and differentiated by the ranking of their examination results
in league tables. The impact of an examination is concerned with the extent to which
the examination results are used in the way intended, and are successful in bringing
about the aims of the examination. A misinterpretation of the results threatens the
83
2.3.5. Effects on standards and curriculum
standardisation, they are used for purposes of monitoring and controlling the
education systems (Shoamy, 2001, p.50). Examination results are the recognised
such views.
Steers (1994) points out: ‘… while governments might have a legitimate interest in
Examination acts as a disciplinary tool punishing and rewarding and, it also acts as
a means of controlling the number of students entering higher education. The study
of predictive validity in the case of art and design GCE/A level examinations might
- Do higher education courses in art value GCE/A level art and design
examinations results?
For students, examinations are not synonymous with fairness; often they perceive
examinations as biased, unjust, unfair and depending on luck (Shoamy, 2001, p.14).
84
Examinations can have detrimental effects on students; they are used as disciplinary
tools, providing both rewards and sanctions (Foucault, 1979). Student voices are
to appear through the establishment of codes of practice (e.g. Code of Ethics for
may prove relevant to analyse the views of examination users about the ethical
aspects of examinations.
examinations have within the contexts in which they are used. Examination
developers should operate with the aim that their examinations do not have a
negative impact and, as far as possible, strive to achieve positive impact. This
syllabus;
moderators)
85
might provide evidence to demonstrate that the examination is or is not
2.4. Practicality
cost effectiveness. According to Steers (1994) those factors are important because
awarding bodies and school centres have to operate within the overall time and
resources available. Aspects such as the type and quantity of students' work to be
included in an art examination, the number of assessors, and the period of the
examination are all important considerations in the context of finite fiscal constraints.
• Central and local costs in terms of: production of question papers, marking
86
• Security, transport and storage of materials;
The examination constraints described in this chapter are used here as the basis of a
framework (see Figure 1 below) for evaluating the validity, reliability, impact and
practicality of art and design examinations, taking into account the identified major
• Validity
Aspects concerning: (1) the extent to which the design of the examination
understanding and skills in art and design, (2) the constructs used, (3) the
of format, directives, standards and scoring criteria, (5) the relationship with
87
- Using methods and procedures that may prevent candidates from performing
in the way intended. Using tasks and marking procedures that do not fully
and skills. Using tasks and marking procedures that encourage and reward
and outcomes.
gender.
• Reliability
different concepts and distinct philosophies in art and design education, for
example in the design of the instrument, standardisation, and moderation and post
moderation processes.
88
- Instructions and procedures for marking that are unclear or weakly defined.
• Impact
examination results are interpreted and acted upon, together with the after-effects
(referring to art and design knowledge, understanding and skills) which are
results.
89
- Inauthentic assessment instrument: tasks that do not enable the candidates to
authentic way.
Summary
In this chapter the reliability, validity, impact and practicality of art and design
examinations have been discussed in order to establish the key problems and issues
using tasks and criteria that place one or other group of examinees at an advantage.
established and, finally, questions about the convenience, flexibility and cost-
practicality of the system. The chapter ends with a proposed conceptual framework
90
to evaluate essential aspects of an art and design examination through document
Reliabilities
Validities
• Content validity • Inter-rater reliability
• Face validity • Intra-rater reliability
• Response Validity
• Bias
• Construct/Theory based
validity
• Predictive validity
• Concurrent validity
Impact Assessment
Assessment Procedures
Contexts of use Processes mark
Instrument schemes,
Effects on society and
individuals training,
Effects on schools consensus,
Effects on curriculum Type, format, internal marking,
Effects on teachers, directives standardisation,
Effects on higher Content, moderation, post
education, question moderation,
Effects on students papers, tasks, consistency in
Feedback. assessment marking and
objectives, grading.
Criteria, Ethics of
weightings examinations
Constructs,
underlying Practicality
Practicality
models
Chapter 3
Design of the research
91
The first part of this chapter reports on the choice of a combined methodology
to be used for the purposes of the study and evaluation of validity and
reliability aspects in current art examinations in Portugal and England and for
the search and trial of new procedures to improve art examinations in Portugal.
The second part of the chapter describes the selected qualitative and
quantitative instruments and methods of data collection.
three main areas of investigation were established corresponding to the overall aims
of this research:
goes on in each assessment culture but, more importantly to the meanings behind the
practices. Initially this led to the choice of a qualitative research method. According
However, it was recognised that using a qualitative approach alone would limit the
92
different data collection instruments was considered necessary to obtain reliable
Taking into account the previous considerations, the research design was mixed-
method, combining both qualitative and quantitative data in case studies. The main
research, although time consuming, was particularly useful for testing out the validity
93
3.3. Stages of research
Stage 1:
Stage 2:
Review of
literature
• Defining a new model
of external assessment
for Portugal
Analysing aspects of
validity, reliability, • Trial 1: Piloting in one
impact and practicality school
in current art • Evaluating the model
examinations • Redefining it
• Portugal • Trial 2: trial in five
• England schools
3.3.1. Stage 1 • Evaluating the model
• Examining the
F implications
Suggesting
improvements
94
3.3.1. Stage 1
Stage 1, which concerned the study of current art examinations in Portugal and
England, and was, mainly conducted during 2001-2002. This stage was intended to
analyse aspects of validity, reliability, impact and practicality in both systems and
3.3.2. Stage 2
on the analysis and findings obtained during Stage 1 and an additional literature
review,
a blueprint for a new external assessment was designed (see p. 214). Rationales, draft
instruments and procedures were established and verified through a priori validation
taking into account the views and advice of experienced Portuguese and English art
and design teachers who provided useful comments on how to redefine the blueprint.
3.3.2.1. Trial 1
A first trial or pilot of the new assessment instrument was conducted in a Portuguese
secondary school with seven art teachers and 51 students who volunteered to
participate in late 2002– and early 2003. Five meetings were convened with the
teachers between 29th September 2002 and 18th January 2003 in order to discuss
assessment objectives, criteria and grade descriptors (see p. 221). The meetings acted
95
both as training and standardisation and were used also to provide feedback and
3.3.2.2. Trial 2
After the first trial evaluation and subsequent revision of the instruments and
procedures a second trial was conducted in five Portuguese secondary schools during
the period April-July 2003. A list of 15 art teachers potentially willing to participate
in the experiment was identified from the questionnaire responses (Appendix VII-2,
question 14.9.) These were contacted and the purpose of the trial, activities and
schedule were explained. Ten teachers agreed to participate with their students. All
but one of their students accepted making a total of 117 students. The trial included
After the trials, the experimental assessment instrument and procedures were
analysed. Data was collected through: (1) the researcher’s notes of school visits,
(2) interviews with teachers and a sample of students; (3) teachers’ and students’
questionnaires; (4) teachers’ and external observers’ reports; (5) on-line teachers’
96
Stage Dates Research objectives Instruments/sources Samples
s
Stage 2001 To examine international Literature review
1 literature pertaining to the
theoretical domain of
assessment in art education.
97
Table 1: Plan of action
The documents analysed included secondary source documents and primary source
documents, the latter included articles and official publications by bodies such as:
(Edexcel); Oxford, Cambridge, and RSA Examination Board (OCR); the Assessment
and Qualifications Alliance (AQA); the Joint Council for General Qualifications; the
Access to documents that were not in the public domain and permission to use them
for the purposes of this research was formally requested from the various assessment
including documents published on the Internet in the web sites of the various
organisations.
The principal documents for analysis were concerned with relevant legislation,
tests from the Portuguese national assessment department, the English Qualifications
and Curriculum Authority (QCA), and the English awarding bodies. Other
98
documents used included minutes of meetings, working papers, examiners’ and
moderators’ reports, both in Portugal and in England. The documents also enabled
the identification and selection of individuals who contribute to the research. Both
the literature review and the document analysis provided information with which to
3.4.1.2. Observation
key features and strategies for testing in these areas. She suggested that researchers
artworks. The key assessment events, identified through document analysis led to
identification of and contact with the relevant assessment experts who provided
Mansfield, Nottinghamshire in May 2001 and July 2002. The researcher wanted to
understand how moderators were prepared; what kind of processes were used to
exemplars were used and how the marking of such exemplars was explained. The
Edexcel art assessment coordinator. During these meetings the researcher acted as a
99
conducted with English art examination stakeholders (For details of observations
3.4.1.3. Interviews
in the search for meaning. The purpose of the interviews was to understand practices
from the point of view of stakeholders and to obtain deep information about the
expectations and problems experienced by participants. The initial plan was to use
structured and unstructured interviews in order to allow them ‘to tell their stories’.
and
lives. The interviews in this research provided a rich array of data about users’
perceptions of art examinations (Appendix IX). It was hoped that they would enable
standardisation and moderation and ways in which results were utilised by pupils,
In stage 1 the purpose of the interviews was to obtain key stakeholders’ perceptions
about the current art examinations. An interview schedule was designed to gather the
100
most useful information, without spending disproportionate time, by organising the
questions into sections on the validity and reliability aspects of assessment identified
in Chapters 1 and 2: (1) assessment instruments, (2) assessment procedures and, (3)
impact of assessment results (Appendix IX, 1.). The schedule was piloted with an
English art teacher moderator in order to identify potential problems, the pilot sought
to test out the interpretation of questions by the respondent and helped to refine them.
Forty-seven questions were selected although not all of them were later used they
Selecting the sample for interviews was crucial to the research because the
participants' responses were being used to validate or negate the findings from
A letter explaining the purpose of the research and asking if they would collaborate
was sent to potential participants and the arrangements for the meetings were made
through letters and emails. The interview sample for England included six English art
teachers and one art student. Since in Portugal questionnaires were conducted with a
large number of students and art teachers a decision was taken to make a life history
interview with only one art teacher and examination developer who had been
101
Roles Name Sex Age Region Experience Interview
Date
teaching examinations
Portugal Curriculum Isabel F 69 Lisboa 45 34 7-05-2001
and test 14.00h-
developer 18.00h
The interviews in Stage 1 were conducted in 2001, at the interviewees’ homes, and
varied between one and four hours in length. The interviews took the form of long
conversations and it was found that this strategy helped the collection of data in a
non-prejudiced and non-prescriptive way. The researcher’s role in the interviews was
one of orientation to a given theme; the approach was like a teaching situation in
102
which respondents taught the interviewer about facts and personal perspectives.
The interviews were tape-recorded and notes were recorded in a research journal at
the time that events and processes were witnessed. Later, all the interviews were
transcribed in the original language. After the first analysis the researcher’s notes and
preliminary findings were shared with the interviewees and feedback and further
Some interviews were quite long and participants refer to many details and examples.
After transcription, the interview data was re-organised to relate to the three main
The interview data was described and summarised by reducing the statements to
those aspects that emerged as being essential through the recurrence of terms,
In stage two it was decided to use both individual and group interviews for students.
Focus groups interviews seemed to offer advantages because they enabled interaction
order to focus on essential issues of validity, reliability, impact and practicality of the
sections: (1) instrument and criteria, (2) contents and resources and, (3) ownership of
the assessment (Appendix XI). The schedule was piloted during the trial 1 with ten
students after they concluded the portfolios for the new examination and it was found
103
that the interview schedule did not obtain very rich responses, so a decision was
taken to use open interviews during the main trial (trial 2).
A sample of students was selected in four schools for group interviews, each teacher
selected five student volunteers for interviews trying to represent boys, girls,
(five students group interviews including a total of 31 students). The interviews took
place in the different schools’ art rooms after the portfolios’ moderation procedures;
104
Figure 4: Sample of students for interviews in stage 2
3.4.2.1. Questionnaires
In stage one a survey questionnaire was designed in order to obtain students and
VII). The questionnaire sought to ascertain, first, the degrees of validity, reliability,
impact and practicality of the Portuguese art and design examinations and second, to
based on data obtained from the document analysis and through the lens of the
framework for evaluating art and design examinations described in Chapter 2 (p. 76).
instruments, assessment procedures and impact were used to design the questions
developing each one of the conceptual issues. Nine sections were developed for the
students’ questionnaires: (1) general information, (2) syllabuses, (3) criteria, (4)
procedures,
(7) reliability, (8) assessors, (9) impact. Teachers’ questionnaires included three
more sections: (10) assessing, (11) feedback and (12) in-service teacher training.
105
The questionnaire was piloted with 10 teachers during an art teachers’ congress
(Apecv, Evora; March 2002) and later with 10 students from design courses at the
After piloting the questionnaire some small language revisions were made in order to
clarify the language, for example technical words such as ‘reliability’ were replaced
October-November 2001, in two different versions: (1) for secondary art and design
teachers with 83 questions (Appendix VII, 2.) and, (2) for students who took
national art examinations in subjects related to art and design with 67 questions
teachers and art assessors in Portugal (approximately thirty percent of the 350 art
teachers who teach 12th year art classes). It was also intended to include a
who took national art examinations in 2001) from different art courses in universities
across the country. Questionnaires were posted to: (1) students involved in the 2001
national art examinations, (2) all the teachers in the art departments of secondary
schools identified as having art courses. Questionnaires for students were posted to
those attending art teacher training courses at the main Portuguese universities and
106
Although pre-stamped envelopes were enclosed, initially it was found difficult to
obtain the questionnaires responses, especially from art teachers, even after sending
follow up letters. So, other approaches were devised. The researcher delivered
questionnaires during Portuguese congresses and conferences on art and design held
in 2001 and 2002. Finally by the end of 2002 a forty per cent response rate was
Forty-four art assessors and one hundred and four students (68 students from general
courses; 36 students from technical courses) replied to the questionnaires. The final
perceptions of the new assessment instrument during the first and second trial. The
questionnaire schedules for students (Appendix XV) and for teachers (Appendix
XVI) were written in accord with the conceptual framework for art examinations (see
Chapter 2, p.76), the concepts of the framework were used to develop a series of
questions focusing on the evaluation of: (1) the portfolio – instructions, format, tasks,
schools, and the curriculum. The questions were revised after a pilot because some
107
questions were repetitive and some terms were replaced in the interests of achieving
greater clarity, finally the questionnaires included 64 questions for students and 93
questions for teachers. All students and all the teachers involved on the trials
assessment instrument.
Statistical information about examinations and appeals was collected from the press
assessment results was sought through a mock examination marking results (MTEP
The different types of analysis used were as follows: (1) Content analysis for
documents, observation notes and interviews (see Appendices IX, X); (2) Descriptive
statistics for questionnaires (see Appendices VIII and XVII);(3) Multi-faceted Rasch
analysis for marks of the current examination students’ MTEP mock examination
scripts and students’ portfolios in the new assessment (see Appendix XXIV).
in Portugal and England in stage one and to understand teachers’ problems with the
108
not neutral. In the course of the research, the documents used for describing and
explaining different assessment systems were analysed taking into account the
ideologies underpinning them. First the contents were summarised to permit rapid
of the frequency by which textual items (words, phrases, concepts) appeared in the
text was made and finally interpretation and meaning was constructed (see example
Appendix VI).
stage one were essentially analysed crosschecking the information obtained with the
procedures were applied, it was sought to see how many visual exemplars of
students’ works were used in standardisation meetings, the type and format of the
exemplars, how the assessment matrix and criteria were explained to the moderators
In stage two the observations conducted in the five schools of the trial 2 were used
participant teachers and students interviews, it was sought to understand the school
109
A decision was taken to adopt an interpretative approach to analysing interview data.
meaning from participants’ words and actions (Cohen & Manion, 1994, p.27).
(2) The meaning of categories was shared and discussed with participants in order to
test the credibility of the research (Lincoln & Guba, 1985; Robinson, 1993).
theory.
The data was transcribed and condensed into analysable units. Labels or tags were
created in the form of codes for assigning meaning to the descriptive or inferential
information compiled during the study. The codes were useful as a way of providing
a means to achieve the generation of concepts (Gough & Scott, 2000, p. 339).
The codes were first defined in accord with the research questions and conceptual
framework (see Chapter 2, p.76), they were used to retrieve and organise words,
setting. However, the codes were not fixed from the start, they developed during the
study, in the form of sub-codes or even entirely new ones such as ‘teacher aid’
Codes served to compact and cluster segments of information, in the first stage of
‘assessment procedures’, and ‘impact’. In the second stage they became more
110
‘effects’. Finally more explanatory codes were defined according to patterns
between judges’, ‘ shared language’, ‘teacher aid’, ‘teacher training’, ‘causes of bias’
111
Teacher
Clear bias aid
Assessment instrument Instructions instructions
Format validity authentic + effects
modular Over
assessed
Criteria
Interpretation of criteria
Shared language
Assessment + effects
procedures Standardisation
Visual Training
Moderation exemplars
reliability consensus between judges’
Diagrams and other graphical forms were devised to organise the coded segments of
links. Networks are helpful to show complex interactions of variables arising from
statistical and content analysis in order to create a narrative or define a final matrix.
them in boxes. A network can be seen as a map of the set of boxes one has chosen to
112
Portfolio instructions Teacher aid
bias
time resources non prescriptive tasks
Improve validity
flexibility Independent study Need
tasks previous
preparation
Self-assessment Assessment
criteria for learning
Assessment responsibility
procedures ownership Students’ voice
In-service teacher
Improve training resources Visual dialogue
reliability exemplars
time
ownership
Shared language
Practicality
Need
Less practicality expensive resources
Need time
113
3.5.4. Descriptive statistics
The questionnaire data was analysed using the computer programme for managing
statistics SPSS. The purpose of the questionnaires was not focused on finding causal-
reliability and impact assessment. Overall the principal data analysis used descriptive
some correlation tables were made and T-tests were calculated to define the measure
of some potential relationships, for example between sex and tasks (Chapter 7,
p.267).
In order to analyse the degrees of markers reliability in the current Portuguese art
thirty- six MTEP examination students’ scripts was collected from a mock MTEP
examination conducted with one class at the pilot school on the 28th September
2002. During the implementation of the new assessment thirteen portfolios were
collected during the pilot and twelve portfolios were collected during the main trial
experience. The scripts and portfolios were selected in order to obtain a wide variety
of different performances, excluding scripts and portfolios marked zero and twenty
because null achievements and perfect achievements are not useful for multi-facet
114
marked by ten art examination assessors selected from the survey questionnaire;
teachers were selected in order to obtain a sample with different ages; gender;
assessment. Each teacher assessed nine scripts on two different occasions; all the
teachers marked six of them. The scripts were posted to the teachers who marked
The sample of portfolios were developed by the students during the pilot and trial
and marked by all the pilot and trial teachers participants during 2003.
Many logistical models have been developed for use in measuring tests’ estimates of
reliability. From the known methods of estimation, the Rasch model has provided
particularly useful information for assessment developers (Baker, 1997, p.24). Rasch,
who developed his models in the early 1950s, defined two specific types of
parameters (one for persons and one for items) thereby introducing a new concept of
(1997, p.58) the measurement principle with which Rasch was primarily concerned
not depend on the particular items used, and comparisons of items which do not
depend on the particular persons tested. Baker pointed out that the Rasch’s models
have certain common characteristics: ‘each has two parameters, one identified with
person ability and the other with item (or test) difficulty, and the ability and difficulty
parameters are in each case ‘separable’ in that they can be estimated independently,
(Baker, 1997, p.62). Many further developments of the Rasch have been used but, in
115
the context of this study, it was important to determine the effects of raters in
performance-based assessment because the focus of the analysis was the markers’
reliability. This was possible using multi-faceted Rasch measurement because types
of interaction between the rater and the scale could be calibrated. An individual
analysis of rater performance. Since the researcher did not have a strong background
Weir (2003) multi faceted Rasch offers a ‘…sophisticated way of looking at degree
of overlap between raters but it also provides evidence on level of marking and is a
systematic method for calibrating scores to iron out differences occasioned by inter
marker differences’.
(Linacre and Wright, 1992). In this research a free sample of the programme (unifac)
associated with the raters. Therefore, three facets or factors of variation of the
assessment setting were investigated: candidates, items and raters. Each rating can be
116
candidate; the difficulty of the item and characteristics of the marker. Data about
each facet was introduced in the programme and through the use of Rasch formulae
it was possible to bring them together into a single relationship expressed in terms of
the effect they were likely to have on a candidate’s chance of success. Consequently
degrees of inter and intra-marker reliability. As McNamara (1996, p.143) points out:
‘We can express the difficulty of items according to the likelihood of a candidate of
given ability getting a given score (or better) from a rater of given severity on that
item. Finally we can express the severity of raters in terms of the chances of the rater
awarding a given rating (or better) on a given item to a candidate of given ability.
The facets can thus all be brought together in a single frame of reference’.
During the research it was acknowledged that the bias of an individual researcher
interferes with the reliability of the data (Coehen and Manion, 1994; Cresswell,
1998). The researcher professional background was particularly pertinent to this. The
researcher, a Portuguese art teacher for twenty years in secondary education, had
been involved as test developer of Theory of Art question papers in the Portuguese
national examinations between 1994 and 1997 and had been assessor of art
examinations in MTEP and History of Art for several years. Her disappointment with
the experience in the Portuguese national examinations and more broadly with the
current assessment system was the cause for starting the research and was also a
potential source of bias. Being aware of such bias influenced the conduct of the
researcher during the two stages, the researcher tended to adopt a distant role trying
to avoid passions and memories from past experiences, trying to understand the
117
current systems of art external assessment in England and Portugal in terms of their
historical contexts and others’ perceptions of it, trying to implement the new
suggestions and, trying to evaluate the current systems and the new one impartially.
However such ideal conditions might not be always respected and this led from the
very beginning of the research design to the necessity of creating strategies for data
3.7. Triangulation
the main aspects of validation in qualitative research. This was particularly helpful in
According to Sullivan (1996) the criteria for assessing the viability of qualitative
findings is not so much a matter of whether outcomes are statistically significant but
118
Taking into account Sullivan’s (1996) views and the metaphorical description of
Richardson (1994), triangulation procedures to verify and validate the findings such
as for example making use of both quantitative and qualitative data were used.
research. The data collected through the different instruments was correlated and
the data. Theory in the literature was used for supplemental validation by referencing
the literature to provide validation of the accuracy of the findings. Verification of the
interview and observation data was conducted by interacting informally with the
processes confirmed but in some cases raised doubts about the data. Another strategy
was used for verification purposes during the pilot and trial when external observers
were invited to attend to the sessions and their reports reflected upon the experiences
they had witnessed (see Appendices XIII and XXIII). The external observers Graça
Martins and Leonardo Charréu were recognised experts in the field of art education.
The study investigated public documents, and documents which had been authorised
to be used in the research by their owners or were published in the public domain.
Anonymity of participants was guaranteed by the use of fictitious names for persons
and schools. Participant’s permission was given to publish the written accounts of
interviews. The protocols developed for interviews and trials referred to the
following matters:
119
- Participants right to voluntarily withdraw from the study at any time.
- The central purpose of the study and the procedures to be used in data collection.
- A space was provided for participants to sign and date the form to confirm
An important ethical concern is the extent of the potential impact on the lives of the
subjects of the research. This may be particularly true in the case of studying
assessment, where the relationship between researcher and participants can make it
difficult for the former to remain neutral. Torrance’s (1989) writing about his
assessment research reveals a number of important points that were taken into
There have been many times when I have attended marking and
moderation meetings and been asked my opinion of pieces of work.
Given that one is often perceived as 'expert' in assessment, or even
'from the board', it is sometimes difficult, though not impossible, to
avoid being drawn into discussion. More acute still however, are the
occasions when one may actually have an impact on the performance
and subsequent grade awarded to candidates – vulnerable minors
placed in an already threatening and stressful situation (p.177).
During the trials of the assessment instrument or visits to schools, the researcher
as little as possible. It was clearly explained to the participating teachers that the
occasions there was interaction between researcher and participants. For example
when students and teachers formulated questions and posed them to the researcher
120
about their projects and about the projects that other students were carrying out in
other places. It is possible, nevertheless that the researcher did influence events also
by giving some indication of how other participants were carrying out work projects.
Summary
The research was carried out in two stages. The first, which investigated the current
situation in art examinations in Portugal and England, was carried out in 2001-
2002.The second, which focused on the creation, and evaluation of a new assessment
instrument for Portugal was carried in 2002-2003. The design of the research was
A survey questionnaire was used in stage one to establish art teachers’ and students’
The researcher was a participant observer in the trials of the new assessment
instruments and procedures developed and tested out in Portugal in phase 2 and the
data collected for the purposes of joint evaluation of the model by the researcher and
students' and teachers' questionnaires and interviews and portfolio students' marks.
Documents and interviews were analysed qualitatively (content analysis). The data
as a whole was coded descriptively first and then by more interpretative and
121
explanatory categories. Codes were developed from this base in order to describe,
assessment. Then they were shared with participants to arrive at credible findings
Descriptive statistics were applied in analysis of the survey questionnaire data using
system and the new model developed and trial led during the research applied Rasch
122
Chapter 4
The first part of this chapter describes the current art and design examinations
impact and practicality. The second part of the chapter presents the analysis
of the current assessment model and the analysis of users’ perceptions of what
level was exclusively centred on knowledge of technical drawing. Drawing was often
education the first art examinations for students prior to university entrance date
from 1836 and were based on technical drawing exercises. This tradition was very
strong in Portugal and shaped the form and content of the present day underlying
Geometry’.
An exploratory interview was conducted with one of the pioneers of art programmes
in Portugal in the 1970s and 80s to understand the history of the art examinations:
There were two periods of external examinations: at the end of the 5th
year (age 15) and at the end of the 7th year (age 17). I remember when I
123
passed my 5th art examination at the ‘liceu’; we had to draw a pot and we
had another question paper in geometry. The art exams for the 5th year
were usually ornamental drawing or still life and a geometry test. In the
7th year the examination was a single test on geometry (Isabel, Lisboa, 7-
05-2001).
Since the revolution of 1974 official policies have regulated the Portuguese
examinations at pre-university level. Between 1974 and 1980 the country was living
through a period of political change after the fall of the dictatorship. Society was
taking the first steps towards a democratic political regime and under such
train an elite, and technological courses where students had no access to university
entrance. This separation between courses and schools was eliminated after 1975.
the 10th and 11th years of the ‘complementary course’ were created (Despacho
normativo nº 140-A/78). In 1980, the first 12th year examinations replaced the ‘ano
‘complementary courses’ had two modes: (1) ‘via de ensino’ designed for students
vocational pathway, but students in mode 2 could also gain access to university
each specialism during the course and external examinations in the core specialisms
taken in the 12th year (Despacho nº 67/81). In the early 1980s, national external
selected their students using schools’ internal assessments and their own
124
examinations entrance. For example Fine Arts schools offering painting; sculpture;
and architecture courses had required students to pass a day long test of figure or still
life drawing in order to be admitted in the course. However the secondary school art
After the ‘liceu’ we had to pass an entrance exam in Fine Arts; and
for that we took extra courses in representational drawing (figure
and still life drawing) with a painter or a sculptor. I remember my
entrance examination in Fine Arts. We had two days of intensive
charcoal drawing from casts (Isabel, Lisboa, 7-05-2001).
The art curriculum in secondary education progressively evolved from a vision of art
separate specialisms. Drawing and History of Art were introduced with syllabuses
The History of Art only appeared after the 1970s in the secondary art
curriculum; there was reluctance to accept it for entrance because they
valued more the descriptive geometry (Isabel, Lisboa, 7-05-2001).
Drawing as a specialism was introduced in the curriculum for the last years of
secondary education (former 6th and 7th) in 1977. After the analysis of drawing
syllabuses it was found that the content focused on the formal elements of visual
language, media and technical skills. The approach was essentially formalist and the
key concepts for teaching came from Weimar Bauhaus teachers such as Johannes
Itten, Wassily Kandinsky, and Paul Klee. Art appreciation included consideration of
the formal elements of art and design such as line, tone, form, colour, and texture and
the exercises were driven by these visual elements and the ‘rules’ of composition.
125
In 1986 national examinations in the three core specialisms of each subject were
/SERE/90). The format of these examinations was a test (120 minutes in the case of
Geometry and Drawing and 90 minutes for the History of Art paper) conducted
the secondary education internal assessment results, (2) external examinations in the
core specialisms, and (3) results of a test called the ‘General Access Test’: ‘Prova
Geral de Acesso’ (PGA) covering general cultural knowledge. The general access
test and core specialism examinations were externally set and assessed by teachers
directly to the syllabus content because, as Isabel confirmed 'the question papers
were designed by the same people who wrote the syllabuses’. These examinations
have not changed since 1981 and still preserve their original format.
From an analysis of 18 question papers from 1981 to 2002 it was evident that the
general structure of the Drawing question paper was preserved unaltered throughout
this time period. The question paper required a short period test (120 minutes) and
understanding the so-called formal qualities of art and design. The visual tasks were
such as for example: line, form, texture, pattern, balance, composition and rhythm.
The recommended media was pencil and the question papers required recognition of
usually from the twentieth century. The weighting was usually 70 percent for visual
126
tasks and 30 percent for written tasks. The criteria for assessment included a list of
model answers for the written questions and ambiguous criteria for the visual tasks
related to abilities for exploring visual elements ‘expressively’. The assessors marked
them using a holistic process for judging technical and expressive drawing skills.
These examinations continued in the early 1990s until the introduction of the reform
of secondary education after the implementation of a general law for the education
education were approved in 1993 (Despacho nº 338/93) and the first examinations
under these new regulations started in 1995/96 for students in the 12th year of
schooling.
The assessment documents, which have regulated secondary education since 1993
Education System Law (Lei de bases do sistema educativo nº 46/86). The first
aim of this assessment model was ‘to promote students’ educational success’.
assessment using more ritualised forms of control seemed to serve the functions of
127
it is claimed, on parity of treatment as a guiding democratic principle. And the so-
called parity of treatment required uniformity in the art and design examinations.
It was important that all students should have the same object [to
draw] to put all the students in the same situation. You know, if
students or schools arrange their own models for drawing some
would be more difficult than others and it wouldn’t be fair (Isabel,
Lisboa, 7-05-2001.
But in art and design such prescriptive methods may conflict with the diverse and
creative outcomes that the curriculum objectives for art education claimed to foster
(Eça, 1999).
Internal summative assessment was described as ‘the global judgement about the
parents about student achievement. Internal assessment was more or less continuous,
occurring three times per year. The results were expressed as marks for each
specialism and were awarded by students’ own teachers during teachers’ assessment
meetings that took place at Christmas, Easter and at the end of the course year.
The last meeting of the year established the students’ final marks. External
128
the final internal assessment results for each discipline and the examination results in
core specialisms that were weighted at 30 percent of the final mark. Students should
The principles of education followed since 1986 had been much in favour of
continuous and formative forms of assessment. However the need for control and
accountability of the education system for student selection (Machado, 1994, p.53)
4.2.2.1. Instruments
comprised: (1) a Director for all the subject areas, (2) a co-coordinator for each
subject area and for each test, (3) two authors (usually secondary teachers) for each
specialism, (4) two or more specialist revisers (auditors), (5) two language revisers,
(6) two text revisers, (7) and one specialist adviser (consultor).
129
The question papers were written in accordance with a statutory assessment
framework, general aims and objectives for the Portuguese National Curriculum and
national syllabuses adopted for each specialism. The examination paper had a list of
questions with a mark scheme totaling 200 points corresponding to the mark scale
(0-20). The design of question papers included some procedures of validation, for
example: verifying the graphic aspect; checking clarity and the form of language;
checking if questions were related to students’ age and related to the syllabus
content. Language and specialist advisors carried out this kind of verification.
The role of the specialist advisers was to provide the expert validation considered
essential for art and design examinations, especially because they were not piloted.
At the same time that the examination papers were finalized, schools received the
and teachers could focus their lessons on these contents. In January/February they
received a form called ‘matrix for the examination question paper’ from the GAVE,
with the content headings, an indication of the number and type of questions and a
list of the materials to be allowed for the examinations. Schools also received a
model (i.e. exemplar) test, usually during March/April. Teachers could prepare
students for the examination by using the model test either as a mock examination,
or by including items from the model test in their own tests during the year.
130
4.2.2.2. Procedures
The National Jury of Examinations (JNE) was responsible for the process of
delivering question papers and assessment. The JNE was composed of a President
and Vice-President and technical assessors from the secondary department of the
special considerations for candidates with special educational needs (SEN); the
scripts sent by schools, contacted the assessors and arranged the regional meetings
where scripts were distributed. Assessed scripts were returned to the centres, and
the results were sent to the central section of the JNE at Lisbon. This process was
All examinations were taken on exactly the same time and on the same day all over
the country. There were three examinations periods. Until 2004 candidates could opt
for the ‘first call’ (1ª chamada at the end June, early July); or ‘second call’ (2ª
chamada in mid-end July); another call took place in September (2ª fase) for
candidates who did not pass or were not able to attend the previous examination
periods. In 2004 the ‘2ª fase’ was abolished. This means that for each subject at
uniformity and the security of the process. In the instructions for the conduct of
131
examination (Departamento do Ensino Secundário, 1996), detailed information was
given to schools about timings; how to arrange the rooms; how to invigilate the
examinations; the number and roles of personnel involved; the type of materials that
students were allowed to use; the instructions for checking students’ identity cards;
instructions for filling the examination response papers; the exact time to open the
officers); how to attach the pages of scripts; the exact time to collect scripts;
During the examinations, government inspectors might visit schools. In each room,
there were two invigilators: teachers who were non-specialists in the examination
practical problems that might arise. Students were not permitted to leave the room
during an examination.
Teachers usually had to mark the examination scripts of students from other schools
in the cities and towns within the geographical region. The process was co-
coordinated at local level by school centres in each area. Schools had to propose a
to the school centres. For example, a school with 100 students registered for an
Meetings with all the teachers/markers attached to school centres were convened to
receive students’ scripts and discuss the criteria for assessment. These meetings
132
were supposed to establish dialogue between markers about the application of
The assessors returned the scripts with completed marking forms to the school
centres. The JNE collected the marks and sent them back to schools by end of July.
Candidates could ask for a review of their examination marks (appeal). Candidates
were required to explain the reasons for the request in a written report. (Despacho
normativo nº 13/2002). Another teacher dealt with the appeal from the school centre
(relator) who re-marked the question paper. The second mark prevailed. If the
candidate was not satisfied with the result, he/she had one more opportunity to
following appeals: 11,776 marks were changed to a higher grade and 979 to a lower
(CNEES, 1998) acknowledged that there was a need to pay more attention to the
In 2000, 72 percent of the candidates asked for appeals. In 2001 the Jornal de
Notícias (25/April) published an article about appeals during the 2000 examinations
based on statistical data which revealed that candidates had a strong chance of
increasing their marks on the second appeal. For example, in one secondary school,
61 out of 72 marks were increased. Particular cases of candidates asking for a second
appeal were highlighted; for example one candidate, who obtained 13 points in the
133
History of Art paper, appealed twice, and maintained the score in the first appeal, but
the question papers for subjects with large number of candidates in the core
specialisms for university entrance. Before 2001, examinations results were not
released publicly but political pressures during 2001 forced the government to
publish the examination results of all 625 secondary schools. The league tables
generated huge public debate about the ranking of schools. The left wing deputies in
discrimination between schools: the right wing deputies claimed that not publishing
examination results was a plot to hide mediocrity and leniency (Público, 26 April
2001). The results raised a controversy about accountability and the quality of
education. The Diário de Notícias (27 August 2001) and Público (26 August 2001),
were alerted to the great disparity between internal assessment and examinations
The Diário de Notícias, article was illustrated by specific and often extreme cases.
For example one student’s internal mark in Descriptive Geometry was 18 points and
the examination mark was 6.2 points. But there were other examples where students
obtained much better marks during examinations than during internal assessment.
134
The newspaper’s interpretation of such discrepancies did not question causes related
Secundário, 2001) in 2001 there was variation between internal assessment and
Art: 3 points; (c) in MTEP: 2.5; (d) in Theory of Design: 2 (general courses), 0.3
(technological courses); and (e) in Theory of Art and Design: 1 point. Such
differences may not appear particularly significant at a first glance, however these
results were not a true illustration of the examination results since the Departamento
do Ensino Secundário did not published other statistical data such as minimum,
Art examinations between 1993-1996 were internally set and assessed. Students’
final results for qualifications and university entrance were expressed through
after the curricular revision for secondary education that was implemented in 1996.
135
example; textiles; ceramics; product design [courses at
specialist art schools such as Antonio Arroio at Lisboa and
Soares dos Reis at Porto]. Art teachers from Antonio Arroio
and Soares dos Reis schools were also invited to help to
design the art curriculum for secondary education.
Ensino Secundário, 1995) a conclusion was drawn that the structure of the art and
Studio Art/ Design and Technologies. The main learning domains in each specialism
were technical and perceptual, although contextual and critical studies were also
The MTEP and Studio Art syllabuses presented a broad range of aims related to
knowledge, understanding and skills of art and design production including ‘artistic
136
‘critical awareness’, and ‘research’. However making such a clear demarcation
between theory and practice could reduce the opportunities for students to develop
contextual and critical studies of art and design in studio art production.
The same demarcation existed in the recommended assessment instruments for art
Geometry, History of Art, Theory of Design, and for MTEP. In Studio Art/ Design
and Technologies the recommended instruments for assessment were very flexible
examinations in the arts only existed, in Portugal, for specialisms widely viewed as
more objective such as Descriptive Geometry; History of Art and Theory of Design;
Theory of Art and Design and MTEP. The national examinations in such specialisms
reinforced their status in the curriculum and at the same time emphasised technical
drawing per se and the history of art and theory of design as encyclopaedic
MTEP included a broad range of assessment criteria such as: ‘quality; quantity;
137
skills; critical skills, inventive and intervention abilities; use
of knowledge’ (GETAP, 1992, p.13).
These criteria were recommended for internal assessment, which was considered to
be both formative and summative. Each teacher marked his/her students’ work
according to his/her own interpretation of them. Besides the list of criteria and
objectives there were no other written documents to help students and teachers.
Compared with disciplines like Portuguese language and mathematics, art disciplines
had a small number of candidates. The total art examination entry in 2001 was as
and technological courses) – 7,168 candidates; History of Art (general courses and
courses) and Theory of Design (general courses and technological courses) – 8,435
However this apparent ranking could be because they were considered to be more
138
The assessment instruments used for national examinations in the arts were uniform
written tests covering the syllabus contents for year 12 in each specialism.
The Descriptive Geometry, Theory of Design, Theory of Art and Design, and History
of Art written tests lasted 120 minutes and Materials and Techniques of Plastic
representational systems. The History of Art, Theory of Art and Design and
recall, contextual and perceptual analysis of art and design objects and art and
design movements.
The only specialism with question papers offering a more comprehensive range of art
revealed that the structure of the test did not change over the years. The test included
opportunities for written and visual responses. The required tasks were an
observational drawing exercise (first part) and in the second part a simulation of a
as final solution and a written explanation of the proposed intentions and materials
The criteria and weightings for assessment in the examination did not vary greatly
139
• Part I: Representational drawing (60 points): Ability to represent
the object faithfully (form and proportions); Personal interpretation
(expressiveness and global effect).
• Part II (1) Preparatory studies and sketches (60 points): Capacity to
present alternative different solutions (creativity). Capacity to communicate
ideas in visual form. Adequateness of the techniques and materials used in
the sketches. Expressiveness.
• Part II (2) Final solution (60 points): Adequate proposal for the
solution of the problem; Composition and organisation, Space-form; Plastic
qualities.
• Part II (3) Written Report (20 points): Description of the process;
Justification of the options; knowledge of the properties of materials and
techniques.
Isabel described some of the constraints faced by an examiner developer. The tight
in Portugal left very little opportunity for creating authentic assessment situations.
The only question paper that included art production, MTEP, was reduced to basic
paper (Isabel), the test was created as a remedial strategy to compensate for the lack
of drawing skills elsewhere and the type and format was constrained by the general
regulations for examinations in the art curriculum. Isabel claimed that under such
But, suddenly they said that all the subjects in the specific
areas should have external examinations; it didn’t fit our
syllabuses; and worse, they said that examinations should be
just theoretical pencil and paper tests. We tried to convince
them to give more time for practical question papers; at
least in MTEP, we proposed 15 days for the examination
time, but they didn’t accept. So we had to create these
question papers for MTEP. We decided to focus on drawing
skills since drawing was absent from the art curriculum
(Isabel, Lisboa, 7-05-2001).
140
In the end they gave 3 hours for the MTEP examination. So,
we invented these question papers for MTEP. We wanted
students to make drawings and we had to think about a
model for drawing which could be easily sent by post for
schools and be the same for everyone. And we did manage
to make the cardboard model (Isabel, Lisboa, 7-05-2001).
Comunicação Visual’ (APECV) also endorsed the view that MTEP art examinations
were not a valid reflection of the course content in terms of content, face, construct
In this discipline [MTEP] a pencil and paper test like the current one is
not valid to completely assess the knowledge, understanding and skills
of candidates in the arts (Parecer MTEP1.1.2002, APECV).
141
Another thing we asked for was a second assessor; but they refused
it. A second assessor would certainly improve the reliability of the
results; you know art examination results are not very reliable
(Isabel, Lisboa, 7-05-2001).
The fact that marking was a solitary task revealed a complete but misplaced trust in
percentage of appeals. At the same time standards were tacit, since there was no
explicit public information about them. The written criteria for arts provided by
syllabuses, instructions for the examinations and question papers were vague and
There was no formal system for training assessors or formal meetings to debate or
examinations instructions, was the markers’ meeting convened at each school centre
for reception of students’ completed scripts. Copies of the minutes of these meetings
in the 2001 examinations were requested from five school centres after receiving
formal authorisation from the JNE. The school centres were: Porto Cidade; Coimbra
Centro; Lisboa Central; Faro and Viseu. The minutes’ schedule was defined by the
regulation: nº 48 /Norma 02/2000: (1) solving the question paper and discussion of
the various processes of solving the problems; (2) debate about the application of the
criteria provided by the GAVE; (3) report of the information required (phone or fax
142
request of information to GAVE or JNE) in case of doubts about the question paper
Each school centre provided six sets of minutes corresponding to the different
meetings for MTEP and Drawing (1st and 2nd call and 2nd phase) – 30 in total.
Two minutes reported that there was no meeting because the markers did not all
attend at the same time. In 80% of the minutes the text was: ‘nothing to report’. The
The assessors consider this test too easy for the 12th year. Some students
use this exam as a core discipline for university entrance [Drawing
question paper].
From the minutes, two serious concerns about the MTEP assessment instrument were
evident: the fact that in the previous year the MTEP examinations instructions were
not in accord with the question paper and the 2001 MTEP question paper was not
clearly legible because of graphic design errors. The minutes concerning the
143
Drawing question paper were also interesting, especially because the question paper
Consideration of these minutes raised two different hypotheses: (1) that teachers
discussed the application of criteria but did not record the results of the discussion in
the minutes, or (2) teachers did not discuss the criteria at all. According to the
minutes, it seemed that teachers did not feel the need to ask for any explanations
Isabel considered that the reliability of assessment results was a problem created by
In order have precise data to check inter and intra-raters reliability of the current
examination ten MTEP examination assessors were asked by the researcher to mark
12th year art students. Art examination assessors selected from those who had
returned the survey questionnaire marked the scripts between January and April
geographical locations, sex, and years of teaching and experience of art examination
assessment. Each teacher assessed nine scripts on two different occasions; all the
teachers marked six scripts in common. The difference between teachers’ marks was
144
from five to seven points on a scale of 20 (inter-rater reliability). The marking
difference with the same teacher (intra-rater reliability) varied from zero to six points
on a scale of 20.
Current Exam
of criteria. The minutes of some MTEP examination appeals procedures were helpful
in understanding how teachers applied the criteria. Copies of 40 reports from appeals
(2001 examinations) were requested from five school centres (20% of the total
number of centres) after having received formal authorisation from the JNE. The
145
Clearly, the marker was concerned about the student’s technical expertise for
Her marking approach was negative, pointing out what the student was
incapable of doing rather than what he/she could do. This is another example:
In group I there were some errors in the faithfulness of the form and
proportions and problems in expressiveness and global effect. In group
II, a) there was errors in creativity, capacity for visual communication
of ideas; problems in articulating techniques and materials; in the
sketches errors of expression. In group II, b) errors in articulating the
final solution with the problem, weak organisation, composition and
plastic treatment (551 report).
The marker seemed to use the terms derived from the criteria in the reports.
‘errors of expression’ are hard to decipher. In the next example, the same kind of
In group II question (a). 2.5 points are awarded to the sketches because
there is no diversity of proposals that indicates a failure in creativity.
Besides the proposed materials are not the best indicated for the
solution, being too heavy, e.g. concrete and iron… (report 060).
All the judgements in the reports were based on academic criteria such as the use of
formal visual elements, techniques and materials. The main justification for awarding
146
marks was based on concepts of ‘composition’, ‘formal organisation’, and ‘graphic
exercises the main justification was ‘the candidate is unable to communicate ideas
visually’; the reports did not mention if the candidate took creative risks in the
sketches of their proposals. Creative and expressiveness qualities were not described;
creative work was vaguely referred to, usually in relation to the number of sketches
and the appropriateness of chosen materials for the final product. The judgements
depended on overall impressions. But this was not surprising given the inadequate
teachers to translate their judgements about visual products into written language.
4.4.1. Respondents
assessment in Portugal (see Appendix VII). Forty-four art assessors and one hundred
and four students replied to the questionnaires. The final sample of teachers/assessors
147
Age sex Secondary school place University courses
Islands 4 Education 50
The respondents had different views about the value of the various art specialisms in
the curriculum (question 1.8). Sixty-eight respondents, 45.9 percent, considered that
all the art specialisms were important. For the others, Studio Art was considered the
(11 respondents).
The existing aims of art specialisms were considered appropriate for art and design
148
respondents considered that the objectives and contents described in syllabuses were
adequate for art and design courses (question 2.2). A further 51.1 percent considered
that art and design syllabus at school provided a good basis for further studies
(question 2.3).
There was no significant variation of responses between students (SR) and teachers
(TR) in the first two questions 2.1 (aims), 2.2 (objectives and contents). In question
2.3 which asked for a response to the statement: ‘The art and design syllabus at
school provides a good basis for students' further studies’, teachers and students
responded differently.
149
4.4.2.2. Opinions about the validity of the examination (Section 3 and 4)
considered that instructions (rubric) for examinations were clearly stated (question
4.1), only 48.8 percent thought the questions were clear (question 4.2). One students’
not very clear’ (SR 80). But 4.3 percent of the respondents strongly agreed that they
The time allowed for answering the question paper was considered adequate by 57.2
that the questions in the examination paper were similar in their form and content to
tasks set in courses (question 4.4), however 9.4 percent strongly disagreed with this.
Of the respondents, 56.1 percent disagreed and 29.5 percent strongly disagreed with
the statement in question 4.5: ‘students are given a good opportunity in the
examination to show well what they know, understand and can do in and about art’.
The tasks set out in the question papers were not considered adequate: 58.7 percent
of the respondents disagreed and 27.5 percent strongly disagreed with the statement
in question 4.6: ‘The examination allows the most able students to demonstrate their
examination did not allow students to display the knowledge, understanding and
skills set out in the syllabuses (question 4.8). And 55.7 percent of the respondents
disagreed and 17.1 percent strongly disagreed with the statement (question 4.9):
150
understanding and skills for future artists and designers’. Furthermore the current
criteria in the examinations were not considered clear by the majority of the
respondents, 55.8 percent disagreed and 18.1 percent strongly disagreed with the
statement in question 4.7 that: 'The assessment criteria in the examinations are
clearly stated’.
Teachers gave varied responses to the statement in question 4.4: ‘ The questions
in the examination paper are similar in their form and content to the tasks set in
the classroom’. The same was the case for the statement in question 4.8:
‘The examination allows students to display the knowledge, understanding and skills
According to one student’s comment in section 10, the examinations did not cover
all the contents of the course and students could take advantage of this:
151
registration and pass the exams as external students; this
has advantages for students because we only need to study
the contents of the exam question paper and not the whole
course syllabus (SR 18).
152
Table 11: Responses to question 4.13
Responses to questions about assessment (section 3) revealed that both teachers and
students considered it was important that students know and understand the criteria
being used to assess their work; 96.6 percent agreed with the statement ‘ It is
essential that students know exactly what qualities the teacher is looking for in
students’ work’ (question 3.1), and 97.8 percent agreed with the statement: ‘It is
important that students understand the criteria used to assess their work’ (question
3.2). Respondents also strongly agreed with the statement that students should
should have some say about the criteria used to assess their works’ (question 3.3).
4.4.2.3. The ideal form and structure of art and design examinations
(section 5)
Only 5.7 percent of respondents thought that art examinations should consist only of
questions requiring a written response (question 5.1); 29.3 percent were in favour of
5.3). However some differences were found between teachers’ and students’ views
153
5.1 1-Strongly 2-Agree 3-Disagree 4-Strongly Median
agree disagree
Teachers: TR 5% 52.5% 42.5% 3.00
Students: SR 2% 4% 42% 52% 4.00
Table 12: Responses to question 5.1
instrument for art examinations; 75.9 percent believed that individual portfolios
provide the best evidence for assessing the work of art and design students (question
that final art and design assessment should be based entirely on coursework’
(question 5.6).
5.7). However there were some differences in the responses of teachers and students.
154
80 percent of teachers agreed with the statement 'students' self-assessment should
be taken into account for the final examination’ whereas only 65.7 percent of
70
60
50
40
30
20
role
10
Percent
0 t
Missing 1 2 3 4
self-assessment
Students’ responses
Teachers’ responses
155
(7): Notes of group criticisms: 72.8 percent.
Of the respondents 66.9 percent agreed with the statement that question papers and
criteria should be internally set by teachers and externally approved (question 5.9).
The frequency of responses to questions about the duration of art and design
examinations was inconclusive. The reason for this could be that the questions were
not sufficiently well constructed and the respondents thought that they should answer
of the sub-questions in question 5.10 separately rather than by choosing just one
option.
Some comments added at the end of the questionnaires (section 10) revealed that
individual students had strong opinions about how their work should be assessed.
For example:
A percentage of 59.4 of respondents stated that students usually read instructions for
marking (question 6.1), 44 percent thought that the assessment matrix and how
instructions about how their work could be marked was clearly explained to students
156
before the examination (question 6.2), and 38.2 percent considered existing mark
schemes entirely appropriate for judging their artworks (question 6.3). But 53.8
percent of the teachers considered that guidance about how to mark students' work
the statement in question 6.2 that ‘The assessment matrix and how work will be
students disagreed while the majority of teachers agreed with this statement.
of the respondents did not believe that assessors always marked students work
consistently (question 7.1); 90.6 percent of them thought that their interpretations of
assessment objectives and criteria varied (question 7.2), and 94.2 percent thought
that the same student work could be marked differently by two different assessors
(question 7.3). Furthermore the answers to the question 7.4 were confirmation of
agreed with the statement: ‘In my opinion it is only a matter of luck or personal
were not always consistent and might mark the same work differently at different
157
periods of time (question 7.5). Different tendencies in the views of teachers and
always mark students work consistently’; and 7.2: ‘Different assessors have different
interpretations of assessment objectives and criteria’; and 7.3: ‘The same student
results it seemed that the students were more concerned with the issue of reliability
158
In section 8, the questions asked about ‘ideal’ assessment procedures and 73.2
should be assessed by the students’ own teacher and at least one external judge
(question 8.4). Less agreement was found with the statement in question 8.3:
A percentage of 79.7 of the respondents considered that students should have a voice
procedures:
I think that students should have access to assessment criteria and their
exam work should be assessed by their own teachers (SR 83).
From students’ comments it was evident that they considered that the current
159
Some teachers in Section 13 also wrote comments about assessment procedures,
although these tended to be vague. One teacher also wrote about the need for reform
The questions in section 9 of the questionnaire asked for opinions about the impact
of art examinations and one response stressed their effects upon students:
Opinions about the value of the results of the examination in society revealed that
both teachers and students had low expectations, 36.1 percent of the respondents
thought that employers valued the results of the art examinations (question 9.5), and
67.4 percent thought that universities and polytechnics highly valued them (question
9.6). Of the respondents, 60.4 percent considered that school administration valued
them (question 9.8), 69.6 percent thought that parents valued the results of the art
examinations (question 9.9), and 81.8 percent of the respondents considered that
It was evident that students valued their results in the art examination because they
depended on them for university entrance. However the responses from teachers
160
revealed low trust in their value; teachers may distrust the results as a consequence of
Currently, in Portugal, examination results are used to select students for university
‘The universities should provide their own process or exams for student selection’
(TR- 29). The responses to questions about the use of the art examination results
showed that 79.5 percent considered that examinations were used to select students
respondents considered that the examination results were the principal determining
examinations:
The responses to the statements about effects of the examination upon curriculum
practice revealed that 57.7 percent of the respondents agreed that the examination do
161
influence the work set during courses (question 9.1), more teachers than students
believed this was the case (see table 20). Although 52.2 percent of respondents
considered that a lot of work carried out during courses was not related to
examinations (question 9.2), the majority of respondents (64.9 percent) thought that
Teachers and students considered the correlation between internal assessment and
external assessment results weak since in question 9.13 only 39.6 percent of
respondents agreed with the statement ‘The marks received by students during
internal assessment in the secondary art course (10th, 11th, 12th year) will be a good
students did not consider that internal results were related to external examination
results.
revealed that only 22.1 percent of respondents considered that the results of national
art examinations (12th year) were a good indicator of students’ results in their
university courses (question 9.11). But 44.8 percent of respondents expressed the
opinion that the internal assessment marks obtained in secondary art courses (10th,
162
Students and teachers responses to the statements in question 9.11 and 9.12 varied.
It seemed that students were less confident than teachers about the correlations
The questions 1.10-1.14 in section 1 of the questionnaire for students were designed
to obtain data to crosscheck with questions about the predictive validity of art
provide the marks awarded in the first year of their university art studies
(58 respondents). The small number of responses did not make reliable data.
163
Therefore the correlations described in Appendix VIII (section 1.1, p. 47-48)
education (internal + external results) and the results obtained in the first year of
university can only be viewed as interesting but very tentative. Similarly the
correlations between examinations results and the results obtained in the first year of
Overall the questionnaire results about the impact of art examinations tended to be
• Art examination results were highly valued by students, but were not
• The respondents did not recognise the impact of art examinations upon
examinations to be low.
percent of the teacher respondents said that they assessed students’ examination
works using holistic or overall judgements; 56.4 percent said that they usually
respondents said that they usually assessed student’s examination works using both
164
Notwithstanding the above responses, detailed criteria were not considered
majority of 94.9 percent of them thought that it was useful to discuss interpretation of
assessment criteria with their colleagues and only 5.1 percent strongly disagreed.
A percentage of 71.8 said that they usually met with their colleagues and tried to
reach a common interpretation of how to apply the criteria and 63.2 percent
considered it was not difficult to reach consensus about how to apply criteria.
These responses contradicted previous data derived from the reports of assessors
Sections 11 and 12 of the questionnaire for teachers were about ideal assessment
74.4 agreed with the need to receive notes of guidance and visual exemplars to
help teachers and assessors (question 11.2), and 89.7 percent stated that there
should be regular in-service training courses for teachers and assessors based
The majority of teachers who participated in the questionnaire favoured special training
need for special teacher in-service training on assessment. And 89.7 percent
agreed with the need for special assessment training for assessors, while 94.9
percent agreed with the statement that current students’ work should be used
for training of assessors. The statement: ‘The aim of in-service training should
165
be to reach consensual agreement about the interpretation of criteria’ had the
Summary
From an analysis of the questionnaire the following evidence emerged:
were clear they also believed that the examination focused on a narrow part
of art and design learning, the question papers did not allow students to
properly reveal their knowledge, understanding and skills in art and design
• Respondents did not consider the procedures used for assessment reliable
• The impact of art examinations was evident upon the users; the respondents
166
influenced teaching practice by focusing on a narrow range of art and design
examination results.
• Respondents’ opinions about how to reform the system suggested that it was
with both written and practical tasks, and including an element of self-
student’s own teachers and at least one external assessor would be the best
The analysis of the history of Portuguese art examinations and current practice
assessment which valued teachers’ judgments but had no external controls, and
The procedures used for external assessment seemed to lack quality assurance
167
mechanisms and consistency of marking. The underlying model of the art
or practical, and the core domains of art education were fragmented because of this
what could be a broad and rich art education. Problems with reliability of the
assessment procedures were detected. The existing assessment criteria were vague
and the process of marking extremely dubious; the absence of moderation procedures
the system.
168
Chapter 5
Art and design examinations in England
This chapter sets out to describe and discuss the strengths and weaknesses of
English art and design examinations at 17 + level through the analysis of their
of the research was intended: to study the policies and practices of art and
practice by analysing concepts and theories that underpinned art and design
5.1. Introduction
From the review of literature a number of art and design examinations had been
From this list, the English GCE A level examinations appeared to fit most closely
with the preferences expressed by Portuguese students and art teachers about an ideal
examination (4.4.2.3; p.137 and 4.4.2.6, p.142), since the assessment instrument
169
References to the English system of art and design examinations at AS and A levels
was included in this research because it was found during the Portuguese survey of
art teachers and art students that it appeared closely related to their views on an ideal
model for art examinations in Portugal. It is acknowledged that this section of the
thesis provides only a partial view of the English system of art and design
study. However analysis of some key aspects of English art and design examinations
body standardisation processes was useful for the purposes of collecting insights
moderation.
The documents used to inform this description included a review of literature about
the history of art and design examinations in England, articles in the press and
official publications (for details see 3.4.1.1, p.84). Since the English model of art and
explore how moderators were prepared, the processes used to reach a common
understanding and application of criteria, what kind of visual exemplars were used
and how the assessment and marking of such exemplars was explained to the
moderators.
170
This stage of the research also set out to understand stakeholders’ perceptions of
English art and design examination practices. For that purpose, data collected
through interviews (Appendix IX) was analysed in order to determine the strengths
and weaknesses of the model in terms of validity and reliability and to develop a list
of tentative suggestions for new assessment instruments and procedures. The purpose
of the interviews was to understand examination practices from the point of view of
different stakeholders and to obtain deep information about the expectations and
moderator (Ian), two art teachers (Cindy and Sally) and one student (Annie). Ian and
2001, and Cindy during an art teachers’ workshop (NSEAD, 2001). Cindy, Elisabeth
and Ian lived and taught in different regions of England and, through informal
experience and different points of view about the examination. Richard was a chief
included because of her special situation as an inexperienced art teacher. Annie was
selected to introduce a student’s view of the model. The sample was not intended to
171
Roles Name Sex Age Region Experience Awarding
body
teaching examinations
Principal Richard M 57 SouthEast 12 AQA
examiner England
Team-leader Elisabeth F 67 London 30 20 Edexcel
region
Moderator Ian M 48 London 18 11 Edexcel
region
Art teacher Sally F 46 London 1 Edexcel
region
Cindy F 42 NorthWest 18 OCR and
England Edexcel
Student Annie F 16 South England Edexcel
Table 23: Interviews with English art and design examination stakeholders
5.2. Art and design examinations before the 1988 Reform Act
Art education in England in the 1960s was influenced by rationales such as self-
subjects in the curriculum were strongly influenced by the modernist tradition in art
itself and the ‘progressive’ ideas of John Dewey and Herbert Read.
These approaches to art education promoted changes in the way that candidates’
work was presented and assessed. According to Carline (1968), personal expression
172
The predominance of observational tasks changed with the new conceptions about
the value of artistic work. For example, living models, humans and plants, were
was considered more important than copying: ‘…such work, even though it may
figure executed without character and culled from constant copying of fashion
Imagination, personal response and problem solving were all rated highly in the
appreciation was mainly concerned with the history of art. According to Carline it
was the ‘least popular’ of the art papers, although it was taken by ‘some thirty per
cent’ of the Advanced level candidates in art, mainly by ‘candidates who plan a
career in which art may be useful, and are likely to be granted extra time at school
difficulty ‘in combining art history with practice in the limited time allowed’ and it
Carline’s (1968) description of art examinations in the 1960s was written from the
point of view of an examiner. His writing makes clear that examinations influenced
subjects and their impact on teaching: for example, in the chapter about ‘craft’ he
173
stated: ‘…the inclusion of craftwork in the art examinations has given enormous
Carline used quotes from examining board reports to reinforce the claim that art
GCE A level art and design examinations progressively evolved over time. In the
1989). Before the implementation of the Education Reform Act in 1988, which gave
new statutory powers to the Secretary of State and central Government, 'Local
Education Authorities, schools, and even the individual teacher [had] almost total
autonomy' (Steers, 1989, p. 8). In fact, the only regulatory device for schools was
the examinations system (White, 1998, p.5). There was no national curriculum for
art and teachers had considerable control of the form and content of art education.
At that time, in England and Wales, secondary art teachers taught what they, as
subject to scrutiny by the Schools Council, the numbers of options for examinations
and flexible procedures ' led the emphasis upon teacher-control of examinations'
174
An example of teacher ownership of external assessment was provided by the range
of options offered to teachers, not only through the choice of an examining board but
also through the choice of examinations structure. According to Earle (1986, p.51) in
the 1970s and 1980s there were three possible modes of examinations for 16+ and
Within each of the eight regional bodies it was possible for a school to devise its own
examination if required (Earle, 1986, p.51). This freedom of choice was still
remembered by the English art teachers interviewed in this research (see interview
Cindy, Appendix IX, p. 101) as being a model for innovative and creative teaching
that allowed most teacher ownership, according to Earle (1986, p.51), but it was not
always preferred, perhaps because teachers did not have sufficient time or inclination
A finding of review of literature about the history of art examinations in England was
that the model of the examinations had progressively evolved from a test of several
hours or a day to the inclusion of preliminary work and coursework in the form of a
portfolio or an exhibition. Candidates usually received the question papers before the
examination in order to prepare the themes and develop ideas in visual and written
forms. In 1977, a survey of the syllabuses of the GCE A level Art examinations
conducted by Earle, found that eight boards conducted a variety of art examinations
175
in different combinations and ways (art, art and craft, art history) providing a total of
Earle (1986) pointed out there were problems of comparability among the
examination boards:
It could be argued at one level that one board expects candidates to undertake a
broad course of study while other boards demand a greater depth of study in specific
areas of art. The problem comes in equating one board's work with another, which
used to be the function of the A level sub-committee of the art committee of the
schools council. In its absence there is no indication of how standards among boards
‘In 1978 no examination board required the submission of coursework but in 1988
on how it should be marked or graded’ (Steers, 2003). Similarly, Earle (1986, p123)
pointed out: ' …there is not much stated in writing to assist the teacher with the
who require coursework and preliminary works are interested in the way which
candidates carry out their work as well as the final product'. Although art
achievement, instructions for this were stated in vague terms and no explicit
Earle (1986) stated that GCE A level art examiners and teachers became more and
176
The comparative wealth of examinations available to teachers has been a
luxury. While the curriculum was the 'secret garden' idiosyncratic
practices in art could continue. If teachers have to re-think their practices
in the light of the new examinations, it could become an advantage to the
profession to have a new and more standardised series of art
examinations (Earle, 1986, p126).
Earle was a chief examiner and, may be because of this his account of art and design
His vision was in line with a strong move towards criterion-referenced assessment,
which emerged during the 1980s, probably because of the external pressures upon
In the cause of a more reliable system of examinations the awarding bodies made
assessment was seen by teachers and examiners (Davies, 1987) as a way to introduce
5.3. The consequences of the Education Reform Act (1988) in England and
Wales
English education that were not always popular. The national curriculum was often
177
seen as the teachers’ losing control of the education system. According to White
(1998):
According to Ross (1989, p. 17): 'Authority in education passes swiftly from the
professional to the politician, from the researcher to the bureaucrat’. English teachers
were not used to such forms of policy-making – until then the only form of control
they had known being external examinations within a flexible choice provided by
several examination boards. However, at this time small boards were subsumed in
larger awarding bodies. For example AEB, NEAB and City and Guilds became the
Assessment and Qualifications Alliance (AQA) awarding body; ULEAC and BTEC
became Edexcel; Oxford and Cambridge and RSA examining boards become
Oxford, Cambridge, and RSA awarding body (OCR). The model of examinations
became more and more rigid with the amalgamation of examining boards. In the late
1980s and 1990s the trend was towards removing teacher ownership of education
and reinforcing more centralised control of the curriculum and external assessment.
Teachers were aware of the negative implications of such changes upon the
profession as Steers pointed out (1994, p. 289): ‘In recent years this professionalism
has been under threat from a government that has made known its mistrust of teacher
assessment’.
178
According to Steers (2003) usually A level art examinations were based on the
portfolio or exhibition of selected works from the two years course of study.
Current examinations have less focus on drawing or craft skills per se,
but place a higher value on knowledge and understanding where both
process and product are assessed. The syllabuses are designed to allow
candidates to provide evidence of their achievement throughout a course
of study rather than test their ability to perform within set and very short
time limits. It can be argued that, in general, the 1998 examinations
provide more demanding but potentially more rewarding opportunities
for both candidates and their teachers (Steers, 2003).
During the 1990s ‘process’ progressively was more emphasised in relation to ‘end
more and more on students’ evidence of developing ideas and investigative work.
This may have been a consequence of tendencies in art education theory at the time
to focus more on the process than on final products and by the development of more
external assessment at age 17+. There were general qualifications such as GCE AS
qualifications were linked to ‘national occupational standards, which set out real-
world job skills as defined by employers’ (QCA; 2000). In the General Certificate of
179
higher education or employment usually took A Level examinations. A Level
courses normally lasted two years and the examinations could be taken in a linear or
modular fashion. AS and Advanced GCE examinations in Art and Design were
suitable for students who wished to study art, craft and design at a higher level,
Qualifications and Curriculum Authority (QCA) that defined the main examination
procedures. The general procedures for all examinations were stated by the QCA in
the Code of Practice (QCA, 2000). The three awarding bodies in England set their
syllabuses, question papers and assessment procedures only with the approval of the
QCA, which also monitored and evaluated the awarding bodies through a
guarantee ‘ parity in each subject and qualification from year to year, across different
syllabuses, and with other awarding bodies’ and ‘ compliance with the requirements
At the time of writing, English schools and examination centres could choose one of
offered by the Welsh Joint Examinations Committee (WJEC). This choice could be
that of the head of school, head of art department or through agreement between
teachers in the centre. Awarding bodies managed ‘all stages of the examining
process’ ensuring that the procedures were ‘carried out in accordance with the Code
180
of Practice and with the awarding body’s policy procedures’ (QCA, 2000). Figure 5
181
Chief examiner
• Supervises the construction of the question papers, mark schemes and the criteria
for coursework assessment.
• Responsible for monitoring the standards of principal examiners and principal
moderators
• Evaluation report of the examination at the end of each examination period.
Team Leaders
Assistant Principal Coordinate and supervise the
team of moderators
Examiners
Assistant Moderators
Assistant moderators moderate centres’ assessment of candidates
in accordance with the agreed marking criteria and the awarding
body’s procedures
182
From September 2000, a modular structure established the specifications for GCE
AS and A level examinations, with the modules corresponding to units. At the time
of this writing most A level qualifications were made up of six units that were
broadly equivalent in size (QCA, 2000). Three of these units constituted the
Advanced Subsidiary (AS) qualification, representing the first half of the full A
level. The other three constituted the second half of the full A level and were called
A2. A level qualifications included two distinct examinations, AS and A2, each one
covering three units of the course. The units were separately set and assessed.
The modular structures of courses and assessment according to the Code of Practice
(QCA, 2000), were intended to allow flexibility. But they appeared to be a potential
theory and practice of art and design education. In the interview with Richard he
said: ‘We were all bound by…the modular structure for the AS and A2’ (Richard,
SEE, 4-02-2001). The structure of the courses in separate units tended to make the
numbers of students taking art and design qualifications at AS level, another was a
distortion of the art and design courses. In theory, modularity was supposed to
ideas and skills considered important in art and design learning. According to
Elisabeth:
Art is on-going, it is a developmental thing and it goes on, you can get
back to a piece, you can use it as a starting point for something else in
the future, whereas with this modular structure you can’t do that.
183
Now everything they do is assessed, they don’t have time to explore
and experiment which is what art is about. They always have a
deadline for something and that is in their heads, I think it stops
things…. I think if you are all the time focusing on deadlines things
have to be finished off by rushing through, I think that it does interfere
with the creative process (Elisabeth, L, 7-02-2001).
For Cindy:
‘I don’t think that’s how we learn. I think its based on the wrong
theory… to assess in a way like this, by putting things into chunks of
experience, is not how you work, I don’t think it is a good practice
too’ (Cindy, NEW, 10-02-2001).
The introduction of assessment by units to replace the single final assessment at the
end of the course, as was the case with the old A level, was explained by Cindy:
‘I know why they have done it, to an extent. Well I think it is so the
students don’t have the pressure of the work at the end, they can
assess as they go along’ (Cindy, NEW, 10-02-2001).
But in practice, in Cindy’s view, it had put more pressure upon students because of
strict deadlines and overloaded courses. Elisabeth thought that the modular structure
qualifications:
She thought this could be the reason for an increasing number of students opting for
art and design at AS level. However a head of art at the North London Collegiate
School considered this unfair and he was quoted as saying: ‘Art students need elbow
room for the pursuit of a personal aesthetic and a modular straitjacket does them no
184
This practice seemed to contradict the course rationales. For example the OCR
specifications stated:
The National Society for Education in Art and Design documentation showed that
one year after the implementation of curriculum 2000, the modular course, despite
the intention to present ‘a developmental structure’, was often seen a threat to art and
design. The newsletter of Summer 2001 included an article about the new A and AS
to allow more options for students, was problematic in terms of fairness and
explanation for the adoption of such measures could be the government desire to
increase rigour and accuracy. Cindy said that ‘…particularly in arts, in the 70s and
early 80s assessment was quite lax’ (Cindy, NEW, 10-02-2001). And this was also
assessment coordinator (IPB; L, 12-06-2000). However the fact that students were
assessed three times per year did not provide proof of fairness or accuracy.
185
According to the interviewees students and teachers were being hard pressed by
artistic development.
required two units of coursework, each one weighted 30 percent of the total AS
marks and 15 percent of the total A level marks and another unit, the ‘controlled
test’. For A2 examinations, two other units of coursework each one weighting 15
percent of the total A level marks were required and another unit of controlled period
test weighting 20 percent of total A level marks. The scheme of assessment was the
same for all the boards; the differences were mainly in the names given to
coursework units and the time allocated for the controlled test or externally set
papers for externally set assignments; guides for marking and reports of examiners.
structure of the course and established the scheme of assessment through description
186
the qualities sought in students’ works in each one of the specification titles or areas
of study.
The evaluation took into account clarity of language, graphic design, content
relevance, bias, and consistency across the years, and discrimination issues (QCA,
2000).
requirements involving (1) spiritual, moral, ethical, social and cultural issues; (2)
environmental education, the European dimension and health and safety issues, and
others; improving own learning and performance; and problem solving (QCA, 2000).
The rationales as stated by the three awarding bodies emphasised the need for breath
and depth in the art courses. For example, teachers were supposed: ‘to encourage…
an adventurous and enquiring approach to art and design’ and ‘focus on art and
(OCR Specifications 2000) and [the specification is designed to] ‘form the basis of a
187
experiences, environment and culture in both their practical and theoretical activities
External assessment in England was conceived as an integral part of the art and
by QCA and specifications of the different awarding boards were broad enough to
The specifications leave a certain freedom for teachers to set out the
course (Ian and Sally, L, 7-07-2001).
‘They state things quite clearly and they give good direction for
teachers and the more mature students’ (Elisabeth, L, 7-02-2002).
number of areas and media. The courses allowed flexibility in students’ choice of
From an analysis of QCA documents about art and design examinations and
awarding boards specifications the assessment objectives set out by QCA appeared to
criteria. This framework emphasised investigation of art and design products. In this
188
model there was a focus on art and design process, more weight was placed on the
development of ideas and critical and analytical capacities through the understanding
of context, meanings and purposes of art and design. It seemed to encourage analysis
and critical processes more than the final products. This appeared to be adequate for
the cross disciplinary nature of contemporary art and design, providing broad
assessment objectives seeming to be appropriate for all the areas of study (Appendix
IV). However the breadth of this model could restrict areas of experimentation and
specific technical skills during her course (Annie, SE, 6-05-2002). According to the
final products and specialisation in art and design. By reducing art to component
parts for assessment purposes the underlying model seemed to have been divorced
from established art and design teaching and practice, minimising the importance of
The clarity of instructions provided for the users is one essential aspect of validity. In
subjects that deal with creative outcomes, such as art and design, achieving a balance
curriculum is critical. And this was particularly crucial in the case of English art
and design external assessment where visual exemplars were often provided.
189
The interviewees expressed contrasting opinions about the degree of prescription
The interviewees thought that instructions were not sufficiently clear for students (or
at least for all students) especially in the question papers, and the role of the teacher
Ian and Sally wanted the instructions to be less ambiguous, at least in terms of
instructions for teachers from Edexcel were not clear, and some teachers had
misunderstood the requirements of the examination in 2001. The reasons for this
could include the new examination format, because teachers did not pay enough
attention to instructions, because they needed training, because the instrument had no
Cindy and Elisabeth said that the instructions needed to be broad and not too
prescriptive:
According to Elisabeth there was a danger that the instructions could act as a
190
‘…because, these units are very prescribed, the way they are
explained, in some centres, not all, make it very tight in the way that
all the candidates do similar work rather than the more individual art
that we would expect at that level’(Elisabeth, L, 7-02-2002).
She concluded that there was a need for instructions that allow for teachers’
courses and loss of freedom to produce personal responses. Annie’s work was
influenced by examples of ‘good’ work displayed on the Edexcel video sent to her
school (researcher diary, St.X School, 7-06-2002). The samples of students’ work
observed during standardisation meetings at Edexcel during 2001 and 2002 did not
generally demonstrate ‘…a wide variety of outcomes and media; specially the
and students may have been encouraged to ‘play safe’ and reproduce the best
examples in order to ensure good results and consequential benefits for all
thinking, which should be fostered in art and design. There was a tension between
instructions for the art and design assessment instrument and art and design
interviewed were in favour of such visual guidance (Ian and Sally, L, 7-07-2001).
But the literature showed that it could have pervasive effects upon schools and that
191
My fears were confirmed when I saw extracts from the induction video.
Although peppered with talk of ‘building on good practice’; prescription
and prohibition had resulted in exemplar material from pilot schools that
was so dull and devoid of wit it took my breath away. My idea of a good
sketchbook is an open ended tool for research and exploration; evidence
of an individual path of discovery, not the series of formulaic and
dedicated exercises exemplified therein. I wonder how many examiners
thought they would scream if they saw another colour wheel (Hardy,
2002).
this part of the research: on the one hand, if instructions are too prescriptive, even
though they are clear for students and other users, they may limit the breadth of the
instrument. On the other hand, if the instructions are broad enough to allow for
diverse outcomes, teachers need to interpret them. In the case of these examinations
where the assessment instrument was delivered by the students’ own teachers, broad
professionals. There may be issues of equality of opportunity also for those students
who do not have the support of a good teacher. Interpreting and delivering
training courses at which attendance was not compulsory. The nature of these
training courses was questionable. In one interview, Cindy noted, there was a need to
each module. This could take the form of a portfolio (structured sequence of
192
exhibition; multimedia presentations; written work such as essays and written and
visual work such as work journals. Two units of coursework were required at both
the GCE AS and A2 examinations, accounting for 60 percent of the final A level
mark. The specifications, students’ guides and instructions for teachers set down
explicit parameters and instructions for setting coursework tasks and defined
marking criteria. Every unit should provide evidence of all the assessment objectives
being met.
described a theme and a list of starting points for developing a unit of work to be
carried out during a controlled period. According to Edexcel the controlled test
in advance and they had a preparatory period for the controlled work. During this
period students could consult with staff and be supplied with supporting guidance
and materials. The period of the controlled test was fixed for AS and A2 at 20 hours
Preparatory AS A2
period Controlled Controlled
test (Unit 3) test ( unit 6)
Edexcel 6 weeks 8h 12 h
AQA 4 weeks 5h 15 h
OCR 4 weeks 5h 15 h
Figure 11: Controlled tests
193
Students should submit unaided work produced under examination conditions for
assessment, but could make use of supporting work in the form of research or
preliminary studies and work journals developed during the preparatory weeks.
Whilst the controlled period work in Unit 3 was not expected to be a finished piece
of work, at Unit 6 the test required the production of a final piece using the
preliminary studies and investigation carried out during the preparatory time.
Instructions for the conduct of examinations stated the types of work candidates
should submit. The three awarding bodies usually required preliminary studies,
research work in visual and written forms, preparatory and annotated visual studies,
and final products. At the time the research was carried out, Edexcel was the only
A conclusion drawn from the analysis of documents about art and design
examinations was that the assessment instruments were related to art and design
methods of inquiry. The evidence required from candidates through written and
visual work allowed wide variety of responses. Coursework and controlled period
work in the form of portfolios permitted the gathering of a wide range of evidence
skills through both process and product. According to Elisabeth the large amount of
‘…you need it, the more you have, the more you can see, the more you
relate to it, it benefits the students’ (Elisabeth, L, 7-02-2002).
194
This may have advantages in terms of reliability of the results because it
through corroboration.
The main problems detected with the type and format of the instrument were the
excess of tasks created by the modular system of assessment and requirement of one
synoptic assessment unit per year. The regulations set out by QCA seemed inflexible
in terms of time allocations and number of units per year. Such measures tended to
increase prescription. Hardy (2002) considered that the imposition of short timed
tests (units 3 and 6) was inappropriate for art, especially because English art and
design external assessment had previously showed a tendency to increase the length
of such examinations over time. The controlled period tests were imposed apparently
to align art and design assessment with other subjects, with little consideration
for established art and design assessment practices. Hardy (2002) suggested that
and students:
This lack of trust of teachers and students alike to keep their noses to the
grindstone is most blatantly illustrated by the extraordinary belt and
braces approach to assessment. If all units are to be examined why the
synoptic requirement for the timed test which, once again requires
students to back track and showcase skills already in evidence? (Hardy,
2002, p.57)
A conclusion was drawn that the type and format of tasks required in the assessment
individual students and generally had the characteristics of disclosure and fidelity
195
studies, preliminary studies with annotations, work journals and final products
produced over long periods) provided a set of written and visual evidence of
attainment that was disclosed and faithfully recorded using methods of work related
preparation for students’ future careers by all the interviewees. The majority
standardisation meetings. In fact the need for written evidence was emphasised
through the instructions and mark schemes; one key issue of validity was the degree
to which writing skills were relevant to art and design courses. This requirement
could be seen as a consequence of the QCA requiring general skills for all courses
change that had been introduced gradually. Earlier examinations had focused on final
products and little by little process had gained more emphasis. The emphasis on
providing evidence about process significantly changed the way teachers set courses.
According to Cindy:
196
However she believed that there was an imbalance of process and product in the
was helpful for students, but students found it ‘boring’ and ‘time consuming’ (Annie,
SE, 6-05-2002).
contemporary art, for example in the conceptual art movement, which puts more
value upon the development of ideas than upon final products. It is compatible with
and development of ideas. The move towards an instrument based to a larger extent
upon process could signify a loss of content validity. For example Ian and Sally still
believed that final products were more important than process, which, according to
traditional methods based exclusively on assessment of final products (Ian and Sally,
L, 7-07-2001). It was evident that these changes were quite radical and, as a
consequence older teachers had to reframe their own view of the purposes of art and
candidates increased. It was noted during the observations and interviews that the
quantity of written work had increased with the new post-Curriculum 2000
‘This again is all words, there are so many words!’(Cindy, NEW, 10-
02-2001)
197
And Elisabeth said:
5.4.8.5. Bias
At first glance the instrument did appear equally fair for all the students. During the
I think the exams cover everybody from any country, there is a scope
there for everybody to find something of interest and as a teacher, again
you often pick up the students own interests and it is dependent on the
quality of the teachers (Elisabeth, L, 7-02-2001).
However, the interviews revealed some problems of bias. For example students from
big cities with good resources in their schools and access to cultural events could be
advantaged; the instruments could advantage girls rather than boys; students from
different ethnic backgrounds could be discriminated against and finally students who
According to the 2002 art and design examination results (Joint Council for General
Qualifications, 2002) it appeared that for several reasons girls performed better than
boys. Elisabeth thought that this was because girls tended to develop superior writing
skills; ‘…they are more competitive; more mature than boys’ (Elisabeth, L, 7-02-
2001), and Cindy thought that it was ‘…because they are more systematic, more
boys’ underachievement in art and design examinations. Bowden claimed that the
emphasis on planning and use of sketchbooks, while valuable, inhibits art making
using media in a direct ways that do not, by their nature ‘…involve prior
198
investigation’ or ‘gathering systematic evidence or imagery’. He suggested that boys
projects.
students from ethnic minorities because of the issue of English language. They
instructions in more detail. Annie said that the instrument was unfair for non-English
‘informed answers’ and the question papers proposed starting points based on
research into exhibitions, books and the Internet. Whereas Cindy believed that
teachers could help students by supplying resources, she pointed out that if teachers
were not specially committed then this raised problems of equal opportunities
Differences in local resources were not the only problem. The compulsory themes for
the question papers might also be a source of bias. For example, the theme for the AS
controlled period test at Edexcel, in 2001, was ‘Cities’. According to Ian and Sally
this was a very fashionable theme in that year because of the current major exhibition
at Tate Modern, in London. Ian commented that his students had no opportunities to
199
‘know a city’ or to see the exhibition referred to in the question paper and he
believed that this inevitably would influence their performances negatively (Ian, L,
7-07-2001). Sally and Ian also said that the examination required a lot of extra work
outside school from students and this also was a bias because it disadvantaged
students from low-income families who had part time jobs (Ian and Sally, L, 7-07-
2001).
the English model of external assessment had permitted a great degree of teacher’
ownership of assessment in the past, but it was losing this strength. The teachers
who were interviewed felt they had less and less autonomy within the increasing
prescription imposed by the awarding bodies and the QCA. However, the role of
the teacher was still considered essential within the English model of external
assessment. It was obvious that students with access to good teachers would have
more opportunities to develop their knowledge, understanding and skills in art and
design. This topic arose frequently in the interviews. Cindy said that ‘…teachers’
work’ could help students to perform better (Cindy, NEW, 10-02-2001). It may be
that the examination instructions were not sufficiently clear or accessible to all
students. It seemed that teachers often had to remedy the ambiguity and complexity
of the instructions so as to help avoid bias, for example, by providing resources for
200
5.5. Assessment criteria
In the English art and design examinations reviewed for this study, the assessment
criteria were developed from general requirements set out by the QCA. The four
assessment objectives established by the QCA for art and design were:
The interviewees considered the assessment objectives published by QCA clear and
appropriate for art and design courses. In Cindy’s words they were ‘excellent’;
‘broad’ and respected the nature of art and design. The assessment objectives that
seemed to provide the frame within which the criteria were established by the
awarding boards acted as ‘windows’ for teachers to focus on particular work in the
It’s up to the student to decide what they want to present. The key to this
is how they are answering the assessment objectives, and criteria. We
don’t actually specify exactly what they should do, we give them
instruction material [instructions for examinations] – the assessment
objectives are the key to that (Richard, SEE, 4-02-2001).
However, a conclusion was drawn that the assessment objectives were not as broad
as they appeared at first sight. Some problems of content and face validity in the
201
assessment objectives. First of all, the term ‘objective’ suggests a rigid, compulsory
and universal concept, which is not in accord with the postmodern emphasis on
fostering plurality and ‘little narratives’ rather than universal ‘truths’. Secondly,
contexts. So, on the one hand, the assessment objectives seemed to be broad and
include essential aspects of knowledge, understanding and skills, but on the other
hand, the fact that they were imposed may restrict the nature and range of art and
design work. For example, Elisabeth (7-02-2001) stated that not all students needed
to look at the work of other artists in order to develop their own ideas. Cindy (10-02-
2001) pointed out that not all students had to make connections between their own
works with the work of others; some students developed strong ideas and produced
the work of others. Annie (6-05-2002) particularly questioned the need to link
students’ work to work of other artists. However the assessment objectives required
all students to follow the same method and submit evidence for all the assessment
England were intended to guide teachers and moderators to focus on those aspects to
202
…because it’s focusing, it’s making teachers and students focus
on those specific things , …teachers were looking at the work in a
different way, they were confronted with this new matrix, and
they really have to look at the work that they were marking… It
clearly guided teachers and moderators (Elisabeth, L, 7-02-2001).
Drawing on Williams (2001) ideas, a conclusion was drawn that the assessment
construct and face validity. The instrument placed an emphasis on activities that were
not necessarily the core of art and design because of the search for objectivity. In an
article entitled ‘Forced in the same mould’ published in The Times Educational
Supplement (November 2001), Williams suggested that the search for objectivity in
examinations was distorting art and design learning and assessment. The article
reported several teachers concerns with the new GCSE and GCE AS art and design
examinations. They claimed that students who were creative, but not good at
recording, were being penalised and, vice versa, students who were not particularly
outstanding with only the ability to record could achieve very good grades. Williams
concluded that the obligation to fulfil all the criteria in each unit of work was ‘…
making art into a hoop-jumping exercise, and creating a topsy-turvy world whereby a
candidate with poor creative and technical ability can nevertheless gain a decent
grade if all the boxes can be ticked’. Teachers were not against the introduction of
criteria related to historical context and contemporary artwork but did not agree with
the requirement to meet such criteria in every piece of work, which in their view
prevented students developing ‘exciting work’. Steers was quoted in this article as
saying: ‘We have stopped assessing creative and technical skills and the imagination
of students’.
203
In the same issue of the Times Educational Supplement there was an article by the
head of art at North London Collegiate School entitled: ‘Farewell to the Wow factor’
exploration had been reduced to a minimum and ‘…it seemed as though the course
had been deliberately designed to reward mediocrity’. Hardy strongly disagreed with
the unit structure of the courses and questioned the advantages of a common
foil to the convergent curricular core’, and that one of the most essential aspects of
art ‘…that joy at seeing students transcend the shackles of the syllabus’ was being
lost. He argued that ‘ The equal weighting for all objectives has led to the reward of
the spurious over the essential’. The analysis of the assessment objectives and the
detailed mark schemes of all awarding bodies confirmed that creativity, imagination
and researching sources were heavily emphasised. This may be a consequence of the
contexts more than creativity and innovative studio production. However, it may also
represent a quest for more objective ways of marking, as Elisabeth suggested it was
‘…much easier to assess the ability to record and write rather than to assess
genuinely creative work and risk taking’ (Elisabeth, L, 7-02-2001). A conclusion was
drawn therefore that the content validity of the instrument was being threatened by
204
statements of achievement and assessment objectives. Richard expressed the views
For examiners who look for uniform judgements in order to obtain accuracy of
aggregation of numbers. Judging an artwork involves more than counting the parts; it
is through holistic appreciation the overall harmony between the parts is recognised.
According to Gestalt theory, a work has to be judged as a whole because the whole is
more than the sum of its parts (Boughton, 1996; Eisner, 1985; Wiggins, 1993). Rigid
a whole. Analysis of the recommendations for marking during the 2001 Edexcel AS
Elisabeth, Cindy, Ian and Sally objected to the marking scheme, which they
considered inappropriate for judging the quality of art works. In Elisabeth’s words:
205
…it needs something else, it is that peculiar thing for the art subjects, I
don’t know what to call it…. it is not necessarily the creativity … it is
something that makes it different…that extra…I don’t think this is
included. I think that it should be something else (Elisabeth, L, 7-02-
2001).
Elisabeth recognised that judging art works was not a rigid and immutable process,
she understood that holistic marking processes should be taken into account. Overall
assessing art and creative work (Hennessy, 1994; Eisner, 1986; Armstrong, 1994).
Boughton (1999) reported that in the 1999 revision of the International Baccalaureate
Overall judgement or holistic assessment was not mentioned in the GCE assessment
matrices published in 2000, and according to previous examination reports this was a
radical change from the traditional processes of assessment in art and design.
A report on art and design examinations published in 1999 by the AQA awarding
body noted that teachers and moderators usually assessed students’ artworks by
206
However, the regulations for GCE mark schemes in England introduced in 2001
with the aim of making assessment in all subjects more ‘objective’ and uniform.
This raised the question of whether mark schemes exclusively based on discrete
criteria were appropriate for art subjects. Elisabeth and Cindy viewed criterion
When these criteria are compared with Boughton’s concept of criteria as ‘windows’
(1996), they may be too small for art and design, which explains Elisabeth thought
But Richard, who was chief examiner, expressed the view that the majority of
‘There is an agreement about the marking; we have less than ten re-
marking situations in about 100 centres. Teachers generally were very
positive about it; they were not negative about the marking scheme.
207
That mark scheme actually works quite well’ (Richard, SEE, 4-02-
2001).
This suggests that mark schemes were clear and easy to apply despite the complaints
Training teachers and moderators was at the core of the English model of
assessment. In fact the breath of submitted evidence and the nature of art and design
Although the awarding bodies had established procedures in place for teacher
training in the form of INSET courses, attendance was not compulsory; Cindy, Sally
and Ian found it difficult to attend because of the great number of candidates and
208
while the government was especially interested in teacher improvement, there were
regional level to share and discuss assessment as a most important strategy to ensure
consensus about standards and criteria. Training, as understood by Cindy, not only
depended on centralised courses but was part of a long-standing tradition of art and
(Cindy, NEW, 10-02-2001). However, Cindy said that such practices, once common,
had tended to disappear. The courses promoted by the awarding bodies at the time of
In Cindy’s view they were designed to impose policies. However, they must have
worked, to some extent, because the instructions for external assessment were
The analysis of documents revealed that the assessment procedures were the same
for all the awarding bodies, because the QCA Code of Practice defined them.
Students’ own teachers assessed candidates’ work in ‘centres’, which may be one
internal moderation. Teachers were required to write down marks for all units of
work, for all candidates, using the assessment matrix and then transfer a final mark
for each unit for every candidate onto specially designed forms; they had to arrange
209
a display of folders of work from candidates in the moderation sample, ensuring that
all pieces of work within each unit were clearly labelled and to send a copy of
Cindy said that in her school internal moderation took place as required by the Code
of Practice and awarding body instructions, but in Sally’s school this was not the
case, possibly due to pressures of time, extra work and an uncommitted head of
department. Sally said that she experienced significant problems with a lack of
information exacerbated by the fact that it was her first year of teaching and she got
little support from her head of department. Cindy’s description of the process was
very different. As an experienced teacher, she had clearly acquired appropriate tacit
knowledge and contacts with other schools and the art department at her school
worked cohesively.
According to Richard, who worked at the OCR awarding body, there was a high
significant changes had been made to internal marks in the Edexcel AS examinations
Richard’s views there was consistency of results, which suggests that internal
5.6.3. Standardisation
At the time this research was carried out, the awarding bodies in England had
210
involving: chief examiner(s); principal examiners; assistant principal moderators;
body procedures, time schedules, documentation and contact points; (2) explanation
process; and a briefing by the principal examiner on relevant points arising from
candidates’ work. This agenda was strictly applied during the two observations of the
Edexcel awarding body carried out as part of this research (Appendix X).
At the Edexcel awarding body there was evidence of formal training and
May 2002. Training and simulation of marking candidates’ work was observed
In standardisation, the awarding body used exemplar work from its archives and
‘live’ work collected from centres in the year of examination to explain how
211
book form and videos should be available to centres. According to the QCA Code of
the awarding bodies containing assessed work that show ‘ clearly how credit is to be
Edexcel standardisation meetings attended on 3rd May 2001 and 4th May 2002 there
was evidence of some exemplar work; but a great part of the examples displayed at
The work was displayed by unit. Near each display were the title of
the unit and a copy of the assessment matrix. Six sets of works were
already marked, accompanied by the completed assessment matrix
with underlying objectives and marks attributed for each objective
(familiarisation exercises). Seven sets of works were not marked
(blind marking exercises), they were marked during the second part
of the meeting by the moderators (researcher diary: observation AS
standardisation meeting, Mansfield, 3rd May 2001).
Only two samples presented the original student work. A set of ten
samples of photocopied students’ work was displayed for
familiarisation; an exemplar of the assessment objectives and marks
attributed to the work was attached to each sample A set of twelve
samples of photocopied students’ work was displayed for blind
marking exercises. An exemplar of the assessment objectives and
marks proposed for the work by centres was attached in each
sample. In the same fashion as blind marking samples, a set of 7
samples of photocopied students’ work was displayed as ‘RES’, or
reserved work, to be used if the moderators could not attain
accuracy during the blind marking exercises (researcher diary:
observation A2 standardisation meeting, Mansfield 4th May 2002).
assistant principal moderators and team leaders explained the assessment criteria to
the moderators using samples of student work during the ‘Familiarisation Exercises’.
Although the samples were not very varied the exercise seemed to help moderators
understand how to apply the assessment objectives and trained them in the
212
All the moderators received photocopies with the assessment matrix
for each one of the examples with the marks and the justification of
the marks attributed to each one of the assessment objectives
(which were underlined). The Team Leader asked the moderators to
look carefully at the examples and to the attributed marks and to
discuss them. Afterwards, the Team Leader explained the work and
the attributed marks showing the evidence related to the assessment
matrix and why it was marked in that particular way (researcher
diary, observation AS standardisation meeting, Mansfield 3rd May
2001).
The team leader explained the marks using the language of the
assessment matrix and adding comments such as ‘the work revealed
a weak area of response to other artists’; ‘the student revealed
confidence in some areas’; ‘the student spent too much time on
research’; ‘the degree of analysis lacked depth and breadth’
(researcher diary: observation A2 standardisation meeting,
Mansfield, 4th May 2002).
work.
that this was training, although Ian (7-07-2001) expressed some doubts about this
criteria; moderators were ‘calibrated’ to apply the mark schemes without having an
active opportunity to discuss them. The researcher did not perceived standardisation
213
as negotiated interpretation but rather as the imposition of the chief examiner’s
interpretation of the criteria. From the events that took place during the Edexcel A2
….the group had to mark the blind marking samples; and verify if
the marks attributed by the centres to the works were consistent. It
was an individual exercise; moderators had 10 minutes to mark
each sample; they marked in silence; the team leader was constantly
advising the group to read the sketchbooks because ‘everything or
almost a great part of the evidence is in there’ (researcher diary,
observation A2 standardisation meeting, Mansfield, 4th May 2002).
However problems of time and lack of diversity of samples were apparent. It was not
clear in the time available whether or not the standardisation would be accurate.
A conclusion was drawn that the large scale of the standardisation meetings could
work against the achievement of consistent marking. The most positive aspect of
standardised assistant principal moderators and team leaders and, in turn, team
214
additional training meetings or individual training for new moderators by team
5.6.4. Moderation
According to the Code of Practice (QCA, 2000) moderation was designed to verify
the consistency of teachers’ marks in centres. In the English model it played a very
important role in ensuring the reliability of teachers’ (and moderators’) internal and
external assessment, not only in ensuring that the same standards were applied but
also the fairness of the system. Students’ work submitted for examination was
guarded from the personal idiosyncrasies of the teachers through this process.
The fairness of results was ensured and teachers could use moderator’s feedback
Not all the candidates’ work was submitted for moderation. The moderation sample
comprised work randomly selected across the mark spectrum by an awarding body
and candidates achieving the highest mark and the lowest marks at a centre. Centres
had two options for moderation: (1) to send a moderation sample to the awarding
body or (2) to have the sample moderated at the centre by a visiting moderator.
Option (2) was most commonly used for art and design examinations.
Moderation consisted basically of ensuring that standards were maintained from year
to year and were consistently applied. Moderators arranged the visits to centres
within fixed time schedules set up by the awarding body. A centre might be visited
assistant principal moderators and chief examiner. At the start of a visit the head of
215
department ,or the teacher representing him/her, described and discussed the course
content and organisation, any problems experienced and the timetable (Edexcel,
(Edexcel, 2000, Instructions for moderators) the primary function of a moderator was
5.7.1. Weaknesses
The main weaknesses of the model were related to the modular structure of courses
and assessment, which were understood to threaten validity. This presented practical
problems for teachers and students because of the enormous amount of evidence
required and pressures of time. The negative consequences were that it distorted the
identified:
• Some problems of response validity were detected in the question papers for
the controlled test because of the written nature of the papers and
information required.
• Sources of potential bias included: gender bias, social class and geographical
backgrounds bias, and ethnic bias. But, at the same time, teachers appear to
own teachers.
• Process and product were not equally weighted in the assessment criteria.
216
• Core domains of art and design and essential knowledge, understanding and
• A major problem was caused by the mark schemes that were entirely based
schemes.
not pay sufficient care in the design and development stage of the instrument
to the reality of art and design practices through empirical research and
haste.
5.7.2. Strengths
217
should be able to submit for the examination as final products, sketchbooks,
was graded.
5.8.1. Weaknesses
The archives of exemplary work, as required by the QCA Code of Practice, were
vital to the success of teachers and moderators’ training. However the lack of
samples for training at Edexcel seemed to threaten the validity and reliability of the
these samples.
5.8.2. Strengths
The assessment procedures in the English model had significant advantages over the
• The type and format of the instrument, which was based on multiple evidence
218
• External moderation provided a way to check the consistency of results in
5.9. Practicality
I think it is true to say that all the art and design examinations involve
internal assessment and external moderation. Most of those who have
written [letters from members of NSEAD] welcome the involvement of
teachers in the assessment process but question the ‘unpaid’ extra work
this involves… An additional problem is that the marking load is all
concentrated around Easter and cannot be moved forward without further
disadvantaging students whose courses in art and design are already
effectively shorter than for the most other subjects (Steers, 2001).
examination. For example, problems with logistics – there was not always room in
the schools to exhibit and store the amount of evidence produced for examinations;
assessment within schools; with regard to teachers timetables and the unpaid extra
work that they carried out. This suggested that the examination was not sufficiently
adapted to available resources. The fact that three separate units of work were
assessed in each year increased the problems of practicality, especially because this
required a great amount of extra work for students and teachers, which was seldom
5.10. Impact
The English model of external assessment was complex and fulfilled several
219
assessment objectives had been attained, providing feedback for teachers on the
quality of their professional work, and ranking schools. It was not used only for
achievement, but also as a way to monitor centres and teachers’ performance and
methods of delivering the curriculum. Assessment results were important for all
According to Cindy (10-02-2001: ‘It [art curriculum] is very much assessment driven
and criteria driven’. The impact of external assessment upon schools was considered
to be potentially negative:
The more good results the schools have in examinations the more money comes in. I
am not sure if you are a parent you want to send your son or daughter to a school – if
a school has poor grades you won’t look for that school, you look for another school
(Elisabeth, L, 7-02-2001).
policies, and restricted core domains of art and design and limited learning time.
The examinations could also have a very negative impact upon the type of work that
220
During the standardisation meeting the team leader warned moderators
about the similarity of students’ works in centres. I thought that it was
kind of revealing that in the centres the students followed strict
orientations and the work should be very similar, at least in the processes
used and organisation; does this mean that art and design in schools is
very prescriptive and there is no great variations in students outcomes? In
this school [St. X School] looking at the art students’ works I have
exactly this feeling: all the student work seems to follow strict
prescriptions even in the kind of sources they used (research diary; 8th
May 2002).
The interviewees considered that accountability in the form of league tables was
excessive. It was considered to have a major negative impact upon schools and
Grades are public. Teachers’ unions have been claiming that it is not a
way to judge a school, there are some very good schools doing amazing
things but not getting good grades, they are doing a lot of social skills,
lots of activities. Things in the community – none of these are assessed.
It is very unfair (Elisabeth, L, 7-02-2001).
examination because schools did not want to get bad examinations results was
examinations results. Similar situations were reported by the press (Halpin, 2003) in
enter less able students for examinations so as not to damage their ranking in the
league tables.
A conclusion was drawn that the examiners’ reports published at the end of each
examination had a positive impact because they provided feedback for teachers and
schools. These included an analysis of work submitted for examinations and included
an evaluation of art and design practice in schools. Awarding bodies also published
221
the examination procedures, analysing strengths and weaknesses and providing
bodies to teachers and moderators was a very positive consequence of the model,
The following key ideas for developing art and design examinations emerged at this
authentic tasks, related to art and design methods of learning and making,
3. It should allow both written and visual language; enable students to decide
4. It should not fragment art and design teaching and learning into modules of
coursework evidence.
222
7. Teachers should be trusted to deliver the examination under normal
assessment.
need to be developed.
12. Provision for examination feedback for schools and teachers would be
necessary.
Summary
Art and design examinations in England were traditionally characterised by
flexibility and diversity, but the Education 1988 Reform Act brought about a chain of
centralise art education curriculum and assessment with greater prescription and
some positive and negative aspects were noted which provided insights for the
Portugal. The English instrument for external assessment was considered generally
relevant and appropriate to art and design methods and courses. The tasks addressed
the need to research artworks and design products. Requiring written evidence as
223
well as visual evidence provided an opportunity to assess the same knowledge,
understanding and skills on separate occasions. However the instructions were not
always considered clear for the users and the instrument relied on teachers’ particular
The underlying model of the instrument raised some problems of content and
construct validity, especially in the ways that the mark schemes appeared to
minimise core aspects of art and design such as production, exploration, independent
thinking, risk-taking and more genuinely creative artistic work. The modular
The instrument was judged as not equally fair for all the students and problems of
gender, social class and ethnic bias were found. The model offered considerable
reliability of assessment results and suggestions were made about the reasons for
such reliability, including the type and format of the instrument, the traditional
224
Chapter 6
Design and piloting of a new external assessment instrument and procedures in
Portugal
This chapter is a report of a new conceptual framework for art and design
examinations at pre-university level that was developed and tested out in this
stage of the research. The content framework for the art and design
6.1. Introduction
Some suggestions and draft materials for the new assessment instrument and
procedures were devised during the first stage of the research taking into account
ideas in the research literature, and analysis of the data from the study of Portuguese
After analysing the data and conclusions of the study of English and Portuguese art
and constraints were drawn up for a new model of external assessment in Portugal.
Although the English examinations had some of the strengths sought by Portuguese
addressed in order to avoid future problems of validity and practicality, for example,
sources of bias and teacher and student overload (See Appendix XXVIII).
6.2. Designing a new assessment instrument and procedures for art and
225
The design of the assessment instrument and procedures was developed in three
stages:
reasonable sample, evaluate and refine the draft specifications for the
assessment and evaluate the specifications for the assessment instrument and
6.2.1. Rationales
The first stage of the design concerned the specifications for the proposed art and
ideas encountered in research literature; the views of teachers and students (Chapter
4); and through an analysis of Portuguese art and design syllabuses. The specification
of the instrument followed the rationale for art and design education described by
Swift & Steers (1999) who argued that art education should address three
creative and critical thinking, and the skills to interrogate dominant ideologies.
226
1. Conventionalisation: an awareness and ability to use the
conventions of art forms.
2. Appropriation: embracing for personal use, the available
expressive form.
3. Transformation: the creative search for personal knowledge.
4. Publication: the placing of the outcome in the public domain.
These dimensions of aesthetic learning operated as four basic quadrants from which
to systematize the contents and skills of art and design for assessment purposes.
The review of literature (Chapter 1/2) provided some insights into what is
or what should be assessed in art and design. Eisner (1972 p. 212- 216)
claims that the qualities that could be evaluated in the arts are: 'Technical
that these are in constant interaction. According to Dorn (2002) there are
The design of the new assessment instrument was influenced by the idea that
1992; Amabile, 1987). The knowledge base should include theory, for example
concepts and techniques of art and design, as well as knowledge of past and
contemporary art, design and media works from different cultures. Knowledge and
understanding of art and design works (and, more broadly, visual culture) using both
formal and contextual analysis was considered very important. But the new
instrument should not only provide both a test of memory and analytical skills; it was
227
essential also to include critical skills. Creative work in art and design depends on
knowledge of art and design works, understanding the principles and elements of
visual language, and the acquisition of skills, technologies and materials. (Dobbs,
1992; Best, 1996). Creative art and design work needs the contextual knowledge of
art and design discovered through research and critical analysis of wide-ranging
(1972, p. 217-222) called ‘boundary pushing’: the ability to attain the possible by
Like other creative processes, artistic work involves memory and researching
Through understanding the symbolic rules and conventions the student can combine
The student receives stimulus in the form of external inputs from their social
milieu and also initial impetus from within the individual (Amabile, 1990;
Csikszentmikalyi, 1990). A decision was taken, therefore, that the new assessment
instrument should include the personal and social milieu inputs through a record of
228
solving and decision making skills were also considered important to allow students
visions of the world and break boundaries. In boundary breaking, ‘students see gaps
and limitations in present theories and proceed to develop new premises, which
contain their own limits, they must be able to establish an order and structure
between the gaps they have 'seen' and the ideas they have generated’ (Eisner, 1972,
p. 217-222). In the new instrument, skills of creative thinking and making involve
personal style, signature or voice (Ross et al, 1993, p.53). These skills, considered in
Evaluation skills play a significant role in creative thinking (Feldhusen & Goh,
the design of the new instrument. Self-evaluation skills were interpreted as the
the full (MacKinnon’s; 1962, p.485). Self-evaluation, in what Ross (1993) calls the
publication stage, was extremely important in the design of the instrument. By this
means students could reveal their ability to persuade others, including the audience,
viewers or assessors who would later validate the work, of the value of their work.
229
6.2.3. Defining a content framework for the new art and design external
assessment
The aims and general objectives described in the Portuguese Studio Art and Drawing
syllabuses were broad enough to enable the construction of a content framework that
included knowledge of art and design, independent thinking, critical and creative
skills and self-evaluation. Although technical skills in art and design formed a large
part of the content, space was left for creative process and contextual studies of art
and design. The current Portuguese syllabuses allowed the construction of a core
framework in the form of blueprint (see figure 11), which attempted to foster a
holistic vision of art and design, promoting risk-taking, personal enquiry and critical
skills.
230
Components Contents Activities
Communication skills
Evaluation skills
231
6.2.3.1. Kinds of evidence
According to the majority of Portuguese art teachers and students who responded to
the survey (Chapter 4, p. 139), ideally art and design assessment instruments should
and practical responses. They wanted the portfolio to include: a record of students’
intentions and how these are realised; preliminary investigation and research work;
assessment notes or reports; and group criticism notes. These tasks should provide
students with an opportunity to reveal what they can do and what they know.
6.2.3.2. Portfolio
After having considered the views of Portuguese teachers and students and the
theme, plan, elaborate, present and evaluate their work (Lindström, 1998).
Some degree of portfolio assessment has been tested out and used in a variety of
formative and external assessment (Gardner, 1992; Beattie, 1994; Hernandez, 1997).
The portfolio was devised as an open-ended project to explore a theme that should
232
The required evidence needed to be flexible enough to allow students to choose
themes and tasks that best suited their individual learning styles, gender, ethnic group
and social background. The intention was to give them opportunities to choose
themes and tasks in order to allow fairness and equity in assessment, so that those
who were disadvantaged in one task could have an opportunity to offer evidence
heard and their intentions made clear. Portfolio evidence was organised into three
main components: (1) Process; (2) Product and (3) Self-evaluation. However process
and product were not seen as discrete units; the framework used both concepts with
the necessary flexibility to respect different ways of working in art and design. The
233
In light of the Portuguese tradition of one single examination period at the end of the
art and design course, it was considered appropriate to develop one single portfolio
project as the final art and design assessment instrument for Portugal. A decision was
taken that the external assessment should be conducted in the last term of the
academic year during normal art and design class time. Students and teachers would
receive all the instructions at least one month before the examination to allow for
The design of the instrument was limited by the nature of assessment itself: it is an
exercise of power, involving some kind of judgement over others. It was limited, in
interpretation by users. The five criteria suggested by the researcher for negotiation
and refinement, during the pilot were devised in accord with the content framework:
From the results of the survey in Portugal and research literature it was suggested
that the new examination should have assessment procedures based on appropriate
234
procedures for standardisation of moderators and moderation contributed to the
Csikszentmihalyi & Gardner, 1994). A plan for in-service teacher training and
procedure for external verification of the internal marks was also designed. Students’
portfolios (or a significant sample) would be made available to one or more external
The design of the assessment instrument needed a priori validation through a pilot
examination. The pilot (Trial 1) was intended to ensure that negotiation took place
between users in order to determine final formats, tasks, media, criteria, mark
weightings, grade descriptors, time, and assessment procedures (see Appendix XIV,
pp. 170- 188). The objective of piloting the initial instrument was intended to: (1)
test the validity and reliability aspects of the instrument, and (2): propose strategies
During the pilot data was collected through researcher observation notes,
questionnaires, student group interviews and reports. A research diary was kept in
235
order to collect information about the experiment, where notes about the pilot
sessions were recorded and transcriptions made of informal interviews with the
participants. One external observer evaluated the pilot or first trial: Graça Martins,
a recognised Portuguese expert in art and design education (see Appendix XIII).
Three student’s group interviews were conducted with a sample of twelve students
between 12 and 16 January 2003, the schedule for interviews included three sections
with questions about the validity of the assessment instrument (see Appendix XI).
A questionnaire for students and teachers involved in the trial and a checklist for
teachers were also used to collect data (see Appendix XII, pp. 141-166). The
questionnaire for students was conducted between 11 and 17 January 2003. The
checklist and questionnaire for teachers were conducted during the fourth pilot
used in the initial survey questionnaire (section 3, 4). However, in the trial, it was
found that the questionnaire and checklist schedules were not very effective and they
needed further revision before they could be used in the main trial. Some questions
were repeated (2.6; 5.3; 5.4.); others were irrelevant (3.10; 3.11; 3.12) or too
technical for students (4.5). So a decision was taken to remove repeated questions,
add new detailed questions about instructions, criteria, weightings and tasks, add
questions of the checklist for teachers to the questionnaire for teachers and remove
the questionnaire.
A list of teachers willing to participate in the trial of a new instrument was compiled
from the survey questionnaire. From that list, ten teachers from the central region of
236
Portugal were contacted to participate in the pilot. Initially the ten teachers and their
students agreed to participate, but later three of the teachers withdrew because they
could not attend the time consuming meetings that were necessary.
Teachers Age Gender Background School Teaching years Role in the pilot
30- 4 M 2 Painting 3 School V OA 4 Teacher 4
40 Studio Art
31- 2 Sculpture 1 OD 1 Teacher 1
50 Studio
design
51- 1 F 5 Design 3 DGD 2 Moderator 2
62 geometry
The pilot was held at Escola Secundária Alves Martins in the city of Viseu. An
auditorium was used during the first session to allow exemplar portfolio documents
to be projected. The other sessions took place in a normal classroom. The schedule
237
Training ( meetings) Activities
Session 1 • Explanation of the aims and purposes of the
pilot Distribution of roles: teachers were asked to
28/9/02 • Explanation of the assessment instrument: experiment with a portfolio with their
(8.30h- portfolios and assessment procedures based on students; only IL and AC could not do this it
13.30h) moderation because they were teaching descriptive
• Explanation of the proposed underlying geometry at the time. However they were
model and content framework interested in participating as external
• Examples of portfolio assessment assessors.
instructions were displayed from Finland; Arts
Propel; The International Baccalaureate and
Edexcel (UK);
• Reproductions of visual exemplars of
students’ portfolios (AS and A2) from Edexcel’s
art and design examinations were also displayed
• Discussion
Session 2
Production of instructions for the new examination:
12/10/02 a draft document including general guidelines for the Experimental trial project using portfolio
(8.30h- format; tasks; criteria; weightings, grade descriptors with students (22hours)
13.30h) was discussed. Negotiation with students
Session 3 (October- December)
16/11/02 Evaluate progress, experiment and discuss marking
8.30h- procedures and mark schemes (Familiarisation
13.30h) exercises).
The pilot was an essential part of the design of the new instrument since teachers and
students opinions and suggestions were constantly sought in order to obtain their
volunteers the nature of portfolio based assessment and moderation using current
examples because, for them, it was very new in terms of external assessment.
238
Teachers reflected upon the proposed model, content framework and draft
specifications and agreed that the proposal was generally in accord with their own
models of teaching art and design and perception of content. The time for the
external assessment was set by common agreement: (1) preparation work: 25 hours
or 3 weeks extra class time; (2) developmental studies and final products: 15-20
hours during the three hour class-time sessions; (3) 2 hours for the self-assessment
volunteers marked a sample of thirteen portfolios and the results showed that the
difference between markers was of 1-3 points out of a total of 20 (for details of the
Portuguese marking system see p. 114). Internal consistency was 1-2 points’
difference between first and second marking of the same assessor. Although two
markers obtained less consistency, the moderation marking results for inter-rater
volunteers considered that the instrument had strong validity and provided valuable
results. They suggested final revisions of the instructions and helped to inform the
The pilot exercise was centred on negotiation. ‘The design was based on teacher
ownership: sharing power and constructing knowledge instead of an elitist, top down
XIII). Participants and the external observer considered this collaborative approach
to the design of the instrument a novelty. Teacher ownership of the assessment was
239
ensured and emphasised, and students’ opinions also were taken into account during
The participating teachers and external observer considered the underlying model
appropriate for the subject (See Trial 1 External observer report, 2003-03-08/
Appendix XIII). Generally teachers said that it was in accord with their own views
about what should be taught in art and design and with the learning objectives for art
and design.
6.3.3.2. Format
instrument and teachers, students and the external observer welcomed the flexibility
of portfolios, which allowed for differentiated responses and respected students own
agreement with the validity of the external assessment in terms of contents; clarity of
240
judged the instructions and criteria to be less clearly stated than teachers, possibly
authentic way, realising tasks that were closely related to art and design learning
practices:
241
We had to experiment with materials, techniques, formal elements
and composition, like colour, shadow, textures, etc. We had to
show technical abilities; knowledge about art forms and art making,
we had to show how creative we were (SR A, 15/01/2003).
instrument format. Although teachers said that they normally used coursework for
assessment, they considered that the format of the portfolio was more demanding in
Things like the work journal are not commonly used, however
Studio Art syllabuses talk about it. But students are not used to it
(TR C; 18/01/2003).
In fact students had not previously been required to make notes or comments about
I was not used to being critical towards the work of others and my
own work. It was quite difficult to make the written comments (SR
F; 16/01/2003).
But, for F and some others, the difficulty was not the writing skills but the evaluative
skills:
242
… it is not the writing which is difficult; I prefer to write because I
have time to think during the writing. The difficulty was in being
critical, analysing the works and expressing personal opinions.
We were not required to do so before the portfolio. Last year,
they [teachers] were more interested in our drawing and painting
abilities (SR F; 16/01/2003).
unfamiliar approaches into Portuguese art education. It was a challenge for teachers
and students used to art and design assessment focused on craft-skills and self-
expression. The pilot experiment revealed that students, more than teachers, felt the
different approach to art and design assessment. Although 92.2% percent of students
agreed that they were well prepared to develop portfolios (questionnaire statement
6.1) and 100% percent of teachers agreed that they were well prepared to conduct the
critical skills important. Teachers suggested that to overcome this problem there was
an urgent need to develop such skills. Some strategies were proposed, such as
promoting more group discussions in which the teacher could act as a mediator,
introducing facilitating questions to help students to reflect upon their work and the
work of others. Students considered group discussions useful during the development
It was fine when we had the group discussions about our work, we
helped each other. Well, sometimes my friends told me I was doing
wrong and I didn’t agree with them; but it was an useful exercise,
243
in the way we had to explain what we were doing and making
statements about the quality; what is good and why…using the
criteria words to explain it. But I am not sure about its importance
for assessment (SR M; 14/01/2003).
According to the questionnaire results, students and teachers used group discussion
during the preparation of portfolios, but students and teachers had different views
showed that they believed that their influence was less important than students
perceived it. This may be because teachers thought that their influence was not very
strong since the portfolio was produced with a minimum of their help, whereas
6.7. Students had freedom of choice to make their project briefs according to 94.1% 100%
their interests and motivations.
6.8. It is true that I influenced students in their choice of works and approach. 58.8% 20%
I think the teacher should take it into account. But some students
are more individualistic; they need to be left alone. So it is not fair
for them, if they don’t like to talk, it is their right to be silent during
group discussions (SR F; 16/01/2003).
244
Students agreed that for some of them, group discussions might benefit assessment.
Teachers agreed that the best way to include them in the final assessment would be
through the recommendations of the students’ own teacher. They agreed that
disadvantage students. The use of work journals was another strategy designed
to foster self-reflection. Some teachers observed that the students who developed
their projects with work journals were more critically aware than the others.
Rom including visual, oral or written comments could be equally valid. Therefore, it
was agreed to strongly recommend the use of work journals in the external
The questionnaire results showed that 68.8 percent of students and 85.7 percent of
teachers agreed with statement 6.13: ‘It was important to have the option of
providing visual and oral (taped) comments about the intentions, quality and progress
of the work in the portfolio’. However in the pilot all the students chose to express
their notes and comments in writing but this should not eliminate the possibility of
using other means of expression. Whether to use verbal or non-verbal languages was
245
debated during teachers’ meetings and there was consensus that students should have
the freedom to use the kind of language that best suited their knowledge and
personality. Poor writing skills should not disadvantage students and there was a
shared opinion that visual and oral commentaries were acceptable to express
students could submit notes and comments by audio, video and digital means and
such records should be afforded the same value as a work journal or a written report.
6.3.3.3. Bias
The questionnaire results showed that no significant bias was detected related to
some students detected some bias related to social class; in their views, portfolios
After analysis of teachers’ comments during the pilot training sessions and
students’ group interviews, evidence showed that the instrument increased teachers’
responsibilities. The fact that teachers’ role was so important in supporting the
The teachers discussed this point during the last pilot training session and
246
recommended that teachers should have training meetings and mutual support before
and during the examination. They thought meetings should act as a vehicle for
training and also for monitoring. However it was agreed that students should not be
left out of the dialogue and the researcher proposed that for future trials an Internet
website for the examination might be published, with a teachers’ and students’
discussion forum available to all the users. A students’ forum might reduce bias in
the instrument because those students with a less committed teacher could ask other
6.3.3.4. Tasks
Students, teachers and the external observer considered the portfolio tasks
appropriate and well organised, presenting a balance between process and product
(for details about tasks see Appendix XIV, p. 171 and p.173).
It was the first time students had been required to submit a considerable amount of
evidence for assessment. Although they complained about lack of time, they agreed
that the portfolio provided a way of structuring their work, because they found the
247
I think the tasks are fine; and the way portfolio is organised helped
us to develop the project, gave us a kind of method of work (SR M;
15/01/2003).
Some students found the investigative studies or research tasks, which were new to
…the investigative work was boring; we are not used to doing this
in Studio Art . [R: Do you think it will be better to eliminate
investigative work?] No, not at all, we learned through searching
the work of artists, I think it is helpful to develop our own ideas, in
the sense that we see other things; we discover ways, techniques,
materials, approaches to ideas and it is not like in history of art, we
can chose our sources and study works we like and which are
useful for our projects (SR T; 16/01/2003).
Previously students had written essays about artists and art movements but had not
been required to connect their practical work with what they learned from theory.
In a great majority of the portfolios, the researcher observed that students failed to
connect the analysis of sources with their own projects. Interviews with students
showed they were aware of this. They said that it was very new in terms of
exploration of ideas. The teachers thought that this might be a consequence of the
breadth of the instructions. They recommended that instructions for students should
clearly state what was expected from students more clearly and simpler language.
248
All the teachers were strongly in favour of self-evaluation tasks, but self-evaluation
requirements were not considered so important by all students, and some boys in
Generally boys did not like self-assessment, and it was apparent from the interviews
that this was partly because it implied extra effort. However the majority of girls
agreed that the inclusion of self-evaluation tasks improve their learning and
performance.
The responses from both students and teachers to questions about timing in the
questionnaires suggested that this was a main constraint, but after reflecting on this
249
during interviews they agreed that time was not the real problem – rather it was lack
of a disciplined work method. Because it was the first time they had developed a
portfolio with such requirements, they did not plan the various phases of the work
sufficiently.
We had little time for the final product; I think the final product is
very important. But I agree with the sketches, we need to
communicate the ideas before the final product… but I didn’t plan
it well (SR F; 16/01/2003).
According to the teachers, planning was not the only problem; students were not
motivated to carry out extra schoolwork especially because Studio Art was not
But when students were asked if they thought the preparatory studies should be
eliminated from the model, surprisingly all seemed to be in favour of them. It was
clear from their responses in interviews that they had not realised how this
examination was.
250
It was evident that the new assessment instrument challenged old habits of work in
the discipline, and all students were quite aware that standards had been raised in
The physical conditions and resources for the examination (classrooms space and
conditions for storage of materials) were inadequate in the school where the pilot
took place; however the school had a good library and all the students had access to
agreed that it was well adapted to the curriculum, underlying teaching models, and
were fair and balanced for art and design students. The emphasis on process, critical
reflection and self-evaluation, was approved as essential to the nature of art and
design learning and practice. However the unfamiliarity of some tasks was a problem
because of the inherent risk of failure. Teachers agreed that these risks should be
taken, at least in the pilot. They were in favour of the underlying conceptual model,
even though they had not required students to reveal so fully their capacities for
251
Teachers and students agreed with the assessment criteria and weightings (see
Appendix XIV, pp. 183-184), teachers thought they were a good basis for
establishing common ground for judging students’ performances but they identified a
need to simplify language in the assessment matrix (see Appendix XII, p. 136).
Although students agreed with the criteria in the questionnaire, some of them
complained they were unsure of what qualities their art and design teacher and
external assessors were looking for in their portfolios. According to the teachers AC
and IL, students did not understand the statement in question 3.4 (Appendix XII,
p.147): ‘Students are aware of what qualities their art and design teacher and
external assessors are looking for in their portfolios’, and tended to misinterpret the
term ‘qualities’ as ‘taste’. A revision of the criteria in the instructions for students
was needed to explain further the concept of ‘qualities’ and how this was intrinsic to
6.3.3.7. Impact
Students understood the portfolio not only as an assessment instrument but also as
a learning experience. They agreed that the portfolio was an important tool for
252
(51) (7)
5.1. I agree that putting together the art and design portfolio has been a 90.2% 100%
worthwhile learning experience for the students.
5.2. Students’ art and design portfolio are a good foundation for further 84.3% 100%
study in art and design.
Table 33: Responses to questions about impact.
The new instrument influenced the way art and design was taught and perceived, not
questionnaire, 85.7 percent of teachers agreed with statement 5.5: ‘It is true that the
perspective on art and design education’. According to the results of the teachers’
checklist, the main effects of the instrument on teaching were related to the need to
foster investigative studies and self-evaluation tasks in their teaching practice (see
Students, teachers and the external observer considered the assessment procedures to
253
7.3. It was useful to have visual exemplars to understand the criteria and 100%
standards.
7.4. I had adequate training to assess students’ portfolios. 100%
Table 34: Responses to questions about assessment procedures.
In the last pilot session (18-01-2003) teachers said that they had reached a common
interpretation of criteria and that the meetings had proved very important for training
them to help mark students’ work consistently. Teachers considered that using
I was afraid about external marking; because it was the first time I
encountered it. But in the end it worked quite well and this was a
sort of checking out my standards. The exercise with portfolios was
helpful to share a common interpretation of criteria, so I am not
surprised with the results of external assessment; we shared the
same views about the levels of achievement. (TR C; 18/01/2003).
And this process of external marking was fair for everyone; at the
same time I feel safer because someone is checking the accuracy of
my marking (TR P; 18/01/2003).
The proposed assessment procedures were considered more reliable than the existing
examination procedures. Clarity of assessment criteria and the matrix was identified
as a key factor enabling consistency of results, but the major reason given for
improved accuracy of marking was the quality of teacher training and the use of real
student portfolios for standardisation. However the training sessions were time-
consuming, partly because all the instructions had to be negotiated through long
The time spent in training sessions for teachers was not realistic given that the
teachers were already overloaded: the withdrawal of three participants made this
clear. This aspect of the model needs further thought. Shortening the training
254
sessions would be practical, but it would probably have negative effects on the
procedures. A possible solution to the lack of time for training could be providing
of the country in order to avoid the problems of teachers undergoing long distance
travel. However it is reasonable to expect that the time spent might be reduced in
Fostering the active involvement of teachers and improving the clarity of instructions
for students (see Appendix XIV, pp. 176-181) might overcome the problem of the
unfamiliar format of assessment. Portfolios demand more work, and more organised
work, from students in terms of motivation, structure and presentation. In the new
assessment instrument, the relationship between teacher and student changed when
assessment was integrated into learning. The instrument was considered to have the
knowledge, understanding and skills in art and design – and offered a rich range of
art and design activities that were considered inspirational. The instrument was
judged to be flexible in its range of options. But points of fragility were detected
such as unfamiliarity with the format and potential bias resulting from the role of the
and assessment matrix, effective in-service teacher training and purposeful dialogue
between teachers and students. A possible strategy for attaining such goals within
time and space constraints might be: (1) to conduct only one standardisation meeting
with the volunteer teachers, (2) to create a permanent support structure during the
255
examination in order to establish dialogue between teachers and students – possibly
Summary
In this chapter the design of the new instrument was described and the external
assessment draft materials were explained. The assessment materials were discussed
and negotiated during the pilot in order to investigate issues of validity, practicality
and reliability. Problems were identified and were used to re-design the examination
instructions (see Appendix XIV). It was concluded that the new instrument and
teacher training, (2) a change of attitudes towards studio based art disciplines,
increasing the value of Studio Art in the curriculum and encouraging students to
engage with the subject with greater commitment and, (3) developing critical and
256
Chapter 7
Main Study: Trial of the new external assessment instrument and
procedures in five schools in Portugal
In this chapter the trial of the new instrument and external assessment
The evaluation was based on data collected during the trial through the
The sample was determined from a list of nineteen teachers who were willing to
participate in the trial of a new instrument obtained from the questionnaire survey
respondents (Appendix VII) and from the list of teachers involved in the pilot.
The teachers were contacted by telephone, the trial activities and timetable were
explained to them and they were asked if they were still willing to participate. Ten of
them volunteered accepted the invitation. A sample of teachers was identified aged
thirty to fifty years-old, with six to twenty five years of teaching experience and from
different regions of the country (North, Centre and South). They included five
teachers who had collaborated in the pilot, three of whom continued to use portfolios
in their studio art internal assessment throughout the year while the other two were
experienced teachers who had acted as moderators during the pilot study and were
However these teachers were clearly not representative of all secondary studio art
teachers in Portugal since their availability for the trial suggested a particular interest
257
Names Age Teaching Gender Background School Role in Number
Experience the trial of
(Years) students
J 42 20 Male Master P (south) Teacher 5
Multimedia
AM 39 16 Female BA Design B (south) Teacher 15
The schools and studio art classes were very different and included: (1) two schools
located in suburban regions in the South region of Portugal with students from
middle and working class families and ethnic minorities (Schools P and K); (2) one
school located in the biggest city of Portugal with students from middle and high
social status families (School B); (3) one school from a medium size city in the
Central region with students from middle class families (School V) and, (4) one
school from a small city in the North West region with students from rural and
The volunteers asked their 12th year Studio Art students if they wanted to participate
in the trial; it was explained that the new instrument was demanding and they would
need to submit more work than usual but, at the same time, it would allow them more
freedom and opportunities to reveal their knowledge, understanding and skills in art
and design. With the exception of those at School P, the students were very
258
motivated to try out the new instrument. One hundred and seventeen students aged
between seventeen and twenty years of age were involved in the trial (seventy-two
A research diary was kept in order to collect information about the trial, where notes
about visits to schools and standardisation meetings were recorded and transcriptions
made of informal interviews with the participants. One external observer evaluated
the trial: Leonardo Charréu who was a recognised Portuguese expert in art and
design education (see Appendix XXIII). Five group interviews were conducted with
a sample of 31 students between 2 and 26 June 2003 (for details see Chapter 3, p.89).
A questionnaire for students and teachers involved in the trial was also used to
collect data (for details see Chapter 3, p.93). The questionnaires were conducted
during July 2003. Other data was collected through teachers’ reports.
The main trial started at the end of the second school term in 2003, before the Easter
holidays and finished by the end of June 2003. The schedule for activities comprised
to schools and interviews with teachers and students; internal marking of students’
moderators; and on-line meetings with the volunteers (moderators and teachers).
259
Teacher and moderator Teacher and student activities
activities
March/April Standardisation meetings: Reading the assessment booklets in
2003 (1) 7th March 2003 (School class
K) Explaining the instructions
(2) 10th March 2003 (School Aided work: Start preparatory work
V) (investigative studies and first
developmental studies)
June/ July On-line meetings (web page) Internal marking (10-15 June)
2003 Post-trial meeting (18 July at Moderation ( 16-30 June)
School V for final evaluation Questionnaires teachers, reports
and marking samples) Interviews
Table 36: Trial 2 schedule
The external observer characterised the trial activities as ‘… a critical and innovative
dialogue about assessment, where art assessment problems were discussed from
practice and solutions were tested in order to achieve a model of assessment adapted
to the nature of art with more validity and reliability’ (Trial 2, external observer
Two standardisation meetings were held. The first meeting with teachers took place
on the 7th March 2003, in the ceramic room at School K, from 9.00h to 19.00h with
AM; E; I; J; M, IL and A. The second meeting took place on the 10th March 2003,
260
in the studio art room at School V, from 9.00h to 19.00h with teachers; AP; R; C and
moderators IL and A. The agenda for the meetings followed a common format:
The volunteers received a copy of the assessment booklet resulted that was
developed for Trial 1 (Appendix XIV) two weeks before the standardisation
meetings. At the School K meeting the booklet was new for all the teachers involved,
since it was the first time they had participated in the experiment. However their
responses to question 15, in the questionnaire had shown that all of the teachers
found it easy to understand it, and were in agreement with the statement ‘The
instructions for the examination were clearly stated’. AM; I and M said that they
adopted similar strategies in their teaching, J said that although he was motivated by
the trial and agreed with the rationale, he expected some resistance from his students
because they were not very motivated and were unfamiliar with the tasks. Since the
moderators had participated in the pilot they described the pilot experience and
showed the fifteen marked portfolios, explaining how the criteria had been applied.
The group spent two hours looking at the marked portfolios asking questions about
specific issues related to the type of evidence and how it was marked. After the
familiarisation exercise, the volunteers were asked to mark individually the ten
261
portfolios displayed for the blind-marking exercises. Afterwards the group discussed
the marks.
At the meeting in School V the standardisation was easier to conduct because all the
teachers involved had participated in the pilot. The first and second parts of the
agenda were covered very quickly. The volunteers described other portfolios they
had developed with students during the second term (January-March) and showed
student work. Since they already had marked the exemplar portfolios during the pilot
the group also decided to use the new ones for the familiarisation exercises.
The results of the blind-marking exercises in both meetings revealed that the
volunteers were reasonably consistent in their marking with only one to four points
J IL AP AM C M E I A R diff
Portfolio 1 6 5 6 6 6 6 6 6 6 5 1
Portfolio 2 10 12 11 12 12 12 13 13 12 12 3
Portfolio 3 8 8 8 7 8 7 6 8 8 7 2
Portfolio 4 9 9 8 12 9 9 10 9 8 10 4
Portfolio 5 18 18 17 19 17 16 17 18 17 19 3
Portfolio 6 11 11 10 11 10 10 9 10 12 11 3
Portfolio 7 10 10 10 11 10 10 10 9 10 11 2
Portfolio 8 20 20 19 20 19 18 19 20 18 20 2
Portfolio 9 16 17 16 17 16 16 16 17 16 17 1
Portfolio 10 8 8 7 8 7 7 7 7 7 9 3
Table 37: Trial 2: standardisation blind marking results.
Under the designation ‘Art assessment’ a web page was constructed on the Internet
distance in-service teacher training. The contents of the site were: (1) explanation of
the rationale and conceptual framework; (2) explanation of the portfolio and
262
evidence required for assessment submission; (3) instructions for teachers: (4)
instructions for students; (5) digital reproductions of marked portfolios; (6) teachers’
and moderators’ forum; (7) students’ forum; (8) on-line chat programme (#af31
channel at IRC). The web page, forums and on-line channel aimed to provide a space
for continuous dialogue between the users of the new instrument. On-line meetings
were conducted with some of the volunteers and the researcher on 5/5/03; 21/5/03;
16/6/03; and 25/6/03 between 18.30-21.30h. However they were not as useful as
expected because of a lack of teacher and student participation. Only four students
sent messages to the student forum asking questions about the appropriateness of
their project briefs. According to the external observer the on-line debates ‘were
teachers’ and moderators’ participation in the forum and on-line meetings, although
important, was very limited. According to his report: ‘ It seems obvious [in this case]
that inter-personal communication via IRC does not have the same qualities as
to think about their replies allowing a richer dialogue between the participants’
The portfolio assessment was administered during the last term of the year.
The teachers explained the portfolio requirements to the class, the types of evidence
opt for a project brief and media that could be easily realised taking into account
available resources and class time. R and AM gave their students complete freedom
263
of choice. However a significant number of students chose the proposed theme:
‘Excluded’ or themes related to it (for details see Appendix XIV, pp. 190-192). A
great diversity of media used was evident in R’s art class (multimedia; web pages;
technical drawing). Different media were also chosen by some students at Schools K
(for details see Appendix XX) and B (installation; illustration; advertising; painting;
performance; video; sculpture and photography), for details see Appendix XIX. At
School A all the students chose themes related to the theme given by the teacher,
‘Human figure’ They developed expressive drawing and prints (for details see
Appendix XXII). AP and C suggested the theme ‘Moods’ to their students who
The art class at School P only had five students and the teacher viewed this as
negative. The teacher described the school location (J interview, 2 June 2003),
as ‘a dormitory town’, and the students’ economic background were ‘very poor.’
The school did not provide students with access to any other kinds of cultural
activities; the school environment lacked stimulus. As J pointed out ‘in the school
nothing really happens’. In his view students’ poor attitudes and lack of motivation
264
All the students rejected the new instrument because they did not understand it as
I didn’t like the portfolio; Studio art is for drawing and experimenting
with new techniques, not for investigative studies (SR Sergio, 2nd June
2003).
All the P students expressed the same opinions in the section 3 of the questionnaire.
They were unfamiliar with the new requirements, their previous learning in art was
underpinned by a formalist approach, and because of that they could not perform in
[During the course] It was not the same thing; the preparation/search was
just finding an image and after we did the work using the image; for
example an impressionist or abstract work. We are used to making
drawings (MTEP; Studio Art) to experiment with specific techniques the
teacher ask us to experiment… The work is the same for everyone; for
example a face and then we work the form; light; colour, etc (SR Sergio,
2nd June 2003).
According to the teacher, the educational practices in the school followed different
because they were not been prepared to develop independent thinking and critical
skills at school:
Students found the new instrument difficult and inappropriate for them:
265
to experiment. They limited themselves because they
were afraid to experiment or explore (TR J interview,
2nd June 2003)
Overall students considered the new instrument too time-consuming and too broad.
For these students the instructions were not detailed enough and the criteria focused
on knowledge, understanding and skills in art and design which they considered
unimportant; they wanted to be assessed on short and prescriptive tasks and not have
reviews. Unfamiliarity with such tasks and a lack of previous training in critical
thinking were the main reasons why this trial failed in that school. These students
understood studio art merely as technical skills. In the two months available for the
trial, the teacher was not able to modify habits and foster development of
error to use the new instrument with these students, the teacher did not use the
resulting marks for his internal assessment, so participating in the trial did not
School B was a contrast. Located in the centre of Lisbon, it was a ‘top’ school.
According to the teacher, the students’ social and cultural background was rich, they
266
came from elite basic education schools and their parents were involved with school
activities. They frequently visited exhibitions, concerts, and theatre and dance
festivals.
According to the teacher they were also very competitive and motivated.
The researcher observed that the relationship between the teacher and students
AM had no problems applying the new instrument with the students because she
already assessed students in a similar way and fostered the kind of knowledge,
understanding and skills required by the new assessment instrument in her classes.
So, the tasks were not unfamiliar and students had the necessary training to perform
seemed to be a key factor for the success of the new assessment instrument in this
case.
267
In general, students considered that the new assessment instrument was valid and
The success of the new instrument at School B was related to its characteristics, the
Autonomy; we learned it little by little during the course (SR Diana, 2nd
June 2003).
However other important factors were the students’ attitudes and behaviours; their
We learn with what the teacher gives us, but we also learn by ourselves
(SR Ana, 2nd June 2003)
268
School A was a small school far away from the big cities on the coast in a small town
in the North East region of Portugal. It has a museum and a library but not many
cultural events. The school lacked appropriate physical resources for the arts and the
teacher said that he tried to compensate for the lack of library and cultural activities
by using his own books and promoting school trips to exhibitions and museums.
The school receives students from the region. We have students from
different socio-economical backgrounds. The students are very
motivated but general standards in all disciplines are low. Students are
used to a school system based on memorisation and they often have
difficulties in independent thinking skills; this is not only my opinion,
the same opinion was expressed by the Philosophy teacher at the last
teachers’ meeting (M’s report, 27 June 2003).
However the teacher ended up with a positive implementation of the portfolio and
there was evidence of good quality in the students’ outcomes. This probably
happened because the teacher had gradually prepared his students to develop
independent and critical thinking during the last years of the course, so they were able
to perform confidently in the majority of the assessment tasks. And besides, M knew
that his students had difficulties in decision making, and to overcome it, M
by choosing two themes for the project briefs: ‘Human Figure’ and ‘Self-portrait’;
providing books for investigative studies and carrying out human figure drawing
exercises during the art classes. These students received a great deal of support from
269
their teacher; consequently their portfolios were not very diverse in terms of media
By using such strategies M avoided failures like the one at School P. Nevertheless this
trial showed that it is possible to use the new assessment instrument with students
who were unfamiliar with the tasks if a teacher adopts a directive and supportive
This kind of examination is fair, but I think that students should have
better training for this type of exam. We should be taught for example
how to make investigative studies according to our themes; I think we
also should experiment with more techniques and media in order to
improve our technical skills and ability to evaluate our problems (SR
Mafalda, June 2003)
The experience was not wholly positive because problems of time and lack of
Overall the portfolio was a good experience for me and for the
students. But I think that time should be more flexible. The main
advantage of the portfolio is fostering students’ sense of
responsibility, it has a very good structure but it should be developed
during the year. Students should be prepared for portfolio tasks during
the first year of the course. At this moment, I think it is very difficult
for students, at least for the average, because they are not prepared for
this kind of task (M’s report, 27 June 2003).
School K was located in a suburban region of Lisbon, well known for its baroque and
neo-classical monuments. It lacked physical and cultural resources but during the
year the staff promoted several cultural activities. According to information given by
the schools’ administrative service, the students were from different socio-economic
270
backgrounds and came from middle class parents and low-income families.
There was small minority of Black and Indian students but the majority were White.
According to the teachers the students were highly motivated: ‘They work hard to get
good grades; and they work hard in extra-curricular activities: (TR I; 9 June 2003)
‘They like to do things’ (TR I; 9 June 2003); ‘They are looking forward to making
The implementation of the new instrument in this school revealed surprising results.
The two teachers, I and E, had different conceptions about art education and different
promoted independent work and group discussion in her classes. However during the
portfolio implementation the two teachers adopted a similar less prominent role; after
they had given the instructions to the students and provided some reference books
related to the starting points for the model project briefs they left the students alone.
Students were quite excited about the new instrument; they seemed to understand the
tasks as authentic and liked the more personal way of working in the arts.
I think it was well done; well though. I liked this work because it is
related to my motivation. I am not very good in examinations. Usually
I had bad results in tests; I know I can do things; but I am not able to
demonstrate it in test or examination conditions. The portfolio was
good because we were assessed when we were working under normal
conditions (SR Inês, 9 June 2003).
271
The portfolio was more real than the usual work and tests we do in the
art classes; it is about our life; not school life or tasks that school thinks
are important for us (SR Ruben, 9 June 2003).
Students seemed very committed to developing their portfolios and it could be said
this instrument corresponded to their expectations for ‘real’ art and design
It was a surprise for the teachers – although they knew their students were already
They were so involved in the portfolio; we could say that they were
engaging themselves in a very unusual way; it was more than the
normal motivation. I could tell that they were living it intensively (TR
E, 26 June 2003).
They worked hard; and the other teachers complained that students
were more interested in Studio Art and were not paying the same
attention to other disciplines’ homework (TR I, 26 June 2003).
According to the teachers, students were not usually required to do much homework
for Studio Art – the coursework was developed in class time. It seemed that during
the portfolio experience students reversed the usual status of the disciplines paying
critical thinking; the fact that students did not ask for additional support during
272
preparation time suggested that they were quite comfortable with the portfolio
requirements.
It is possible that the students’ previous experience in Studio Art classes and other
disciplines enabled them to perform well. However the school’s dynamic tradition
could also explain the students’ independent work habits. However the portfolio
was not considered to be an easy examination option. For example Tânia wrote in
her questionnaire ‘ I had a lot of difficulties especially in organising the work’ (SR
V was a city in the centre of the country. It was a large school but had noticeably
273
C defined the school as very dynamic, promoting all kinds of extra-curricular
activities open to the entire community. Art students were fortunate because their
teachers organised school visits to sites where students could see contemporary art:
However it is far away from other cities and the art department usually
organises school visits to museums and art festivals at Lisbon; Porto;
Barcelona or Madrid, at least one per year (TR AP, 25 June 2003).
There are many art classes in the general courses and in the
technological courses. There are twelve art teachers in the art
department. The students are highly motivated and the art department
teachers are fine; we work very well together (TR R, 25 June 2003).
I think students like the school; and also the kind of relationship we have
with them; generally they work very hard (TR R, 25 June 2003).
However we don’t reduce the school to this building and the art lessons
are not always in the classrooms. The school is everywhere; in the park,
in the museum, in the streets. We usually have outdoors lessons.
Especially in Studio Art, where students can develop different media, we
try to make connections with other places; for example with factories,
design and craft studios in the town (TR R, 25 June 2003).
The new instrument had been piloted at this school, so the students and teachers were
familiar with it. The three teachers involved in the main trial continued to use
school were quite different from the other participating schools. The teachers
involved had adopted the new assessment instrument and, for example, for teacher R
You know, I have worked like this for several years. What you gave me
with the portfolio experience was a more structured instrument,
because the booklet was very organised and things were systematised
… (TR R, 25 June 2003).
274
The other two teachers were given time to prepare students for the trial during
the year:
It was really interesting to see how much students evolved during the
year. In the first portfolio they were not really good at making
investigative studies and developmental studies; but towards the
middle of the year they started to understand how it was good for
them and in the end they made portfolios with very good personal
responses (TR C, 25 June 2003).
However for some students the new assessment instrument was still problematic:
The most difficult thing about the portfolio was to justify our options;
but it also depended on the media. I developed a photography project.
It was easy to explain the techniques and to evaluate the experiments;
but it was less easy to explain the decisions (SR Joaquim, 17 June
2003).
For me the difficulty was in dividing the tasks; planning. Because I left
a lot of things to the last hour… it was difficult to organise the ideas; to
sort them out from the confusion (SR Manuela, 17 June 2003).
7.2.4. Moderation
Moderation of a sample of students’ portfolios was conducted during the period 16-
30 June 2003 after the internal marking. The sample of portfolios followed the rules
established in the guidance booklet that contained examples with a range of low,
middle and high marks. At Schools B and A, the sample included also four portfolios
275
that the teachers had considered problematic to assess during internal moderation.
A was first moderator and visited all the schools to moderate the samples.
Her reports confirmed that the marks awarded were generally in line with the
standards agreed during the standardisation meetings. In some cases there was a
difference of one or two points between her own marks and the teachers’ marks.
After the moderation A discussed the results with the teachers explaining why she
had found their judgements too harsh or too lenient. In the majority of cases the
teachers had been too harsh, perhaps because they had been influenced negatively by
students’ behaviour and attitudes throughout the whole year. After the moderation
teachers revised the marks in line with A’s suggestions. In two schools A found
portfolios that had been very under marked (School K: Joana’s portfolio, see
Appendix XX, p.270 and School P: Luis’ portfolio, see Appendix XVIII, p. 248).
The second moderator, IL, visited Schools P and K. IL agreed with A’s
recommended awards and informed the teachers J (School P, see Appendix XVIII,
p.249) and E (School K, see Appendix XX, p. 273). They appeared to understand
why they had under marked the portfolios and agreed with IL and A’s marks.
Joana’s and Luis’ marks were finally agreed and IL moderated all students’
portfolios in order to see if there were similar cases, but none were found.
276
Overall the results of moderation procedure showed consistency in marking.
Teachers and moderators shared a common interpretation of criteria, but the small
number of students and teachers involved in the trial only allow a tentative
judgements.
I was very happy to see that the marks I awarded to the portfolios are
recognised by the moderator; it gives me confidence in my assessment
skills, and besides I feel less alone in my profession because of the
opportunity to establish dialogue with other art teachers outside my
school (AM’s report, 18 June, 2003).
What was also good in the portfolio assessment was the external
assessment. I think that it was very good to have my marks confirmed
by the moderator. So she could reassure me and my students that we
were using the established standards (M’s report, 27 June 2003).
An analysis of the questionnaire responses showed that the new instrument was
compatible with teachers Studio Art curricula. Seven teachers strongly agreed and
three teachers agreed with the statement: ‘ The examination respects my own
opinions about art and design learning’ (question 65). Six teachers strongly agreed
and four teachers agreed with ‘The examination respects my teaching practice’
(question 66).
7.3.2. Preparation
277
(117)
54 To be well prepared to conduct the examination. 73%
57 The contents of the portfolio were regularly discussed with teachers. 84.3%
58 It was important to talk about and regularly explain the work in their portfolios 94.8%
with the teacher during preparation time.
59 The teacher was very interested to hear students’ views about 91.4%
the work in their portfolios.
Table 38: Responses to questions about preparation.
An analysis of the questionnaire responses showed that most students considered they
were well prepared to take the examination. Collaborative working was one characteristic
of the new assessment instrument and the researcher observed constant dialogue between
teachers and students in the form of advice or providing motivation. Although 51.3
percent of the students thought that their teachers influenced their work, only four
teachers recognised this (questionnaire trial, question 59). This influence was mainly in
the form of verbal advice and helping students to develop a project brief appropriate for
the time and resources available. There was interaction between students and ‘crits’ were
7.3.3. Format
278
Both students and teachers considered the assessment instrument compatible with art
and design practice, related to the contents of the specialism and to be capable of
The portfolio was real work; I mean because it was like the work by
artists and designers (SR Carla, 17 June 2003).
In the end it [the portfolio] was also a way to explore ourselves (SR
Sandra, 2nd June 2003).
7.3.4. Instructions
All the teachers and the majority of students found instructions clear.
279
‘The instructions for students were clear and helped us to organise
the sequence of tasks’ (School V, SR António, 17 June 2003.
In general the instructions for teachers and students were considered sufficiently
7.3.5. Time
During the pilot, 78.4 percent of students had commented that the designated time for
portfolio development was not adequate. However the opinion of students who
collaborated in both the pilot and the trial changed for better. This confirms the
suggestion that the problem of lack of time was related to degree of familiarity with
In the questionnaire response to the trial, all the teachers and 70.9 percent of
students agreed with the statement ‘students have plenty of time to complete
Scheduling strict periods for developmental studies and for realising final products
might not be a good idea because some kinds of work take more time than others.
The problem of time is very relative; especially the time for production. It depends
on the kind of final work we develop. Some students made the final products very
quickly like Inês. She painted the canvas in two hours; but for me and others it took
lots of time to make the final product. We could have chosen quick final products but
it was our work; we wanted to make our best performance (School K, SR Isabel, 9
June 2003).
280
On the other hand more flexibility with time might introduce an element of bias
because students with more free time might be advantaged. Some students did not
want to spend extra-time on their portfolio, so they chose project briefs that could fit
the available time. However, other students felt that it was not fair to narrow their
options because of time limits; reducing time would prevent them fully developing
The question of time was discussed at length during teachers on-line meetings
without consensus. For example teacher E wrote: ‘All the students must have the
same period of time for the portfolio; it is their problem if they can not adapt their
projects briefs to their available time’ (on-line meeting 21 May 2003); and teacher R
argued ‘ We need to allow students to work on their final products after school,
otherwise they will not have time to complete it; we cannot ask students to narrow
their projects just because of limits of time’ (on line meeting 21 May 2003); and
teacher C said: ‘ It is not fair if some students have more time than others; all
students should have the same time for completion’ (on line meeting 21 May 2003).
During the main trial, teachers treated the question of time differently. However,
students who spent more time on their final products were not advantaged in this
assessment because the teacher and moderators marked the works taking time into
account. In their portfolios, as well as including the written time frame for their
portfolio’s development, students dated all the work for starting and finishing.
In the end a flexible timeframe was agreed depending on the school context and
281
application the ‘set’ requirement was the same for everyone. However students were
free to continue their work beyond examination time limits provided this was
recorded and this seemed to improve the authenticity of the assessment instrument.
Although in the researcher’s view only School B had suitable conditions for the
percent of the students said that the physical conditions (accommodation; equipment
and materials) were adequate for the examination. Teachers AM and R allowed their
students to develop final products outside the classroom when they needed to use
The other teachers advised students to work with the media and equipment that were
available in the classroom. However advising students to work this way narrowed
The problem of inadequate physical conditions was more or less overcome in the five
282
of bias, especially access to books, museums and exhibitions. In the main trial, the
majority of teachers brought their own books into the class; organised visits to
museums and exhibitions and helped students to find specific tools and equipment
7.3.7. Tasks
The results of questions about tasks in the questionnaire showed that examination
and by 90.5 percent of the students to be appropriate for assessing art and design.
The required assessment tasks were not familiar to 68.7 percent of students, but all
the teachers and a great majority of students considered the tasks appropriate for art
In the pilot it was hypothesised that self-assessment tasks were perhaps more
attractive to girls than to boys (Chapter 6, p.232). In order to see if this could be
verified in the main trial, male and female student responses to the questions about
the appropriateness of the tasks were compared. Significant small differences were
283
found between male and female students’ responses to these questions (21-27). Table
43 shows the results of the t-test obtained with the gender variables, 21 (record
Test Value = 0
95% Confidence
Interval of the
Sig. Mean Difference
t df (2-tailed) Difference Lower Upper
sex 30.653 116 .000 1.38 1.30 1.47
record ideas 27.609 116 .000 .9316 .8648 .9985
investigative
18.242 115 .000 .8621 .7685 .9557
work
devlpmt studies 18.419 116 .000 .8632 .7704 .9561
final product 13.328 116 .000 .7778 .6622 .8934
selfassessment 9.414 116 .000 .6581 .5197 .7966
workjournals 14.109 116 .000 .7949 .6833 .9065
annotations 14.109 116 .000 .7949 .6833 .9065
In talking about the assessment tasks, portfolios were viewed by both students and
teachers as a complete process of inquiry and making in art. The perception was that
capacities such as independent study and student autonomy improved, as Julia and
I think that with this last portfolio people can see what we learned during
the three years of the course; it was a final work. I can see how much I
evolved since the 10th year; but this year was more important; because
we had to think alone (School V, SR Carla, 17 June 2003).
The search part was also very helpful.... because it gives us things that
others have done and we can use it as starting points, I mean
interpreting it and going further according to our own personality
(School K, student Inês, 9 June 2003).
284
While students appeared to enjoy the opportunity to include process and
products:
I think that the portfolio is not complete without the real final product.
We need to finish the final product (School K, SR David, 9 June 2003).
The final product was the culminating point…(School B, SR Sandra, 2nd
June 2003).
evidence of process.
I am not very happy with the final product; because just seeing it
people will not understand what capoeira is [a Brazilian war-dance]
but, if they look at the complete portfolio it will be easier; I can tell
them what capoeira is …the final product is just a small part of my
work (School B, SR Susana, 2nd June 2003).
I think that for some students it was more than just a school work to be
assessed, it is about them … I think the portfolio is a whole, combining
process and products; a person has to see it all, to read everything and
not just see the final products, to fully understand our achievement
(School V, SR Carla, 17 June 2003).
There was a sense of great achievement to be derived from the portfolio. Students
285
designer. But it was my personal pride. I needed to go until the end, to
see that I was capable of doing it (School V, SR Tiago, 17 June 2003).
From the data from the on-line meetings and questionnaire responses it was clear that
teachers considered the criteria clear and well balanced and that they provided them
with the necessary common ground to be able to assess students’ work consistently.
For us the portfolio experience was really good, because it made things
easier. I mean all the decisions about assessment and marking; the
criteria were clear and easy to apply to each portfolio whatever media
or techniques the students were using (School V, TR AP, 25 June
2003).
For the students the criteria were also appropriate, as can be seen from their
The criteria were fine; everything which is important was there (SR,
Manuela, 17 June 2003).
It was balanced, including the process and the products (SR, Joaquim,
17 June 2003).
It was good to know the criteria; before we knew more or less what the
teacher was looking for in our works; but it was quite vague. In the
286
portfolio everything was written down; we knew what to do and how
the work would be marked (SR, Inês, 9 June 2003).
Similarly all teachers and a great majority of students, in the questionnaire, agreed
with the specific criteria and respective weightings (questions 31-45) as follows:
The response to questions 46 and 48 showed that for all teachers and 84.5 percent of
students all five criteria were equally important and for all the teachers and
84.5percent of the students no other criteria were needed. The students who
disagreed with the weightings for the assessment criteria made suggestions about a
fair weighting. However their opinions varied, revealing personal preference for
certain kinds of skills and tasks. In the responses to question 48 of the questionnaire
some students suggested other criteria such as: ‘the student is able to plan the work
Such suggestions were similar to the original criteria, but the students did not appear
to recognise them in the assessment matrix, may be because they were integral to
broader criteria.
7.3.9. Bias
287
Responses to questions 11-13 revealed that none of the teachers and very few
(28.7 percent) believed that students from small cities or rural areas were
disadvantaged, and in fact the access to resources like good art libraries may be
Although the majority of respondents did not detected gender bias in the
favoured female more than male students. The researcher observed that the portfolios
of male students included fewer notes and often exclusively visual work-journals
using brief annotations. In contrast, girls included long prose passages and visual and
disadvantaged in terms of access to libraries and museums. And students from low-
income families could have problems in acquiring expensive materials as seen in the
I have no money to buy another film and photographic paper and the
last photographs are not good enough; I have to present it as it is (22
May, Jocelyne’s work journal).
288
I spent all my money with the coloured photocopies… I would like to
go to the exhibition at Culturgest but the ticket is too expensive (12
May, Joana’s work journal).
This problem was acute at School K where teachers were aware of their students’
constraints in terms of materials. It was agreed from the first standardisation meeting
that the types of materials used by students should not influence teachers and
moderators and, in fact, some students with very high marks used recycled materials.
But the problem of bias related to students’ financial opportunities was not fully
overcome because certain students could not achieve what they wanted because and
her students. R favoured tutorials as a teaching method and almost all the teachers
student learning in their methods. The diversity of methods was possible because of
However a serious bias was detected related to the degree of aid and advice students
I think that if you are carrying on this experiment with portfolios you
should be careful because it is only successful with a certain kind of
teacher; it demands involved and well-prepared teachers; it requires
good school conditions…(School V, TR R, 25 June 2003).
progressive autonomy of student learning perhaps was only possible because the
majority of teachers who participated in the main trial were well motivated.
289
They worked hard to help their students without over influencing their students’
Although the new assessment instrument was considered valid and to provide authentic
tasks, considerable problems of bias were found related to flexibility of time, available
resources and teacher support as seen in the case of Tiago’s portfolio at School V:
Even if you say that you are not impressed by this type of final work; it
is difficult not be impressed by a student who actually makes a real
bicycle as a prototype instead of a small-scale maquette. And in this case
the student made the bicycle because he had more time, he made the
final product outside school. And this is not fair for the others who can’t
afford extra-time or a teacher who has friends owning a bicycle factory.
So I think, for a question of fairness that students should have the same
resources and time conditions (teacher AP, 25 June 2003).
The teachers expressed positive views about the appropriateness of the materials and
seven teachers strongly agreed with question 75 and six with question 76. The others
opted for ‘agree’ may mean they were not entirely satisfied with the assessment
matrix and the degree of consistency in standards. Comparing these answers with the
multi-faceted analysis of the trial marking (p. 285) it is interesting to note that four
290
teachers were slightly inconsistent in marking which could be a consequence of less
7.3.11. Consequences/Impact
7.3.11.1. Results
For 74.3 percent of the students, portfolio marks were close to those that students
obtained during internal assessment (question 53). At School V the marks were
similar, but in the other schools teachers said that the marks in the portfolios were
lower and probably this was a consequence of the breadth of the instrument, which
was ‘more demanding in terms of knowledge and skills than the usual coursework
291
compare the portfolio marks with coursework marks because the criteria and
standards normally used in internal assessment may not be the same as those used in
the new examination. This was confirmed by teacher J who stated in the interview
that he used to mark student work leniently in internal assessment in order to help
The portfolio was a way to improve our knowledge and at the same time
it was preparing us for the future (SR Mafalda, questionnaire, June
2003).
The new assessment instrument increased students’ workload, but the majority of the
students were motivated and did not see this effect as negative. However students
firmly stated that with the portfolio requirement they needed to be more committed
and pay more attention to the subject. The questionnaire revealed that all the teachers
and a great majority of students saw the new instrument as a good learning tool and a
sound foundation for university. Ninety-four percent of students agreed with the
statement ‘putting together the art and design portfolio has been a worthwhile
learning experience’ (question 51) and 95.7 percent agreed with ‘Students’ art and
design portfolios are a good foundation for further study in art and design’ (question
292
In common with the English examination system it was evident that teachers felt
overloaded by the portfolio experience. However in the context of this study, since all
the teachers were volunteers: they did not see it as negative. On the contrary, teachers
appreciated the opportunity to learn, improve their relationship with their students and
assessment practices.
But, you know the portfolio is a real challenge for us, as teachers. We
had so different student outcomes…. It was amazing and I found it
difficult to organise it alone; it demanded extra-work; which I don’t
mind because I am used to it. You can’t imagine the quantity of phone
calls and emails I received per day from students. I was as much
involved in the projects as the students; so we lived it together
intensively…(TR R, 25 June 2003)
Now; I think that I can assess students with more consistency and I can
give more valuable assessment feedback to students (TR C,
questionnaire, June 2003).
After the trial at least nine teachers appeared convinced about the positive qualities
in art education had changed. They were convinced of the advantages of portfolios as
293
Str. Agree Disagree Str.
Questionnaire results: number of agreements with: agree disagree
77. The examination influenced my teaching practice by using 8 2
portfolios as a learning strategy
78. The examination influenced my teaching practice by using 9 1
portfolios for summative assessment
79. The examination influenced my perception of required knowledge, 5 4 1
understanding and skills in art and design education
80. The examination changed my views about the importance of 4 3 3
process in art and design assessment.
81. The examination changed my views about the importance of 1 6 3
product in art and design assessment.
82. The examination changed my views about the importance of 4 3 3
investigative studies in art and design assessment.
83. The examination changed my views about the importance of self- 2 5 3
evaluation in art and design assessment.
84. The examination changed my views about the importance of 3 3 4
developmental studies in art and design assessment.
85. The examination changed my views about the importance of 2 7 1
technical skills in art and design assessment.
86. The examination changed my views about the importance of 2 4 4
thinking skills in art and design assessment.
87. The examination changed my views about the importance of 1 4 5
problem finding skills in art and design assessment.
88. The examination changed my views about the importance of 2 8
problem solving skills in art and design assessment.
89. The examination changed my views about the importance of 3 5 2
independent thinking in art and design assessment.
90. The examination changed my views about the importance of 1 2 7
creativity in art and design assessment.
91. The exam did not influence my teaching practice 1 1 5
92. The exam influenced positively my 7 2
teaching practice
Table 49: Responses about impact.
However some teachers disagreed with the propositions in questions 84, 85, 86, 87
and 90. They did not consider developmental studies, technical skills, thinking skills,
294
problem finding skills and creativity as new, may be because they already fostered
Teachers saw the portfolio as a valid instrument for learning and assessment with
The teachers agreed that the new assessment instrument could function as an agent for
plurality of approach, because it was a form of assessment that was integrated with
the learning process and focused on students as active subjects rather than objects for
That’s why I think this is not just about assessment; portfolio is more
than an assessment instrument; it is a way of teaching and learning (TR
R, 25 June 2003).
The teachers who participated in the trial had an opportunity to review and reflect on
their concepts of the art and design curriculum and to re-think their practices in the
light of the proposed assessment framework. It is possible that some of them will
continue to use this model for designing portfolios and become agents of change in
295
7.3.12. Reliability of results
In order to check the inter and intra-reliability of assessors (raters) within the new
assessment procedures during the main trial, the teachers marked ten portfolios on
two different occasions (April 2003 and July 2003). The difference between teachers
was from one to three points (inter-rater reliability). The marking difference between
the same teacher (intra-rater reliability, i.e. performance at different times) varied
The high consistency of the results was, probably, a consequence of: (1) the ease of
application of the assessment matrix; (2) the use of a common shared language in the
criteria statements; (3) the training of teachers and moderators at the standardisation
meetings.
7.4.1. Validities
The existing Portuguese examination in arts did not present many problems of bias
(see Chapter 4). The time and materials prescribed were strictly controlled and
uniformly applied, the tasks did not require materials or equipment other than pencils
and paper, the tasks were short and did not require long periods of development and
production, the questions did not require long, and possibly expensive, investigative
studies. In contrast the new external assessment instrument showed some bias
problems. However it was considered by the users to be more valid and authentic in
terms of content, face and response validities, qualities that were not evident in the
existing examination.
7.4.2. Reliability
296
In the mock MTEP examination the ten raters had no training for marking; they
were asked to mark the scripts in the same way they usually did during the national
examinations – in other words they only had the official instructions used for
marking in the current examination. They marked students’ completed scripts as they
usually did, at home during the period January-April 2003 and they sent the marked
During the fourth pilot training session (11 January 2003) the seven teachers who
attended it were asked to mark thirteen portfolios; they had received training during
During the main trial, the researcher collected twelve portfolios, and the teachers
were asked to mark them on two different occasions. The new assessment procedures
seen below:
1999) was used to analyse the data obtained from marking current examination
scripts, the pilot (Trial 1) and main trial (Trial 2) portfolios. FACETS is a
measurement error (such as assessment items, raters, and rating scales). In the many-
297
facet Rasch model (Linacre, 1989), each element of each facet of the assessment
(such as the candidate, the marker, item, scale, etc) is represented by one parameter
that represents proficiency (for students), severity (for markers), difficulty (for items)
or challenge (for rating scale categories). For each element of each facet in the
calibration), a standard error (information about the precision of that logit estimate),
and fit statistics (information about how well the data fit the expectations of the
measurement model).
The specific questions explored with the FACETS output were related to the
situations in order to compare the old examination and the new assessment
reliability.
Tables 51,52 and 53 display all facets of the analysis. The markers, candidates and
rating scales are calibrated so that all facets are positioned on the same scale, creating
a single frame of reference for interpreting the results from the Analysis. The scale is
categories. Column 4 display the 20 points rating scale markers used to score
students’ works. The bottom rows provide the mean and standard deviation of the
distribution of estimates for candidates and markers. The first column in the tables
displays the measure scale in logits. In the current examination (table 51) the range
of scale measurement used was very narrow while with the new assessment (table 52
298
for trial 1 and table 53 for trial 2) enlarged considerably the scale of measurement
severity) that have been entered in the Analysis. The difference between the current
examination and the new assessment measurement scales might show that within the
and that the new assessment provided more reliable information than the current
Higher scoring students appear at the top of the column, while lower scoring students
appear towards the bottom of the column (Candidate ability measure). The third
column compares the markers in terms of the level of severity or leniency. Because
more than one rater marked each script (six scripts in the old exam) or portfolio (13
portfolios in the pilot and 12 portfolios in the trial) markers’ tendencies to rate
markers appear higher in the column, while more lenient raters appear lower. The
In the pilot and in the trial there was less variation of severity between the markers.
This could be because during the pilot and trial the markers were able to achieve
consensus about the interpretation of criteria; and also because the criteria were more
helpful for marking the new instrument than those for the current examination.
299
Table 51: Measurable data Table 52: Measurable data Table 53: Measurable data
summary: Current summary: Trial 1 summary: Trial 2
examination
Me Cand raters S.1 Mea Can raters S.1 Meas Cand Raters S.1
as idat sr did r idat
r es ate es
s
1 (17) 5 (18) 10 (20)
---- 6 ----
4 14 9 8 19
1 4 1 ---- ----
---- 8 5 18
10 13 7 ----
3 17 6 17
---- 8 ---- 5 9 ----
2 9 12 2 11 16 16
1 6 ---- 4 ----
1 5 ---- 12 5 15 3 11
123 11 1 15
7 ---- 4 3 ---- ----
7 2 12 14
0 2 4 ----
1 8 10 1
---- 0 6 1 13
9 9 14 ----
3 6 9 12
-1 ---- 0 2 1 5 7 ----
---- 2 13 2 8 11
-1 4
3 ---- ----
-2 -2 6 10
7 12 7 ----
10 9
-1 (6) 5 ---- -3 4 ----
-3 11 8
-4
-4 ---- ----
13 -5 7
-5 3
(10) -6 10
-7 ----
-8 6
-9 ----
-10 1 (4)
to the summary map, detailed information about the extent to which the assessment
300
instrument defines different levels of ability – its capacity to distinguish between
of a reliability index. The index is from zero to one, with values close to one
suggesting good reliability. This reliability does not indicate the degree of agreement
between raters, it is more like the indices of reliability associated with instruments of
assessment or tests (e.g. Cronbach’s alpha). The Analysis revealed that the current
consistency than the new assessment instrument: 0.7 for the current examination
FACETS produces two indices of the consistency of agreement across markers. The
indices are reported as fit statistics weighted and unweighted, standardised and
non-standardised. In this analysis the infit mean-square was used to estimate markers
reliability. The expectation for this index is 1; the range is 0 to infinity. The higher
the infit mean-square index, the more variability can be expected. When markers are
fairly similar in the degree of severity they exercise, an infit mean-square index less
than 1 indicates little variation in the pattern of scoring, while an infit mean-square
index greater than 1 indicates more than typical variation in the ratings. According to
Myford and Wolfe (2000) there are no hard-and-fast rules for setting upper and lower
control limits for the infit mean square index. An acceptable interval for infit mean
square can be between 0.5 and 1.5. Below 0.5 the raters show less than acceptable
variation in scoring and those with more than 1.5 will show inconsistency or
301
The fixed (all same) chi-square is a test of the ‘fixed effect’ hypothesis:
‘Can this set of elements be regarded as sharing the same measure after allowing for
measurement error?’ The chi-square value and degrees of freedom (d.f.) are shown.
The significance is the probability that this ‘fixed’ hypothesis is the case. This tests
the hypothesis: ‘Can these raters be thought of as equally severe?’ The final line
contains the Random (normal) chi-square test. This is a test of the ‘random effects’
hypothesis: ‘Can this set of elements be regarded as a random sample from a normal
distribution?’ The significance is the probability that this ‘random’ hypothesis is the
case. This tests the hypothesis: ‘Can these persons be thought of as sampled at
revealed serious problems with four raters. Raters 9,6 and 5 they tended to give the
same scores to every script with no deviations, especially the marker 9. Marker 10
In the trial 1 (table 55) one marker presented an infit mean square of 0.3 (poor
In the trial 2 (table 56) four markers showed poor deviation (infit mean squares 0.2,
0 .1, and 0 .3), but these four markers did not participate in the pilot, so it was
reasonable to expect less consistency because they had less training than the others.
302
|
| 75 6 12.5 12.48| -.24 .22 | .6 0 .6 0 | 3 3
|
| 54 6 9.0 8.95| .71 .21 | .9 0 .8 0 | 4 4
|
| 72 6 12.0 12.01| -.09 .22 | .2 -1 .3 -1 | 5 5
|
| 70 6 11.7 11.70| .00 .22 | .3 -1 .4 -1 | 6 6
|
| 77 6 12.8 12.81| -.34 .23 | 1.0 0 1.0 0 | 7 7
|
| 80 6 13.3 13.32| -.49 .23 | 1.4 0 1.4 0 | 8 8
|
| 68 6 11.3 11.40| .10 .22 | .0 -3 .0 -3 | 9 9
|
| 64 6 10.7 10.76| .28 .21 | 3.2 2 3.3 2 | 10 10
|
| 70.8 6.0 11.8 11.81| -.05 .22 | .9 -.5 .9 -.5| Mean (Count: 10)
|
| 7.1 .0 1.2 1.17| .33 .01 | .8 1.5 .9 1.4| S.D.
RMSE (Model) .22 Adj S.D. .24 Separation 1.09 Reliability .54
Fixed (all same) chi-square: 22.5 d.f.: 9 significance: .01
303
| 143 12 11.9 12.03| -.08 .33 | .9 0 1.0 0 | 1 1
|
| 148 12 12.3 12.64| -.60 .33 | .2 -2 .2 -2 | 2 2
|
| 137 12 11.4 11.21| .54 .32 | .1 -3 .1 -3 | 3 3
|
| 153 12 12.8 13.19| -1.13 .33 | .7 0 .6 -1 | 4 4
|
| 140 12 11.7 11.62| .23 .32 | .1 -3 .1 -3 | 5 5
|
| 138 12 11.5 11.34| .44 .32 | .5 -1 .6 -1 | 6 6
|
| 144 12 12.0 12.16| -.18 .33 | .5 -1 .4 -1 | 7 7
|
| 147 12 12.3 12.53| -.50 .33 | .3 -2 .3 -2 | 8 8
|
| 138 12 11.5 11.34| .44 .32 | .8 0 .8 0 | 9 9
|
| 143.1 12.0 11.9 12.01| -.09 .32 | .5 -1.9 .4 -2.0| Mean (Count: 9)
|
| 5.1 .0 .4 .65| .54 .00 | .3 1.3 .3 1.3| S.D.
|
RMSE (Model) .32 Adj S.D. .43 Separation 1.32 Reliability .63
Fixed (all same) chi-square: 24.6 d.f.: 8 significance: .00
It is important to note that three of the Trial 2 teachers (R, C and AP) also
participated in Trial 1 (pilot), so they had more experience with the new marking
procedures than the others who had just attended the standardisation meeting and this
was reflected in the Trial 2 teachers’ highly consistent marking (see in the table 56
the raters numbers 6, 7 and 4 for infit mean squares logits). Failure to use the full
scale for marking can be observed with the new teachers involved in the trial and
who had not attended all the on-line meetings (see table 56 raters 3, 2, 5 and 8 for
infit mean squares logits). It is possible that these raters were not confident in
applying the grade descriptors because of their lack of training with visual exemplars
improved their marker reliability while the lack of previous training reduces
reliability.
FACETS also provides probability curve graphics. The curves present the probability
of occurrence for each category. The probability of the extreme categories always
304
approaches 1.0 for corresponding extreme measures. Most scale developers intend
this to look like a series of hills (O’Sullivan, 2002, p. 21). From the graphics in tables
57, 58 and 59 it is evident that the current Portuguese examination probability curve
is not at all as might be expected. It appears that in the current examination (table 57)
between 7 and 16 on the scale there is never a clear probability that any particular
score will be achieved suggesting very serious problems in the scale. The middle of
the scale appears to be unable to distinguish between different levels of ability and
In the Trial 1 (table 58) the range of performance only extends from 10-18, and there
were problems with the scale points 13 and 16. At these points on the scale there is
never a clear possibility that this score will be awarded, i.e. this ‘hill’ is not
distinguishable from the other ‘hills’ and this could be caused by lack of clarity in the
draft assessment matrix and grade descriptors. However the revision of such tools
seemed to help because in the trial 2 (table 59) the probability curve was closer to the
expected probability curves, though there are still some residual problems with the
In conclusion it is evident that still more work needs to be done to improve the new
examination instructions at these points on the scale, but the trial scale is clearly
305
P | 66 77 |
r | 66 77 |
o | 66 77 |
b | 6 7 |
a | 66 7777 77 |
b | 6777 77 7 |
i | 7766 7 666666* |
l | 77 6 66 77 666 |
i | 77 66 7 6 7 66 |
t | 77 6 7 6 7 66 |
y | 77 6 *00 22***336 77 66 |
| 777 66 00 1****1 2633 7 666 |
| 777 * 11* 33001 6**5**555 6666 |
|7 009***9*7 ***5 **4*4 5555 6|
| 8888****************0***1 22****4 555555 |
0 |*********************************************************************|
++----------------+----------------+----------------+----------------++
-4.0 -2.0 0.0 2.0 4.0
306
o | 4 6 6 |
b | 0 |
a | 4 6 88 |
b | 6 6 77 8 8 00 55 66 77 0 |
i | 7 7 8 0 0 44 5 6 67 7 8 |
l | 46 7 8 22 4 * * 8 8 9* |
i | 7 6 7 ** 02 333 5 76 87 * 9 |
t | 5**5 7 8 7 9 9 324 5 4 6 7 6 9 8 9 |
y | 5 5 6 9 08 9 *03 3 4 5 8 7 0 9 |
| 5 6 4 5 7 86 * 1 1 42 53 6 75 86 97 8 9 |
|5 5 7 8 70 8 *2 31 * 6 4 7 9 0 9 |
| 6 4 7* 8 699 07 *29 0* * * 58 * 07 8 9 |
|66 7*4 55*8 9660 711283**40** 2* 337 4488559 6* 77 88 99|
0 |*******************************************************************|
++----------+----------+----------+----------+----------+----------++
-12.0 -8.0 -4.0 0.0 4.0 8.0 12.0
The expected score ogive (linear graph representing scores expected to look like
similar steps) shows the average rating value expected for any measure relative to the
item, judge etc. The ogive also indicates the category ‘zone’ or areas between
average ratings (O’Sullivan, 2002, p. 21). Comparing the expected ogives from the
current Portuguese examination (table 60), trial 1 (table 61) and trial 2 (table 62) it
is evident that the new assessment showed much more consistency between points in
scales because the ‘zones’ between average ratings are more equally distributed.
In the current Portuguese examination only a small portion of the scale is used; it is
impossible to determine different steps for the real scale, at the top and at the bottom
the scores are overlapping. There is almost no increase in score for large areas of
ability between 2 and 7. Between 7 and 15 the scores for a similar ability increase.
But between 16 and the top of the scale there is almost no improvement in score for a
unable to differentiate between those levels. In the trial 1 (table 61), and especially
307
in the trial 2 (table 62) the increase in score is more in line with increase in ability
or achievement.
-4.0 -2.0 0.0 2.0 4.0
++----------------+----------------+----------------+----------------++
17 | 77777777|
| 66666777777 |
16 | 6666 |
| 556 |
15 | 5 |
14 | 44 |
| 4 |
13 | 3 |
| 23 |
12 | 2 |
| 12 |
11 | 1 |
| 0 |
10 | 0 |
| 9 |
9 | 99 |
8 | 8 |
| 78 |
7 | 777 |
| 666666777777 |
6 |666666666 |
++----------------+----------------+----------------+----------------++
-4.0 -2.0 0.0 2.0 4.0
308
13 | 33 |
12 | 22 |
11 | 111 |
10 | 00 |
9 | 9999 |
8 | 8888 |
7 | 77777 |
6 | 66666666 |
5 | 5555 |
4 |444 |
++----------+----------+----------+----------+----------+----------++
7.4.3. Consequences/impact
(see Chapter 4) the current examination narrowed the range of the discipline and
distorted curriculum practice; the new assessment instrument enlarged the range and
promoted a broad vision of art and design education which might have some positive
impact upon the curriculum. The current examination does not require specific in-
service training for teachers; the new assessment procedures facilitated useful
professional training and dialogue between teachers. And, finally, the new
assessment instrument based on portfolio work was seen as a good foundation for
further studies in higher education, which was not the case with the current
examination.
7.3.4. Practicalities
the limited resources, time, equipment and environment required. The new
assessment instrument and procedures require extra work from students and teachers,
improved work environment and possibly more expensive resources. The cost of
examination is conducted in normal classroom situations. The price to pay for the
309
enhanced reliability of marking results is also higher because teachers and
Summary
Through its implementation in five different schools the new assessment instrument
was evaluated. From the data collected for evaluation purposes the following
Main advantages:
1. The new assessment instrument was more closely related to the nature of art
and skills in art and design in portfolio increased validity. The assessment
3. The new assessment instrument included tasks that enabled students to realise
from the portfolio and they felt motivated to learn. In addition the new
instrument was seen as a good learning experience and a good foundation for
310
one key aspect of the new assessment instrument that facilitated constant
interpretation of criteria.
10. Finally it provided more reliability of results than the existing examination.
Main disadvantages:
1. It requires hard-working and committed teachers. During the pilot and trial
teachers spent long periods of their free time in meetings. Collaborative work
participated in the trial and in the pilot on a voluntary basis were particularly
committed.
2. Problems were revealed resulting from unfamiliarity with the new instrument.
the portfolio seemed to be a key factor for the success of the new assessment
instrument. Unfamiliarity with the tasks and the lack of previous training in
critical thinking were the main causes for the failure. An important issue in
311
the relative success of the portfolio was linked to the students’ attitudes and
performance.
3. Potentially serious problems of bias were revealed, for example bias related
to access to resources and teacher support. The bias related to the degree of
help and advice students could obtain from their teachers was particularly
problematic.
4. The proposed new instrument is more expensive to operate than the current
Portuguese examination.
312
Chapter 8Conclusions
The three main sections of this chapter focus on: (1) the research questions
identified in the Introduction Chapter (p. 15) which were revisited in order to
research, (3) the research methodology was reviewed and critiqued, and (4)
made.
The finding that the current system of art and design examinations in Portugal lacked
the validity and reliability deemed necessary elsewhere was not a surprise outcome
(see Chapter 4). Eisner’s views (1979) were confirmed in that the old-fashioned
system of tests to assess art and design provided: ‘a biased, and distorted picture of
the reality that we are attempting to understand’. Portuguese art educators and
examinations were strongly embedded in tradition and regulations designed for the
conflict with many essential art and design concepts and even with the general
‘objectivity’ art and design students were examined in only easily assessable
activities by means of pencil and paper tests. Consequently art and design students
were selected for higher education on the basis of assessments that did not reflect the
real complexity of art and design learning. The examinations focused on a narrow
313
area of the curriculum; studio art was eliminated from the external assessment
(MTEP) by means of a short, timed drawing test. Studio art’s status in the curriculum
was weak although considered by a great majority of art students and teachers to be
the most important specialism for preparing students for further studies in art and
The Portuguese system of art examinations tended to deny important rationales and
educational objectives for the arts. The contemporary demand for transferable skills
Students were not given an opportunity in the examination to show how well they
know, understand and can make art (Chapter 4, p.130). Summative assessment and
Debate about new forms of assessment in Portugal was mainly limited to formative
assessment, mostly in science subjects and teacher education (Valadares & Graça,
1998; Boavida & Barreira, 1993; Sá Chaves, 2000). Although they were dissatisfied
with the current system of art examinations, teachers and students generally were not
As Broadfoot (1998, p. 473) argued, tests: '… with their emphasis on scientific
314
rationality are so pervasive in modern societies that they blind us to the potential for
alternative forms of assessment'. This study found evidence for the need to reform.
The great majority of participants in the Portuguese survey (see Chapter 4) proposed
new external assessment instruments to improve the system, which were linked to the
literature (see Chapter 2). Conservatism was not an issue among the participants and
A finding of this research was that Portuguese art and design examinations did not
offer a clear set of assessment materials and guidance for teachers and students
regulations for the conduct of examinations, assessment criteria were only vaguely
tacitly used but not publicly discussed. The ritual of the examinations was more
It could be argued that the system embodied Eisner’s connoisseurship model (1985)
in the conviction that teachers and assessors should be qualified artists and teachers
who know how to interpret the visual terms used in the criteria: art critics and
connoisseurs. This may be an important factor but is not the only necessary
condition. As the literature has established valid and reliable assessment does not
and sharing values. It is true that it is difficult to define criteria and standards in art
efficiently and effectively without common agreement about the qualities thought to
315
be desirable in the work. Not only are standards and criteria difficult to establish,
it is also very expensive to prepare teachers to interpret them accurately and provide
places and time for dialogue. The research found that in-service opportunities for
training in assessment were not on the agenda of the Portuguese art examination
system, and the negative consequence of this was the examinations poor reliability of
results.
In Portugal the independent judgements of each assessor were trusted by the system
students were dissatisfied with this situation (Chapter 4, pp. 136-138). The lack of
control of the assessment procedures was another possible cause for the unreliability
The current system of art and design examinations in England was found to have
some advantages when compared with the Portuguese system in terms of validity and
problems. For example, one problem was the ever increasing workload for teachers,
while the other was ‘more philosophical and concerned with the effect of the
examination on teaching and learning’ with negative effects on students and the
curriculum (Steers, 2003, p.25). In the GCE examination students were required to
submit every unit from the start of the course fully realised and of the required
316
standard, potentially exposing them to over-assessment. Students and teachers lacked
time for experimentation and creative risk-taking. According to Steers (2003, p.25)
the examination’s hidden message for teachers and students was: ‘…don’t bother
The modular system was believed by some to threaten the very nature of art and
design learning which, in theory, should foster a climate of inquiry, risk-taking and
creative opportunities (Hardy, 2002). It was also in conflict with the current
students, as expressed for example, in the report of the National Advisory Committee
the Edexcel standardisation meetings the set of exemplar of art and design students’
work displayed was rather ‘safe’ devoid of obvious risk-taking. Besides the
‘exemplars of students’ art works’ were used not for discussion or to negotiate
standards but rather to illustrate the competencies students needed to reveal in order
to meet assessment objectives. These visual examples were important tools for
understanding and interpreting the language of the criteria and assessment matrix,
but there was no ‘debate about what counts as good work in the community’
(Boughton, 1997). Instead it was found that in practice there was little teacher’s
ownership of assessment. It was observed that students’ voices were also neglected
At the end of 2003 several signs of dissatisfaction with the English examination
system were evident and debate about reform was ongoing. Tattersall, a former head
317
of the Assessment and Qualification Alliance (AQA), reported to in The Guardian
(September 30, 2003) that teachers should be trusted more to assess their own
Ronnie Lane in the National Society for Education in Art & Design newsletter:
A’N’D (Nº 10, Winter 2003/4) expressed teachers’ concerns as voiced in the
Liverpool Art and Design Teachers Forum meeting (2003) related to increasing
‘The forum was of the opinion that art teachers cannot sustain the levels of workload
which is being demanded by the examination boards’ (Lane, 2003, p. 3). At the time
of this writing the Department for Education and Skills was working on new
Working Group on 14-19 Reform), building on proposals outlined in the Green Paper
opportunities and raising standards. Their extensive consultation confirmed the need
to create a clearer and more appropriate curriculum and qualifications framework for
the 14–19 phase – ‘…one that develops and stretches all our young people to achieve
their full potential, and prepares them for life and work in the 21st century’
318
increasing formative assessment and reinforcing the professional judgement
of teachers.
8.3. Impact of the English and Portuguese art and design examinations
From the study of English and Portuguese art examinations it was evident that
discipline and regulation of student’s art work. As Atkinson pointed out assessment
logic is reductionist and by implication it omits ‘…those things, those forms of life,
curriculum materials but summative assessment and examinations were not viewed
upon the curriculum. Tests and examinations strongly influenced teaching practice,
teachers taught to the tests and examinations which apparently were intended to
319
In England, art and design examinations, as well as providing a means for selection
and certification, also seemed to be a way of controlling the curriculum. The QCA
and awarding bodies exercise great power over students, teachers and schools
have negative effects on the curriculum, teachers and students. Not only in terms of
work overload but also because they did not encourage teachers to foster
experimentation and risk taking during their courses. Therefore the need for
accountability in England also generated negative effects upon schools and teachers.
grades that their students obtain during the examinations and the position of
8.4. Increasing the validity of Portuguese art and design external assessment
(p. 134) suggested external assessment instruments as portfolio work that they
believed might improve the system. However assessing portfolios requires changes
(Klenowski, 2002). In the context of this research it was impossible to change the
established curriculum for art and design in Portugal, thus it was necessary to find
and valid framework for defining new criteria, new tasks and the evidence required
320
8.5. Increasing the reliability of Portuguese art and design external
assessment results
It was not surprising to discover that the great majority of participants in the
Portuguese survey (p. 139) suggested teacher training and assessment juries as ways
to improve the system. The key reforms identified for increasing the reliability of
moderation procedures.
The proposed framework was designed to accommodate postmodern views of art and
of the visual culture they encounter every day’ (Freedman, 2003, p.11). It was partly
influenced by Swift & Steers (1999) manifesto for the arts in schools and was
practice by moving towards a more inclusive approach to teaching and learning in art
social issues.
During the trials of the new assessment (Chapters 6 and 7) it was found that in
certain cases students integrated everyday life experiences and reflected on the social
321
issues including interviewing members of the community and independent searches
for source material. But this was not the case for all students. In school A for
example the project briefs for the portfolios did not ask pupils to explore social
issues and the investigation was limited to examples of modernist art. The role of the
teachers and their beliefs significantly influenced students’ choices. In some cases
teachers were not able to go beyond the formalist art concepts they had been taught
and they did not encourage critical analyses of visual culture. Such teachers, for
example M, AP and C, were nervous about letting students investigate themes they
were not comfortable with themselves, and convinced their students to engage with
The on-line meetings while were intended to provide support for teachers and more
space for dialogue did not completely attain these goals. Using Internet resources for
dialogue about art and design learning and assessment may be more useful in the
future; at the present time it was found that both teachers and their students found it
reproductions of students’ studio art works are not appropriate for fully appreciating
and discussing the visual characteristics of portfolios, except perhaps when the
After the trials of the proposed instrument (Chapters 6 and 7) it was evident that the
role of the teacher was crucial when students challenged conventions or methods of
work through creative proposals or when they attained unexpected outcomes that
revealed their own intentions, personal experience and awareness of social and visual
322
culture issues. A, I and R acted as facilitators through the substantive
dialogue that took place between the student and the teacher, using teaching methods
based on collaborative learning and partnership (see Appendices XIX, XX and XXI).
This enabled assessment to be integrated into the teaching and learning cycle.
This confirms Klenowski’s (2002, p.115) view that teacher’s role is fundamental in
consequence of the importance of teacher’s role in the assessment was the evidence
that the instrument was strongly biased by the degree of aid afforded by the teacher.
The new assessment instrument was found to respect student’s voice and
personal style as Wiggins (1993) and Ross et al (1993) have claimed it should do.
Student performance was very variable according to their school context, although a
wide range of levels of achievement was obtained in each school with the exception
of School P where the portfolio experience verged on complete failure for the
reasons described in Chapter 7, for example the lack of student motivation and
unfamiliarity with the format. In the new assessment instrument students were given
portfolios and the reasons for inclusion were negotiated with their teachers.
However, for some students, such a level of autonomy was difficult, perhaps because
they did not clearly understand the aims and assessment criteria. While they were
aware that they were expected to be thoughtful about the selection of work for
inclusion, this presented problems because they had not developed a capacity for
reflection in previous years. From the research it was also apparent that some
students found self-evaluation difficult. In the new instrument students need to take
responsibility for self-directed learning and for developing and maintaining their
323
portfolios. The student interviews confirmed that this requirement was understood
but nevertheless they were not always able to evaluate their own progress and
discussion groups, peer critique, were all found to be helpful in this regard.
According to the teachers involved in the main trial some students succeeded in
presenting portfolios of exceptional quality even in cases where resources were poor
The students’ commitment and motivation were evident and contributed to their
success. For example Joana fought to pursue her own goals despite a lack of the
resources to realise them and the teacher’s disagreement with the chosen theme
information about ‘capoeira’ and Ruben presented very good developmental studies
using recycled materials. In the case of these students the opportunity to make a
project relevant to their own lives appeared to give them the strength to work hard
and seek out aid elsewhere. Students felt that with the new assessment instrument
they gained more ownership of the learning and assessment process. The conclusion
was drawn that students need to have decision-making powers about themes for
project briefs and about which work is chosen for inclusion in portfolios if they are to
feel ownership of, and real commitment to, the art making. It increases the content
validity of the assessment instrument and presents more opportunities for student
324
Unlike the current examination in Portugal, which limits the range of the curriculum,
the new assessment instrument enlarged the range of the subject and raised the status
of studio art in some students’ perceptions. It required more evidence and more
committed work from them and this was only possible because they came to believe
that studio art had some direct relevance for their lives. Should the proposed
different visions of art and design processes and making which challenged the old
teachers who are able to help students develop independent critical and self-reflective
skills. It also demands dialogue between teachers and students, mutual understanding
of aims and intentions and negotiation of tasks and approaches. From the trials it was
concluded that the new assessment instrument provided useful feedback for students
and teachers. Assessment was seen in Wolf ‘s terms as 'an episode of learning'
(Wolf et al, 1991, p.183) providing qualitative information in the course of portfolio
Problems may arise when teachers do not understand the implications of such
Joana’s visual work in her portfolio because she did not clearly realise that the
assessment had expanded art practice to the world of visual culture (see Appendices
325
XX, p.264 and XXVI, p.319). E did not recognise that the teachers needed to acquire
a deep respect for the students’ artwork and this challenged the limits of her
preconceptions of what students in the art and design disciplines should learn and be
taught. Teachers need to reconceptualise their pedagogy before they can integrate the
At the same time, the breadth and flexibility of the assessment instrument presented
Students and teachers in the trials recognised that ‘taking risks’ was positively
encouraged, and diverse creative outcomes and approaches were expected. In the
current Portuguese art examination system tests were easily administered under the
same conditions for all schools and all students. The only resources required were
pencil and paper and the short time for the test allows effective standardisation of
assessment conditions. The new instrument, which aims to provide a more valid
assessment, presented problems related to diversity of outcomes and the need for
increased resources. In the English art and design examination system this problem
was generally less evident because the work submitted for examination was often
very similar in character, may be because teachers and students felt safe within
Perhaps they felt that they could not risk different approaches because the results of
examinations had major implications for public perceptions of schools and teachers,
as well as students.
326
It is recognised that if the new assessment instrument were to be used over a period
of several years, there is a danger of losing flexibility. Just as that which appears to
have happened in the English art and design examinations, a new orthodoxy could
achieve good results. One possible way to prevent such an occurrence is to stress the
rationales for the assessment instrument and emphasise the importance of diverse
creative outcomes during in-service teacher training sessions. And students need to
understand that they will be rewarded for innovative and personal responses
there is a need for a wide variety of visual experience and debate within assessment
the creation of knowledge, and for the unexpected outcome that surpasses planned
objectives’. Teachers should take into account the most up-to-date student and
professional art being produced at any given time in their debates about what should
be valued, and such debates ‘…must include continual challenge so that student art
327
8.6.3. The assessment procedures
Using the new assessment procedures required considerable time for training and for
and reports to be written – all contributed to teachers’ workload. This was only
possible in the trials because the teachers who participated in them were convinced
that it was important for their own professional development and for the fairness of
art and design students’ assessment. It would be unrealistic to expect all Portuguese
art and design teachers to engage in the new assessment procedures with the same
commitment as the volunteers in the trials without considerable preparation and in-
service training. However this experience showed that it is possible to change old
established and unreliable habits of arts assessment in Portugal if the reasons for
change are properly understood, and change is understood to be in the best interests
Schönau, 1996). The concept of a ‘community of judges’ (Boughton, 1997) and the
example of the English moderation procedures was used to design the assessment
procedures. It was clearly established that reliability of results achieved in the trials
increased in comparison with the current Portuguese system of a single assessor (see
Steers, 1996; Blaikie, 1996) assessors can reach a considerable degree of consensus
through the use of standardisation and moderation procedures. The new procedure
was found to be more demanding than the current assessment system in Portugal, but
328
it was also found to present real benefits in terms of validity, reliability and teachers’
professional development. One feature of the experiment’s success was that teachers
had enhanced confidence in their powers to make assessment decisions. The new
reliance on teachers’ connoisseurship was maintained but the shift was made to
8.6. 4. Impact
From the research (see Chapter 7, pp. 275-279) it appeared that the effects of the new
framework for assessment were significant for both the teachers and the curriculum.
status of studio activity and the raising of art and design educational standards by
redefining rationales, domains of learning, understanding and skills in art and design.
The teachers in the pilot and the trial reported that they changed the way that
These changes happened because the teachers who were already competent
practitioners reflected on the proposed rationales for the experience and decided,
with the support of their peers, that they had to make changes, because they agreed
with the proposals and believed that their students had the right to be involved in and
329
8.6.5. Strengths and Weaknesses
Throughout this chapter strengths and weaknesses of a new framework for external
assessment in the arts have been discussed and these are summarised as follows:
8.6.5.1. Weaknesses
c. The work requires considerable time and resources that are not always
students, and some students may receive different degrees of teacher aid.
e. The assessment criteria require skills that might or might not have been
might affect students whose critical reflective capacities have not been
f. The assessment instrument may not be applied evenly for all students and
of implementation.
330
8.6.5.2. Strengths
a. The assessment instrument provides valid and authentic tasks related to the
media, and domains of art and design, allowing students to develop personal
projects in which they can personalise social issues and reveal important
self-assessment.
The approach to designing assessment instruments for art and design presented in
Chapter 6 was influenced by Weir’s (2004) framework for validating tests. It was
adapted for art and design contexts and might also be of use for developing similar
exercises in other countries. However, it is recognised that this present proposal for
a new system of art examinations needs debate and further revision. The conceptual
framework developed in this thesis was specifically designed for the Portuguese
context. Nevertheless some of the research findings could be of use for others
331
changes in art and design assessment, (2) dialogue, collaborative work and some
least with small-scale groups. Each of the negative and positive findings of this study
fundamental in preparing
reflective skills.
understanding of aims
and students.
variety of visual experience and debate within the art and design community.
332
These issues may be useful as a staring point to generate further discussion and
experimentation for those who are interested in developing portfolios and assessment
by juries. As Eisner (2003, p.55) pointed out ‘…our resolutions will generate other
situations that will need further resolution’. It would have been unrealistic to try to
resolve all the problems of external assessment in art and design in this research.
The idea established in the literature, that using alternative assessments such as
in art was supported by the empirical studies. It is hoped that the conclusions of this
research might be of use for the Portuguese government, art teachers, art education
researchers, and researchers into evaluation and assessment everywhere who are
searching for more valid and reliable assessment instruments and procedures for art
and design.
This research study was not intended to resolve all the problems of validity and
Chapters 1 and 2 were not fully resolved. The identified strengths and weaknesses of
the English system of art and design examinations proved to be a useful starting
point for designing a new framework for external assessment in the subject in
Portugal. However biases identified in the English examinations were also found in
the new model for external assessment proposed during the research. As Beattie
(1995, pp. 52-53) pointed out, the emphasis on written descriptions of learning
some boys and non-standard language speakers. To prevent bias it will be necessary
333
to reconsider issues concerning teacher and student ownership of the assessment,
The English external assessment system was able to combine formative and
summative assessment (see Chapter 5) and the framework developed for a new
Contemporary trends are to foster ‘assessment for learning’, assessment for which
the first priority is to serve the purpose of promoting students’ learning (Black et al,
2003) and a combined approach to assessment (Eisner, 2002). However Gipps and
Stobart (1993) claimed that different purposes require different models of assessment
and whereas they admit that it might be possible to design one assessment system
that measures performance for accountability and selection purposes, whilst at the
same time supporting the teaching and learning process, ‘… no one has yet done so’.
(2002, p. 10), for example believes that ‘ …a portfolio of work can fulfil the full
teaching and learning processes’. The research did not fully confirm Klenowski’s
views, because it would be necessary to study the long-term consequences of the new
Other aspects of assessment that could be explored more fully in the future are the
334
alternative methods of assessment. Assessment systems, with their bias towards
2003, p. 183). The disjointed nature of the Portuguese art and design curriculum
might be one reason for the relative invalidity of the assessment system, because
contemporary art and design work often links several disciplines or specialisms.
However it is hard to believe that it is the only reason and the question remains
validity but whether or not more equality and fairness was obtained is questionable.
Some students were disadvantaged because of gender bias and access to resources
and teacher aid. These problems may be related to the form of examinations in
and efficiency’ (Broadfoot, 2000, p. ix). As Foucault (1977, 183) points out
competencies and criteria are established through dialogue and agreement, allowing
some degree of teachers’ and learners’ ownership, assessment still denies diversity
and plurality of approach for some groups of students and schools. More fairness
negotiated assessment were not used in this research; however the possibility of
335
negotiated assessment as an element in external assessment could be explored in
valid and reliable evaluation procedures which can perform both formative and
summative functions’.
The research method was constantly refined and readjusted during the period of the
study. Its hybridity provided flexibility for readjustments and allowed a balance
and quantitative methods provided considerable data for analysis (see for example
Appendices VIII, IX, X, XVII, XVIII). Some of the evidence proved superficial and
data with which to understand current examinations and that directly informed the
observation generally were more helpful for these purposes. Furthermore the
response rate to the questionnaires on the current Portuguese art examinations was
relatively poor (44 teachers and 104 students). Every effort was made to obtain
336
Overall the research method was ambitious in trying to access views of participants
in two countries. However the programme as planned was followed thanks to the
generous help of various individuals and organisations. The fact that the research was
carried out by one single researcher limited the number of case studies and
8.10.1. Students
In the trials of portfolio assessment the students were expected to think, to assess
From the sample included in the research it was apparent that the impact of the new
and independent learning, awareness of their own learning strategies, motivation and
interest in their own achievements and performance. These benefits are compatible
with the findings of the Arts Propel project that also used portfolios as alternative
and students who have grown used to being tacit observers might resent having to
work harder. By comparison the current Portuguese tests do not require so much
work and evidence to be submitted. Nevertheless students felt there were benefits in
the new assessment system; it seemed more interesting and relevant. The new
337
As Black et al (2003, p. 59) stress: ‘Learning cannot be done for the student; it has
However in the light of the problems detected during the research concerning
assessment are to be used, then the coursework must be planned carefully in advance.
Planning should take into account available resources and the assessment
keep to deadlines.
As Klenowski (2002) pointed out, it is not only teachers who need to learn about the
need specific teaching and support to develop the necessary cognitive processes. It
was evident from the research that students need skills in independent study, group
were not prepared to develop their portfolios because they lacked background skills
teachers saw that the source of the problem was that their students lacked the
necessary skills both to judge specific problems in understanding and to set realistic
targets to remedy problems within reasonable time frames. However, where teachers
teacher’s comments about their work, then peer and self-assessment provided the
training that students needed to judge their own learning and to begin to take action
to improve their performance. It was evident that changes are needed to the role of
students, but the period during which the role of the student’s changes needs to be
338
handled carefully and the students have to be supported as they learn to become
It was evident that to increase its validity the new assessment requires a long period
of time for students to become familiar with a culture of self-reflection that is only
Black et al argue (Black et al, 2003, p. 59), that the teachers also needed to train their
students to take responsibility for their own learning and assessment. The teacher’s
targets and to give support throughout the task of attaining them. Implementation of
changes in classroom assessment would call for profound changes in both teachers’
perceptions of their own role in relation to their students and in their classroom
than passive. Cognitive processes used by students such as self-regulation and self-
8.10.3. Schools
If the new assessment instrument is to be further developed with a larger sample then
339
have to be made bearing in mind that less uniformity will not necessarily reduce the
reliability of results but will increase the potential for bias. These problems will have
more equal resources for schools and by promoting pedagogy that emphasise
cognitive growth and inquiry methods. The learning milieu is a fundamental resource
Another important consideration for schools is the allocation of time for teachers in-
change. They need time to plan, discuss and mark students’ work, they also need
community of assessors.
8.10.4. Universities
The research found that the new assessment instrument had clear advantages in terms
of preparation of students for further studies in art and design because the evidence
submitted required a broader range of knowledge, understanding and skills in art and
design (see p.275). In Portugal the majority of art and design faculties and schools
select their students on the basis of the current examination system. The proposed
new framework allows for wider and more appropriate evidence of students’ artistic
of more value for accurate selection of art and design students in higher education.
8.10.5. Government
340
It is fully recognised that the proposed framework and assessment instrument is less
than perfect but it does offer some clear advantages over existing practices in
Portugal. It would need further revision, development and re-trials to be ready for
step, rationales, concepts and models of art and design education need to be radically
re-thought in order to align Portuguese art and design education at secondary level
portfolios, which includes a range of learning domains and skills in art education,
offers more validity than the current Portuguese examinations and can provide
positive outcomes for teachers and students. In-service teacher training and
in the small-scale trials reported here. However the instrument and assessment
procedures offered less utility than the current examinations, requiring more
resources and more work from teachers and students. Nevertheless the advantages of
design, undoubtedly, this will require a long overdue, comprehensive and radical
341
Glossary
Ability
A mental trait or the capacity or power to do something (ALTE, 1998).
Achievement
Ability to demonstrate accomplishment of some outcome for which learning
experiences were designed (Armstrong, 1994).
Achievement levels
Behaviours along a continuum that represent degrees of attainment of a criterion
(Armstrong, 1994).
Alternative Assessment
These are ways of measuring student achievement using methodologies other than
pencil and paper tests e.g. observation, checklists, portfolios, interview etc.
Art Education
The process of learning to understand, make and adequately respond to the
complexities of the arts.
Art examinations
In the context of this research the term ‘art examinations’ are used to refer
examinations covering visual art subjects.
342
Art criticism
Describing and evaluating the media, processes and meanings of works of visual art,
and making comparative judgements (NAEA, 1994).
Assess
To analyse and determine the nature and quality of achievement through means
appropriate to the subject (NAEA, 1994).
Assessment
The process of obtaining information that is used to give feedback about students’
progress, strengths and weaknesses i.e. make educational decisions about students
(Weir, 2004). The process of judging student behaviour or product in terms of some
criteria (Clark,1975).
Assessor
Someone who assigns a score to a candidate’s performance in a test, using subjective
judgement to do so. Assessors are normally qualified in the relevant field, and are
required to undergo a process of training and standardisation. Also referred to as
examiner, marker or rater (ALTE, 1998).
Assessment objectives
The Assessment objectives are means by which the formal elements, processes and
practices can be defined and assessed to ensure that a coherent and meaningful
course has been followed.
Attainment targets
Description of expected standards. A kind of performance matrix where different
levels of achievement to be expected in students' works are described based on a set
of agreed criteria. Those attainment targets act as descriptors of performance,
knowledge skills and understanding.
Authentic assessment
A characteristic of assessments that have a high degree of similarity to tasks
performed in the real world, a view that assessment should include task types which
resemble real life activities as closely as possible.
Bias
A test or item can be considered to be biased if one particular section of the
candidate population is advantaged or disadvantaged by some feature of the test or
item which is not relevant to what is being measured. Sources of bias may be
connected with gender, age, culture, etc. (ALTE, 1998)
Candidate
A test/examination taker. Also referred to as examinee.
Cognitive model
343
A theory concerning the way in which a person’s knowledge, in the sense of both
concepts and processes, is structured. This is important in examinations because
such a theory may have an effect on choice of instrument or examination content.
Competence
The knowledge or ability to do something. (ALTE, 1998)
Connoisseurship
The art of appreciation, the ability to define the quality of an object or environment
(Armstrong ,1994).
Construct
Can be viewed as definitions of abilities that permit us to state specific hypothesis
about how these abilities are or are not related to other abilities, and about the
relationship between these abilities and observed behaviour.
Context
A set of interrelated conditions that influence and give meaning to the development
and reception of thoughts, ideas or concepts and that define specific cultures and eras
(NAEA, 1994).
Criterion
A distinguishing property or characteristic of anything, by which its quality can be
judged or estimated, or by which a decision or classification can be made
(Sadler,1987).A behaviour, characteristic, or quality of a product or performance
about which some judgement is made ( Clark, 1975).
Curricular materials
Documents as orientations to curriculum, syllabus; list of objectives, list of criteria
and list of grade descriptors.
Descriptors
A brief description accompanying a band on a rating scale, which summarises the
degree of proficiency or type of performance expected for a candidate to achieve that
particular score (ALTE,1998).
Discrimination
The power of an item to discriminate between weaker and stronger candidates.
Various indices of discrimination are used. Some (e.g. point-biserial, biserial) are
based on a correlation between the score on the item and a criterion, such as total
score on the test or some external measure of proficiency. Others are based on the
difference in the item’s difficulty for low and high ability groups (ALTE 1998).
344
Evaluation
A judgement of merit based on various measurements, notable events and subjective
impressions (Armstrong, 1994).
Examinations
Measure the attainment of a candidate at the end of a course of study. They may be
external or school based, or a combination of both and, usually conducted in a formal
manner. Examinations may include a variety of instruments such as practical tests,
written essays, coursework and portfolios.
Examiner
The person who is responsible for judging a candidate’s performance in a test or
examination.
Expressiveness
Expressive features. Elements evoking affects such as joy, sadness, or anger (NAEA,
1994).
Expression
A process of conveying ideas, feelings, and meanings through selective use of the
communicative possibilities of the visual arts (NAEA, 1994).
Feedback
Comments of people involved in the testing process (examinees, administrators, etc.)
which provide a basis for evaluating that process. Feedback may be gathered
informally, or using specially-designed questionnaires (ALTE, 1998).
Formalism
Conception of aesthetics developed in the late 19th and early 20th centuries it focus on
the analyses of physical and perceptual characteristics of art objects and involves the
reduction of form to elements such as line, shape, and colour and principles of design
such as rhythm, balance, and unity. (Freedman, 2003, p. 27) .
Formative assessment
Testing which takes place during, rather than at the end of, a course or programme of
instruction. The results may enable the teacher to give remedial help at an early
stage, or change the emphasis of a course if required. Results may also help a
student to identify and focus on areas of weakness (ALTE, 1998).
Formative Evaluation
Ongoing evaluation of a process, which allows for that process to be adapted and
improved as it continues. It can refer to a programme of instruction (ALTE, 1998).
Global assessment
A method of scoring. The assessor gives a single mark according to the general
impression made by the work or performance produced, rather than by breaking it
down into a number of marks.
Grade
345
A score may be reported to the candidate as a grade, for example on a scale of A to
E, where A is the highest grade available, B is a good pass, C a pass and D and E are
failing grades (ALTE, 1998).
Grading
A process of assigning a symbol for some judgement of quality relative to some
criterion.
Holistic
Holistic, whether describing learning or assessment, refers to the conscious
awareness of the way parts interact and influence, looking globally rather than
analytically (Armstrong, 1994).
Holistic Scoring
Scoring based upon an overall impression (as opposed to traditional test scoring
which counts up specific errors and subtracts points on the basis of them). In holistic
scoring the rater matches his or her overall impression to the point scale to see how
the portfolio, product or performance should be scored.
Impact
The effects created by an assessment instrument, both in terms of influence on
general educational processes, and in terms of the individuals who are affected by
test results.
Instrument of assessment
Assessment tool. A method of gathering data about student performance
(Armstrong, 1994).
Instructions
General directions given to candidates, for example on the front page of the answer
paper or booklet, giving information about such things as how long the test lasts,
how many tasks to attempt and where to record their responses (ALTE 1998).
Internal consistency
A feature of a test, represented by the degree to which candidates’ scores on the
individual items in a test are consistent with their total score. Estimates of internal
consistency can be used as indices of test reliability ; various indices can be
computed, for example, KR-20, Cronbach alpha (ALTE, 1998).
Inter-rater reliability
An estimate of test reliability based on the degree to which different assessors agree
in their assessment of candidates’ performance (ALTE, 1998).
Intra-rater agreement
The degree of agreement between two assessments of the same sample of
performance made at different times by the same assessor (ALTE 1998).
Level
346
The degree of proficiency required for a student to be in a certain class or represented
by a particular test is often referred to in terms of a series of levels. These are
commonly given names such as ‘elementary’, ‘intermediate’, ‘ advanced’, etc.
(ALTE, 1998).
Mark
A symbolic number which represents the achievement level of the students' work.
Marker
Someone who assigns a score to a candidate’s responses to a written test. This may
involve the use of expert judgement, or, in the case of a clerical marker, the relatively
unskilled application of a mark scheme (ALTE, 1998).
Marking
Assigning a mark to a candidate’s responses to a test. This may involve professional
judgement, or the application of a mark scheme which lists all acceptable responses
(ALTE, 1998).
Measurement
Generally, the process of finding the amount of something by comparison with a
fixed unit, e.g. using a ruler to measure length. In the social sciences, measurement
often refers to the quantification of characteristics of persons (ALTE, 1998).
Method of assessment
Instruments and procedures used to measure student performance in meeting the
standards for a learning outcome. These assessments must relate to a learning
outcome, identify a particular kind of evidence to be evaluated, define tasks that
elicit that evidence and describe systematic scoring procedures.
Moderation
Moderation is the means by which the marks of different teachers in different
centres/schools are equated with one another.
Norm-referenced
If a test is norm-referenced it aims to place candidates on some sort of ordered scale,
so that they can be compared with one another (Alderson et al 1995).
Objective testing
Objective testing refers to items such as multiple-choice, true-false, and error-
recognition, amongst others, where the candidate is required to produce a response
which can be marked as either ‘correct’ or ‘incorrect’. In objective marking the
examiner compares the candidate’s response to the response or range of responses
that the item writer has determined is correct (Alderson et al 1995).
Perception
Visual and sensory awareness, discrimination, and integration of impressions,
conditions, and relationships with regard to objects, images and feelings (NAEA,
1994).
347
Portfolio
A portfolio is defined as a purposeful collection of selected student work, exhibiting
effort, progress and achievements in more than one area, and including student
participation in selecting the contents and self-reflection (Lindström, 1998).
Practicality
A practical assessment instrument is easy to administer and to score without
wasting too much time or effort.
Question
Sometimes used to refer to a task or item (ALTE, 1998).
Rating scale
A scale consisting of several ranked categories used for making subjective
judgements such as mark schemes or assessment matrices. Rating scales for
assessing performance are typically accompanied by grade descriptors which make
their interpretation clear.
Reliability
Reliability is the extent to which test scores are consistent: if candidates took the test
again tomorrow after taking it today, would they get the same result (assuming no
change in their ability). There are three aspects to reliability: the circumstance in
which the test is taken, the way in which it is marked and the uniformity of the
assessment it makes. There are several ways of measuring the reliability of
‘objective’ tests (test-retest, parallel form, split-half, KR 20, KR21 etc.) The
reliability of subjective tests is measured by calculating the reliability of the marking.
This is done in several ways (inter-rater reliability, intra-rater reliability etc.)
Reliability coefficient
A measure of reliability, in the range 0 to 1. Reliability estimates can be based on
repeated administrations of a test (which should produce similar results) or where
this is not practicable, on some form of internal consistency measure. Sometimes
known as reliability index (ALTE, 1998).
Rubric
The instructions given to candidates to guide their responses to a particular test task
(ALTE, 1998).
Scale
A set of number or categories for measuring something. Four types of measurement
scale are distinguished – nominal, ordinal, interval and ratio (ALTE, 1998).
Score
The result obtained by a student on an assessment, expressed as a number.
Self-Assessment
348
Students reflect about their own abilities and performance, related to specified
content and skills and related to their effectiveness as learners, using specific
performance criteria, assessment standards, and personal goal setting.
Skill
Ability to do something, expertness.
Specifications
Specifications provide the official statement about the method of assessment and
how it assess what it intends to assess.
Standardisation
The process of ensuring that assessors adhere to an agreed procedure and apply
rating scales in an appropriate way (ALTE, 1998).
Summative assessment
Testing which takes place at the end of a course or programme of instruction (ALTE,
1998)
Syllabus
A list of headings outlining a curriculum (Steers, 1979). A detailed document which
lists all the areas covered in a particular programme of study, and the order in which
content is presented (ALTE, 1998).
Task
A combination of rubric, input and response.
Techniques
Specific methods or approaches used in a larger process; for example, graduation of
value or hue in painting or conveying linear perspective through overlapping,
shading or varying size or colour (NAEA, 1994).
Test
Assessment instrument based on objectivity, usually pencil and paper tasks requiring
short answers through for example items like matching, multiple choice, alternative
items or essays.
349
Testing
Testing is one procedure used to obtain data for purposes of forming descriptions or
judgments about one or more human behaviours. Tests can be used to obtain
summative assessments and in external examinations, '…the typical testing situation
puts students in an artificial situation in the sense that their performance will be
appraised and rewarded accordingly' (Eisner, 1972, p. 205).
Validity
The degree to which an assessment instrument actually measures what it is supposed
to measure. Validity occurs when assessment procedure measures the performance
described in the objective, that it claims to measure (Armstrong, 1994). Validity is
the extent to which a test measures what it is intended to measure: it relates to the
uses made of test scores and the ways in which test scores are interpreted, and it
therefore always relative to test purpose (Alderson et al 1995).
Validation
The process of gathering evidence to support the inferences made form assessment
results. The extent to which assessment scores enable inferences to be made which
are appropriate meaningful and useful, given the purpose of the assessment
instrument.. Different aspects of validity are identified, such as content, criterion and
construct validity; these provide different kinds of evidence for judging the overall
validity of the assessment for a given purpose.
Value-added measures
Value-added measures measure the progress made by individual pupils in an
institution rather than raw outcome scores.
Weighting
The assignment of a different number of maximum points to a item, task or
component in order to change its relative contribution in relation to other parts of the
same assessment instrument.
350
Bibliography
ALLISON, B. (1982). Identifying the Core in Art and Design, Journal of Art and
Design Education, 1(1), pp.59-66.
ALLISON, B. (1986). Some Aspects of Assessment in Art and Design Education. In:
Ross, M. (Ed.). Assessment in Arts Education. Oxford: Pergamon Press.
ADDISON, N. AND BURGESS, L. (2003). (Eds.) Issues in art and design teaching.
London: Routledge Falmer.
351
ANDERSON, T. (1994). The International Baccalaureate Model of Content-based
Art Education. Art Education 47 (2), pp.19-24.
APECV (2002). Parecer da Apecv sobre a prova de MTEP, 1ª chamada 2002. Porto:
Apecv Reports.
BAKER, R. (1997). Classical Test Theory and Item Response Theory in Test
Analysis. Preston, Lancashire: SDS Supplies Limited.
BARRET, M. (1990). Guidelines for Evaluation and Assessment in Art and Design
Education 5-18 years. Journal of Art and Design Education, 9 (3), pp.233-313.
BELL, J. (1989). Doing Your Research Project. Milton Keynes: Open University
Press.
352
BENAVENTE, A (1990). Escola, Professoras E Processos De Mudança. Lisboa:
Livros Horizonte.
BEST, D. (1982). Objectivity and Feeling in the Arts. Journal of Art & Design
Education, 1 (3), pp.373-390.
BINCH, N. (1994). The Implications of the National Curriculum Orders for Art.
Journal of Art & Design Education, 3 (2).
BLACK, P.; HARRISON, C.; LEE, C.: MARSHALL, B.; WILIAM, D. (2003).
Assessment for Learning: Putting it into practice. Maidenhead: Open University
Press.
BLISS, J., MONK, M. & OGBORN, J. (1983). Qualitative Data Analysis for
Educational Research. London: Croom Helm.
353
BOUGHTON, D. (1999). Framing Art Curriculum and Assessment Policies in
Diverse Cultural Settings. In: Mason, R, and Boughton, D. (Eds.) Beyond
Multicultural Art Education: International Perspectives. New York: Waxmann, pp
331-348
BTEC. (1986). Assessment and Grading: General Guide. London: Business and
Technician Education Council.
BURGESS, R.G. (1993). (Ed.) Educational Research & Evaluation for Policy and
Practice? London: Falmer Press.
BURGESS, L. & Addison, N. (2000). Contemporary art in schools: why bother? In:
Hickman, R. (Ed) Art education : meaning, purpose and direction. London:
Continuum.
354
BURKHART, R. (1965). Evaluation of Learning in Art. Art Education, 19 (4),
pp.3-5.
CLARK G. (1995). Clarks Drawing Abilities Test. Bloomington: Arts Publishing Co.
Inc.
CRONBACH, L.J. (1971). Test validation. In: Thorndike, R.L. (Ed.) Educational
Measurement (2nd edition.). Washinghton, DC: The American Council on Education
and The National council on Measurement in Education.
355
CSIKSZENTMIKALYI, M. (1990). The domain of creativity. In M. A. Ruco and
R.S. Albert (Eds.) Theories of Creativity. Newbury Park, CA: Sage, pp.190-212.
DASH, P. (1999). Thoughts on a Relevant Curriculum for the 21st Century. Journal
of Art and Design Education, 18 (1), pp. 123-127.
DAVIES, I. K. (1987). Art and Design in the GCSE: Objectives, Criteria and
Assessment. Journal of Art and Design Education, 6 (1), pp.51- 65.
D' HAINAUT ,L. (1977). Des Fins aux Objectives de L'Education. Editions Labor:
Bruxelles.
356
DORN, C. M. (2002). The teacher as stakeholder in student art assessment and art
program evaluation. Art Education 55 (4), pp. 40-45.
EDEXCEL (2000). GCE A level Art & Design: Guide for Teachers and Moderators.
London: Edexcel.
EDEXCEL (2000). A Student's Guide to the AS and Advanced GCE in Art & Design.
London: Edexcel.
EISNER, E. (1991). The Enlightened Eye: Qualitative Inquiry and the Enhancement
of Educational Practice. New York: Macmillan.
EISNER (1992). The Misunderstood Role of the Arts in Human Development. Phi
Delta Kappan, April 1992.
EISNER, E. (2003). The Arts and the Creation of Mind. New Haven & London:
Yale University Press.
EISNER, E. (2003). Qualitative research in the new millennium. In: Addison, N. and
Burgess, L. (2003). (Eds.) Issues in art and design teaching . London: Routledge
Falmer , pp: 52- 60.
EMERY, L. (1996). Heuristic Inquiry. In: Australian Art Education 19 (3) p.29.
357
EFLAND, A, FREEDMAN, K. & STUHR, P. (1996). Postmodern Art Education:
An Approach To Curriculum. Reston, Virginia: NAEA.
FELDHUSEN, J.F. & GOH, B.E. (1995). Assessing and Accessing Creativity: An
Integrative Review of Theory, Research, and Development. Creativity Research
Journal 8 (3), pp. 231-247.
FILER, A. ( 2000). (Ed.) Assessment: Social Practice and Social Product . London:
Routledge Falmer.
FLINDERS, D. & MILLS, G.E. (1993). (Eds.) Theory and Concepts in Qualitative
Research. N.Y: Teachers College Press
358
GENTILE, R., & MURNYACK, N. (1989). How shall Students be Graded in
Discipline-Based Art Education? Art Education 42 (6), pp.33-41.
HALPIN, T. (2003). Teachers caught cheating to help pupils make grade. In: The
Times, October 11, 2003, p. 1.
359
HARDY, T. (2001) Farewell to the Wow. Times Educational Supplement, 16th
November 2001, p. 7.
HARDY, T. (2002). AS Level Art: Farewell to the ‘Wow’ Factor. The International
Journal of Art and Design Education 21 (1), pp.52- 59.
HASSELGREN, A. (1998). Small words and valid testing. PHD thesis. Bergen:
University of Bergen, Department of English.
HAUSMAN, J. (1994). Standards and Assessment. Art Education 47 (2), pp.9- 13.
HENRY, C. (1990). Grading Student Artwork: A Plan for Effective Assessment. In:
Little, B.E. (Ed.) Secondary Art Education: An Anthology of Issues. Reston, Virginia:
NAEA, pp.: 61- 68.
HERMAN, J.L.; KLEIN, D.C.D. AND WAKAI, S.T. (1997). American Students'
Perspectives on Alternative Assessment: Do they Know It's Different? Assessment in
Education 4 (3), pp.339-52.
HERMANS, P (1996). Why Assess? In: Art & Fact: Learning Effects of Arts
Education, International Conference 27/28 March 1995. The Netherlands: World
Trade Center Rotterdam, pp: 97-107.
360
HERNANDEZ, F.; VALLS, M.R.; RODRIGUEZ, J.M.B. (1998). Pedagogia De
L'Art: Identitat de l' Artista, Context i Ensenyament de les Arts. Barcelona:
Universitat de Barcelona.
HICHMAN, R. (2000). (Ed.) Art Education 11-18 Meaning, Purpose and Direction
London: Continuum.
HITCHCOCK, G. & HUGHES, D. (1995). Research and the teacher. New York:
Routledge.
KINNEAR, T., C. & TAYLOR, J.R. (1996). Marketing Research. (5th Ed.) New
York: McGraw-Hill, Inc.
361
LINDSTRÖM, L. (1999). The Multiple Uses of Portfolio Assessment. In: L.
Piironen (Ed.) Portfolio Assessment in Secondary Art Education and Final
Examination. University of Art and Design Helsinki (Finland), Department of Art
Education, 1999, pp. 7-16. Report of EU Comenius 3.1 project.
LEARY, A. (1986). Art, Assessment and Ethnocentrism . Journal of Art & Design
Education 5 (1 & 2) .
MACGREGOR, R. N. (1996). What You See and What You Get: Assessment and
Its Constituents In: Art&Fact: Learning Effects of Arts Education, International
Conference Art&Fact, 27/28 March 1995. World Trade Center Rotterdam. The
Netherlands
MACKINNON, D. & STATHAM, J. (1999). Education in the UK. Facts & Figures.
(3th Ed.) London: Hodder &Stoughton/The Open University.
362
MARTINS, G. (1999). Práticas Avaliativas, Momentos Formativos .MA thesis:
Lisboa: Universidade Católica de Lisboa.
MESSICK, S. (1989). Meaning and Values. In Test Validation: The Science and
Ethics of Assessment. Educational Researcher 18 (5), pp.10-11.
MESSICK, S. (1992). Validity of Test Interpretation and Use. In: Alkin, M.C. (Ed.)
Encyclopedia of Educational Research (6th edition). New York: Macmillan.
MICHAEL, J. (1980). Studio Art Experience: The Heart of Art Education. Art
Education 33 (2), pp.15-19.
MINKIN, (1998). Inviting A Response to the Agenda. Draft for Discussion in the
Consultative Group on Creative Development. Unpublished paper.
363
MUMFORD, M.D.; BAUGHMAN, W.; MAHER, M.A.; COSTANZA, P. &
SUPINSKI, E.P. (1997). Process-Based Measures of Creative-Solving Skills: IV.
Category Combination. Creativity Research Journal 10 (1), pp. 59-71.
NACCCE (1999). All our futures: Creativity, Culture. Education. London: DCMS
and DFEE.
NEAB. (1999). Art and Design ( Advanced): Syllabus number 4191. Northern
Examinations and Assessment Board.
[available January 2000: http://www.neab.ac.uk/syllabus/arts/artdes/ss992191.htm]
NCTE (2000). Developing a Test’s Taker Bill of Rights. Milwaukee: NCTE Annual
meeting.
NORUSIS, M.J. (2000). SPSS 10.0: Guide to Data Analysis. New Jersey: Prentice
Hall.
PATMAN, T. (1988). Key Concepts: A Guide to Aesthetics, Criticism and the Arts in
Education. London: The Falmer Press.
364
PIIRONEN, L. (1999). (Ed.). Portfolio Assessment in Secondary Art Education and
Final Examination. University of Art and Design Helsinki (Finland), Department of
Art Education.
PISA (2000). Mesurer les connaissances et les compétences des élèves : Lecture,
Mathématiques et Sciences : l´evaluation de PISA 2000. Programme international
pour le suivi des acquis des élèves (PISA).
PRENTICE, R. (1995) (Ed.). Teaching Art and Design: Addressing Issues and
Identifying Directions. London: Continuum.
PRICE, E. (1982). National Criteria for a Single System of Examining at 16+ with
reference to Art and Design. Journal of Art and Design Education, 1 (3),
pp. 397-407.
QCA (2000). England: Context and principles of Education. [ Available July 2000:
http://www.inca.org.uk/.]
QCA (2000). GCSE and GCE A/AS Code of Practice. London: QCA.
REID, L.A. (1986). Assessment in Art Education, in Ross. M. (Ed.). "Art" and the
Arts . Oxford: Pergamon.
365
RIBEIRO, C. (1993). Educação e Arte. Braga: Revista Portuguesa da Educação,6
(1), pp.103-108.
SCHÖNAU, D.W. (1999). The Future of Nation-Wide Testing in the Visual Arts.
Art Education, 18 (2), pp.183- 87.
SCHÖNAU, D.W. (1994). Final Examinations in the Visual Arts in the Netherlands.
Art Education, 47 (2), pp.35- 39.
366
SCHÖNAU, D.W. (1999). A Survey on Visual Art Exams in Europe. In : Piironen, L
(Ed.) EU Comenius 3,1/Socrates & YOUTH, Report of the project 40966-CP-1-97-1-
FI-C31. Portfolio assessment in secondary art education and final examination, June
1999, UIAH Helsinki: Department of Art Education: University of Art and Design.
STEERS, J. (1987). Art, Craft And Design Education In Great Britain: A Summary.
Canadian Review of Art Education, 15 (1), pp.15-20.
STEERS, J. (1988). Art And Design In The National Curriculum. Journal of Art and
Design Education 7(3), pp.303-323.
STEERS, J. (1994). Art and Design Assessment and Public Examinations. Journal of
Art and Design Education, 13 (3), pp.287-297.
STEERS, J. (2001). New A and AS levels. A’N’D: The Newsletter of The National
Society for Education in Art, nº1: Summer 2001, p.11.
STEERS, J. (2003). Art and design in the UK: the theory gap. In: Addison, N. and
Burgess, L. (Eds.). Issues in art and design teaching. London: RoutledgeFalmer, pp.
19-31.
STEERS, J. (2003)b. Art and Design. In: WHITE, J. (Ed.) (Ed.) Rethinking The
School Curriculum: Values, Aims and Purposes. London: RoutledgeFalmer.
367
STOKROCKI, M. (1997). Qualitative Forms of Research Methods. In: Lapierre,
S.D. & Zimmerman, E. (Eds.) Research Methods and Methodologies for Art
Education. USA: NAEA, Indiana University.
SWIFT, J. & STEERS, J. (1999). A Manifesto for Art in Schools. Journal of Art &
Design Education, 18 (1), pp. 7-13.
TGAT (1988). Task Group on Assessment and Testing, a Report. London: DES.
TORRANCE, H. (1989). Ethics and Politics in the Study of Assessment In: Burgess,
R.G. (Ed.) The Ethics of Educational Research. Philadelphia: Falmer Press.
368
VOS, A. J. & BRITS,V.M.(1987). Comparative and International education for
Students Teachers. London: Buttlerworth.
VYGOTSKY, L.S. (1972). Thought and Language. Cambridge, MA: MIT Press.
WEIR, C.J. (1993). Understanding and developing language tests. New York:
Prentice Hall.
WHITE, J. (2003). (Ed.) Rethinking The School Curriculum: Values, Aims and
Purposes. London: RoutledgeFalmer.
WILLIAMS (2001). Forced in the same mould. In: The Times Educational
Supplement ,16th November 2001.
WOLF, D. (1988). Artistic Learning: What and Where is it? Journal of Aesthetic
Education 22 (1).
WOLF, D.; Bixby, J.; Glen, J. and Gardner, H. (1991). To Use Their Minds Well:
Investigating New Forms of Assessment. In: Grant, G. (Ed.) Review of Research in
Education. Washington, D.C.: American Research Association.
369
YOUNG, J. O. (1995). Relativism and the Evaluation of Art. Journal of Aesthetic
Education 29 (4).
370
Appendix XIV
Booklet (Instructions for an experimental examination in art and design)
Index:
A. General Instructions for teachers
B. Instructions for students
C. Assessment Matrix
D. Grade descriptors
E. Model Project Brief
371
General Instructions for teachers
Teachers are responsible for the delivery of the examination and for the first marking
of portfolios.
Training sessions will be conducted during March.
The final exam in art and design consists of the submission of a portfolio by
candidates.
Schools are responsible for organising the exam. Teachers are responsible for the
development of the unit for external assessment. Students should have access to the
examples of project briefs, assessment criteria and mark schemes one month before
the exam (before the Easter holidays) in order to start the preparatory investigation
through independent study.
Project Brief
There are three options for project briefs:
1. The theme and optional final products provided.
2. Theme and final products designed by the students and the class
teacher.
3. Theme and final product designed by the individual student.
The project brief can be designed by the students and class teacher taking into
account students’ motivations. The class teacher and or the students will develop a
project brief according to a chosen theme and starting points in order to be realised
during the class time taking into account the discipline(s) aims, objectives and
contents. Materials, techniques should be specified in task definitions (according to
school conditions) , however the choice can also be left open to students. The project
brief can be transdisciplinary involving several disciplines (for example:
Technologies; Studio Art and Design; MTEP; History of Art). The project brief
designed by the teachers and students must include the following:
2. Portfolio
Students construct their portfolios around one task/theme or starting point. The
portfolio contains a selected collection of investigation work, preparatory and
developmental studies (annotations, sketches, experiments, visual studies), the final
product and a self-assessment report ( written, oral, visual). The separate items in the
372
portfolio should be connected and depict the entire process. The portfolio should also
reflect the development of the project in time; all the evidence should be dated and
numbered as a part of the process. Portfolio should include comprehensive
documentation ( models, sources).
The portfolio, for example, can be, an expanding file, document case, box, album,
web page, CD Rom; video, work journal or notebook, etc. The appearance of the
portfolio can be a part of the artistic production and reflect the character of the
project and the personality of the author. The appearance can contribute to the
general visual expression, and help to form a positive evaluation.
3. Time:
Portfolio tasks should be developed during class time excepting investigative work
which can be developed extra-class time.
373
Teachers should be available for tutorials with students during the preparation time.
When portfolios are evaluated the time available should be taken into account as
well as the student’s ability to plan the work beforehand and carry out his/ her plans
in the time given.
4. Special considerations
Teachers should provide conditions for students with special needs to conduct their
work with appropriate conditions. Students with physical disabilities should have
special arrangements or special equipment according to their different needs. Non-
native language students should have additional support in order to understand the
instructions of the assessment and task requirements (for example access to
translation).
374
proposals. Portfolio assessment require great effort and commitment from the
students in terms of time and persistence; students must see studio art or studio
design as important disciplines; they must understand why portfolio will be
beneficial for them in terms of fair assessment and preparation for further studies.
The criteria are not strict guidelines; they must be used with the necessary flexibility.
Teachers and students may develop them, adding rubrics for each. However the
general structure and the mark schemes provided in the assessment matrix must not
be changed in order to achieve a common language for assessment.
8. Assessment procedures
The quality of the work and student achievement is holistically estimated taking into
account overall judgement and criteria.
375
The teacher must mark all the portfolios of all the candidates. Portfolios are assessed
in accordance with the negotiated assessment instructions (assessment criteria,
assessment matrix). The class teacher must provide a copy of the portfolio tasks and
mark scheme used in the project unit to the external assessors and explain the
methodology and strategies used in the project unit to the moderator(s).
Internal Marking
Internal marking should be conducted after student’s submission of the portfolios.
Student’s work and achievement is assessed by the student own teacher before the
moderator’s visit to the centre. Teachers will take into account the quality of the
portfolios and their own observations of students’ behaviour during classroom time.
Teachers must use the assessment criteria and assessment matrix language. A total
mark out of 20 points must be awarded for each portfolio. Marks will be allocated in
the assessment matrix and assessment form for each candidate.
Teachers should take into account their own observations of student’s performance
during the course. Evidence for some rubrics related to ‘crits’ for example can only
be observed by the student’s own teacher. Teacher should use such observations to
justify their award.
Art departments are reminded that it is their responsibility to ensure that where more
than one teacher has marked the work in a centre, effective standardisation (group
assessment) has been carried out across all teaching groups. This procedure ensures
that the work of all candidates at the centre is marked to the same standard.
The following documentation must be available to the visiting moderator at the start
of the moderation visit:
376
• Copies of the project brief for each candidate.
• Copies of the assessment matrix for each candidate (with awarded marks).
• Awarding forms for each candidate.
The teacher should meet the visiting moderator(s) at the beginning of the visit, to
introduce them to the work. They should be readily available throughout the visit in
case they are required. Visiting moderators will review the portfolios in the
moderation sample in order to ensure that the centre’s marking is:
• In accordance with the marking criteria stipulated;
• In conformity with the overall standards of the examination.
The moderator(s) checks the samples’ marks. Moderation is not a re-marking but
rather a confirmation or otherwise of the teachers’ marks. Re-marking is only
recommended if a serious discrepancy of marks (5 points difference) occurs between
moderators and teachers and will be conducted on a second moderation visit which
can be required by the teacher or by the moderator. At the end of the visit the
teacher(s) will be informed of the moderators’ findings.
The visiting moderator may request a further visit by a second moderator if there are
any aspects of the work and the moderation, which the visiting moderator believes
should be subject to further consideration.
In this case candidates’ portfolio must be retained in the same conditions as viewed
by the original visiting moderator. As a result of the second visit, the second
moderator may, in certain circumstances, find it necessary to recommend that the
original moderator’s marks should be amended upwards or downwards. Should this
be the case, the recommendations of the second moderator will stand.
A third session will be conducted with external assessors in order to prepare them for
moderation; this will include the marking of students’ portfolios and is intended to
verify the accuracy of moderators’ marking.
377
9. Reports about the examination
Teachers will receive a report of the examination (September).
378
B. Instructions for Students
1. Components
The work to be submitted will take a form of portfolio. The portfolio includes three
components: Process; Product and Self-assessment
Components Weightings
Process Investigation notes or studies 50%- 60%
Developmental studies
Preparatory notes and studies in
the work journals or
developmental studies
Product Final products 30%- 40%
In other words the qualities the qualities that teachers and external assessors will look
for in your portfolio will be:
AC1: Visibility of the intentions: Show that you are able to express your ideas,
motivations, opinions, purposes through words and images, explain your intentions at
the beginning of the project and as long as it develops make annotations about your
progresses, if you re-formulate ideas explain why, if something influences you
explain it, explain the importance of the examples of visual culture that you have
selected and used in your work. Show that you are able to plan your project and
respect deadlines; if for some reason you cannot fully realise what you have
proposed explain why it was not possible.
379
AC2: Searching abilities; abilities to interpret and use examples of visual culture: Show that you
can collect information about the work of others in the world of visual culture (art, design, media,
etc); that you can interpret its meanings, purposes and functions. But, be careful, do not spend too
much time in collecting information that is not relevant for your project; you are not required to copy
and paste information, but rather to reflect critically on it, expressing personal opinions and informing
your work through what you have learned in your search. Show that your sources were useful for
developing your own ideas and experiments of techniques, materials, etc.
AC4: The portfolio as a whole and the final product: The teacher will look for
your knowledge, understanding and skills in art and design and if your personal
response was informed by what you have learned through the process and adequate
for realising your intentions. The teacher will see how you used the formal elements
of visual language; art and design concepts and conventions; your technical skills
and your personal style.
AC5: Knowing the strengths and weakness of your work, explain and justify the
meaning and purposes of your work: The teacher will look for your skills of
evaluation, if you are able to reflect upon the process and product you have
developed, if you can justify the purposes, meaning and function of your final
outcome by using specific vocabulary. Show that you are aware of the difficulties
you encountered; that you can point out what you achieved in terms of knowledge,
understanding and skills in art and design.
2. Portfolio
Students construct their portfolios according to the chosen project brief. The
portfolio contains a selected collection of investigation work, preparatory and
developmental studies (annotations, sketches, experiments, and visual studies), the
final product and a self-assessment report (written, oral, visual). The separate items
in the portfolio should be connected and depict the entire process. The portfolio
should also reflect the development of the project in time; all the evidence should be
380
dated and numbered as a part of the process. Your portfolio should include
comprehensive documentation (models, sources).
The portfolio can be, for example, an expanding file, document case, box, album,
web page, CD Rom; video, work journal or notebook, etc. Appearance of the
portfolio can be a part of the artistic production and reflect the character of the
project and your personality. The appearance can contribute to the general visual
expression, and lead to a positive evaluation.
Project Brief
You have three options for you project brief:
1. The theme and optional final products provided.
4. Theme and final products designed by the students and the class teacher.
5. Develop your own theme and final products.
The project brief may involve one discipline or involve several disciplines’ contents.
You must describe the project brief. In the description you must state:
1. The theme
2. The tasks: what are you proposing to do; what kind of final product will you
develop. Explain why do you want to develop it, submit diagrams and of
development and methodology.
3. Timetable: submit a planning diagram or schema showing a time line to
completion of your project.
4. What type of knowledge; understanding and skills will you develop in the
project? What kind of help and resources do you need?
During March April you must choose and develop the project brief. You can use the
example provided, examples provided by your teacher or design your own project
brief in negotiation with your teacher. Your teacher will help you to design your
project brief through individual tutorials at your request.
You must choose a theme for your work; the portfolio is built around one task or
final product. You have one month to study the tasks, gather necessary background
information and make your choice before the final exam period.
381
2.2. What kind of evidence to submit in the portfolios
The following diagram describes some types of evidence to include in portfolios:
382
techniques and materials? What kind of experiments and exploration do you want to
try and why?)
• Final Products.
The portfolio provides evidence about how you have tackled your chosen task and
contains material produced during the process. You should choose related sketches,
studies and experiments that, together with the final product, best illustrate your
project and end results.
The portfolio should illustrate the origins of your chosen content and mode of
expression. It should also reveal pivotal decisions and their consequences: how you
have started and developed your form and content, mode of expression, techniques
and use of materials. You can also include annotated supporting materials that you
have used and analysed, such as texts, newspaper clips, material samples,
reproduction of images or media files. You may include the work journal in your
portfolio.
The portfolio should show that you have made an art and design study of your
subject. This means that making sketches, studies and decisions on alternative
solutions should be a part of the process. The portfolio should also show how your
project developed in time. Date, sign, number all your developmental studies and
other material during the work period. The work journal may be useful to help you
document and critically reflect on the progress and purposes of the project; it helps
you to record and justify your decisions. Notes, visual or textual, also help you to
develop your ideas; reflect about problems found and your decision making process.
You may include the original final product; but if it is impossible to include it
because of size; media or other reasons include a video/photographic or digital
reproduction of it. The final product(s) or the record of it must be labelled as
Product.
383
3. Self-assessment report guidelines
This part should be conducted after completion of the work (during the specified
time). Try to answer the following questions using oral, visual, written media for
example.
Describe how you developed the form and content of your project; your progress,
how you overcame the difficulties; what kind of sequence of work (leap of
imagination, how new ideas appeared and why). If you changed your point of
departure, why and how was that? Did you use any sources or models; were they
useful, why? How well did you attain your purposes, aims and goals; how well did
you achieve the qualities required by the assessment criteria and mark schemes? Did
you achieve your purposes; why and how? Does your work reveal a personal
reflection about an issue, why and how? Is your personal reflection important for
others? Is your work meaningful for others? Why? An honest explanation of partial
failure may be as valuable as an exaggerated claim that all went smoothly.
384
C. Assessment matrix
385
New Examination Instructions
Criteria Level: 1-4 Level: 5-9 Level: 10-14 Level: 15-17 Level: 18-20
CA1: Record personal .
ideas, intentions, Reasonable amount of Considerable amount of
experiences, information Limited or inappropriate Small amount of records not records, the intentions The student’s intention is records with personal
and opinions in visual and records always persistent. The sometimes are clear , obvious. The student reflection.
other forms. The student has no student knows what she shows persistence and shows persistence and Intentions are clearly stated.
conscious intention for wants to achieve, but her/his curiosity combines some The student approaches
what she/he is doing. intention is not explicit. information with the work themes and problems from
according to the several angles and develops
intentions. it through a series of drafts
and. sketches
(1/6pts) (7/14)
( 30 pts) (15/21) (22/26) (27/30 pts)
CA2: The student uses the sources The student actively The student actively The student actively
The student only uses the indicated by the teacher and searches for sources to get searches for various searches for and critical
Critical analysis of sources sources indicated by the others but limits the search ideas for her/his own sources in time and space
reflect about various
from visual culture teacher, the analysis is to collect and organise work. Collect, organise and use it in a well-sources in time and space
showing understanding of limited to collect information. and select information integrated way in her/his
and use it in a versatile,
purposes, meanings and information. according to intentions. own work ( collect, independent, and well-
contexts organise, select, combine)
integrated way in her/his
( 30 pts) own work ( collect,
organise, select, combine,
(1/6pts) (7/14) (15/21) (22/26) critic and re-organise)
(27/30 pts)
CA3: Develop Limited or inappropriate The students show a small The student can use pre- The student can re- The student can find, and
developments of obvious amount of exploration of established problems. . formulate problems. formulate problems in an
ideas through ideas, small ideas, lacking sense of order Reasonable and safe Comprehensive independent way.
experimentation , experimentation and no and technical abilities, exploration of ideas exploration of appropriate Constantly experiment and
signs of critical reflection Tendency to repeat ideas and (without risk-taking) ideas taking risks, explore possibilities taking
exploration and about the experiments and experiments. No critical Shows some understanding revealing understanding risks and finding non-
evaluation decisions made. reflection about the of visual language, of visual language, expected responses. Shows
experiments and decisions concepts and techniques concepts and techniques critical reflection about the
made. but reduced reflection and critical refection experiments and decisions
about the experiments and about the experiments and made.
decisions made. decisions made.
(1/12 pts) (13/27 pts) (28/42 pts) (43/51 pts) (52/60 points)
386
New Examination Instructions
( 60 pts)
CA4: Present a coherent Inappropriate small The amount of works and The considerable amount The amount of works and A set of works was selected
and organised sample of amount of works and final final product shows little of works and final product final product shows and organised. The works
works and final product product showing lack of understanding of visual shows reasonable understanding of visual and final product shows
revealing a personal and understanding of visual language, concepts and understanding of visual language, concepts and excellent understanding of
informed response that language, concepts and techniques. language, concepts and techniques. visual language, concepts
realises their intentions techniques. techniques. and techniques.
( 60 pts)
(1-12 pts) (13-27) (28/42) (43/51) (52/60)
CA5: Evaluate Unable to explain the The student explain the The student evaluate the The student evaluate the The student evaluate
and justify the reasons for her work. merits of her work using characteristics and merit of characteristics and merit fluently the characteristics
qualities of the The student cannot point vague terms, can refer to the her work using specific of her work using specific and merit of her work using
work. out the strengths and intentions and used sources terms, can describe the terms, can explain the specific terms, can explain
weaknesses of her own but is unable to justify the progresses referring to progresses referring to and justify the progresses
work or distinguish quality and purpose of the intentions, sources and intentions, sources and referring to intentions,
between works that are work. problems encountered, problems encountered, sources and problems
successful and those that Justify the purpose and Justify the purpose and encountered, Justify with
are less successful. meaning of her work using meaning of her work in fluent terms the purpose
vague terms. cultural and social and meaning of her work in
(1/4 pts) ( 5/9 pts) ( 10/14 pts) contexts. cultural and social contexts.
( 15/17 pts) ( 18/20 pts)
( 20 pts)
387
New Examination Instructions
D. Grade Descriptors
A list of draft criteria was established in order to be used as ‘windows’ to help the
Draft criteria
Students should:
evaluation.
AC4: Present a coherent and organised sample of works and final product
revealing a personal and informed response realising their intentions.
AC5: Evaluate and justify the qualities of their work.
2. Global Descriptors
The following descriptors or band scales might be useful for holistic assessment.
1-4. Reduced
388
New Examination Instructions
including a few records of obvious or literal ideas; some collected but not analysed
sources from visual culture; few developmental studies and a final product which
does not realise the intentions. Personal expression is weak and stated in vague terms
and the student is unable to justify the strengths and weakness of their work Overall
the work is lacking a sense of order, basic technical skills and relevant
5-9. Limited
A small amount of work has been produced including some record of personal ideas
and intentions, some organisation of sources but the student does not use it to explore
his/her own ideas. There is some sense of order and structure in the way ideas are
abandoned too early, ideas and experiments are repeated without any changes in the
form and content. The student sometimes critically reflects on his/her work and the
Overall the work demonstrates a limited understanding and use of art and design
principles/visual language).
10-14 . Basic/Reasonable/Appropriate
389
New Examination Instructions
sources appropriate to the project but with superficial critical reflection. Some
sources seem to be used during the development of ideas but not fully explored. The
original ideas may be consolidated too early; experiments are not fully developed
and the outcomes are predictable, however there is some sense of order and method.
The student sometimes critically reflects on his/her work and the work of others
15-17: Good
A good sample of work has been organised including a record of some unusual ideas;
the majority of development studies show a personal style and reasonable technical
understands and critical reflects on the contexts, meanings and purposes of visual
and explores possibilities taking risks and some times evaluating the decisions. Use
of art and design specific vocabulary to reflect and evaluate his work and the work of
others. Explains the importance of the work presented within social and cultural
contexts.
390
New Examination Instructions
unusual ideas; development studies showing a personal style and technical expertise,
reflects on the contexts, meanings and purposes of visual culture production. Uses
always evaluating decisions made. Use of art and design specific vocabulary to
reflect and evaluate his/her work and the work of others. Persuades the audience
(viewer) about the importance of the work presented within social and cultural
contexts.
media and technical expression through developmental studies and final products.
391
New Examination Instructions
1. General instructions
This paper and the instructions for examination are given to you in advanced of the
examination so that you can make sufficient preparation.
The portfolios for external assessment will be developed according to a project brief
theme. You may chose to develop:
1. This project model theme
2. A theme and project brief developed by the class and the teacher
3. Your own theme and project brief.
392
New Examination Instructions
The excluded people are those persons who are rejected by the society, or those who
are discriminated by the community because they are different from the majority of
the people in the community. People can be rejected because they are physically or
mentally different; other can be excluded because they have different ethnic origins
or they are from a different race. Some persons can be discriminated by sexual
characteristics; others are excluded because they have different beliefs, different
social backgrounds or simply because their ways of living are different from the
majority.
It is intended to make a visual study reflecting upon the people who are excluded
from the community, who are discriminated. You are asked to present a personal
reflection about ‘the other’.
Dwarfs, beggars, crippled and crazy people were depicted in works of art though
history; for example by Velasquez; Murillo, Gericault, picaresque paintings, etc.
Contemporary artists have been working on projects about homeless people.
Similarly in film and photography some artists had approached discriminated or
excluded people. Photographs by Sebastião Salgado show a vision about the other,
and some of his works are about minorities. The photograph by Gabriel Orozco: ‘ the
island in the island’ is a reflection about closed and discriminating spaces. The
photographer Diane Arbus in her works presented a personal vision about different
people often called monsters. Monsters are often present in comics and science
fiction films. The super-heroes, androgynies, cyborgs and mutants are somehow a
vision about different people and their place in society.
Some theatre and dance companies, and film producers have artists with disabilities,
do you know about ? In advertising and publicity different people are not commonly
represented, do you often see fat people, disabled or poor people in publicity? Why?
However there are some humanitarian founded advertising about different people.
Designers invented products to make easier the daily life of disable people, for
example cutlery and scissors for left-handed people.
393
New Examination Instructions
not invite him to share out lives. We integrate the other when we invite him to
share our daily lives with no constraints.
• Me as the other
I can be the other, the different one. I can be discriminated by some reason. How
will I feel within the community? What will be my sensations and feelings? How
shall I live; who are my friends? How do the other persons look at me in the
street. Why they insist to look at me?
What is my relationship with the objects made for the normal people? The
entrances in the buildings are accessible for a wheeling chair? What about the
size of urban equipments? If I am blind, are the streets safe for me? If I am
locked in a psychiatric hospital or prison cell, how do I feel in this space?
• Fine Arts
Fine arts could be defined as a mean of to express personal experiences, feelings
and particular visions, it is not aimed to solve practical or utilitarian questions,
however fine arts media can transmit strong messages about the reality and the
world including social questions. If you chose this media you can make a
painting, a drawing, sculpture, printing, installation, performance, site-specific
work, etc. You can also choose ceramic, textile or other crafts.
There are several artists who approached this theme, make a search about it and
try to understand their visions about different people. Develop your ideas after
the search and from your own opinions about the issue.
Search about photography and media products where different people are
presented, and try to understand the underlying visions about the issue. You can
produce as outcome one or several finished photographs, film or video in order to
394
New Examination Instructions
present your own views and opinions about the theme. You can also make a web-
page, video-clip or other multimedia presentation.
• Comics
Comics are a narrative medium using images and texts. You can make a comic
strip or a graphic novel to tell a story related to the issue. The graphic novel
cannot be a finished product, but you must include the complete story-board;
essays for scenes and characters; drafts for page settings and some finished ‘
pranches’.
• Graphic design
The aim of graphic design production is the communication of messages using
expressive, symbolic, formal and functional aspects of the image. Messages can
be informative and or persuasive.
• Product design
Try to detect needs related to different people and disable in your environment
before thinking about the final product (you can use brainstorming strategies to
help you). You should show evidence of an understanding of the appropriateness
of the medium to function and of fitness for the purpose. Use your knowledge
about design process to develop your ideas (concept, formulation of brief,
research, experimentation, realisation and evaluation). You can re-design a
utilitarian object or a space or invent a new one.
395