Вы находитесь на странице: 1из 395

Developing a new conceptual framework for pre-university art examinations in

Portugal

Maria Teresa Torres Pereira de Eça

Submitted in partial fulfilment


of the requirements for the degree of
Doctor of Philosophy

School of Education Studies, University of Surrey Roehampton


University of Surrey
2004

1
Abstract

The research topic was external assessment for art and design at pre-university level
(age 17+). The research field was focused on formal assessment in art education,
especially on the visual arts at the end of secondary schooling (ages 17-18). In the
literature terms such as art, arts, art and design are used in the context of art
education. Through the study such terms were used interchangeably according to the
context.The principal audience for this study included key stakeholders in the
national examinations in Portugal - government, government agencies, university
admissions systems, and art teachers. It may also be relevant to the international
community of researchers in art and arts education and related subjects and
specialists in arts assessment in general.The study was undertaken in two stages: (1)
analysis of current art examinations at pre-university level in Portugal and England
and (2) the development of a conceptual framework for external assessment in art
education in general, and a pilot and trial of a new instrument and assessment
procedures for art examinations in Portugal. The first stage included three sections:
(1) The literature on assessment was reviewed and current problems in assessing the
arts were identified, including issues of quality such as reliability, validity, impact
and practicality. It was established that valid and reliable assessment in art education
required assessment instruments with clear instructions and criteria that allow
multiple evidence to be considered and that such assessment procedures should
include provision for in-service teacher training, standardisation and moderation. A
conceptual framework was developed for evaluating art examinations. (2) The
current Portuguese system of art examinations (age18) was critiqued. Analysis of
relevant documentation and questionnaires conducted with Portuguese art teachers
and art students enabled identification of key issues such as the lack of validity and
reliability in the system, as well as collecting proposals for alternative improved
forms of assessment. (3) The English system of art and design examinations (age
17+) was partially reviewed through direct observation of one example of
examination procedures, analysis of documents and the recording the views of
selected stakeholders. The second stage also included two sections that described and
evaluated: (1) the design and piloting of a new framework for external assessment of
art education in Portugal based on a portfolio assessment extended task, in-service
teacher training and standardisation procedures; (2) the trial of a new external
assessment instrument and procedures in five schools in Portugal.

The new assessment instruments and procedures were compared with the current
Portuguese art examination, establishing that greater validity and reliability was
achievable. The study concluded with discussion of the negative and positive aspects
of the proposed conceptual framework, the implications for assessment stakeholders
in Portugal including the need for reform in Portuguese art education. Further
research into alternative forms and methods of assessment was also recommended.

Keywords: Art Education; Arts; Art and Design; Art Examinations; Assessment and
Evaluation; Authentic Assessment; Portfolio Assessment.

2
List of Contents

3
Pages

Abstract i
List of Contents ii
List of Figures ix
List of Tables x
Acknowledgements xii

Introduction 1

0.1. The problem area 1


0.1.1. Multiple viewpoints about art education 3
0.1.2. Assessment considerations 5
0.1.3 Key difficulties in art assessment 6
0.2. Contrasting systems of art examinations: Portugal and England 11
0.3. Summary problem statement 13
0.4. Research aims 13
0.5. Research questions 14
0.6. Outline of chapters 15

Chapter 1:Assessment in Art Education 18


1.1. Assessment 18
1.1.1. Roles of assessment 20
1.1.1.1. Feedback, motivation 22
1.1.1.2. Surveillance, control, accountability 24
1.1.2. Forms of assessment: formative/summative 25
1.1.2.1. Formative assessment 26
1.1.2.2. Summative assessment 26
1.1.2.3. Ipsative assessment and self-assessment 27
1.1.2.4. Diagnostic assessment 27
1.1.2.5. Negotiated assessment 28
1.1.2.6. External assessment 29
1.1.3. Assessment as a social product 29
1.1.4. External assessment and examinations 30
1.2. Assessment and the arts 31
1.2.1. Why assess the arts? 32
1.2.2. Assessment instruments and objects of assessment 33
1.2.2.1. Moving away from tests 34
1.2.2.2. New approaches for external assessment 37
1.2.2.3. Records of achievement 38
1.2.2.4. Portfolio 39
1.2.3. Standards, criteria, attainment targets 41
1.3. Assessing the arts: current problems 46
1.3.1. The problem of standards and criteria 46
1.3.2 The case for a consensual view of criteria 47
1.3.3. The problem of teacher in-service training 48

4
Summary 49

Chapter 2: Reliability, Validity, Impact, Practicality 51

2.1 Reliability 51
2.1.1. Problems of reliability in art examinations 52
2.1.1.1. Instrument 53
2.1.1.2. Assessor and examiners training 54
2.1.1.3. Judgements and moderation 54
2.1.1.4. Threats to reliability summarised 55
2.2. Validity 55
2.2.1. Some types of validity 56
2.2.1.1. Content validation 57
2.2.1.2. Face validation 57
2.2.1.3. Response validation 58
2.2.1.4. Washback validation 58
2.2.1.5. Criterion-related validation 59
2.2.1.6. Validation related to examination bias 60
2.2.1.7. Construct validation 61
2.2.2. Problems of validity in art examinations 62
2.2.2.1. Assessment instrument 62
2.2.2.2. The examination underlying model 63
2.3. Impact 65
2.3.1. Effects on society and individuals 66
2.3.2 Effects on students 67
2.3.3. Effects on teachers and teaching methods 67
2.3.4. Effects on schools 68
2.3.5. Effects on standards and curriculum 69
2.3.6. Effects on higher education 69
2.3.7. Ethical considerations 70
2.3.8. Evaluating the impact of examinations 70
2.4. Practicality 71
2.5. Conceptual framework for evaluating art examinations 72
Summary 75

Chapter 3: Design of the research 77

3.1. Research questions 77


3.2. Choice of method 77
3.3. Stages of research 79
3.3.1.3.3.1. Stage 1 80
3.3.2 Stage 2 80
3.3.2.1. Trial 1 80
3.3.2.2. Trial 2 81
3.4. Instruments and data collection 83
3.4.1. Qualitative data collection 83
3.4.1.1. Document sources 83
3.4.1.2. Observation 84
3.4.1.3. Interviews 85

5
3.4.2. Quantitative data 90
3.4.2.1. Questionnaires 90
3.4.2.2. Other data 93
3.5. Data analysis 93
3.5.1. Document analysis 93
3.5.2. Analysis of observation notes 94
3.5.3. Analysis of interviews 94
3.5.4. Descriptive statistics 98
3.5.5. Multi-faceted analysis 98
3.6. Reliability of data 101
3.7. Triangulation 102
3.8. Ethical considerations 103
Summary 105

Chapter 4: Portuguese art examinations 107

4.1. Historical background overview 107


4.2. Assessment in secondary education after 1996 111
4.2.1. General regulations for assessment 111
4.2.2. National examinations 113
4.2.2.1. Instruments 113
4.2.2.2. Procedures 115
4.3. The model of Portuguese secondary art curriculum 119
4.3.1. Origins of current art curriculum 119
4.3. 2. Assessment instruments in Portuguese art examinations 122
4.3.3. Assessment procedures in Portuguese art examinations 125
4.4. Users’ perceptions of Portuguese art examination practices 131
4.4.1. Respondents 131
4.4.2. Description and analysis of questionnaire results 132
4.4.2.1 Syllabuses ( Section 2) 132
4.4.2.2. Opinions about the validity of the examination (Section 3 133
and 4)
4.4.2.3 The ideal form and structure of art and design examinations 137
(Section 5)
4.4.2.4. Opinions about current procedures (Section 6) 140
4.4.2.5. Reliability (Section 7) 141
4.4.2.6. Opinions about ideal assessment procedures (Section 8) 142
4.4.2.7. Impact: consequences of the examination (Section 9) 144
4.4.2.8. Teachers views about assessment procedures (Sections 10- 148
12)
Summary 150

Chapter 5: Art and design examinations in England 153

5.1. Introduction 153


5.2. Art and design examinations before the 1988 Reform Act 156
5.3. The consequences of the Education Reform Act (1988) in 161
England and Wales
5.4. AS and A level art examinations after curriculum 2000 163

6
5.4.1. The role of the QCA 164
5.4.2. The structure of awarding bodies 164
5.4.3. A modular structure of courses and examinations 166
5.4.4. Scheme of assessment 169
5.4.5. The specifications 169
5.4.6. Rationales for the specifications 170
5.4.7. Instructions for the conduct of examinations 172
5.4.8. Assessment instruments 176
5.4.8.1. Coursework 176
5.4.8.2. The ‘controlled test’ and question papers 176
5.4.8.3. Type and format of the assessment instruments 177
5.4.8.4. Process orientated instrument 179
5.4.8.5. Bias 181
5.4.8.6. Gender bias 181
5.4.8.7. Bias related to non English students 182
5.4.8.8. Bias related to geographical location and economical 182
background
5.4.8.9. Teachers’ influence upon students’ achievement 183
5.5. Assessment criteria 184
5.5.1. Mark schemes 186
5.5.2. Criterion referenced marking 188
5.6. Assessment procedures 191
5.6.1. In- service teacher training 191
5.6.2. Internal marking and internal moderation 192
5.6.3. Standardisation 194
5.6.3.1. Observation of standardisation procedures 194
5.6.4. Moderation 198
5.7. Conclusions in terms of validity summarised 199
5.7.1. Weaknesses 199
5.7.2. Strengths 200
5.8. Conclusions in terms of reliability summarised 201
5.8.1. Weaknesses 201
5.8.2. Strengths 201
5.9. Practicality 202
5.10 Impact 203
5.11. Suggestions for art and design examinations 205
Summary 206

Chapter 6: Design and piloting of a new external assessment 208


instrument and procedures in
Portugal

6.1. Introduction 208


6.2. Designing a new assessment instrument and procedures for art 209
and design external assessment in Portugal
6.2.1. Rationales 209
6.2.2. Skills in art and design 210
6.2.2.1. Knowledge base 210
6.2.2.2. Thinking skills and metacognitive skills 211

7
6.2.2.3. Self-evaluation skills 212

6.2.3. Defining a content framework for the new art and design 213
external assessment
6.2.3.1. Kinds of evidence 215
6.2.3.2. Portfolio 215
6.2.3.3. Portfolio evidence 215
6.2.3.4. Assessment criteria 217
6.2.3.5. Assessment procedures 217
6.3. Piloting the new assessment instrument 218
6.3.1. Pilot sample (trial 1) 219
6.3.2. Place, time and programme 220
6.3.3. Feedback from the pilot 221
6.3.3.1. Underlying model 223
6.3.3.2. Format 223
6.3.3.3. Bias 229
6.3.3.4. Tasks 230
6.3.3.5. Time and resources constraints 232
6.3.3.6. Criteria and weightings 234
6.3.3.7. Impact 235
6.3.3.8. Assessment procedures 236
6.4. Suggestions for improvement 238
Summary 239

Chapter 7: Main Study: Trial of the new external assessment instrument 240
and procedures in five schools in Portugal

7.1. Trial sample 240


7.1.1. Data collection 242
7.1.2. Location, time and programme 242
7.2. Description of the activities 243
7.2.1. Standardisation meetings 243
7.2.2. Web page and on-line meetings 245
7.2.3. Implementing the portfolio with students 246
7.2.3.1. School P 247
7.2.3.2 School B 249
7.2.3.3. School A 251
7.2.3.4. School K 253
7.2.3.5. School V 256
7.2.4. Moderation 258
7. 3. Feedback from Trial 2 260
7.3.1. Underlying concept 260
7.3.2. Preparation 260
7.3.3. Format 261
7.3.4. Instructions 262
7.3.5. Time 263
7.3.6. School resources 265
7.3.7. Tasks 266

8
7.3.8. Criteria and weightings 268
7.3.9. Bias 270
7.3.10. Assessment procedures 273
7.3.11. Consequences/Impact 274
7.3.11.1. Results 274
7.3.11.2. Effects upon students 275
7.3.11.3. Effects upon teachers 275
7.3.11.4. Effects upon schools and curriculum 277
7.3.12. Reliability of results 278
7.4. Comparing the current MTEP examination and new model 279
7.4.1. Validities 279
7.4.2. Reliability 279
7.4.2.1. Summary of the FACETS 281
7.4.2.2. Reliability index 284
7.4.2.3. Raters measurement reports 284
7.4.2.4. Probability curves 288
7.4.2.5. Expected ogives 290
7.4.3. Consequences/impact 292
7.4.4. Practicalities 292
Summary 293

Chapter 8: Conclusions 296


8.1. Current system of art examinations in Portugal (2000-2003) 296
8.2. Current system of art and design examinations in England (2000- 299
2003)
8.3. Impact of the English and Portuguese art and design 302
examinations
8.4. Increasing the validity of Portuguese art and design external 303
assessment
8.5. Increasing the reliability of Portuguese art and design external 304
assessment results
8.6. The proposed model for art examinations 304
8.6.1. The underlying model 304
8.6.2. The proposed instrument 305
8.6.3. The assessment procedures 311
8.6.4. Impact 312
8.6.5. Strengths and Weaknesses 313
8.6.5.1. Weaknesses 313
8.6.5.2. Strengths 314
8.7. Wider Implications 314
8.8. Further research 316
8.9. Reflections on the research methodology 319
8.10 Recommendations for assessment stakeholders 320
8.10.1. Students 320
8.10.2. Teachers and Assessors 322
8.10.3. Schools 322
8.10.4. Universities 323
8.10.5. Government 323

9
Glossary of terms 325
Bibliography 334

Appendices (Volume 2)

I: Portuguese education system 1


II: English education system 10
III: Scheme of Assessment England 19
IV: GSE AS and A level art and design areas of study in England 22
V: MTEP test 24
VI: Report 060 27
VII: Survey questionnaires 29
VIII: SPSS output survey Portugal 47
IX: Interviews in England 76
X: Observation of English art and design standardisation practices at 116
Edexcel awarding body
XI: Trial 1 – Schedule group interviews 132
XII: Trial 1 – Sessions 133
XIII: Trial 1 – External observer report 167
XIV: Booklet (Instructions for an experimental examination in art and 169
design)
XV: Trial 2 – Questionnaire students 193
XVI: Trial 2 – Questionnaire teachers 201
XVII: Trial 2 – SPSS questionnaires outputs 211
XVIII: Trial 2 – School P 240
XIX: Trial 2 –School B 250
XX: Trial 2 – School K 262
XXI: Trial 2 – School V 274
XXII: Trial 2 – School A 288
XXIII: Trial 2 – External observer report 292
XXIV: Facets output (current examination; trial 1 and trial 2) 297
XXV: Research journal, 4th May 2002 313
XXVI: On-line meeting (25-6-2003) 318
XXVII: Consent form 323
XXVIII: Diagram two countries 324

10
List of Figures

Figures Page

Figure 1: Conceptual framework for evaluation of art and design 76


external assessment
Figure 2: Stages in research 79
Figure 3: Sample of art and design examination stakeholders for 87
interviews in stage 1
Figure 4: Sample of students for interviews in stage 2 89
Figure 5: Response rates to the survey in stage 1 92
Figure 6: Diagram analysis of interviews in England stage 1 96
Figure 7: Diagram analysis of interviews in stage 2 97
Figure 8: MTEP question paper card board model (question paper 125
September 2002)
Figure 9: Responses to question 5.7 139
Figure 10: Structure of awarding bodies 165
Figure 11: Controlled tests 177
Figure 11: Blueprint for the design of a new art examination 214
Figure 12: Types of evidence for portfolio 216
Figure 13: Assessment criteria 217
Figure 14: Differences between marks in Joana’s and Luis’ portfolios 259

11
List of Tables

Tables Page

Table 1: Plan of action 82


Table 2: Current examination differences of marks (MTEP) 128
Table 3: Questionnaire survey sample of students 131
Table 4: Questionnaire survey sample of teachers 131
Table 5: Responses to question 2.3 133
Table 6 : Responses to question 4.4 135
Table 7: Responses to question 4.8 135
Table 8: Responses to question 4.10 136
Table 9: Responses to question 4.11 136
Table 10: Responses to question 4.12 136
Table 11: Responses to question 4.13 136
Table 12: Responses to question 5.1 137
Table 13: Responses to question 5.3 138
Table 14: Responses to question 6.2 141
Table 15: Responses to question 7.1 142
Table 16: Responses to question 7.2 142
Table 17: Responses to question 7.3 142
Table 18: Responses to question 8.3 143
Table 19: Responses to question 9.7 144
Table 20: Responses to question 9.1 146
Table 21: Responses to question 9.11 147
Table 22: Responses to question 9.12 147
Table 23: Interviews with English art and design examination stakeholders 156
Table 24: Trial 1 sample teachers 220
Table 25: Trial 1 sample students 220
Table 26: Trial 1 schedule 221
Table 27: Responses to questions about format 224
Table 28: Responses to questions about preparation 227
Table 29: Responses to questions about bias 229
Table 30: Responses to questions about tasks 230
Table 31: Responses to questions about time and resources 232
Table 32: Responses to questions about criteria 235
Table 33: Responses to questions about impact 235
Table 34: Responses to questions about assessment procedures 236
Table 35: Trial 2 sample 241
Table 36: Trial 2 schedule 243
Table 37: Trial 2: standardisation blind marking results 245
Table 38: Responses to questions about preparation 260
Table 39: Responses to questions about format 261
Table 40: Responses to questions about content 262
Table 41: Responses about instructions 262
Table 42: Responses to questions about tasks 266
Table 43: Questionnaire trial 2: one-sample T Test: sex/tasks 267
Table 44: Responses to questions about criteria. 269

12
Table 45: Responses to questions about weightings 270
Table 46: Responses to questions about bias 271
Table 47: Responses about assessment procedures 273
Table 48: Responses about consistency of marking 274
Table 49: Responses about impact 277
Table 50: Comparing inter and intra raters reliability between the current 280
examination and the new assessment instrument and procedures
Table 51: Measurable data summary: current examination 283
Table 52: Measurable data summary: trial 1 283
Table 53: Measurable data summary: trial 2 283
Table 54: Current examination raters measurement report 286
Table 55: Trial 1 raters measurement report 286
Table 56: Trial 2 raters measurement report 287
Table 57: Current examination probability curves 289
Table 58: Trial 1 probability curves 289
Table 59: Trial 2 probability curves 290
Table 60: Current examination expected score ogive 291
Table 61: Trial 1 expected score ogive 291
Table 62: Trial 2 expected score ogive 292

13
Acknowledgements

I wish to extend my grateful thanks to:

Dr John Steers, Director of Studies, for his knowledgeable advice, encouragement,


support, and for having introduced me to the English system of examinations;
Professor Cyril Weir, first co-supervisor and Professor Rachel Mason, second co-
supervisor for their expert advice, guidance and support; Dr Barry O’Sullivan for his
help with minifac files; the Portuguese government for their financial support,
especially the Ministry of Education and Fundação para a Ciência e a Tecnologia; the
English awarding body, Edexcel Foundation for letting me participate in the
GCE/AS/A2 standardisation meetings in 2001 and 2002; the English teachers and
students who generously gave me their hospitality and time; and the Centro de
Formação Penalva e Azurara and the Associação de Professores de Expressão e
Comunicação Visual for giving me the physical resources for the pilot and trial.

Particular thanks are due to the Portuguese art teachers and students who took part in
the research and to my fellow students in the Centre for Art Education and
International Research and the Centre for Research in Testing, Evaluation and
Curriculum at the University of Surrey Roehampton for their lively discussions and
companionship.

And finally my thanks are due to my family and friends for their support and
encouragement.

14
Introduction

This chapter establishes the research problem by defining issues and current

trends in art and design curriculum and assessment.

0.1. The problem area

The field of 'art education' can be elusive to define with precision because a variety

of terms are used in the literature, often interchangeably. For example, in the English

National Curriculum the term 'Art' was used until 2000 when after much lobbying

the subject title was changed to 'Art and Design'. Despite this change the National

Curriculum sub-text always stated that the subject title included 'Art. Craft and

Design'. The English awarding bodies have also used 'Art' and 'Art and Design'

inconsistently. North American literature on the subject usually refers to 'Art

Education' although the term 'Visual Culture Education' is becoming more common:

it is debatable whether these terms are synonymous. In Portugal the terms: ‘arte’,

‘artes’, ‘artes visuais’, ‘arte e design’, ‘educação artística’ are used and these have

been translated in this thesis as ‘art’. When the term 'Arts Education' is used it covers

a range of visual and performing arts including for example, dance, drama and music

education. Where any of these terms are used within quotations in this thesis they

are the choice of the original author. To avoid confusion discussion of these quotes

uses the same terms as the original unless there is an obvious reason not to do so.

Assessment is a broad concept with multiple potential purposes. According to Brown

(1994, p. 271), assessment should be closely integrated with the curriculum. Its

principal functions include diagnosis of the causes of the learner’s success or failure,

to provide motivation for the learner, to provide valid and meaningful accounts of

15
what has been achieved, and a means of evaluating teaching programmes.

Assessment can be used to provide feedback and motivation for teachers, parents and

students; as an instrument of surveillance, control and accountability and as an

instrument for ranking and selecting students (Eisner, 1998; Robinson, 1982; Gipps

& Stobart, 1997; Wiggens, 1993).

In the case of examinations at the end of secondary education, assessment (and

especially external assessment in the form of formal examinations) has serious

consequences for students because it is one of the indicators used to select them for

further study or employment. The various procedures used in external examinations

are frequently determined by the rationales that underpin different educational

systems. Different procedures and instruments are used in different countries, often

related to specific emphases in art education policy and practice.

Assessment in the context of this research refers to the process of arriving at a view

or judgement about individual student achievement. It should be distinguished from

reports or statements which result from that process, and from broader evaluation,

which is taken to be the process of making judgemental statements about educational

activities such as courses, programmes or some other larger aspect of the educational

process (Eisner, 1985). Examinations and tests are techniques used to assess students'

competencies through various forms of external assessment. External assessment

refers to procedures developed and implemented by assessment experts from

national, regional or other agencies outside the candidates’ own schools.

0.1.1. Multiple viewpoints about art education

16
In art education it is usual to find that multiple viewpoints and value positions are

celebrated (Boughton, 1996, p.77). Boughton (1996) raised several problems for

educators who have to judge student art works, particularly at the senior school level.

For example:

How do we accommodate emerging postmodern forms within traditional


criteria derived from modernism?

...Should the balance in emphasis of socially, critically theoretical


analysis and studio practice be reconsidered?

...To what extent should formalist and modernist concepts be


valued as artistic learning? (pp. 77-78).

Modernist and postmodernist views are two different underlying paradigms, which

may problematise the achievement of consensus about the quality of art works. In a

modernist discourse art is understood as a highly personal, individual expression and

often tends to apply formal criteria to student outcomes as measures of quality. In

postmodern discourse art works are viewed as social constructions, interdisciplinary

and instrumentalist justifications are proposed for art, using criteria emphasising

critical analysis and questioning, and accepts pluralist narratives of art (Efland,

Freedman, & Stuhr, 1996).

Different viewpoints about art education present problems for assessment. For

example the criterion of ‘originality’, is not so important in a postmodernist as in a

modernist perspective; and copies and reinterpretations of art and design exemplars

by students are valued as much as ‘original creations’. Finally the well-established

criteria for judging drawing performance such as faithful coping of reality may not

be valued so highly in postmodern conceptions of art and design assessment criteria.

17
In a modernist paradigm, events of judging artworks were often limited to formalist

criteria. In the postmodern paradigm judgements are more complex, they always

need moderation and a certain degree of consensus between teachers (Boughton,

1999; Freedman, 2003). The tension between modernist and postmodernist

paradigms in art education is not yet resolved at the time of this research; however, it

is important to consider their effects in assessment. Assessing the arts raises

particular problems.

On the one hand, dynamic, socio-cultural changes affect artistic expression of all

kinds; debates about, for example, reconstructing or maintaining the cultural identity

of minority groups, or about issues of national identity, rapid changes in technology

and the advent of postmodern philosophy are reshaping fundamental assumptions

about art and education. On the other hand, the curriculum reforms currently being

undertaken in countries that are members of the Organisation for Economic

Cooperation and Development (OECD) seem to be demanding more prescriptive

outcomes and clearer specifications of achievement standards for the arts. At a time

when long standing beliefs about the nature of art are under challenge, teachers are

being asked to account for their students' learning in clearer and more precise ways

(Boughton, 1999).

In Europe the subject of art education is not well-defined (Schönau, 1999). The

impact of postmodern thinking has created tensions between the expectations of

centralised assessment, and some of the traditional ways of defining art, raising the

need to debate the impact of distinctions between modern and postmodern ways of

defining art and art learning in assessment. Selecting the type of evidence of

students’ knowledge, understanding and skills in art assessment is a problematic task

18
in the sense that political issues, the ideologies of curriculum planners and practical

constraints of art examinations always influence the design and implementation of

assessment instruments and procedures. What is actually assessed in art and design

examinations may be not be congruent with what is actually learned in schools. It

may be limited to a narrow domain of knowledge, raising questions about the

validity of the system.

0.1.2. Assessment considerations

According to Broadfoot (1996, p.175) different forms of assessment influence what

is taught and how it is taught, and what and how students learn. Examinations

organise and legitimate knowledge in ‘testable form’, always on the assumption that

appropriate knowledge can be organised and tested. The choice of what is tested or

examined, and the criteria used, influences curriculum practice. Large scale external

assessment often constrains the range of knowledge or competencies to be examined;

in order to try to ensure reliability of examination results, certain competencies may

be omitted from examination papers because they are difficult to assess or because

they might require prohibitively expensive assessment procedures. Broadfoot (1996)

also points out that '…systems of assessment have been criticised for putting a

premium on the reproduction of knowledge and passivity of mind at the expense of

critical judgement and creative thinking'. External assessment procedures such as

examinations should have validity. In other words, assessment should serve its

prescribed purposes, reflecting the aims and objectives of the courses and units

taught (Eisner, 1998). Examination results should be reliable minimizing the effects

of measurement error (Bachman, 1990, p.161). Reliable assessment concerns the

consistency of scores obtained by the same persons when re-examined with the same

19
test on different occasions, or with different sets of equivalent items, or under

variable examining conditions (Anastasi, 1988, p.188). For example, two or three

qualified judges (assessors or examiners) rating the same works and using the same

criteria should similarly rate ‘answers’, including the products or outcomes of

practical work. The main assessment characteristics of reliability, impact, validity

and utility or practicality raise difficult problems for external assessment, since these

qualities may be in conflict or tension. When constructing examination papers,

examiners and test developers must make a choice of content, competencies and the

specific knowledge to be assessed. These choices will be determined by their concept

of what is most relevant and important in art education. They must define the criteria

expressing the qualities to be valued in student artwork and provide guidelines for

making judgements.

0.1.3. Key difficulties in art assessment

Assessing the arts has been considered an especially problematic field and doubts

have often been expressed about whether such assessment can be properly objective

(Boughton, 1996). Key difficulties include the extraordinarily varied nature of art

works themselves and the consequent limitations of expressing common assessment

criteria in words (Boughton, 1996; Eisner, 1985). Criteria express the qualities

thought to be desirable in students’ works and are considered essential to reliable

assessment (Boughton, 1996; Eisner, 1985; Blaikie, 1996; Schönau, 1996). The

concepts that underpin the criteria need to be shared by the whole art education

community that is subject to a particular model of assessment. However, the

expression of a criterion in words can be reductionist. Often the most important

qualities in art students' achievements are tacit in nature, and are difficult to express

20
adequately in words. According to Boughton (1996, p.72), the problem with the use

of criteria is that they express inexact human values and are subject to various

interpretations by judges who may hold different views about the significance, or

precise meaning of any given criterion. He had raised important questions for

assessment in art education, such as:

1. How specifically should criteria be written? Should sub criteria (rubrics) be


used to illustrate expected standards? Should visual exemplars be used?
2. How many criteria are required to describe the qualities expected in student
work?
3. Should criteria be weighted, i.e. are some more significant than others?

Even if examination papers are well constructed in terms of validity, and the criteria

are well defined, assessment of students' examination work may still lack reliability.

Agreement between judges, assessors or examiners involving some form of

moderation is a necessary condition of obtaining reliability of art examination results

(Beattie, 1994; Blaikie, 1996; Boughton, 1996; Steers, 1996). An important aspect of

examinations is the method or procedures used to judge students' artworks, and these

procedures vary according to different systems of external assessment. In Portugal,

art and design external assessment occurs at the end of secondary education in the

12th year of school (age 17-18). A single judge assesses examination works. In

England students of the same age usually are assessed by at least two judges and

moderated by an external examiner or moderator.

A system of moderation is essential to the reliability of an art and design examination

(Schönau, 1996; Steers, 1994). Moderation is the means by which the marks of

different teachers in different centres/schools are equated with one another and

through which the validity and reliability of assessment is confirmed. According to

21
Blaikie (1996) moderation is a fundamental aspect of reliability in qualitative

assessment in that it ensures the fairness of judgements through combining inter-

subjective and informed opinions. Boughton (1997) refers to the importance of group

discussion in assessing tasks. Beattie (1994) notes the necessity of a second judging

for evaluation of the product, and she argues: 'This procedure is extremely important

when reliability of scores is critical, as in the case where a norm-referenced

interpretation is mandated (i.e., comparisons of pupils and schools)'. The common

assumption is that if judges can agree about the merit of student work then the

judgement process has demonstrated objectivity. If different judges, using the same

criteria can agree about the standard of work over a variety of cases, then a high

measure of reliability has been achieved, and confidence can be placed in the

judgements. This assumption is derived from the natural sciences and is based on the

belief that a ' true' score for student work exists, and may be approximated through

the agreement of experts. Most effort in centralised systems is invested in developing

efficient strategies to obtain consensus on judgements. In that sense, the practice of

training of moderators and examiners by English examining boards aims to create

agreement between judges and, if they share the same assumptions about the value of

the students' art works, high reliability is obtained. Moderation meetings and the

exhibition and publication of exemplars for each level of achievement may serve the

purpose of calibrating assessors.

The English system of moderation appeared to have advantages over the Portuguese

system because classroom teachers are the principal assessors of student’s work.

Teachers mark students’ work and those marks are ‘moderated’ by external assessors

who verify the marking of a sample of candidates’ work at each school. Moderation

22
may take the form of a sample of all the candidates’ work being sent to a moderator

at an awarding body, or moderators may visit schools when, for example, the product

or the process is bulky or ephemeral. Moderators are trained to assess to agreed

common standards and their judgements are usually subject to various checks and

balances.

According to Wiggens (1993, p. 344) '…if assessment occurs without debate,

values cannot be examined, change cannot be institutionalised, and awareness

of alternatives is generally not discussed. It is all too easy for teachers whose

judgements are not challenged to resist change and judge their students' work

inappropriately'. In examinations it is assumed that assessors or judges have the

necessary knowledge, or 'connoisseurship' in Eisner's (1985) terms, to make

judgements about the quality of students' work. However, questions need to be

asked about the role and competence of judges. Boughton (1996, p. 300) suggests:

1. What experience and philosophy should the judges' have?


2. How should they be selected?
3. How should they be initiated?

These questions are particularly important in Portugal, where assessors of national

examinations at the end of secondary school (age 18) are teachers without any kind

of assessment training (Martins, 1999, p.152). By comparison, in England assessors

are supposed to receive some measure of specific training for their work from

examination awarding bodies. The adequacy of this training will be investigated in

this study (see Chapter 5). The literature suggests that a major source of unreliability

in art examinations concerns the interpretation of assessment criteria and assessment

23
objectives. In theory, if assessors share the same views and values they can act as one

single judge thus improving the reliability of the examination results.

It is usual for the procedures used in external assessment to be related to the

prevalent local, or national ideology of art education, and especially to concepts

associated with art students' achievement measurement. In countries such as

Portugal, art external assessment in the end of secondary school (age 18) is limited to

tests administered during a short period of controlled time. The question papers are

not available to the students before the date of the examination; students are not

allowed to use preliminary sketches or preparatory research and there are limited

guidelines available for assessors and teachers. In England, the assessment

objectives, criteria and question papers are made available to students, teachers and

assessors before a long, controlled period of external assessment. Students are

expected to use preparatory research during an examination. These external

examinations do not take the form of pencil and paper tests as in Portugal, but of

rather practical art and design course tasks with, in some cases, a written component.

The assessment procedures used in art examinations raise specific problems. Ideally

the techniques used to assess students' achievement should have the necessary

characteristics of reliability, validity and practicality or utility and not have a

negative impact on curriculum practice. But this is rarely the case (Shoamy, 2001).

In some systems reliability might be the main requirement, in other systems it might

be validity, in others practicality (Wiggens, 1993). It is necessary to analyse the

impact or consequences of such procedures to facilitate further review and to bring

about improvement of the system. In Portugal there are a significant number of

24
publications on assessment in general (Boavida, & Barreira, 1993; Estrela, 1994;

Fernandes, 1992; Lemos, 1993; Pacheco, 1995; Ribeiro, 1997; Valadares & Graça,

1998), but very little research has been completed on assessment in art. In England

and the United States, some research in art education assessment has been carried

out, however the American art educator Zimmerman (1992) claims that there is a

particular need for research and development on the interface between teachers and

assessment procedures. The nature of the concepts involved in studio art makes the

assessment task especially problematic. According to Boughton (1997): 'An

enormous amount of research remains to be done with these forms of assessment.

Although some research has already been undertaken, perhaps most notably by the

Dutch Institute for Educational Measurement (Schönau, 1996), little in the way of

systematic investigation underpins widely established practices of assessment of

studio art learning' (p.211).

0.2. Contrasting systems of art examinations: Portugal and England

During the year 2001, approximately 4000 students in Portugal took national art

examinations in general art courses and approximately 1500 students took national

art examinations in technological art courses at the end of secondary school (Source:

Departamento do Ensino Secundário, 2001). The most important Portuguese art

examinations are Drawing and Descriptive Geometry; Materials and Techniques of

Plastic Expression; Theory of Design and History of Art. The format is pencil and

paper tests requiring short essays and drawing exercises. Despite the considerable

number of hours of study allocated to them, Studio Art and Technologies are the only

specialisms in the Portuguese art curriculum which are not assessed externally. This

fact leads students to consider Studio Art and Technologies as less important in the

25
curriculum (Eça, 1999). In the Portuguese system, therefore, students are not given

an opportunity to reveal their true knowledge and ability, and this reduces the

validity of the system. Important skills and capacities in students' learning and

achievement, which are specific to art production, are left out of external assessment

thus raising doubts about the content validity of the assessment system. In the

assessment guidelines for Portuguese Studio Art (Departamento do Ensino

Secundário, 1992) assessment objectives and criteria are not clearly defined.

Assessment by jury or moderation is seldom used in Portuguese art courses; teachers

are left free to judge students' coursework and examination work according to their

own, often very different, interpretations. In previous research (Eça, 1999), the

techniques used in Portuguese art and design external assessment were judged to be

inappropriate, and serious doubts about the validity and reliability of the system were

raised . Through empirical evidence it was stated that a significant number of

Portuguese art educators had a clear and evident dissatisfaction with the national

system of external assessment for art disciplines. It was concluded that there was an

urgent need to reform and to improve external assessment in art and design courses

at pre-university level in Portugal.

In England, 37, 380 students took Advanced Subsidiary General Certificate of

Education (GCE) Art and Design examinations in 2001 (Source: Joint Council For

General Qualifications). Art and design examinations, in both studio art and

associated critical and contextual studies, are examined through a combination of

submission of coursework and work completed during controlled time, known as

external set assignments at the end of the course. The components of examinations

allow for the inclusion of process and product in coursework and external set

26
assignments. Students can use preparatory research during the examination. The

external examinations are not pencil and paper tests. Written essays, work journals,

coursework/portfolio, a finished piece of work as part of the externally set

assignment, tape/slides, video, web pages can be accepted as evidence.

0.3. Summary problem statement

It is widely recognised that there are fundamental difficulties in most forms of

external assessment related to the validity, reliability, impact and practicality of

examinations (Valadares e Graça, 1998; Anastasi, 1988; Broadfoot, 1996; Cardinet,

1986). These are particularly problematic in art and design where a consensus about

the interpretation and use of criteria is difficult to obtain because of the wide-ranging

nature of the works being judged because differentiation by outcome is a method

frequently employed and in art and design there are no right or wrong answers, and

examiners typically seek divergent and ‘original’ responses.

There is a need to reform Portuguese art and design examinations at age 18 since

evidence exists that the current system is deficient in most aspects of validity

(Afonso, 1998). At the beginning of the research it was hypothesised that a critical

evaluation of art and design examination systems in Portugal and England should

provide valuable insights into current problems in Portuguese art examinations. An

analysis of weaknesses and strengths of both systems might facilitate analysis of key

issues and suggest possible alternative assessment instruments.

0.4. Research aims

27
Therefore, the aims of the research reported in this thesis were to understand the

theory, policy and practice of art and design examinations in England and in Portugal

by comparing, contrasting, and evaluating the principles, processes and relative

reliability, validity, impact and practicality or utility of the particular national art and

design examinations at age 17 /18.The principal audience for this study included key

stakeholders in the national examinations in Portugal - government, government

agencies, university admissions systems, and art teachers. It may also be relevant to

the international community of researchers in art and arts education and related

subjects and specialists in arts assessment in general.

The intention was to carry out a number of small empirical studies of assessment

practices in Portugal and England, in order to establish a conceptual framework for

art external assessment. It was intended that the final conclusions and

recommendations would; (1) establish a firm basis for making proposals to the

Portuguese Government about the development of new forms of art and design

external assessment and, (2) contribute to the international debate in art education

external assessment. The specific aims of the research were:

1. To examine theory of assessment in the international literature about


assessment in art education and evaluate its significance for art and design
examinations in England and Portugal.
2. To analyse documents concerned with art and design examinations in
Portugal and England and identify the critical issues for assessment practice.
3. To investigate Portuguese students’, teachers’ and assessors’ views about the
nature and problems of art and design examinations in Portugal and their
views on possible means of improvement.
4. To compare policy and practice of art and design examinations in England
and Portugal and determine their relative strengths and weaknesses.
5. To develop a new conceptual framework for external art assessment at pre-
university level
6. To develop and trial an experimental assessment instrument and assessment
procedures.

0.5. Research questions

28
Taking into account the problem statement and research aims, the following

broad research questions were formulated:

1. To what extent do English and Portuguese art and design external assessment
systems at age 17+ have the necessary validity, reliability and practicability to
assess art students’ achievement fairly and accurately?
2. What is the impact of the English and Portuguese art and design examinations at
age 17+, on teaching and learning?
3. How can the validity and reliability of Portuguese art and design external
assessments at pre-university level be improved?
4. What kind of conceptual framework would attain more valid and reliable art and
design external assessment instruments and procedures at pre-university level in
Portugal?
5. How might the conceptual framework for such assessment be applicable in more
general international contexts of art external assessment at pre-university level?

In order to compare and contrast the two systems of external assessment the

following more specific questions were identified:

1. What types and models of art assessment instruments are currently in use
in England and Portugal at age 17+?
2. What are the similarities and differences between them?
3. What concepts and theories underpin art and design external assessment
in Portugal and England?
4. How are assessors and examiners trained in Portugal and England?
5. How is students' examination work judged in Portugal and England?
What criteria are applied? How is agreement reached about defining
criteria?

0.6. Outline of chapters

The structure of the thesis is as follows:

Chapter 1: constitutes on a literature review of assessment issues and broad

problems related to art examinations at pre-university level. Key terms are

defined and key problems associated with assessment and assessment within

art are identified and discussed.

29
Chapter 2: consists largely of detailed considerations of particular qualities of

examinations including reliability, validity, impact and practicality. The

concept of reliability and different types of validation are described, and

specific sources of invalidity are identified. The chapter ends with a proposal

for an evaluative framework addressing a list of questions concerning

potential sources of invalidity, unreliability and negative impact in relation to

art and design examinations.

Chapter 3: describes the design of the research. The first part concerns the

choice of a combined qualitative-quantitative method and the outcomes of the

two research stages. The second part describes considerations of the data

collection instruments, data sources and methods of data analysis.

Chapter 4: the first part is an overview of the history of art and design

examinations in Portugal and considers the current examinations in terms of

their validity, reliability, impact and practicality. The second part of the

chapter reports the findings of the analysis of the current assessment model

and of users’ perceptions of what might be an ‘ideal’ art and design

examination in order to present a list of key features that might improve the

current model.

Chapter 5: this chapter reports on the findings of analysis of strengths and

weaknesses of the English art and design examinations at 17 + level through

analysis of documents relating to their evolution; underlying concepts;

30
instruments and procedures of assessment; analysis of the researcher’s

observations and interview data with key stakeholders.

Chapter 6: (1) proposes a possible new framework for art and design

examinations together with a rationale. (2) the content framework for the

experimental art and design examinations is explained. (3) reports on and

evaluates a pilot experiment (first trial) at School Alves Martins.

Chapter 7: This chapter reports on and evaluate the larger-scale trial of the

new instrument and external assessment procedures in five schools in

Portugal.

Chapter 8: In the final chapter, findings from all the strands of the research

are synthesised to answer the research questions and conclusions are drawn

about possible ways to improve the Portuguese art assessment system.

Recommendations are made to Portuguese art assessment stakeholders and

international implications for further researches are identified.

31
Chapter 1

Assessment in Art Education

This chapter reports on a literature review of assessment issues and broad

problems related to art examinations at pre-university level. Assessment in art

and design is often seen as especially problematic because of the nature of the

work and knowledge to be judged and also because of the difficulty of

obtaining a consensual view of criteria and how to interpret them. Key terms

are defined and significant issues related to assessment and assessment within

art education are identified and discussed.

1.1. Assessment

Assessment in education is the process of gathering, interpreting,


recording and using information about pupils' responses to an educational
task. At one end of a dimension of formality, the task may be normal
classroom work and the process of gathering information would be the
teacher reading a pupil's work or listening to what he or she has to say.
At the other end of the dimension of formality, the task may be a written,
timed examination, which is read and marked according to certain rules
and regulations. Thus assessment encompasses responses to regular work
as well to specially devised tasks (Harlen et al., 1994, p. 273).

Assessment might be considered as an integral aspect of the teaching and learning

cycle and can include a range of methods for monitoring and evaluating student

performance and attainment. These methods range from formal testing and

examinations, performance assessments including practical and oral presentations,

teacher or classroom-based assessment and portfolios (Klenowski, 2002, p.42).

32
The word 'assess' is derived from the Latin ad+ sedere, meaning 'to sit down

together'. Wiggins (1993) points out that the etymology of the word alerts us to the

clinical aspect as when applied to psychological assessment, that it is a ‘client-

centred’ act:

…in an assessment, one 'sits with' the learner. It is something we do with


and for the student, not something we do to the student, this person who
'sits beside' is one who 'shares another's rank or dignity' and who is
'skilled to advise’ on technical points (p. 14).

This implies that assessment is an interactive relationship between at least two

persons: the student (or his/her work) and the assessor. MacGregor (1996, p. 29)

points out: 'The teacher who undertakes assessment is inevitably committed to

making decisions about values'. The assessor is viewed as someone who has

expertise and competence and can be trusted as an authority to guide students'

learning and establish the value of students' performance, through comments or by

grading the performance. This implies a power relationship between the assessor and

the person being assessed. Assessment can be constructive when it provides guidance

for the student and self-reflection for the teachers and schools. However, it can also

be reduced merely to an act of quality measurement and classification of students'

performance.

Wiggins’ (1993) ideas about assessment gives an important overview of the

phenomenon because he views it as a contractual relationship with consequences for

students and teachers. From this point ethical and social dimensions have to be taken

into account. Assessment and especially assessment in art and design cannot be

reduced to a simple act of measurement: it is judgement based.

33
1.1.1. Roles of assessment

Assessment is a broad concept with the capacity to fulfil multiple purposes. It should

be closely integrated with the curriculum (Broadfoot, 2000). Its principal functions

include diagnosis of the causes of young people's success or failure, motivation for

the learner and teacher, the provision of valid and meaningful accounts of what has

been achieved and it may form a part of the evaluation of teaching programmes

(Brown, S., 1994, p. 271).

Sutherland (1996) argues that three uses of assessment have dominated its history in

Britain so far: as a device for raising standards, as a device for measuring deviation

or abnormality, and as a device for securing equitable treatment. Assessment used as

a form of quality control is commonly seen as a way to improve educational

standards. According to Gipps and Stobart (1997 p. vi), professional assessment and

managerial assessment are two different functions that need to be taken into

consideration. These are defined as: (1) Professional assessment , where assessment

primarily helps the teacher in the process of educating the pupil, and (2) managerial

assessment, which uses test and assessment results to help manage the education

system. Assessment for learning can be equated with professional purpose: it takes

place at any time but is more useful during a period of teaching when the information

produced is being used by the classroom teacher, and perhaps by other teachers in a

school, to improve teaching-learning classroom strategies. Assessment for learning

may also mean assessment of pupils' understanding and mental processes, rather than

assessment of their performance. This is often referred to as ‘formative assessment’

(Ash et al, 2000).

34
According to Robinson (1982, p.82) 'The principal function of assessment in schools

is to provide information about pupils' abilities and levels of attainment'.

However, assessment has other functions and needs to be viewed in light of the

specific educational purposes it is intended to perform. Eisner (1998) has identified

and described five important functions of assessment as follows:

One function of assessment is a kind of educational temperature taking...


A second function of assessment is a gate keeping function...
A third function of assessment is to determine whether course objectives
have been attained…
A fourth function of assessment is to provide feedback to teachers on the
quality of their professional work. ...
A fifth function of assessment focuses on the quality of the programme
that is being provided (p. 139).

Educational theorists recommend that the above functions of assessment should be

treated separately (Cardinet, 1993; Rosales, 1992; Gipps and Stobart, 1993). The

functions of assessment as providing feedback for the student or teacher, providing

motivation for the student or teacher, providing a basis for selection to monitor

national or local standards, and control of the curriculum may conflict (Lambert &

Lines, 2000, Broadfoot, 2000). Although the same assessment instrument can

provide data that is useful for different situations, there is a need to view assessment

in the context of its utility. Gipps and Stobart (1993) argue:

...different purposes require different models of assessment, and different


relationships between pupils and teacher. It may be possible to design
one assessment system which measures performance for accountability
and selection purposes, whilst at the same time supporting the teaching
and learning process, but no one has yet done so (p.3).

For the purposes of the research reported in this thesis the main function of

assessment is summative and concerns the judgement of students' performance

or achievement for selection and certification purposes at age 17 or 18.

35
1.1.1.1. Feedback, motivation

Robinson (1982) argues that assessment and evaluation have important roles in

education because they can provide feedback for teachers and schools. He states:

Assessments of pupils may be used to decide on the course their work is


to take, or to summarise their achievements. Feedback and
encouragement are key elements in education. Providing these should be
a function of assessment. This should pervade the process of their
education and be as familiar to pupils as their lessons. This implies
something much broader and more integral to the work than periodic
testing and grading (p.84).

Feedback is information that provides performers with direct, usable insights, based

on tangible differences between current and hoped-for performance. Few teachers,

and even fewer tests, provide what students most need namely information designed

to enable them to accurately self-assess and self-correct, so that assessment in Wolf

‘s terms (Wolf et al, 1991, p.183) becomes 'an episode of learning'. Wiggins (1993)

argues that many classroom teachers believe that a grade and a short series of

comments constitute feedback. To provide effective feedback there is a need for

user-friendly information on how the student is doing and how, specifically, he or

she might improve his or her performance. Grades and scores are symbols, or

encoded information, and not always easy to understand. Wiggins (1993,p.195)

recommends that feedback for student should be qualitative. Assessment is not a

question of assigning symbols. Students must understand and share the meaning of

assessment in order to improve their performances, so assessment is much more a

constructive dialogue rather than a matter of marking works with numbers or letters.

Looking at assessment as a ‘situational interaction’ (Wiggins, 1993, p.195), the

communication between participants is essential, although a system based on codes

externally defined by experts may not be efficient. Feedback for students and

36
teachers is complex and need to be explained through sophisticated forms of

communication (Wiggins, 1993, p.196). Detailed visual, verbal and/or written

information related to the assessment context should be used in order to comment on

students’ achievement. Strategies such as ‘conversation’ may be more appropriate for

assessment procedures than is usually thought:

…we now feel able to suggest ways forward. A pupil’s self-esteem and
confidence need not be at risk in a genuine conversation designed to
support her own reflection and help her make informed and constructive
judgements while at the same time hearing the opinions and advice of her
partner’ (Ross et al., 1993, p. 164).

According to Wiggins (1993, p. 198-199) traditional assessment (such as tests)

provides ‘praise or blame, and non-situation specific advice or exhortation’, uses

‘language only experts can decode’ and ‘measures only easy-to-score variables’. This

is a kind of feedback that is ineffective. To obtain ‘authentic’ feedback teachers

should rely on several assessment strategies to assess students’ accomplishment,

specifying the degree of conformance with an exemplar, goal or standard.

Ross et al argued that grading in the arts, ‘must be criterion related and not based

upon normative, positivistic structures’ (Ross et al., 1993, p.164). As Vygotsky

(1972) pointed out grading should be accompanied by a rich descriptive profile of

each student, based upon what they are able to achieve in a supportive environment.

A system of assessment, which relies only upon quantitative data, may not provide

adequate information. To provide feedback and motivation for students and teachers,

assessment should be based and communicated not only through statistical/numerical

symbols but also through detailed verbal reports about performance. Furthermore

there is a risk of misuse of data and incorrect interpretation of results if the only

37
information available is revealed through grades and examination/tests scores, such

as, for example, in the English school ‘league tables’.

1.1.1.2. Surveillance, control, accountability

Assessment may be used as a mechanism of control and, for governments;

assessment may provide a form of accountability. Boughton (1999) criticises the

pervasive consequences:

The problem is that what both industry and governments want is


predictability, uniformity, and reliability of outcome. A quick fix for the
problems created by cultural change and economic leadership. What has
emerged from the frenzy of national curriculum reforms seen around the
world in the last decade has been uniformity in curriculum 'standards' or
'profiles' for each of the important curriculum subjects. There is nothing
wrong with the idea of standards, indeed it would be foolish to deny
attempts to define high quality benchmarks, but the manner of their
development and expression is critical if they are to be of constructive
use to any field. The problem is that uniformity and predictability are the
key elements of evaluation strategies useful to governments as a
surveillance and control mechanism to ensure schools and teachers
produce the goods (p. 337).

According to Wiggins (1993, p.289) '…accountability exists when the service

provider is obliged to respond to criticism from those the provider serves. The ability

to hold the service provider responsible depends upon a moral-legal-economic

framework in which the client has formal power to demand a response from the

provider, to influence the providing, and to change service providers if the responses

are deemed unacceptable'. In the case of education, the client may be a student,

parents, a community, employers, the government or a nation. The dictionary

definition of accountability, which includes a formal, legal element, makes it clearer.

The Oxford English Dictionary defines accountable as 'Liable to be called to

account, responsible, amenable'. Accountability is a moral (and sometimes, by

extension contractual) obligation to be responsive to those with whom one has a

38
formal relationship. And as the word's roots in European legal history suggest,

accountability presumes moral equality: no one is above the law. Wiggins (1993)

argues that teachers and administrators are obliged to answer questions about how

their students are doing, irrespective of what they choose to report (p. 257).

However, students and teachers are ill served by having complex performances

reduced to an arbitrarily single number (p. 260). Furthermore,

The absence of credible tests makes summary judgements about school


performance almost impossible. And whatever tests we use, if union
contracts and custom prevent teachers from being 'traded' or acting as
'free agents' it is far less likely that schools can be changed. And if they
cannot be changed, they are not accountable (Wiggins 1993, p.261).

Accountability may be also designed as ‘quality control'. Wiggins (1993) states that

‘if the schools set clear, public targets; if the parents have a clear and present voice;

if the students, former students, and institutional customers have a voice then we will

have accountability that will improve schools ' (p.277). Accountability is a political

issue, and Hermans (1996) makes the point that:

As art educators, we should not only question the educational purposes of


assessment in the arts, it is also vital to consider some of the political
issues relating to the position of the arts in education, the quality of art
education and the use of data and results. The results will provide a
partial and distorted view of the educational reality, if test scores are used
to paint a picture of the quality of the curriculum, art teaching or art
education in general. Many large-scale assessments serve only one
purpose: ranking students. Improving education might be an important
effect but it is not the prime goal (p.106).

1.1.2. Forms of assessment: formative/summative

Assessment may take many forms. Informally, teachers are assessing pupils all the

time, as indeed pupils are assessing them. For Ash et al. (2000):

Assessment is a general term embracing all methods customarily used to


appraise the performance of an individual or a group. It may refer to a

39
broad appraisal including many sources of evidence and aspects of
pupils' knowledge, understanding, skills and attitudes, or a particular
occasion or instrument. An assessment instrument may be any method or
procedure, formal or informal, for providing information about pupils'
learning: individual discussion, homework, project outcomes, mock
examination work, group critiques (p.135).

1.1.2.1. Formative Assessment

Assessment is formative ‘when the evidence is used to adapt the teaching work to

meet learning needs’ (Black et al, 2003, p. 2). Formative assessment can occur many

times in every lesson. It can involve several different methods to encourage students

to express what they are thinking and several ways of acting on such evidence (Black

et al, 2003, p. 2). The results may enable a teacher to give remedial help at an early

stage of learning, or change the emphasis of a course if required. They may also help

a student to identify and focus on areas of weakness. Ash et al (2000) point out that

formative assessment is usually continuous throughout the process of a particular

learning activity. ‘It has been recognised by most art and design educators that

formative assessment is particularly constructive’. It is concerned with providing

ongoing feedback during the process of making, rather than assessing and grading an

isolated, finished product. ‘Formative assessment is an integral part in the

development of projects, which support learning; it encourages and guides pupils’

work forward. If pupils do not receive regular feedback on their work, they quickly

lose motivation and become unsure of their own assessment of success or failure’.

Summative assessment

Summative assessment can be a culmination of evaluations made

over a period of time. Usually summative assessment occurs at the

40
end of a project, scheme of work or course of study and therefore

focuses on finished outcomes or more holistically on a body of work.

Ipsative assessment and self-assessment

Another type of assessment is described as ipsative (Ash et al., 2000): ‘Ipsative

assessment gauges the development of an individual from one moment in time to

another (usually the present). It is concerned with the evaluation of personal

achievement rather than an individual's relationship to national or local norms’. It

often takes the form of pupil self-assessment providing an opportunity for pupils to

appraise themselves in a non-competitive climate. Additionally pupils' self-

assessment provides teachers with insights into pupils' understanding of their own

progress. This can be used for diagnostic assessment.

In theory at least, self-assessment is gaining more and more importance in education,

not only as a form of assessment but also as a form of learning. As Ross et al (1993,

p.164) point out: ‘Pupils must first know what they have done before they are in any

position to judge [and understand] how they have done’. Furthermore, the process of

self- assessment can motivate students. The intent of self-assessment is to teach

students to monitor their own learning continuously, to help students reflect on their

own abilities and performance, related to specified content and skills and related to

their effectiveness as learners, using specific performance criteria, assessment

standards, and personal goal setting.

1.1.2.4. Diagnostic assessment

41
Diagnostic assessment approaches pupils' work and behaviour as evidence for the

analysis of their ability in a given field (it is often used to discover learning needs). It

can be used constructively as a vehicle for discussion between teacher and pupil

where both parties consider progress and set targets for future development (Ash et

al., 2000, p. 136). Diagnostic assessment can be a powerful motivating factor for

pupils.

1.1.2.5. Negotiated assessment

It is widely agreed that assessment should be negotiated (Wiggins, 1993; Boughton,

1996; Hannan, 1985). For Ash (2000) negotiated assessment in the form of

constructive criticism promotes learning and a degree of pupil ownership in the

assessment process. Two main aspects of negotiated assessment can be found: (1): in

the relationship between student and teacher/assessor, and (2) in the relationship

between teacher and the system.

Students and teachers can work on formative assessment together by sharing criteria,

discussing levels of achievement and looking for strategies to improve learning

(Shoamy 2000). ‘ Assessment done properly should begin with conversations about

performance’ (Wiggins, 1993, p.13) and this only happens by transparent means of

communication; in other words, it depends on the clarity of instructions, directives,

format, criteria and procedures for marking.

Teacher ownership of assessment is also important (Steers, 1994, p.289). Negotiation

occurs when all the users of the system understand and agree with the applied

procedures. In that sense, teachers should not be marginalized from the development

42
and application of external assessment. Debate and consensus should form part of the

process of standard setting in the assessment of student art products (Boughton,

1997). Assessment procedures in the arts should involve co-operation:

Judgements in art are fundamentally holistic, and it is the total patterning


of a work, which exhibits criteria appropriate for its evaluation ...
Assessment in art, at least in the context of its production, must
encourage co-operation and the active participation between teacher and
taught in negotiations about meaning and quality of artwork in exemplary
ways (Heyfron, 1986, p. 203).

1.1.2.6. External assessment

This research focused on external assessment, which is often associated with external

examinations. Assessment is external when experts working for an external

organisation, e.g. an awarding body or the Ministry of Education, develop the

instrument of assessment and it is applied in a large number of schools in a country

or a region. In external assessments examiners appointed by the external organisation

that designs the instrument mark the work submitted by students externally. External

assessment in the form of national examinations is not always synonymous with

summative forms of assessment. For example, English art and design examinations

such as the GCSE taken at age 16+ can be considered both formative and summative

in nature (Steers, 1996).

1.1.3. Assessment as a social product

From its beginnings in the universities of the eighteenth century and the school

systems of the nineteenth century, educational assessment has developed rapidly to

become the unquestioned arbitrator of value, whether of pupils' achievements,

institutional quality or national educational competitiveness (Broadfoot, 2000, p. ix).

43
Educational assessment can be seen as the representation of the desire to discipline

an irrational social world in order to establish rationality and efficiency. According to

Broadfoot (2000):

…the engine that drove [assessment’s] rapid development was the


aspiration that merit and competence should define access to power and
privilege; that investment in education could be tailored to identified
potential; that value for money could be convincingly demonstrated. All
these are worthy aspirations, and they remain the dominant agenda of
examinations and test agencies around the world who even today
continue what is often a heroic struggle to provide equitable and
defensible accreditation and selection mechanisms, to hold back the tide
of corruption and nepotism that threatens to engulf the whole enterprise
(p. x).

Assessment and examinations are often viewed as an equitable method of selecting

individuals for preferment. However, this is not always the case since education and

assessment systems tend to favour specific social classes, genders and races.

Assessment systems do not necessarily introduce more equality and fairness into the

system  they may offer less, with more streaming and stratification by family

income, wealth and race (Hilliard, 1991; Pullin, 1994; Madaus 1994; Darling-

Hammond, 1994; Miller, 1995).

1.1.4. External assessment and examinations

Examinations are a social construct rather than a neutral technological instrument

and may function to maintain social stability. Foucault (1977) argues that

examinations contribute to the accomplishment of social control, particularly with

respect to understanding and acknowledgement of authority and self-discipline. For

Atkinson (2002) assessment of art practice invokes processes of surveillance,

normalisation, discipline and regulation of students’ artwork. With regard to social

control Torrance (2000) argues:

44
Educational assessment can no longer claim to be naive about such
matters or uninterested in them. Future theory and practice must be more
self-conscious about these implications of the process; in particular,
research and development in assessment must take more seriously the
ethical issues of what counts as acting in the 'best interests' of the
candidate, and take more trouble to ascertain candidates' views of the
process and its outcomes (p. 178).

Foucault, (1977) documents the move from the direct, explicit, feudal exercise of

power through physical coercion, to the modern social practices of self-control and

professional management whereby discourses of care, training, rehabilitation and

treatment construe and construct discipline/individual behaviour in relation to

normative expectations. The examination is the exemplary manifestation of

normative disciplinary power:

Power is not possessed as a thing, or transferred as a property; it


functions like a piece of machinery... it is the apparatus as a whole that
produces 'power' and distributes individuals in this permanent and
continuous field... this network 'holds' the whole together and traverses it
in its entirety with effects of power that derive from one another:
supervisors perpetually supervised (Foucault, 1977,p. 176).

Foucault's argument is that the power of examinations derives from the use of

comparison and differentiation tools:

The art of punishing, in the regime of disciplinary power, is aimed


neither at expiation, nor even precisely at repression... it refers
individuals to a whole that is once a field of comparison, a space of
differentiation... the constraint of a conformity that must be achieved
(pp. 182-183).

For Foucault the school becomes a sort of apparatus of uninterrupted examination:

... [examinations] become increasingly a perpetual comparison of each


and all that made it possible both to measure and to judge... a constantly
repeated ritual of power... it is the examination which, by combining
hierarchical surveillance and normalising judgement, assumes the great
disciplinary functions of distribution and classification ( pp. 186-192).

1.2. Assessment and the arts

45
According to Barrett (1990), assessment refers to the judgement of a process or

product, in terms of a spectrum of explicitly stated criteria. Specifically, the degree

of successful achievement is assessed by these criteria. In other words studio art

assessment is the appreciation of the qualities of students' achievement through the

judgment of the qualities of the work (performance, product or process) submitted to

the judges (teachers, assessors or examiners) by the student in response to an

assessment task (examination question papers for example). The judgement is based

on a list of known and previously stated criteria and expressed by qualitative and/ or

quantitative means. In the case of studio art examinations, the judgement is

dependent on appreciation and interpretation of the qualities that are anticipated

(attainment targets or descriptors of achievement) by the judges. Usually assessors

express their judgements using scores or marks. Using mark schemes they attribute a

grade to a student's examination work. The grade symbolises a level of attainment in

terms of the variable balance of knowledge, skills and understanding about art

revealed in the student's examination work.

1.2.1. Why assess the arts?

Armstrong (1994) states that arguments for assessing art emanate from three arenas:

(a) professional education, (b) the government, and (c) the local community.

Professional education maintains that a sound education links instruction and

feedback about the effect of that instruction. According to Armstrong (1994)

governments maintain that a common, general education is in keeping with the

democratic tradition, and evidence of progress toward that end is expected.

46
Assessment and especially examinations in the arts subjects may also be viewed as a

form of legitimisation of their role in the curriculum. 'More and more art teachers see

examinations as guaranteeing survival in hard times' (Vickerman, 1986). Ross (1978)

argues:

If we do participate in public examinations we run the risk of allowing


our work to be wrested from its legitimate roots; and if we do not, we
seem to push the arts even farther out on a limb. The more the arts
become exceptions to the rules of schooling the less relevant they are
likely to appear (p. 261).

Armstrong (1994) points out:

Art educators cherish their uniqueness. At the same time, that uniqueness
contributes to a precarious position in which art teachers too frequently
find themselves. Few persons understand and value the uniqueness of art
education as an important part of general education for all students. Art
educators need to demonstrate that the learning unique to art contributes
to the education of all students and that students do produce evidence of
learning in art. Without assessing what students learn in art, art education
will remain in a peripheral value position in formal education.
Assessment provides an objective basis for the claims that arise from art
teachers' hunches and chance, casual observations made daily at the
scene of action in the art classroom (p.9).

1.2.2. Assessment instruments and objects of assessment

According to Cronbach, assessment '…involves the use of a variety of techniques,

has a primary reliance on observations [of performance], and it involves an

integration of [diverse] information in a summary judgement'. Wiggins (1993) argues

that assessment in the arts must be flexible and include several outcomes:

No assessment system worthy of the liberal arts assesses mere knowledge


or skill; it assesses intellectual character, good judgement −− the wise use
of know-how in context. We are therefore derelict if we construe
assessment as only the act of finding out whether students learned what
we taught. On the contrary, one could argue that assessment ought to
determine whether students have effectively reconsidered their recently
learned abstract knowledge in light of their experience (p. 38).

47
In the case of arts the objects of assessment often include the products and processes

of art making. Beattie (1996) recommends that assessment components in the arts

should always include products:

Current philosophical theories of postmodernism of art education present


a vision of the art object and its expansive context as inextricably linked.
Art criticism, art history, and aesthetics cannot be understood without the
art object that informs them. Personalised dimensions of these three
disciplines, therefore, require creation of an art object. A holistic
understanding of art cannot be evaluated without the product (p. 48).

But she also states that the process of making is an essential part of learning and

should be considered in assessment: the objects of assessment can be either process

or product. Beattie (1995, p. 54) makes the point that recent research in cognitive

psychology, in educational and teaching theories, and the current educational reform

initiatives provide a strong basis for a paradigm shift from emphasis on assessing

product exclusively to assessing process and product in tandem. Assessing the

process is an important aspect of education, what Resnick and Resnick (1992) call

the ‘thinking curriculum’. The thinking curriculum with emphasis on 'higher-order

abilities' such as problem solving, reasoning, inference, constructing personal

knowledge, and exercising personal judgement.

1.2.2.1. Moving away from tests

Traditional assessment instruments used to judge students' knowledge skills and

understanding about art are often based on tests requiring capacities of recall and

technical ability. Tests might not assess thinking and critical skills validly. In Eisner's

(1979) words, such methods provide a biased, even distorted, picture of the reality

that we are attempting to understand. Performance-based tasks tend to be highly

48
context-specific and do not easily allow direct testing (Beattie, 1996, p.53). Steers

(1994) points out:

A valid assessment instrument will assess, in an appropriate manner, the


specific curriculum/syllabus to which it relates. So for example, a course,
which principally consists of practical studio-based art and design
activity, cannot be adequately or validly assessed by means of a multiple
choice, pencil and paper test (p.290).

Gipps & Stobart (1997) also refer to the need for appropriate use of assessment

instruments according to the purposes for which they are designed. They argue:

Tests are useful instruments of assessment, but, even well-constructed


tests are not valid if the results are used inappropriately: to use a maths
test to select students for Fine Art courses or a music test to select
engineers, even if the tests are technically superb, provide extreme
examples of how it is the use of test results which determines validity
(p.42).

Eisner, 1998 makes the point that:

The multiple-choice test, true and false test, was considered reliable and
objective. Though, single correct answers and single correct solutions are
not the norm of intellectual life, nor are they the norm of daily life. The
problems that people confront in their intellectual endeavours, like the
problems they confront in ordinary living, are replete with reasonable
alternative solutions (p.144).

In the body of research that has been carried out in the assessment field there is a

move away from tests (Amabile, 1983; Gardner and Grunbaum, 1986). Tests have

been criticised for encouraging teachers to set tasks that promote de-contextualised

rote learning and that narrow the curriculum to basic skills with low cognitive

demands (Kellaghan and Madaus, 1991; Darling-Hammond, 1994; Herman et al,

1997). According to Wiggins (1993, pp 54-67) the use of ‘secret tests’ is a vestige of

the medieval past, a legacy of tests used as mere ‘gate keeping’ or as

punishment/reward systems, not as empowering and educational experiences

designed for displaying all that a student knows and understands.

49
Wood (1987) and Gibbs (1994) argue that there has already been a paradigm shift in

assessment from an essentially psychometric model (measuring individual

differences) to a more educational one (involving description and feedback to aid

learning). Assessment of accumulated knowledge has shifted to focus on students'

behaviours or capabilities to meaningful integration of learning of all kinds. The

proliferation of service industries and the changing character of work have created

demands for transferable skills such as those of communication, information

retrieval, problem solving, critical analysis, self-monitoring and self-assessment. As

a result a fast growing interest can be seen in the more formative, holistic,

contextualised forms of assessment, often described as 'authentic' or 'performance'

assessment. In vocational training these are often referred to as 'competence-based’

(Wolf, 1995). It remains the case, however, that traditional forms of assessment are

not easily replaced, embedded as they are in complex histories, cultures and power

relations of societies. Furthermore, as Broadfoot (1998, p. 473) argues: '…these

traditional forms with their emphasis on scientific rationality are so pervasive in

modern societies that they blind us to the potential for alternative forms of

assessment'.

For Torrance (2000) the way forward includes the necessity of assessment practices

that accommodate divergent experiences, pupil intentions, wider dialogue and

curriculum knowledge that can be local. He states that what justifies and legitimates

educational assessment is a meta-narrative of social and economic progress, giving

rise to a method for identifying individuals worthy of accreditation and selection

within a meritocratic system. However, in the context of an increasingly

50
multicultural society with global forms of communication, it is difficult to state that

one kind of selection of knowledge is more relevant than another; and Torrance

raises doubt about the veracity of obtaining a 'truer score', his question is whether:

‘…a true score can ever be produced for and adhere to an individual?’ For Torrance

(2000, p. 179) knowledge selections must be locally contingent, and assessment

results must be a function of the interplay of task, context, individual response and

assessor judgement. ‘Thus provisional descriptions of multi-faced achievements

would appear to be the appropriate goal for a system of educational assessment that

took such a challenge seriously’.

1.2.2.2. New approaches for external assessment

According to Armstrong (1994, p.110) ‘non-traditional methods of assessment

promote the value of art education by revealing a wider spectrum of student

achievements’. Wiggins (1993, pp 54-67) claims that an authentic assessment system

has to be based on known, clear, public, non-arbitrary standards and criteria. He

considers that assessment of thoughtful ‘mastery’ should ask students to justify their

understanding and craft. Furthermore he states that an authentic education makes

self-assessment central.

Instead of asking students to respond to a test in order to determine, for example their

art learning, Gardner (1989) has proposed that learning ought to be organised around

meaningful projects carried out over significant periods of time, and that evidence of

student thinking ought to be collected throughout that period, what he has called, a

‘process-folio’. His process-folios include concept sketches and early drafts of ideas,

along with final products. Reflection sheets, interview forms, and journals are used to

help students, teachers, parents, and administrators trace the evolution of particular

51
ideas as well as general progress over time. Assessment of student's learning is

based more upon the process than the final product. In this way the assessment more

successfully captures the way in which learning occurs within the discipline. The

term, which has been increasingly used in the United States in recent years to

describe this kind of approach, is 'authentic assessment' (Zimmerman, 1992). It is an

expression that has arisen in response to a growing recognition that standardised tests

capture only a small dimension of human intellectual functioning and it reflects the

kinds of changes to assessment practice suggested by Gardner (1996).

The impact of Gardner’s ideas on assessment has been more radical in the American

context, where a strong tradition of standardised tests derived from IQ testing

practice exists. But in American art education, assessment has not been a particular

concern of the recent discipline-based education movement. In that context, Harvard

University’s Project Zero and the Arts Propel (Blaikie, 1994; Gardner, 1996)

encouraged new approaches to performance-based testing where assessment is

central to curriculum planning. However, in Europe and especially in England a

different tradition of formal assessment exists in art education assessment, where

judgment grading is based on the evidence provided by student’s coursework

including, process and products (TGAT, 1988).

1.2.2.3. Records of achievement

Torrance (2000) argues that the use of records of achievement as an assessment

instrument has changed the field of assessment in general. He states:

Perhaps more intriguingly, the Record of Achievement movement, which


has ebbed and flowed over the past twenty years or so, has at times
sought to elevate the voice of the pupil over and above simply providing

52
employers and other third parties with a more comprehensive teacher-
written report on pupils' achievement (see Broadfoot et al, 1988). The
movement has also given significance to achievements outside the
mainstream academic curriculum and indeed, on occasions, to
achievements outside school. In this we can certainly see evidence of
attempts to accentuate the importance of the local and allow the voice of
the pupil to be heard.... (p.185).

Records of achievement have been criticised for potentially exposing pupils to even

more scrutiny than traditional examinations, since every aspect of a pupil's life, in

and out of school, becomes potential material for inclusion (Hargreaves, 1989,

quoted in Torrance, 2000, p.185). Despite such potential disadvantages, Torrance

(2000, p.185) concludes: ' I am minded to suggest that the Record of Achievement

movement is long overdue, as long as due attention is paid to identifying a wide

range of accomplishments, as witnessed by a wide range of adults in and out of

school, and is accompanied by a much more self-conscious commitment to

privileging the pupil'.

1.2.2.4. Portfolio

Closely related to records of achievement are portfolio tasks. According to

Klenowski (2002, p.26) a portfolio is a collection of work that can include a diverse

record of an individual’s achievements, such as results from authentic tasks,

performance assessments, conventional tests or work samples. A portfolio documents

achievements over an extended period. The development of the portfolio involves

documentation of achievements, self-evaluations, process artefacts and analyses of

learning experiences, strategies and dispositions. Generally, the individual identifies

work from an accumulated collection to illustrate achievement and to demonstrate

learning for a particular purpose, such as certification, summative or formative

53
assessment. Careful critical self-evaluation is an integral process and involves

judging the quality of one’s performance and the learning strategies involved. The

individual’s understanding of what constitutes quality in a particular context and the

learning process involved is facilitated by discussion and reflection with peers,

teachers, or others during substantive conversations, exhibition or presentation of

learning. Using portfolio for assessment of students may present several advantages.

As Zimmerman (1992, p.17) points out, through portfolios students can be observed

taking risks, solving problems creatively, and learning to judge their own

performance and that of the others.

According to Eisner, (1998) the tasks used to assess students should reveal how

students go about solving a problem, not only the solutions they formulate. He points

out:

The new assessment practices will need to provide tasks that resemble in
significant ways the challenges that people encounter in the course of
ordinary living. This will require an entirely different frame of reference
for the construction of assessment tasks than the frame we now employ
… the challenge to assessment is to somehow create tasks that give
students opportunities to display their understanding of the vital and
connected features of the ideas, concepts, and images they have explored.

Portfolios and records of achievement seem to present advantages for art assessment

because they can be used as assessment instruments providing different sources of

information through several tasks. Portfolios include process and product; through

evidence of self-evaluation, substantive conversation and reflective thinking and

practice.

The strength of using portfolios derives from students’ ownership of the learning

process. This contrasts with other forms of assessment where the teacher or examiner

54
is full charge of the process. The student in portfolio assessment is actively engaged

in thinking about his or her own work, the learning involved and the progress he or

she is making (Klenowski, 2002, p. 110). Use of portfolios encourages students to be

active creators rather than passive receivers of knowledge, giving them opportunities

to examine and analyse their ideas, beliefs, and values in collaboration with others.

Portfolio use supports this process and helps to develop increased awareness of their

learning (Klenowski, 2002, p111). When students manage their portfolios they are

also developing important organising skills which serve to extend their sense of

responsibility and ownership of their work. Students are challenged to be creative

and inventive as they strive to develop a portfolio of work that captures their unique,

personal achievements.

However some difficulties with implementing portfolio-based assessment have been

raised: extended tasks that encourage thinking and reasoning activities take time and

are open to different types of valid response. Klenowski (2002) points out some

obvious practical problems that need to be taken into account such as the in-service

training of teachers, raters or assessors; and the amount of assistance teachers

provide for students. Problems associated with portfolio implementation include the

significant resources required, ranging from the in-service training of teachers to

mark or grade the portfolios, support for students to understand the portfolio process

and increased workload for teachers and students.

1.2.3. Standards, criteria, attainment targets

Reliable assessment can be improved and facilitated by written criteria and defined

standards or attainment targets (Steers, 1996). Curriculum writers and examiners

55
frequently employ criteria to express the qualities valued in student artwork and to

guide the judgement of arts educators. The simple expression of the criteria used for

assessment in the arts does not necessarily provide an indication of standards.

Criteria express the qualities we might value in an object or performance, and

standards express the degree to which they exist. A criterion for judgement is not

quite the same thing as the specification of an achievement standard or attainment

target. Judgement of a work against pre-specified criteria does not necessarily mean

a good standard has been achieved, even if all the criteria have been evidenced in the

work.

Boughton (1999) argues that different cultures emphasise different criteria. Even

within a single cultural tradition, the criteria used for judgement may demand

different emphases according to the genre of the work especially for perceived

‘originality’. For example, contemporary work using new technologies and recycled

imagery may raise different issues for judgement than work undertaken within

traditional styles and using older media. Art works and art students' performances are

not predictable, comparisons of student outcomes with pre-ordained 'standards' or

'profiles' can be inadequate, not only because words do not do justice to visual

phenomena, but also because it is inappropriate to prescribe the precise qualities of a

potentially uncertain outcome.

Standards imply the existence of uniform exemplar answers. However in art and

design education such ‘answers’ are seldom the aim, rather, diversity of response

and outcome is sought. Boughton (1997) considers that: ' … the judgement task is

far more complex because the products being judged are expected to reflect a degree

56
of original thought, and some sense of the idiosyncratic [personal] characteristic of

their creators' (p.203). According to Boughton (1999) it is not possible, nor would it

be appropriate in a postmodern context, to describe a priori an ideal performance or

icon against which students' work is matched to determine a level of achievement.

He points out: '…what is much more appropriate is the use of criteria which express

conceptions of qualities thought to be desirable in the work' (p.343).

However, assessors cannot exercise judgement or use any assessment tools unless

they are clear about the criteria and standards to be used when making their

judgements. The use of criteria and holistic judgements, rather than standards

frameworks, may provide the degree of flexibility necessary to accommodate

variations of outcome in the students' work. Criteria provide guidelines for making

judgements. Any single criterion is able to indicate desirable qualities which may or

may not be relevant to a given work, or body of works. According to Boughton

(1999) criteria singly, or in combination, are constitutive elements, which serve to

provide an array of windows through which a judge may look. They do not provide a

definitive assessment matrix. In Boughton's ' view:

These windows may sharpen a judge's focus, but it still remains for the
judge to value the qualities of the work, to understand its genre and
context in which it was produced. An important element in the
construction of assessment policy in studio art that is most likely to be
responsive to variation of genre and cultural values is the employment of
criteria rather than standards for the guidance of examiners and a single,
holistic judgement following attention directed by the criteria (p.343).

The critical element in the definition of standards is the establishment of a (socially)

'recognised measure' which comes into being through some kind of imposed act of

authority, by accepted custom, or by simple agreement. Assessment is a crucial part

57
of arts education, particularly if educational managers and the community are to be

convinced that arts educators are able to reliably determine the quality of student

achievements. While national curricula focus on standards and prescriptions of

predictable content outcomes, the real issue is the problem of cultural relevance and

validity in assessment.

Eisner (1985) argues that criteria are essential for assessment:

In visual arts education, and particularly in studio work, the judgement


task is complex since the work being judged is normally expected to
show originality of thought, and convey some sense of the idiosyncratic
nature of the maker. It is not possible, nor would be appropriate to pre-
define exactly the ideal performance, or icon, against which students'
work is matched to determine level of achievement. It is much more
appropriate to use criteria which express conceptions of the qualities
thought to be desirable in the work.

However, it must be borne in mind that the expression of a criterion in words can be

reductionist. Often the most important qualities in art students' achievement are tacit

or elusive in nature, and they are very difficult to express in words. The nature of

'aesthetic response' is one of these. One temptation is to write criteria that can be

easily interpreted, rather than to attempt to write those that are more appropriate, but

are difficult to define, or are controversial.

For Boughton (1996) the problem with the use of criteria is that they express inexact

human values and are subject to various interpretations by judges who may hold

different views about the significance, or precise meaning of any given criterion.

The problem with the notion of art is that it too is a social construction
subject to various interpretations in different social, educational,
historical contexts. It is no wonder that education bureaucrats who value
secure and predictable worlds __ who lust after precision and reliability __
becomes anxious about the validity and reliability of qualitative
judgements. In the age of educational reform, characterised by an
obsession with standards, reporting frameworks, and accountability
measures, the value of qualitative assessment information becomes more

58
and more difficult to justify to those who are preoccupied with statistics
and management strategies. At the same time as the industrialists and
politicians are demanding from educators increased precision in their
judgements, clear standards, and higher levels of accountability, the
impact of postmodern ideas is having the opposite effect in the world of
art (p.72).

According to Walling (2000) : 'To be effective without being restrictive, standards

must be broadly, even loosely, cast to allow for diversity, for plural visions of what

art is and how art may be created' (p. 47). Localising standards is a necessary

component of assessment that is ends driven. Walling points out:

Teachers must have in mind some goals that are measurable and then
design instruction that will move students toward the achievement of
those goals. This is a highly useful way of thinking about instruction, but
this approach also has limitations and, I would argue, should not be used
exclusively. In the visual arts, open-ended exploration and
experimentation – true creativity – are valid ends as well (p. 63).

For Walling a constructivist and postmodern approach to education can be in conflict

with traditional means of assessment, '…educators will need to consider carefully

how a balance can be struck between the need to define clear standards in advance to

instruction [previous established] and the need for standards to emerge as a result of

constructing new knowledge and understanding' (p. 64).

In the case of studio art examinations, where practical skills are assessed, the

question of whether studio work should be compared to standards must be

considered. Schönau (1999, p.32) answers: ' Yes and no. No, it is not, as no two art

works are the same. Yes, in practice art teachers do use standards. They compare

what they see with what they have seen before…. Art teachers set their own criteria

and standards'. Schönau points out that

…the procedures used by the teacher or the standards themselves are


almost never public …studio work is judged on a series of criteria, but

59
how much does each criterion relate to the final score? What is the
maximum level of competency? It is well known practice that teachers
never give a maximum score to their students, because things can always
be more perfect (Schönau, 1999, p.32).

The nature of an artwork creates difficulties for assessment by criteria; an artwork

can be seen as an example of Gestalt – an integrated perceptual structure conceived

as functionally more than the sum of its parts. According to Schönau (1999, p.32) '…

judging studio work using criteria is not fair when it ends with adding up separate

scores. For how do the separate scores relate to the overall score?' However Schönau

argues for the use of national standards or minimum levels of competency when

assessing on a national level. The problem is not if standards, levels of competence,

attainment targets or grade descriptors must be used in art and design examinations

but how they might be expressed in order to be commonly understood and applied. In

other words: how do assessors reach consensus?

1.3. Assessing the arts: current problems

1.3.1. The problem of standards and criteria

The difficulty of defining domains of knowledge in art and design education is not

the only problem in external examinations. Another important question is about the

possibility (or impossibility) of defining standards in art and design examinations and

the difficulty of defining unambiguous criteria in verbal terms. In a summative

assessment context different judges’ views present a problem for system managers.

According to MacGregor (1992) the criteria used by teachers to judge studio work

shows little variation in countries such as United Kingdom, Canada and The

Netherlands. Mostly they employ some variation of three categories of criteria:

60
competencies to develop and interpret a theme, level of technical expertise, and

competencies to achieve sensitive personal expression through the use of a variety of

techniques and processes. However it seems that criteria are sometimes confused

with standards, attainment targets or marking schemes (Schönau ,1999). Furthermore

standards may not be written. The marking schemes used in art and design

examinations are not always defined in explicit terms:

The examination committees, or the Inspectorate in France for that


matter, generate their own standards, based on comparisons through the
years. Each single assignment, criterion or question leaves open [sic]
when a student has failed. In the written exams for most subjects the
summation of scores on separate questions leads to a final score. This
score lays somewhere between nil and the maximum score. The cut-off
score – the score that is decisive for pass or fail – in most cases lays
somewhere halfway. This cut-off score one could call a standard, but it is
a mathematical standard that used in this way lacks a transparent basis on
what is considered essential. So the word ' standard' in this case has a
dubious meaning (Schönau, 1999, p.32).

1.3.2. The case for a consensual view of criteria

It is tempting to suggest that where coherent programmes of study are


supported by an agreed-upon range of appropriate attainment targets, it is
probable that appropriate assessment procedures to meet specific
perceived needs, rather than assessment for assessment's sake, will
emerge without the need for undue anxiety (Steers, 1996, p. 189).

The necessity of a well-designed assessment instrument and scoring process

establishing clear and unambiguous criteria and attainment targets increases in the

field of art and design where teachers use different underlying concepts. Beattie

(1997, p. 229) points out that '…establishing criteria, objectives, and standards can

help a state or school determine what it values and seeks to accomplish, an authentic

reform initiative'. But the real problem in art and design examinations is not the type

of criteria but how they are written and applied in order to facilitate validity and

reliability. A consensual interpretation of criteria is a necessary condition in art

61
examinations and this may be possible through the development of assessment

communities (Boughton, 1997; Freedman, 2003). Boughton (1997, p.207) argues

that there is a need to reach a shared understanding of cultural contexts, reciprocal

understanding of competing ideologies, and agreement about the meaning of criteria

and objectives.

…instead of placing our faith in written standards we should,


instead, employ a process of judgement, based upon the notion of
community [judges/ examiners] as arbiter of quality, to determine
standards in studio art education. Such a procedure is consistent with
arts practice in the broader community and responsive to the
dynamic nature of the discipline (Boughton, 1997, p. 209).

In a study of teachers’ assessments of primary children’s classroom work in the

creative arts, carried out in 13 primary schools in Leicestershire and Leicester,

Hargreaves et al (1996) conclude that teachers may reach agreement about the

quality of students’ artwork if they achieve a common understanding about criteria

and constructs:

This study, although on a relatively small scale, has some important


implications for future work on arts assessment. First, it has
demonstrated that when teachers are given the opportunity to clarify their
ideas and the ambiguities in the language used to describe children’s
work, they are capable of substantial agreement about the quality of
different pieces of work from different pupils, and apparently make these
assessments in unidimensional evaluative terms. Second, the more
explicitly teachers define the end product of the activity which they set,
the more rigorous they seem to be in assessing the quality of this work
(Hargreaves et al, 1996, p.210).

1.3.3. The problem of teacher in-service training

Another current problem in assessing the arts concerns teacher in-service training.

Eisner recommends a more qualitative orientation to educational evaluation that

focuses on educational connoisseurship and educational criticism. Connoisseurship

is taken as the art of appreciation, the ability to define the quality of an object or

62
environment (Armstrong, 1994). Eisner (1985) observes that assessors should be

qualified artists and teachers who know how to interpret the visual terms used in the

criteria. He claims that assessors need enough experience and knowledge

(connoisseurship) to recognise creative work in order to make a holistic judgement.

Therefore by this account, assessors should act like art critics.

Steers (1994, p.296) also considers that art and design education in schools requires

teachers with a high degree of professionalism and a real understanding of their

subject, supported by adequate resources of time and materials. The teachers' role in

the assessment process is essential and without their collaboration and agreement

there can be no reform or innovation. Teacher ownership or commitment to the

assessment depends on the quality of teacher in-service training. According to Steers

(1996): ‘This ownership only can be effective if the system provides an efficient and

continuous teacher training, ensuring connoisseurship and professionalism of

classroom teachers and assessors’. Assessor training is an important aspect of

connoisseurship, but not all systems provide such in-service training (Schönau, 1999,

p.33).

Processes of teacher in-service training and strategies for reaching agreement about

criteria and standards such as moderation and standardisation are further discussed in

the next chapter.

Summary
In this chapter, general and specific art and design assessment literature was located

and discussed. Several key roles for assessment were identified: as a way to provide

63
feedback and motivation for teachers and students; as a mechanism of control; and a

device for students' selection and certification. Different functions of assessment

including formative and summative were described. Assessment instruments such as

tests and alternative forms of assessment, for example, records of achievement and

portfolios were briefly analysed in terms of their advantages for art assessment and it

was evident that tests are not normally the best instrument for assessment in the arts.

Specific assessment problems of the arts were described and the following issues are

considered crucial for this study:

• Assessment instruments in the arts should allow students to reveal a wide

spectrum of achievements, including evidence of process and products.

‘Alternative’ forms of assessment present advantages for art assessment,

including for example portfolios and records of achievement.

• The nature of art raises difficulties in defining a shared language. There are

no single ‘correct answers’ and it is difficult to define precisely criteria,

standards and attainment targets. However there is a need to accommodate a

wide range of often unanticipated outcomes.

• Developing a consensual view of criteria and standards provides a possible

way to overcome such difficulties. The creation of a community of judges,

through teacher in-service training might provide a way to improve the

reliability of assessment.

64
Chapter 2

Reliability, Validity, Impact, Practicality

This chapter analyses some particular qualities of examinations identified as

essential for the design and evaluation of assessment, namely: reliability,

validity, impact and practicality. The chapter ends with a conceptual

framework for addressing a list of questions concerning potential sources of

invalidity, unreliability and negative impact in relation to art and design

examinations.

2.1. Reliability

An instrument for assessing art learning by tests or other means must demonstrate

reliability or consistency. Reliable assessment should enable students with similar

abilities, in different geographical regions to obtain the same result. In test literature

the traditional definitions of reliability focus on the consistency of results: a test is

reliable if two identical students obtain the same results (Cardinet, 1993; Ribeiro,

1997; Valadares & Graça, 1998). However in reality it is unlikely that there will be

identical candidates (on two or more occasions) and in order to verify consistency

between test results some proxy measures are used such as: (a) test-retest reliability,

which is based on gaining the same marks if the test is repeated; (b) mark-remark

reliability which looks at the agreement between markers; and (c) parallel

forms/split-half reliability in which similar tests give similar marks (Gipps & Stobart,

1997). Reliability is also concerned with evaluation of the marking or rating process

in terms of the consistency of assessors' judgements: inter-rater reliability (an

estimate of reliability based on the degree to which different assessors agree in their

65
assessment of candidates’ performance) and intra-rater reliability (the degree of

agreement between two assessments of the same sample of performance made at

different times by the same assessor.

Reliability as understood by test developers may not be as useful for teachers, and

especially for art teachers who usually assess variables outcomes of coursework or

portfolios. A broader conception of reliability might be useful. For example William,

(1992) argues that reliability is only a small part of the dependability of an

assessment instrument in providing accurate information about the performance of

individual students. He states that it is more appropriate to work with the concepts of

disclosure and fidelity. The disclosure of an assessment is the extent to which it

produces evidence of attainment from an individual in the area being assessed. In

general,

there should be reluctance to conclude that failure to elicit evidence of attainment

means that the student has not attained: very often, all it means is that the question

has not been asked in the right way for that individual, and the verdict cannot be

proven. The fidelity of an assessment is the extent to which evidence of attainment

that has been disclosed is observed, correctly interpreted and accurately recorded.

The concept of fidelity will be developed later through the discussion of validity.

2.1.1. Problems of reliability in art examinations

The major problems associated with reliability in art examinations can be attributed

to: (1) in a priori stage: the construction of the examination instrument and, (2) in a

posteriori stage: the implementation of the instrument where aspects such as

66
teachers, assessors, examiners’ training, and judgment of students' work through

interpretation of criteria all need to be taken into account.

2.1.1.1. Instrument

In art and design examinations tests are seldom the sole instrument. Several

instruments can be used and complement each other. This may present advantages

for the reliability of results because it provides several assessments of the same trait.

Eisner (1991) advocates techniques such as systematic sampling in assessment

procedures as a process to obtain consistency through corroboration: 'In seeking

structural corroboration we look for recurrent behaviours or actions... that inspire

confidence that the events interpreted and appraised are not aberrant or exceptional,

but rather characteristic of the situation'. Systematic sampling allows a teacher to

regularly observe those behaviours that occur consistently or those that suggest

change with the intervention of teacher-guided experience. The teacher can

repeatedly assess the same thinking process (Armstrong, 1994). Gipps & Stobart

(1997) also refer to systematic sampling as a way to improve assessment reliability:

Curiously, another form of reliability __ repeated measures__ is often


neglected. This is unfortunate because, in discussing the reliability of
coursework and of teacher assessment, the fact that performance is
assessed on a number of occasions makes the measurement in this respect
more reliable than a 'one-off' test. Indeed, the perception of examinations
as reliable and coursework as unreliable usually has more to do with
opinion than with evidence. It is often the case that reliability of
examinations is over-rated and that of coursework under-rated (p.42).

Here, Gipps & Stobart (1997) raise an important aspect of reliability that is,

particularly acute in art and design. In fact coursework or portfolios usually provide

67
more reliability than a single test because they allow repeated measures of the same

knowledge skills, understanding and art making over a period of time.

2.1.1.2.Assessor and examiners training

Sources of unreliability can be found not only in the design of an examination

instrument but also in the use of methods and procedures that are unclear or weakly

defined and which result in inconsistency of the whole process. For example the use

of vague scoring instruments such as performance and grade descriptors, criteria and

standards and the consequential problem of different interpretations of such

instruments to judge students works. Boughton (1999, p. 72) considered that the arts

do not easily reveal their quality or character through measurement. Standardised

measurements of learning in the arts describe very little about highly personal

qualities of performance. Boughton (1997) argued that agreement between judges is

essential for reliability in art assessment. He states:

A more appropriate source of ideas to determine standards in arts


education is to look to the arts in the broader community for guidance.
One important cue is the nature of institutionalised debate about what
counts as good work in the community. Similarly, challenge and debate
among teachers has useful institutional potential in arts assessment
contexts. In other words the community becomes the arbiter of quality.

2.1.1.3. Judgements and moderation

Another important procedure in large-scale qualitative assessment methodology

in the arts is the use of moderation (Askin, 1985; Chalmers, 1981; Steers, 1987).

This procedure is extremely important where reliability of scores is critical.

Moderation is a system of assessment using several judges to assess students' works.

68
According to Steers (1996, p.185): ‘ Moderation is the means by which the marks of

different teachers in different centres are equated with one another and through

which the validity and reliability of assessment are confirmed’. Moderators calibrate

teachers’ marks in relation to an agreed standard. Moderation improves reliability of

assessment results and reduces the variations of the individual assessor's

interpretation of criteria (Schönau, 1996; Steers, 1994, Blaikie 1996; Heyfron, 1986).

Moderation occurs in assessment of art on a national scale, including in England

(Steers, 1988), Australia (Boughton, 1994) and the Netherlands (Schönau, 1996), in

international assessments that take place for the International Baccalaureate

examination (Chalmers, 1981) and for the Advanced Placement examinations used in

the United States of America to select some students for progression to higher

education (Askin, 1985). This procedure of marking students' examinations requires

expensive assessor training and resources, however it will be important to consider

the advantages of such systems in order to minimise the unreliability of examination

results.

2.1.1.4. Threats to reliability summarised

To sum up, threats to reliability in art and design are usually a consequence of a

deficient assessment instrument, ineffective assessment procedures and lack of

assessor training. Standardisation meetings for assessors, when carefully planned and

implemented can improve examination reliability because they serve as assessor

training and help to foster a common understanding of standards. Moderation and

post moderation procedures also might reduce sources of unreliability by

encouraging informed debate about standards.

69
2.2. Validity

Validity is used in the context of this research as a unitary concept: unless the type of

validity is specified in the text the term validity is used to include all types of

validities (content, face, response, construct, theory-based, criterion-related, etc.).

Validity concerns the appropriateness and meaningfulness of an examination in a

specific educational context and the specific inferences made from examination

results. The validation process is the process of accumulating evidence to support

such inferences. Whether an examination is valid hinges on the question of whether

or not the examination assesses what it is supposed to assess usually in relation to

what has been taught. Assessment of either process or product requires critical

scrutiny of the soundness of the interpretations that will be made from the derived

score. Questions and issues surrounding the appropriateness, meaningfulness, and

usefulness of inferences from scores are concerns of validity. Validation can be

sought through two main stages: A priori validation involving a scrutiny of the

examination paper or examination instructions before it is put into use, and a

posteriori validation involving the investigation about the way the examination

appears to have worked after the event, it involves both the analysis of scoring data

and qualitative investigation.

Only as a form of shorthand is it legitimate to speak of the validity of a


test [or examination]. We need first to look at validity in the design and
development stage where the test is constructed based on an empirically
derived theoretical specification. Next the implementation stage when the
test has been administered, we are able to look at the data generated and
apply statistical analyses to these to tell us the degree to which any
inferences are well founded. Finally we can collect data on events after
the test to shed further light on the well foundedness of those inferences
and value for end users of the information provided (Cronbach, 1990,
p.150).

70
2.2.1. Some types of validity

Validity is often discussed in test literature. Beattie (1996) makes the point that '…

although validity is currently viewed as a unitary concept, the scope of questions and

issues is broad subsuming all of the historically discrete validity types (e.g., content,

criterion-oriented, construct, face, and the like)'. In assessment and testing literature

several types of validity are defined (Messick, 1971,Cronbach, 1971, Anastasi,

1988). For the purposes of this research seven different types of validity have been

selected, which are discussed in the following sections.

2.2.1.1. Content validation

Content fidelity is concerned with the faithfulness of the task to the


domain or object that it purports to assess. If procedural knowledge is to
be evaluated, then the selected task must cover the dimensions of the
processes adequately. On a broader level, if holistic knowledge of an art
discipline is the object of assessment, then examined processes and
products per se must sufficiently represent the art discipline's methods of
inquiry or content, respectively. A review of the literature in the field and
discussions with peers are means for determining most valuable
knowledge (Beattie, 1996, p. 52).

Content validity concerns the coverage of appropriate and necessary content

(if the assessment instrument covers the skills necessary for good performance,

or all the aspects of the subject taught). Content validation is also about ensuring

the authenticity of the tasks or, in other words, if the assessment instrument is

representative of some domain of artistic competencies used in real life.

Usually, content validation is carried out through qualitative experts' judgments of

examination instruments. Sources of invalidity can be: assessment instruments that

are constructed using a poor sample of artistic competencies or domains; tasks and

procedures that do not enable students to reveal their competencies in an authentic

71
way; failure to restrict inferences from examinations results to what an examinee can

do in the specified content domain.

2.2.1.2. Face validation

Face validation is also about checking if the assessment instrument measures what it

is supposed to measure, but, as opposed to content validity, which is judged by

experts, face validity is judged by various groups of non-experts who come into

contact with the instrument taking into account the feedback from test takers and

stakeholders of examinations. Threats to face validation may be unfamiliarity of

format and lack of authenticity in the assessment instrument.

2.2.1.3. Response validation

Response validation is the extent to which examinees respond in the manner

expected by the examination developers. It is a process of inquiry about the clarity of

the examination paper, for example, whether or not students understand the

instructions and if the tasks motivate them. Using the designation '

straightforwardness' Beattie (1996) describes aspects of response validity:

Straightforwardness, as it relates to either assessment


of process or product, is important to consider. Every
aspect of the task (format, directives, standards,
scoring criteria) should be transparent enough for
students to understand what is expected of them. Peer
review and pilot testing of tasks help to uncover
problems of transparency.

To minimise sources of response invalidation it is important to evaluate the

assessment instrument conducting a trial version of the assessment instrument with

a selected sample.

72
2.2.1.4. Washback validation

Washback validation is concerned with the influence of the examination process

on the teaching and learning situation and checking if the instrument and

underlying model of artistic skills and knowledge are related to the learning goals.

Washback validity is also concerned with the tasks and scoring process, verifying if

they clearly reflect the specified model and domains – in other words, if the tasks

enable examinees to reveal their artistic skills and knowledge in a reasonably

authentic way.

Washback validity can also be referred to as backwash (ALTE, 1998), which is

also related to the impact of an assessment instrument on classroom teaching.

Teachers may be influenced by the knowledge that their students are planning to take

a certain test, and adapt their methodology and the content of lessons to accordingly

reflect the demands of the test. The result may be positive or negative. Sources of

washback invalidity include: basing the assessment instrument on an underlying

model of artistic skills and knowledge which are divorced from learning goals; tasks

and scoring procedures that do not fully and clearly reflect the specified underlying

model and domains of artistic skills and knowledge; tasks and scoring procedures

that do not enable the examinees to actively engage their artistic skills and

knowledge in a reasonable and authentic way; and tasks and scoring procedures that

encourage and reward aspects of performance that draw on irrelevant competencies

(Hasselgren, 1998).

2.2.1.5. Criterion-related validation

73
Criterion-related validation is an a posteriori evaluation of the examination that can

be checked by two main processes: predictive validity and concurrent validity.

Predictive validity relates to whether the test predicts accurately or well some future

performance. Concurrent validity is concerned with whether the assessment

instrument correlates with, or gives substantially the same results as, another

assessment instrument for the same skill.

When designing an assessment instrument it is important to verify whether or not the

test is measuring unrelated abilities. Criterion-related validation is threatened by the

use of external criteria that measure different competencies from the assessment

instrument in question and by failing to look for evidence to ensure that the

assessment instrument is not measuring irrelevant competencies. Messick (1989,

p.10) has argued:

We fail to serve the demands of validity if we try only to correlate


simplistic tests with other tests (even other performance tests). We cannot
use content validity procedures alone, if the aim is to capture the essential
'doing' of the ultimate performance and the most valid, contextual
discriminators for assessing 'doing'.

In describing the typical validation procedures that involve mere correlation between

test scores and criterion scores, Messick (1989, p.10) notes that 'criterion scores are

measures to be evaluated like all measures. They too may be deficient in capturing

the criterion domain of interest’. The solution, he argues, is to evaluate the criterion

measures, as well as the tests, in relation to construct theories of the criterion domain.

2.2.1.6. Validation related to examination bias

74
Validation related to examination bias or equity of an examination is concerned with

the verification of the assessment instrument or checking if the tasks and procedures

place one or other group of examinees at an advantage. Sources of bias can include:

cultural background, background knowledge, cognitive characteristics, native

language, ethnicity, age and the gender of the examinees.

A powerful case needs to be made that the task for assessing process or
product is clearly unbiased and fair for all students involved. Aspects of
content and format might favour some group over another. The emphasis
on written or oral descriptions of learning processes in reflective journals
may be unfair to disadvantaged students, minorities, or non-standard
language speakers. A gender bias might also occur. Females tend to be
more reflective than males; therefore, reflecting on their processes might
be easier for them. An example of a contextual bias that might occur with
respect to the evaluation of an art product stems from different emphases
art educators place on product assessment criteria. In one context, the
teacher may consistently emphasise formal criteria, while in another
context, value is placed on conceptual and affective criteria.
Critical to the assessment of procedural knowledge is the opportunity
afforded all students to learn and practice methods of inquiry
(Beattie, 1995, pp. 52-53).

In order to identify points of bias is important to ask experts to revise the task prior to

the assessment and also to analysis the responses after assessment.

2.2.1.7. Construct validation

Construct validity relates to whether the examination is an adequate measure of the

construct – that is the underlying skill, knowledge or understanding being assessed.

Important to the development of an assessment is a clear and detailed definition of

the construct. The term construct can be viewed as definitions of abilities that permit

us to state specific hypothesis about how these abilities are, or are not, related to

other abilities, and about the relationship between these abilities and observed

behaviour. Two aspects of a construct validation must be considered: (1) a posteriori

empirical and (2) a priori theoretical (Weir, 2004). Anastasi (1988) describes

75
specific empirical techniques that can contribute to construct validation related to test

validation. For example: correlations with other tests; factor analysis; internal

consistency measures; convergent and discrimination validation. According to

Anastasi it is only through empirical investigation of the relationship of test scores to

other external data that we can discover what a test measures.

Construct validation can be viewed as the unifying concept of validation integrating

considerations of content, criteria and consequences (Messick, 1992). According to

Messick, two major threats to construct validity are construct under-representation

and construct irrelevancies. The more fully examination developers are able to

describe the construct that the examination instrument is attempting to measure at the

priori stage, the more meaningful might be the statistical procedures contributing to

construct validation that can subsequently be applied to the results of the

examination. According to Weir (2004) construct validation of an assessment

instrument can be checked using primary or secondary evidence to ensure that

constructs, descriptors of performance and the scoring system are adequate. A trial

version of the assessment instrument is recommended in the a priori stage of

examination design.

Examination developers should posit a theory of the construct of artistic skills,

knowledge and understanding as the basis for designing examination instruments.

Furthermore, a methodology for exploring the validity of examinations is essential

in order to provide tools and empirical evidence to support the instruments and

processes. Construct validity can be threatened by faulty or incomplete description

of the abilities or traits to be assessed and lack of empirical evidence supporting

76
the creation of descriptors of performance and division of constructs in the

scoring system.

2.2.2. Problems of validity in art examinations

The major problems of validity in art and design examinations are related to: (1) the

design of the assessment instrument, (2) the examination’s underlying conceptual

model and domains of competencies and, (3) a consensual concept of the most

important aspects of art and design.

2.2.2.1. Assessment instrument

In art and design, one single event such as a short period examination seldom

allows the valid assessment of students' artistic skills, knowledge, and

understanding (Burkhart, 1965; Chalmers, 1990; Zimmerman & Zurmuehlen,

1987; Dorn, 1988; Efland, Koroscik & Parsons, 1991). A range of tasks and

student's works are necessary. Beattie (1996) has argued that:

A single process or product is not exhaustive of an art discipline's


methods of inquiry or content, and to infer anything about broad
knowledge in a discipline from one process or one product is
problematic. Neither is a single score on a process a strong indicator of
knowledge about the process. Several measures increase the chance of
determining an accurate score and making an appropriate interpretation
(p. 52).

The choice of the assessment instrument is crucial to attain validity and also

reliability of examinations. In the case of art and design, the complexity of the

subject raises problems about the use of traditional assessment instruments structured

principally on basic learning strategies such as repeating, recalling and ordering.

77
2.2.2.2. The examination underlying model

Examination developers need to be explicit about the nature of examination, while

acknowledging concepts such as: artistic skills, knowledge and understanding and

judgement. For the purpose of this research it might be helpful to have a sharper

definition of whether the intention in art and design examinations is to assess student

performance or student ability. In the English report of the Task Group on

Assessment and Testing (TGAT, 1988) it was stated:

There has been some misunderstanding about the assessment of


'ability'…. We had intended to confine our proposals to the assessment of
'performance' or attainment and were not recommending any attempt to
assess separately the problematic notion underlying 'ability'. If ' ability
were to be assessed, its meaning would have to be carefully defined; and
the problem with defining it without making it merely a measure of a
particular type of performance is hard to solve (TGAT, 1988, p.2)

Beattie (1996) cited Glaser’s (1991) dimensions of competence. Glaser (1991) has

identified four dimensions of developing performance that can be useful concepts for

assessment:

- Coherence of knowledge (i.e., experts are better able to structure their


knowledge in meaningful, interrelated ‘chunks’ [p. 26] rather than
fragmented, isolated bits of information, and can access this knowledge more
readily);
- Principled problem solving (i.e., experts recognize underlying principles and
patterns in a problem);
- Usable knowledge (i.e., experts progress beyond content or declarative;
knowledge to procedural to goal-oriented knowledge);
- Self-regulatory skills (i.e., experts monitor their own performances).

Glaser’s description of these four dimensions are particularly relevant to the

establishment of criteria, assessment objectives, and consequently to the design of

appropriate assessment instruments for art and design. The dimensions are

appropriate for the specialist field of art and design where knowledge and

78
understanding, problem-solving skills and aesthetic appreciation all need to be

evidenced.

When an assessment instrument is designed a coherent theoretical model of art and

design curriculum should underpin it for the purposes of theory based validity. The

assessment instrument should cover the artistic skills, knowledge and understanding

necessary for good performance, or all the important aspects of the subject taught

within that model. But it is widely agreed that art and design is a subject where there

are no single or ‘correct’ answers. Boughton (1997) notes that the problem with

judgements of artistic outcomes is that there is no single exemplar or icon against

which to judge a given student's performance. Originality, expressiveness, personal

interpretation and creative solutions, which are usually highlighted in attainment

targets and criteria, are influenced by trends and fashions in art. Moreover in the light

of postmodern theories, the nature of art defies attempts at a tidy definition, and

competing definitions of art may need to be brought to bear in judgements of quality.

Art and design examinations are concerned with the judgement of


performance through the appreciation of students' artistic, knowledge,
understanding and skills. 'Two judges viewing the same work may hold
different opinions about its value. A judge with modernist values, for
example, may reject eclecticism as unoriginal, and regard copying as
evidence of technical and imaginative ineptitude; whereas a judge
holding postmodern values may reward eclecticism and copying, because
both are appropriate and consistent with contemporary art practice. At the
cutting edge of artistic exploration these dilemmas will always exist in
judgment about student work. Because they are matters of value, and not
fact, they are difficult to resolve (Boughton, 1996, p. 83).

However the problem of different visions of art and design can be overcome by

agreement and consensus between the users. The scoring methods should be

constructed around agreed concepts of art and design students’ performance.

79
The underlying model of skills, knowledge and understanding being assessed should

be carefully defined in advance, seeking the empirical evidence supporting such a

model (Weir, 2004). The agreed aims and objectives of art education are essential

elements to take into account in the development of art and design examinations.

One necessary condition for the success of the examination is a ‘high level of

consensus about the aims of art education’ (Steers, 1996, p. 189), rather than

imposed policies underpinning its constructs and contents.

2.3. Impact

Assessment shapes institutional learning. Different forms of assessment


influence what is taught and how it is taught, what and how students
learn. Through the twentieth century, many educationists have deplored
the effects of public examinations on the quality of learning in English
schools. Systems of assessment have been criticised for putting a
premium on the reproduction of knowledge and passivity of mind at the
expense of critical judgement and creative thinking (Broadfoot, 1996,
p.175).

The impact of an assessment process is concerned with the effect created by that

assessment, both in terms of influence on general educational processes, and in terms

of the individuals who are affected by test results. The impact of an assessment

instrument and procedures includes all sort of consequential validities and washback.

Examinations formats and selections of syllabus content are not value-free; its effects

are both educational and societal. The former refers to the changes as a result of

examinations on curriculum, teaching methods, learning strategies, materials,

assessment practices and knowledge examined, while societal effects are concerned

with the impact of examinations on gate keeping, ideology, ethicality, morality and

fairness.

80
2.3.1. Effects on society and individuals

Examinations as features of power cause detrimental effects on those users, who

affected by the results, are forced to change their behaviour and comply with the

examination’s demands in order to maximize and gain the benefits associated with

high scores. Examinations are used ‘to gather evidence upon which inferences are

drawn about the unobservable’ (McNamara, 1999), they might be imperfect tools, but

their power and use has strong consequences on society and on individuals.

Examinations are not used solely by educators and students – they can also serve

political agendas. Shoamy’s (2001, p. xxii) comments about tests are also valid in

the examination context:

… policy makers and test developers had different agendas regarding


tests; they each viewed tests through very different eyes. Testers believe
that the main criterion for good testing is that the tests they construct
possess features that provide an indication that they can measure
accurately. Policy makers; on the other hand, view tests as a means of
promoting educational agendas…I observed the many ways that tests
were used in education and society, not only to force teachers to teach
and students to learn but also to impose policies, define knowledge and,
worse, to punish, exclude, gate keeping and perpetuate existing powers.

2.3.2. Effects on students

For the examinees, examinations are threatening rituals detached from reality with

long-term effects (Shoamy, 2001, p.14). The uses of examination results have

detrimental effects for examinees since such uses can create ‘winners and losers,

successes and failures, rejections and acceptances’ (Shoamy, 2001, p. 15).

Examination results are used as indicators to placing students in class levels, for

granting certificates, for determining whether a person will be allowed to continue to

further study, for deciding on a profession, or for obtaining a job.

81
2.3.3. Effects on teachers and teaching methods

Art and design examinations influence teaching practice. To predict intended and

unintended results there is a need to study the impact of similar previous assessments

on curriculum and instruction. Beattie (1996, p.53) makes the point that:

The art curriculum might be changed to align more closely with the
emphasised object of assessment. Moreover, the methods by which
each is assessed could bring about a distortion of teaching with positive
or negative results. Task format might discourage other viable methods
of teaching concepts and skills.

Similarly Spolsky (1995, p.56) argues that the power of examinations inevitably

narrow the educational process. This is because once teachers have familiarised

themselves with the content of an examination they recognise that:

… it becomes more or less the precise specification of


what knowledge or behaviour will be rewarded (or
will avoid punishment). No reasonable teacher will do
other than focus his or her pupils’ efforts on the
specific items that are to be tested; no bright pupil will
want to spend the time on anything but preparation for
what is to be on the examination. The control of the
instructional process then is transferred from those
most immediately concerned (the teacher and the
pupil) to the examination itself.

The examinations act as a curriculum model for art teachers because of the

importance of results for students and schools. The competence of a teacher may be

judged by his or her students’ examination results. In order to improve students’

results teachers tend to pay a great deal of attention to exemplary work graded as

‘good’ in a previous examination and to allocate more time to learning prioritised in

them. Examinations usually stimulate teachers to produce teaching materials and

strategies in accordance to the examination guidelines (published or unpublished). In

the case of art examinations, where ‘originality’ is often an assessment criteria, this

82
effect can be pervasive, Steers claims: ‘When original work is seen to be rewarded

by the examination system it is rapidly imitated by students and teachers alike and by

the time of the next annual round of examinations an innovative approach is reduced

to a cliché’ (Steers, 1994b). On the one hand examinations offer models of ‘good

answers’ but on the other hand such models are contrary to the very nature of art

making and learning. Consequently examinations may introduce orthodoxy to a field

characterized by a plurality of personal view points and innovative answers causing a

threat to washback validity on teaching and learning.

2.3.4. Effects on schools

Examinations are powerful methods of control and a means for monitoring the

performance of schools. The examination results are the basis for accountability:

schools are compared and differentiated by the ranking of their examination results

in league tables. The impact of an examination is concerned with the extent to which

the examination results are used in the way intended, and are successful in bringing

about the aims of the examination. A misinterpretation of the results threatens the

examination process. Results based solely on quantitative data with no other

specifications are not recommended by Torrance (2000):

When accountability is based only on quantitative data with no regard for


the contexts in which the results were obtained, the credibility of such
results may be questionable. The validity of national scoring results or
inspection reports may be criticised. The results on a national scale may
not represent the reality, since reality is plural. But results in a small scale
may have more accuracy if they are focused in what, in fact, happens in
schools. The codified results of students' achievement as they appear in
league tables are not easily decipherable, except for assessment experts.
And furthermore they do not represent the quality of learning in schools
unless the 'value-added ' is described in such tables. Value-added
measures measure the progress made by individual pupils in an
institution rather than raw outcome scores.

83
2.3.5. Effects on standards and curriculum

Examinations are perceived as an expression of authority. Because they establish the

measure of students’ achievement using the language of numbers and

standardisation, they are used for purposes of monitoring and controlling the

education systems (Shoamy, 2001, p.50). Examination results are the recognised

measure of school, local or national standards. The curriculum will be affected by

such views.

Steers (1994) points out: ‘… while governments might have a legitimate interest in

assessment as an instrument to monitor local and national standards, assessment also

can be used as a potent means of controlling the curriculum’.

2.3.6. Effects on higher education

Examination acts as a disciplinary tool punishing and rewarding and, it also acts as

a means of controlling the number of students entering higher education. The study

of predictive validity in the case of art and design GCE/A level examinations might

provide valuable insights to evaluate the effectiveness of the examination.

The following questions may be worthy of further study:

- Do higher education courses in art value GCE/A level art and design

examinations results?

- Is there any correlation between students’ pre-university and design

examination results and their grades or marks obtained in higher education?

2.3.7. Ethical considerations

For students, examinations are not synonymous with fairness; often they perceive

examinations as biased, unjust, unfair and depending on luck (Shoamy, 2001, p.14).

84
Examinations can have detrimental effects on students; they are used as disciplinary

tools, providing both rewards and sanctions (Foucault, 1979). Student voices are

often neglected by current examination systems, although some improvements seem

to appear through the establishment of codes of practice (e.g. Code of Ethics for

International Language Testing Association; Test taker’s bill of rights by the

American National Council of Teachers of English; Wiggins, 1993). In this study it

may prove relevant to analyse the views of examination users about the ethical

aspects of examinations.

2.3.8. Evaluating the impact of examinations

It is important to be able to monitor and investigate the educational impact that

examinations have within the contexts in which they are used. Examination

developers should operate with the aim that their examinations do not have a

negative impact and, as far as possible, strive to achieve positive impact. This

might be accomplished through collaboration with stakeholders, in relation to:

• Development and presentation of examination specifications and

syllabus;

• Professional support programmes for the users (teachers, assessors,

students) and training of experts (examination developers, assessors,

moderators)

• Evaluation of the examination by collecting examination results

(marks) and teachers’/students’ opinions.

The impact of the examination is closely linked with consequential validity,

washback validity and reliability. The study of the impact of an examination

85
might provide evidence to demonstrate that the examination is or is not

sufficiently valid and reliable for the context in which it is used.

2.4. Practicality

Practicality, sometimes called utility, is the extent to which an examination is

practicable in terms of the resources necessary to produce and administer it in its

intended context of use (development, administration and validation of

examinations). The practicality of an examination is its convenience, flexibility and

cost effectiveness. According to Steers (1994) those factors are important because

awarding bodies and school centres have to operate within the overall time and

resources available. Aspects such as the type and quantity of students' work to be

included in an art examination, the number of assessors, and the period of the

examination are all important considerations in the context of finite fiscal constraints.

The usefulness of an examination is related with its practicality and appropriate

procedures must be implemented for managing all aspects of the examination

process. The following factors must be considered:

• Design and question papers production;

• A priori validation of question papers;

• In-service teachers and moderators training;

• Availability of the examination in terms of dates, frequency, location and

number of centres or schools;

• Fees to be paid by examination takers;

• Central and local costs in terms of: production of question papers, marking

and scoring, meetings, visits, post-examination evaluation;

86
• Security, transport and storage of materials;

• Special circumstances, arrangements for candidates with special needs;

• A posteriori validation of examinations.

2.5. Conceptual Framework for evaluating art examinations

The examination constraints described in this chapter are used here as the basis of a

framework (see Figure 1 below) for evaluating the validity, reliability, impact and

practicality of art and design examinations, taking into account the identified major

sources of potential invalidity, unreliability and negative impact.

• Validity

Aspects concerning: (1) the extent to which the design of the examination

provides tasks that contain representative and relevant evidence of knowledge,

understanding and skills in art and design, (2) the constructs used, (3) the

authenticity of the assessment instrument and procedures, (4) the appropriateness

of format, directives, standards and scoring criteria, (5) the relationship with

other assessments of the same performance, (6) examination bias.

• Threats to validity in art and design examinations summarised

- Poor sampling of artistic knowledge, understanding and skills in art and

design. Unclear, irrelevant or reduced versions of the main aspects of

knowledge, understanding and skills in art and design. (Under-representation

and construct irrelevancies). Using an underlying model of artistic

knowledge, understanding and skills, which is divorced from learning goals.

87
- Using methods and procedures that may prevent candidates from performing

in the way intended. Using tasks and marking procedures that do not fully

and clearly reflect the specified model of artistic knowledge, understanding

and skills. Using tasks and marking procedures that encourage and reward

aspects of performance that draw on irrelevant abilities. Lack of authenticity

in the examination tasks.

- Criteria, assessment objectives, attainment targets and or grade descriptors;

which are not supported by empirical evidence.

- Lack of evaluation in the three stages of the examination: design, delivery

and outcomes.

- Examination bias. Format, directives, tasks which discriminate candidates’

cultural background; background knowledge; native language; ethnicity; age;

gender.

• Reliability

Aspects concerning the generalisation of results, consistency of marking and the

marking procedures used to reach such consistency will need consideration. In

particular, the processes used to reduce variability of scores produced by

different concepts and distinct philosophies in art and design education, for

example in the design of the instrument, standardisation, and moderation and post

moderation processes.

• Threats to reliability summarised

- Examination methods and procedures that are unclear or poorly defined.

- The influence of a weak or dominant partner in the assessment team.

88
- Instructions and procedures for marking that are unclear or weakly defined.

- Directives and marking instruments, for example: assessment objectives,

criteria, attainment targets and/or grade descriptors using vague terms.

- Lack of teacher and assessor training.

- Lack of consensus. Lack of standardisation.

• Impact

Analysis of the uses of an examination as well as an analysis of the way

examination results are interpreted and acted upon, together with the after-effects

of the examination on teaching and learning.

• Sources of negative impact

- Inappropriate profiling of individual strengths and weaknesses (e.g. unclear

grade descriptors/assessment matrix)

- Criteria, assessment objectives, attainment targets and/or grade descriptors

(referring to art and design knowledge, understanding and skills) which are

expressed in vague or negative terms, unable to clearly express what learners

can and need to be able to do.

- Unclear instructions to users on how (and how not) to interpret examination

results.

- Basing the examination on an underlying domain model of artistic knowledge

that is divorced from learning goals. A narrow selection or a selection of

irrelevant knowledge, understanding and skills for assessment.

89
- Inauthentic assessment instrument: tasks that do not enable the candidates to

actively engage their knowledge, understanding and skills in a reasonable and

authentic way.

Summary

In this chapter the reliability, validity, impact and practicality of art and design

examinations have been discussed in order to establish the key problems and issues

related to assessment processes. Threats to the reliability of the examinations were

established in relation to: inadequate design of the assessment and scoring

instruments; lack of marking consistency due to inadequate training of assessors and

moderation procedures. Sources of invalidity of examinations were identified

through inadequate content fidelity; ambiguous and unshared concepts of domains

and descriptors of performance; assessment instruments covering a poor sample of

artistic skills, knowledge and understanding; lack of authenticity in the assessment

instrument; unfamiliarity of format; use of external criteria that measure different

competencies from the assessment instrument in question. Assessment instruments

using tasks and criteria that place one or other group of examinees at an advantage.

The need to evaluate the consequences or impact of the examinations was

established and, finally, questions about the convenience, flexibility and cost-

effectiveness of the examination procedures were identified in order to evaluate the

practicality of the system. The chapter ends with a proposed conceptual framework

90
to evaluate essential aspects of an art and design examination through document

analysis, observation and questionnaire data analysis.

Reliabilities
Validities
• Content validity • Inter-rater reliability
• Face validity • Intra-rater reliability
• Response Validity
• Bias
• Construct/Theory based
validity
• Predictive validity
• Concurrent validity

Impact Assessment
Assessment Procedures
Contexts of use Processes mark
Instrument schemes,
Effects on society and
individuals training,
Effects on schools consensus,
Effects on curriculum Type, format, internal marking,
Effects on teachers, directives standardisation,
Effects on higher Content, moderation, post
education, question moderation,
Effects on students papers, tasks, consistency in
Feedback. assessment marking and
objectives, grading.
Criteria, Ethics of
weightings examinations
Constructs,
underlying Practicality
Practicality
models

Figure 1: Conceptual framework for evaluation of art and design


external assessment

Chapter 3
Design of the research

91
The first part of this chapter reports on the choice of a combined methodology
to be used for the purposes of the study and evaluation of validity and
reliability aspects in current art examinations in Portugal and England and for
the search and trial of new procedures to improve art examinations in Portugal.
The second part of the chapter describes the selected qualitative and
quantitative instruments and methods of data collection.

3.1. Research questions


Taking into account the research questions stated on the Introduction Chapter (p. 15)

three main areas of investigation were established corresponding to the overall aims

of this research:

(a) An investigation and evaluation of some current art examinations in Portugal

and England (research questions 1, 2).

(b) A proposal for improvement of Portuguese art and design examinations

(research questions 3, 4).

(c) Application and evaluation of the proposal (research questions 4,5).

3.2. Choice of method


For this study, a research method was needed that would give access not just to what

goes on in each assessment culture but, more importantly to the meanings behind the

practices. Initially this led to the choice of a qualitative research method. According

to Creswell (1998) qualitative inquiry represents a legitimate mode of social and

human science exploration. He defined it as:

…qualitative research is an inquiry process of understanding based on


distinct methodological traditions of inquiry that explore a social or
human problem. The researcher builds a complex, holistic picture,
analyses words, reports detailed views of informants, and conducts the
study in a natural setting (p.15).

However, it was recognised that using a qualitative approach alone would limit the

study in terms of its ability to generalise. Furthermore triangulation of data from

92
different data collection instruments was considered necessary to obtain reliable

findings. Such considerations led to the choice of a combination of qualitative and

quantitative methods. As Miles & Huberman (1994, p.253-254) point out

quantitative analysis provides the means to verify a hypothesis derived from

qualitative research data and it helps to avoid bias.

Taking into account the previous considerations, the research design was mixed-

method, combining both qualitative and quantitative data in case studies. The main

instruments used to collect data were documents, observation, interviews and

questionnaires. The combination of qualitative and quantitative instruments in this

research, although time consuming, was particularly useful for testing out the validity

(i.e. authenticity) and reliability (i.e. consistency) of data gathering techniques.

93
3.3. Stages of research

The research was undertaken in two Stages.

Stage 1:
Stage 2:
Review of
literature
• Defining a new model
of external assessment
for Portugal
Analysing aspects of
validity, reliability, • Trial 1: Piloting in one
impact and practicality school
in current art • Evaluating the model
examinations • Redefining it
• Portugal • Trial 2: trial in five
• England schools
3.3.1. Stage 1 • Evaluating the model
• Examining the
F implications
Suggesting
improvements

Figure 2: Stages in research

94
3.3.1. Stage 1

Stage 1, which concerned the study of current art examinations in Portugal and

England, and was, mainly conducted during 2001-2002. This stage was intended to

analyse aspects of validity, reliability, impact and practicality in both systems and

also to develop suggestions for improving the Portuguese art examinations.

3.3.2. Stage 2

Stage 2 was intended to create a new assessment instrument and assessment

procedures for art examinations in Portugal, it was conducted in 2002-2003. Based

on the analysis and findings obtained during Stage 1 and an additional literature

review,

a blueprint for a new external assessment was designed (see p. 214). Rationales, draft

instruments and procedures were established and verified through a priori validation

taking into account the views and advice of experienced Portuguese and English art

and design teachers who provided useful comments on how to redefine the blueprint.

3.3.2.1. Trial 1

A first trial or pilot of the new assessment instrument was conducted in a Portuguese

secondary school with seven art teachers and 51 students who volunteered to

participate in late 2002– and early 2003. Five meetings were convened with the

teachers between 29th September 2002 and 18th January 2003 in order to discuss

assessment objectives, criteria and grade descriptors (see p. 221). The meetings acted

95
both as training and standardisation and were used also to provide feedback and

evaluation of the instrument. The draft assessment instrument was administered to

students during the period October – December 2002.

3.3.2.2. Trial 2

After the first trial evaluation and subsequent revision of the instruments and

procedures a second trial was conducted in five Portuguese secondary schools during

the period April-July 2003. A list of 15 art teachers potentially willing to participate

in the experiment was identified from the questionnaire responses (Appendix VII-2,

question 14.9.) These were contacted and the purpose of the trial, activities and

schedule were explained. Ten teachers agreed to participate with their students. All

but one of their students accepted making a total of 117 students. The trial included

standardisation meetings, administering the new assessment instrument to students,

on-line meetings, moderation and post moderation meetings (see p. 242).

After the trials, the experimental assessment instrument and procedures were

analysed. Data was collected through: (1) the researcher’s notes of school visits,

(2) interviews with teachers and a sample of students; (3) teachers’ and students’

questionnaires; (4) teachers’ and external observers’ reports; (5) on-line teachers’

meetings; (6) students’ portfolios; (7) students’ portfolio marking results.

96
Stage Dates Research objectives Instruments/sources Samples
s
Stage 2001 To examine international Literature review
1 literature pertaining to the
theoretical domain of
assessment in art education.

2001-2002 To analyse official documents Document analysis


concerned with art and design (documents,
examinations in Portugal and statistical sources)
England.
May 2001- To investigate Portuguese Interview 1 test developer
July 2002 students, teachers and Questionnaires 104 students
assessors views of the nature 44 teachers
and problems of art and design
examinations in Portugal and
identify possible forms of
improvement.
May 2001- To compare the practices and Interviews 5 (4 teachers+1
July 2002 policies of art and design student)
examinations in England and Observation (Edexcel
Portugal and evaluate their standardisation 2
relative strengths and meetings)
weaknesses.

Stage To develop a conceptual Literature review


2 framework for external art Documental analysis
assessment in pre-university (syllabuses,
level. curriculum materials)

September To develop and trial an Trial 1 (pilot) 1 school


2002- experimental assessment Interviews 7 teachers
January instrument and procedures Questionnaires 51 students
2003 (with the aim of improving Reports 1 external observer
reliability and validity in Observation
Portuguese art and design Portfolios’ marks
examinations).

April 2003- Trial 2 (main trial)


July 2003 Interviews 5 schools
Questionnaires 10 teachers
Reports 117 students
Observation 1 external observer
Portfolios’ marks

2004 To explore the broader


implications of the study.

97
Table 1: Plan of action

3.4. Instruments and data collection

3.4.1. Qualitative data collection

3.4.1.1. Document sources

The documents analysed included secondary source documents and primary source

documents, the latter included articles and official publications by bodies such as:

the Qualifications and Curriculum Authority (QCA); the Edexcel Foundation

(Edexcel); Oxford, Cambridge, and RSA Examination Board (OCR); the Assessment

and Qualifications Alliance (AQA); the Joint Council for General Qualifications; the

Portuguese Department of Secondary Education; the Portuguese Board of National

Examinations (GAVE) and the Portuguese Jury of National Examinations (JNE).

Access to documents that were not in the public domain and permission to use them

for the purposes of this research was formally requested from the various assessment

organisations. Other documents, found in the public domain, were analysed,

including documents published on the Internet in the web sites of the various

organisations.

The principal documents for analysis were concerned with relevant legislation,

assessment information such as codes of practice, syllabuses, examination papers and

tests from the Portuguese national assessment department, the English Qualifications

and Curriculum Authority (QCA), and the English awarding bodies. Other

98
documents used included minutes of meetings, working papers, examiners’ and

moderators’ reports, both in Portugal and in England. The documents also enabled

the identification and selection of individuals who contribute to the research. Both

the literature review and the document analysis provided information with which to

frame questions about specific areas of assessment.

3.4.1.2. Observation

Burgess (1993) recommended observational studies in order to identify some of the

key features and strategies for testing in these areas. She suggested that researchers

should work alongside advisers, teachers and students – a recommendation that

proved particularly important in the context of this research. Observations focused on

assessment procedures such as standardisation meetings to witness at first hand how

teachers applied assessment procedures and made judgements about students'

artworks. The key assessment events, identified through document analysis led to

identification of and contact with the relevant assessment experts who provided

authorisation to observe assessment practices in England. The Edexcel Foundation

gave permission to observe the standardisation meetings held at their premises in

Mansfield, Nottinghamshire in May 2001 and July 2002. The researcher wanted to

understand how moderators were prepared; what kind of processes were used to

reach a common understanding and application of criteria; what kind of visual

exemplars were used and how the marking of such exemplars was explained. The

observer was introduced to examiners as a Portuguese research student by the

Edexcel art assessment coordinator. During these meetings the researcher acted as a

moderator, while taking notes and informally interviewing the participants.

Subsequently these notes were complemented by semi-structured interviews

99
conducted with English art examination stakeholders (For details of observations

conducted in England see Appendix X).

3.4.1.3. Interviews

Interviewing is often described as a straightforward method of trying to discover

what people think (Robson, 1993, p.228). It is essentially a way of collecting

subjective information. As a consequence, its use offers a most important technique

in the search for meaning. The purpose of the interviews was to understand practices

from the point of view of stakeholders and to obtain deep information about the

expectations and problems experienced by participants. The initial plan was to use

structured and unstructured interviews in order to allow them ‘to tell their stories’.

According to Stewart (1996) an interview-based methodology attempts to construct

and

reconstruct personal and socio-cultural stories about particular concepts in people's

lives. The interviews in this research provided a rich array of data about users’

perceptions of art examinations (Appendix IX). It was hoped that they would enable

an evaluation of ways in which teachers used the examinations, processes of

standardisation and moderation and ways in which results were utilised by pupils,

parents and other teachers.

In stage 1 the purpose of the interviews was to obtain key stakeholders’ perceptions

about the current art examinations. An interview schedule was designed to gather the

100
most useful information, without spending disproportionate time, by organising the

questions into sections on the validity and reliability aspects of assessment identified

in Chapters 1 and 2: (1) assessment instruments, (2) assessment procedures and, (3)

impact of assessment results (Appendix IX, 1.). The schedule was piloted with an

English art teacher moderator in order to identify potential problems, the pilot sought

to test out the interpretation of questions by the respondent and helped to refine them.

Forty-seven questions were selected although not all of them were later used they

were useful for guiding the conversations.

Selecting the sample for interviews was crucial to the research because the

participants' responses were being used to validate or negate the findings from

documentary sources. Interviewees were identified using purposeful sampling

procedures from information gathered from advisers and documentary sources.

A letter explaining the purpose of the research and asking if they would collaborate

was sent to potential participants and the arrangements for the meetings were made

through letters and emails. The interview sample for England included six English art

examination stakeholders including a chief-examiner, two moderators, two art

teachers and one art student. Since in Portugal questionnaires were conducted with a

large number of students and art teachers a decision was taken to make a life history

interview with only one art teacher and examination developer who had been

identified as having extended experience in art examinations. The researcher

intended to complete information about the historical background of the Portuguese

art examination by interviewing one of the pioneers of art programmes in Portugal in

the 1970s and 1980s.

101
Roles Name Sex Age Region Experience Interview
Date
teaching examinations
Portugal Curriculum Isabel F 69 Lisboa 45 34 7-05-2001
and test 14.00h-
developer 18.00h

Principal Richard M 57 SouthEast 12 4-02-2001


examiner England 16.00-
16.30h
Team-leader Elisabeth F 67 London 30 20 7-02-2001
region 11.00h-
England 13.00h
Moderator Ian M 48 London 18 11 7-07-2001
region 18.00h-
22.30h
Art teacher Sally F 46 London 1 7-07-2001
region 18.00h-
22.30h
Cindy F 42 NorthWest 18 10-02-2001
England 11.00h-
14.00h
Student Annie F 16 South England 6-05-2002
14.30h-
16.00h
Figure 3: Sample of art and design examination stakeholders for interviews in stage

The interviews in Stage 1 were conducted in 2001, at the interviewees’ homes, and

varied between one and four hours in length. The interviews took the form of long

conversations and it was found that this strategy helped the collection of data in a

non-prejudiced and non-prescriptive way. The researcher’s role in the interviews was

one of orientation to a given theme; the approach was like a teaching situation in

102
which respondents taught the interviewer about facts and personal perspectives.

The interviews were tape-recorded and notes were recorded in a research journal at

the time that events and processes were witnessed. Later, all the interviews were

transcribed in the original language. After the first analysis the researcher’s notes and

preliminary findings were shared with the interviewees and feedback and further

advice was obtained through correspondence. Interviewees’ comments on important

points of assessment helped to develop an understanding of practice.

Some interviews were quite long and participants refer to many details and examples.

After transcription, the interview data was re-organised to relate to the three main

areas of concern: the assessment instrument, assessment procedures and impact.

The interview data was described and summarised by reducing the statements to

those aspects that emerged as being essential through the recurrence of terms,

similar or contrasting opinions, through extensive or repeated descriptions and by

discarding parts where participants expressed opinions about non related-themes

(Appendix IX, 2.).

In stage two it was decided to use both individual and group interviews for students.

Focus groups interviews seemed to offer advantages because they enabled interaction

among interviewees. Initially it was planned to use semi-structured interviews in

order to focus on essential issues of validity, reliability, impact and practicality of the

experiment. The interview schedule contained sixteen questions organised in three

sections: (1) instrument and criteria, (2) contents and resources and, (3) ownership of

the assessment (Appendix XI). The schedule was piloted during the trial 1 with ten

students after they concluded the portfolios for the new examination and it was found

103
that the interview schedule did not obtain very rich responses, so a decision was

taken to use open interviews during the main trial (trial 2).

A sample of students was selected in four schools for group interviews, each teacher

selected five student volunteers for interviews trying to represent boys, girls,

different ages, different socio-economic backgrounds and races whenever possible

(five students group interviews including a total of 31 students). The interviews took

place in the different schools’ art rooms after the portfolios’ moderation procedures;

the length varied according between 30 minutes and one hour.

Schoo Date Student Age Sex Race Parents


l education
P 2-06-2003 Luis 18 M B Basic
14.30h-15.30 Sergio 17 M W Basic
Sandra 20 F W Basic
Raquel 18 F W Basic
Susana 19 F W Basic
B 2-06-2003 Telma 17 F W Higher
17.30h-18.30h Sara 17 F W Higher
Ana 18 F W Higher
Sandra 17 F W Higher
Susana 17 F W Basic
Diana 17 F W Basic
K 26-06-2003 David 18 M B Basic
13.00h-14.00h Júlia 20 F W Secondary
Joana 18 F W Basic
Inês 19 F W Secondary
Isabel 18 F W Basic
26-06-2003 Zé 18 M W Secondary
14.00h-15.00h Ruben 20 M W Higher
Jocelyne 17 F B Basic
Marta 18 F W Secondary
Inês A. 19 F W Secondary
V 17-06-2003 António 18 M W Secondary
10.00h-10.30h Sandra 18 F W Higher
Carla 17 F W Higher
Joaquim 18 M W Secondary
Manuela 17 F W Secondary
Tiago 18 M W Secondary

104
Figure 4: Sample of students for interviews in stage 2

3.4.2. Quantitative data

3.4.2.1. Questionnaires

In stage one a survey questionnaire was designed in order to obtain students and

teachers/assessors perceptions of external art assessment in Portugal (see Appendix

VII). The questionnaire sought to ascertain, first, the degrees of validity, reliability,

impact and practicality of the Portuguese art and design examinations and second, to

identify suggestions for improvement. The questionnaire schedule was designed

based on data obtained from the document analysis and through the lens of the

framework for evaluating art and design examinations described in Chapter 2 (p. 76).

The main concepts of the framework such as validity, reliability, assessment

instruments, assessment procedures and impact were used to design the questions

developing each one of the conceptual issues. Nine sections were developed for the

students’ questionnaires: (1) general information, (2) syllabuses, (3) criteria, (4)

external assessment, (5) form and structure of examinations, (6) assessment

procedures,

(7) reliability, (8) assessors, (9) impact. Teachers’ questionnaires included three

more sections: (10) assessing, (11) feedback and (12) in-service teacher training.

The questionnaires provided space for respondents to add comments or suggestions

(Section 10 - students and Section 13 - teachers).

105
The questionnaire was piloted with 10 teachers during an art teachers’ congress

(Apecv, Evora; March 2002) and later with 10 students from design courses at the

Instituto Superior Politecnico das Caldas da Rainha and Universidade de Aveiro.

After piloting the questionnaire some small language revisions were made in order to

clarify the language, for example technical words such as ‘reliability’ were replaced

by terms like ‘consistency of results’. The questionnaire was administered during

October-November 2001, in two different versions: (1) for secondary art and design

teachers with 83 questions (Appendix VII, 2.) and, (2) for students who took

national art examinations in subjects related to art and design with 67 questions

(Appendix VII, 1.).

The questionnaire sample was intended to include a significant number of art

teachers and art assessors in Portugal (approximately thirty percent of the 350 art

teachers who teach 12th year art classes). It was also intended to include a

considerable number of students (approximately three hundred of the 4000 students

who took national art examinations in 2001) from different art courses in universities

across the country. Questionnaires were posted to: (1) students involved in the 2001

national art examinations, (2) all the teachers in the art departments of secondary

schools identified as having art courses. Questionnaires for students were posted to

those attending art teacher training courses at the main Portuguese universities and

Institutes of education such as: Universidade do Minho; Escola Superior de

Educação de Portalegre; Universidade de Évora . They were also sent to students

attending art, design, multimedia and architecture courses at: Universidade de

Aveiro; Faculdade de Arquitectura do Porto; Faculdade de Belas Artes do Porto,

Instituto Politécnico de Viseu; Instituto Politécnico das Caldas da Rainha.

106
Although pre-stamped envelopes were enclosed, initially it was found difficult to

obtain the questionnaires responses, especially from art teachers, even after sending

follow up letters. So, other approaches were devised. The researcher delivered

questionnaires during Portuguese congresses and conferences on art and design held

in 2001 and 2002. Finally by the end of 2002 a forty per cent response rate was

achieved thus allowing some confidence in generalisation of the findings.

Forty-four art assessors and one hundred and four students (68 students from general

courses; 36 students from technical courses) replied to the questionnaires. The final

sample included a wide variety of geographical locations, ages, teachers’ experience

of assessment and students’ art courses (for details see p. 131).

Total of art Sample Response Total of art Sample Response


teachers rate students rate
350 100 44 4000 300 104
Figure 5: Response rates to the survey in stage 1

In Stage 2, questionnaires were used to obtain data about the participants’

perceptions of the new assessment instrument during the first and second trial. The

questionnaire schedules for students (Appendix XV) and for teachers (Appendix

XVI) were written in accord with the conceptual framework for art examinations (see

Chapter 2, p.76), the concepts of the framework were used to develop a series of

questions focusing on the evaluation of: (1) the portfolio – instructions, format, tasks,

assessment criteria, weightings, (2) – assessment procedures such as standardisation,

moderation, consistency in marking and (3) impact – effects on students, teachers,

schools, and the curriculum. The questions were revised after a pilot because some

107
questions were repetitive and some terms were replaced in the interests of achieving

greater clarity, finally the questionnaires included 64 questions for students and 93

questions for teachers. All students and all the teachers involved on the trials

responded to the questionnaires during the last sessions of the experimental

assessment instrument.

3.4.2.2. Other data

Statistical information about examinations and appeals was collected from the press

and in response to requests to the Portuguese Ministry of Education. Other data on

assessment results was sought through a mock examination marking results (MTEP

question paper – 36 scripts) and the marking results of portfolios developed by

students during Trial 1 (13 portfolios) and Trial 2 (12 portfolios).

3.5. Data Analysis

The different types of analysis used were as follows: (1) Content analysis for

documents, observation notes and interviews (see Appendices IX, X); (2) Descriptive

statistics for questionnaires (see Appendices VIII and XVII);(3) Multi-faceted Rasch

analysis for marks of the current examination students’ MTEP mock examination

scripts and students’ portfolios in the new assessment (see Appendix XXIV).

3.5.1. Document analysis

Documents were analysed in order to understand current systems of art examinations

in Portugal and England in stage one and to understand teachers’ problems with the

implementation of the new examination in stage 2 (teachers’ reports). Documents are

108
not neutral. In the course of the research, the documents used for describing and

explaining different assessment systems were analysed taking into account the

ideologies underpinning them. First the contents were summarised to permit rapid

retrieval when needed, categorised by type (primary and secondary sources),

classified according to their apparent authenticity and credibility. Then enumeration

of the frequency by which textual items (words, phrases, concepts) appeared in the

text was made and finally interpretation and meaning was constructed (see example

Appendix VI).

3.5.2. Analysis of observation notes

The notes collected during the observation at Edexcel standardisation meetings in

stage one were essentially analysed crosschecking the information obtained with the

documents about examinations in England in order to understand how the assessment

procedures were applied, it was sought to see how many visual exemplars of

students’ works were used in standardisation meetings, the type and format of the

exemplars, how the assessment matrix and criteria were explained to the moderators

and how moderators were trained to ensure consistency.

In stage two the observations conducted in the five schools of the trial 2 were used

to crosscheck information with teachers’ reports, informal interviews with the

participant teachers and students interviews, it was sought to understand the school

resources and the style of teaching and learning strategies.

3.5.3. Analysis of interviews

109
A decision was taken to adopt an interpretative approach to analysing interview data.

Interpretation is the process whereby a researcher seeks to understand and construct

meaning from participants’ words and actions (Cohen & Manion, 1994, p.27).

The steps of data analysis were as follows:

(1) Segmenting information; organising; managing and retrieving information; and

using code-and-retrieve techniques.

(2) The meaning of categories was shared and discussed with participants in order to

test the credibility of the research (Lincoln & Guba, 1985; Robinson, 1993).

(3) Generating concepts. The information was crosschecked with established

theory.

The data was transcribed and condensed into analysable units. Labels or tags were

created in the form of codes for assigning meaning to the descriptive or inferential

information compiled during the study. The codes were useful as a way of providing

a means to achieve the generation of concepts (Gough & Scott, 2000, p. 339).

The codes were first defined in accord with the research questions and conceptual

framework (see Chapter 2, p.76), they were used to retrieve and organise words,

phrases, sentences or whole paragraphs connected or unconnected to a specific

setting. However, the codes were not fixed from the start, they developed during the

study, in the form of sub-codes or even entirely new ones such as ‘teacher aid’

or ‘overloaded’ which emerged from documents and from participants’ voices.

Codes served to compact and cluster segments of information, in the first stage of

analysis codes were mainly descriptive, for example: ‘assessment instrument’,

‘assessment procedures’, and ‘impact’. In the second stage they became more

interpretative, for example: ‘clear instructions’, ‘standardisation, ‘moderation’ and

110
‘effects’. Finally more explanatory codes were defined according to patterns

discerned during the data interpretation and establishment of relationships between

different segments of information, such as: ‘interpretation of criteria’, ‘consensus

between judges’, ‘ shared language’, ‘teacher aid’, ‘teacher training’, ‘causes of bias’

and ‘negative/positive effects’.

111
Teacher
Clear bias aid
Assessment instrument Instructions instructions
Format validity authentic + effects

modular Over
assessed
Criteria
Interpretation of criteria
Shared language
Assessment + effects
procedures Standardisation
Visual Training
Moderation exemplars
reliability consensus between judges’

curriculum Teaching practice


Impact effects - effects
practicality
overload

Figure 6: Diagram analysis of interviews in England stage 1

Diagrams and other graphical forms were devised to organise the coded segments of

information (see Appendix XXVIII), in order to establish causal and consequential

links. Networks are helpful to show complex interactions of variables arising from

statistical and content analysis in order to create a narrative or define a final matrix.

Bliss et al. (1983, p.8) described networks as follows:

Networks can usefully be regarded as an extension of the familiar business of putting

things into categories. To categorise is to attach a label to things, in effect to place

them in boxes. A network can be seen as a map of the set of boxes one has chosen to

use, which shows how they relate to one another.

112
Portfolio instructions Teacher aid
bias
time resources non prescriptive tasks
Improve validity
flexibility Independent study Need
tasks previous
preparation
Self-assessment Assessment
criteria for learning
Assessment responsibility
procedures ownership Students’ voice
In-service teacher
Improve training resources Visual dialogue
reliability exemplars
time
ownership
Shared language

Impact Effects students Learning styles Potential


negative and
Effects teachers Teaching practice positive effects

Effects schools Learning environment

Effects curriculum Reform

Practicality
Need
Less practicality expensive resources
Need time

Figure 7: Diagram analysis of interviews in stage 2

113
3.5.4. Descriptive statistics

The questionnaire data was analysed using the computer programme for managing

statistics SPSS. The purpose of the questionnaires was not focused on finding causal-

consequential relationships but rather to detect problems related to validity,

reliability and impact assessment. Overall the principal data analysis used descriptive

statistics such as frequencies, which established percentage of agreement or

disagreement with statements included in the questionnaire. The variables entered

corresponded to the recoded questions. Additional to frequencies and graph analysis

some correlation tables were made and T-tests were calculated to define the measure

of some potential relationships, for example between sex and tasks (Chapter 7,

p.267).

3.5.5. Multi-faceted analysis

In order to analyse the degrees of markers reliability in the current Portuguese art

examinations and in the new assessment instrument and procedures, a sample of

thirty- six MTEP examination students’ scripts was collected from a mock MTEP

examination conducted with one class at the pilot school on the 28th September

2002. During the implementation of the new assessment thirteen portfolios were

collected during the pilot and twelve portfolios were collected during the main trial

experience. The scripts and portfolios were selected in order to obtain a wide variety

of different performances, excluding scripts and portfolios marked zero and twenty

because null achievements and perfect achievements are not useful for multi-facet

measurement analysis (FACETS). The MTEP mock examination scripts were

114
marked by ten art examination assessors selected from the survey questionnaire;

teachers were selected in order to obtain a sample with different ages; gender;

geographical location, years of teaching and experience in art examination

assessment. Each teacher assessed nine scripts on two different occasions; all the

teachers marked six of them. The scripts were posted to the teachers who marked

them between January and April 2003.

The sample of portfolios were developed by the students during the pilot and trial

and marked by all the pilot and trial teachers participants during 2003.

Many logistical models have been developed for use in measuring tests’ estimates of

reliability. From the known methods of estimation, the Rasch model has provided

particularly useful information for assessment developers (Baker, 1997, p.24). Rasch,

who developed his models in the early 1950s, defined two specific types of

parameters (one for persons and one for items) thereby introducing a new concept of

measurement and individual centred statistical techniques. According to Baker

(1997, p.58) the measurement principle with which Rasch was primarily concerned

was that of specific objectivity in comparisons, i.e. comparisons of persons which do

not depend on the particular items used, and comparisons of items which do not

depend on the particular persons tested. Baker pointed out that the Rasch’s models

have certain common characteristics: ‘each has two parameters, one identified with

person ability and the other with item (or test) difficulty, and the ability and difficulty

parameters are in each case ‘separable’ in that they can be estimated independently,

in a manner which Lord and Novick describe as analogous to the estimation of

parameters in a two-way factorial analysis of variance with no interaction terms’

(Baker, 1997, p.62). Many further developments of the Rasch have been used but, in

115
the context of this study, it was important to determine the effects of raters in

performance-based assessment because the focus of the analysis was the markers’

reliability. This was possible using multi-faceted Rasch measurement because types

of interaction between the rater and the scale could be calibrated. An individual

markers’ reliability could be estimated based on the degree to which he or she, or

different markers, were consistent in their assessment of candidates’ performance.

According to Clapham & Corson (1997) user-friendly Rash analysis computer

programs have led many researchers to apply Rasch multi-faceted measurement to

analysis of rater performance. Since the researcher did not have a strong background

in statistics such computer programmes appeared to be a good solution to the

problem of how to analyse data in the form of examination results. According to

Weir (2003) multi faceted Rasch offers a ‘…sophisticated way of looking at degree

of overlap between raters but it also provides evidence on level of marking and is a

systematic method for calibrating scores to iron out differences occasioned by inter

marker differences’.

Multi-facet measurement is available on a computer program known as FACETS

(Linacre and Wright, 1992). In this research a free sample of the programme (unifac)

was downloaded from the Internet at (www.winsteps.com/minifac.htm). Through

FACETS multi-faceted measurement it was possible to isolate and make interactions

between variations of subjects’ abilities, variations of tasks’ difficulties and variation

associated with the raters. Therefore, three facets or factors of variation of the

assessment setting were investigated: candidates, items and raters. Each rating can be

understood as a function of the interaction of the three facets: ability of the

116
candidate; the difficulty of the item and characteristics of the marker. Data about

each facet was introduced in the programme and through the use of Rasch formulae

it was possible to bring them together into a single relationship expressed in terms of

the effect they were likely to have on a candidate’s chance of success. Consequently

it was possible also to estimate overall predictions or probabilities, and precise

degrees of inter and intra-marker reliability. As McNamara (1996, p.143) points out:

‘We can express the difficulty of items according to the likelihood of a candidate of

given ability getting a given score (or better) from a rater of given severity on that

item. Finally we can express the severity of raters in terms of the chances of the rater

awarding a given rating (or better) on a given item to a candidate of given ability.

The facets can thus all be brought together in a single frame of reference’.

3.6. Reliability of data

During the research it was acknowledged that the bias of an individual researcher

interferes with the reliability of the data (Coehen and Manion, 1994; Cresswell,

1998). The researcher professional background was particularly pertinent to this. The

researcher, a Portuguese art teacher for twenty years in secondary education, had

been involved as test developer of Theory of Art question papers in the Portuguese

national examinations between 1994 and 1997 and had been assessor of art

examinations in MTEP and History of Art for several years. Her disappointment with

the experience in the Portuguese national examinations and more broadly with the

current assessment system was the cause for starting the research and was also a

potential source of bias. Being aware of such bias influenced the conduct of the

researcher during the two stages, the researcher tended to adopt a distant role trying

to avoid passions and memories from past experiences, trying to understand the

117
current systems of art external assessment in England and Portugal in terms of their

historical contexts and others’ perceptions of it, trying to implement the new

assessment instrument as if it was solely generated by participants and literature

suggestions and, trying to evaluate the current systems and the new one impartially.

However such ideal conditions might not be always respected and this led from the

very beginning of the research design to the necessity of creating strategies for data

validation to minimise the bias

3.7. Triangulation

Richardson (1994) provided a metaphorical description of validity, which illustrates

the main aspects of validation in qualitative research. This was particularly helpful in

the context of this research:

The central image is the crystal, which combines symmetry substances,


transmutations, multidimensionalities, and angles of approach. Crystals
grow, change, alter, but are not amorphous. Crystals are prisms that
reflect externalities and refract within themselves, creating different
colours, patterns, arrays, casting off in different directions. What we see
depends on our angle of repose.... Crystallisation, without losing
structure, deconstructs the traditional idea of "validity" (we feel how
there is no single truth, we see how texts validate themselves); and
crystallisation provides us with a deepened, complex, thoroughly partial
understanding of the topic. Paradoxically, we know more and doubt what
we know (p.522).

According to Sullivan (1996) the criteria for assessing the viability of qualitative

findings is not so much a matter of whether outcomes are statistically significant but

whether they are meaningful. He pointed out that:

The emphasis on discovery requires one to maintain a special vigilant


pose in dealing with issues of validity and reliability. This involves sound
reasoning, systematic analysis and sustained focusing, along with the
process of subjecting emerging findings to continual empirical challenge
as new observations are brought to bear on existing insights (p.17).

118
Taking into account Sullivan’s (1996) views and the metaphorical description of

Richardson (1994), triangulation procedures to verify and validate the findings such

as for example making use of both quantitative and qualitative data were used.

Procedures for examining verification were implemented at different stages of the

research. The data collected through the different instruments was correlated and

causal-effect relationships were sought by making contrasts and comparisons across

the data. Theory in the literature was used for supplemental validation by referencing

the literature to provide validation of the accuracy of the findings. Verification of the

interview and observation data was conducted by interacting informally with the

interviewees afterwards, in order to ‘pick up’ additional information. These

processes confirmed but in some cases raised doubts about the data. Another strategy

was used for verification purposes during the pilot and trial when external observers

were invited to attend to the sessions and their reports reflected upon the experiences

they had witnessed (see Appendices XIII and XXIII). The external observers Graça

Martins and Leonardo Charréu were recognised experts in the field of art education.

3.8. Ethical considerations

The study investigated public documents, and documents which had been authorised

to be used in the research by their owners or were published in the public domain.

The study involved participants’ permission by using consent forms designed in

accordance to the University of Surrey Roehampton guidelines (Appendix XXVII).

Anonymity of participants was guaranteed by the use of fictitious names for persons

and schools. Participant’s permission was given to publish the written accounts of

interviews. The protocols developed for interviews and trials referred to the

following matters:

119
- Participants right to voluntarily withdraw from the study at any time.

- The central purpose of the study and the procedures to be used in data collection.

- A statement about protecting the confidentiality of the respondents.

- A statement about known risks associated with participation in the study.

- The benefits expected to accrue to participants in the study.

- A space was provided for participants to sign and date the form to confirm

that they understood the permission conditions (see Appendix XXVII).

An important ethical concern is the extent of the potential impact on the lives of the

subjects of the research. This may be particularly true in the case of studying

assessment, where the relationship between researcher and participants can make it

difficult for the former to remain neutral. Torrance’s (1989) writing about his

assessment research reveals a number of important points that were taken into

account in this study:

There have been many times when I have attended marking and
moderation meetings and been asked my opinion of pieces of work.
Given that one is often perceived as 'expert' in assessment, or even
'from the board', it is sometimes difficult, though not impossible, to
avoid being drawn into discussion. More acute still however, are the
occasions when one may actually have an impact on the performance
and subsequent grade awarded to candidates – vulnerable minors
placed in an already threatening and stressful situation (p.177).

During the trials of the assessment instrument or visits to schools, the researcher

adopted the role of a non-participant observer and attempted to influence proceedings

as little as possible. It was clearly explained to the participating teachers that the

researcher would not intervene in lessons or marking process. However on some

occasions there was interaction between researcher and participants. For example

when students and teachers formulated questions and posed them to the researcher

120
about their projects and about the projects that other students were carrying out in

other places. It is possible, nevertheless that the researcher did influence events also

by giving some indication of how other participants were carrying out work projects.

Summary

The research was carried out in two stages. The first, which investigated the current

situation in art examinations in Portugal and England, was carried out in 2001-

2002.The second, which focused on the creation, and evaluation of a new assessment

instrument for Portugal was carried in 2002-2003. The design of the research was

complex and combined both qualitative and qualitative methodologies.

A survey questionnaire was used in stage one to establish art teachers’ and students’

evaluations of art examinations and needs in Portugal. In the investigation and

evaluation of the strengths and weaknesses of equivalent art examinations in

England data was collected through documents, observation and interviews.

The researcher was a participant observer in the trials of the new assessment

instruments and procedures developed and tested out in Portugal in phase 2 and the

data collected for the purposes of joint evaluation of the model by the researcher and

participants was extremely diverse including documents such as teachers’ reports;

moderators' reports; external observers' reports; researcher's observations and

students' and teachers' questionnaires and interviews and portfolio students' marks.

Documents and interviews were analysed qualitatively (content analysis). The data

as a whole was coded descriptively first and then by more interpretative and

121
explanatory categories. Codes were developed from this base in order to describe,

relate, explain and evaluate conditions, strategies and consequences of the

assessment. Then they were shared with participants to arrive at credible findings

and to establish conclusions.

Descriptive statistics were applied in analysis of the survey questionnaire data using

SPSS. Degrees of marker reliability in both the current Portuguese examination

system and the new model developed and trial led during the research applied Rasch

multifaceted measurement using the computer programme FACETS.

122
Chapter 4

Portuguese art examinations

The first part of this chapter describes the current art and design examinations

in Portugal in terms of its history, underlying model, validity, reliability,

impact and practicality. The second part of the chapter presents the analysis

of the current assessment model and the analysis of users’ perceptions of what

should be an ideal art and design examination in order to present a list of

possible means to improve the current model.

4.1. Historical background overview

Throughout the period 1836-1974 the Portuguese art curriculum at a pre-university

level was exclusively centred on knowledge of technical drawing. Drawing was often

identified with geometry and perspective. Following the establishment of public

education the first art examinations for students prior to university entrance date

from 1836 and were based on technical drawing exercises. This tradition was very

strong in Portugal and shaped the form and content of the present day underlying

concept of art education that overemphasises the discipline of ‘Descriptive

Geometry’.

An exploratory interview was conducted with one of the pioneers of art programmes

in Portugal in the 1970s and 80s to understand the history of the art examinations:

Isabel, an art teacher aged 69 years, stated:

There were two periods of external examinations: at the end of the 5th
year (age 15) and at the end of the 7th year (age 17). I remember when I

123
passed my 5th art examination at the ‘liceu’; we had to draw a pot and we
had another question paper in geometry. The art exams for the 5th year
were usually ornamental drawing or still life and a geometry test. In the
7th year the examination was a single test on geometry (Isabel, Lisboa, 7-
05-2001).

Since the revolution of 1974 official policies have regulated the Portuguese

examinations at pre-university level. Between 1974 and 1980 the country was living

through a period of political change after the fall of the dictatorship. Society was

taking the first steps towards a democratic political regime and under such

circumstances education experienced radical changes. Before 1975 education

differentiated student achievement through general courses, which were intended to

train an elite, and technological courses where students had no access to university

entrance. This separation between courses and schools was eliminated after 1975.

Secondary schooling increased to twelve years in 1977, including the foundation

year: ‘ano propedêutico’ (Decreto Lei nº 491/77 de 23 de Novembro) and in 1978

the 10th and 11th years of the ‘complementary course’ were created (Despacho

normativo nº 140-A/78). In 1980, the first 12th year examinations replaced the ‘ano

propedêutico’ examinations (1980 Decreto-lei nº240 de 19 de Julho). The

‘complementary courses’ had two modes: (1) ‘via de ensino’ designed for students

pursuing studies in higher education and (2) ‘via profissionalizante’ designed as a

vocational pathway, but students in mode 2 could also gain access to university

courses. The qualifications obtained were based on continuous internal assessment of

each specialism during the course and external examinations in the core specialisms

taken in the 12th year (Despacho nº 67/81). In the early 1980s, national external

examinations were abolished (Despacho nº 23/ME/83) in state schools. Universities

selected their students using schools’ internal assessments and their own

124
examinations entrance. For example Fine Arts schools offering painting; sculpture;

and architecture courses had required students to pass a day long test of figure or still

life drawing in order to be admitted in the course. However the secondary school art

curriculum did not prepare students for such examinations.

After the ‘liceu’ we had to pass an entrance exam in Fine Arts; and
for that we took extra courses in representational drawing (figure
and still life drawing) with a painter or a sculptor. I remember my
entrance examination in Fine Arts. We had two days of intensive
charcoal drawing from casts (Isabel, Lisboa, 7-05-2001).

The art curriculum in secondary education progressively evolved from a vision of art

education as technical expertise in geometrical drawing to a broader vision with

separate specialisms. Drawing and History of Art were introduced with syllabuses

shaped by formalist rationales. However, Descriptive Geometry was still considered

the most important specialism in the curriculum:

The History of Art only appeared after the 1970s in the secondary art
curriculum; there was reluctance to accept it for entrance because they
valued more the descriptive geometry (Isabel, Lisboa, 7-05-2001).

Drawing as a specialism was introduced in the curriculum for the last years of

secondary education (former 6th and 7th) in 1977. After the analysis of drawing

syllabuses it was found that the content focused on the formal elements of visual

language, media and technical skills. The approach was essentially formalist and the

key concepts for teaching came from Weimar Bauhaus teachers such as Johannes

Itten, Wassily Kandinsky, and Paul Klee. Art appreciation included consideration of

the formal elements of art and design such as line, tone, form, colour, and texture and

the exercises were driven by these visual elements and the ‘rules’ of composition.

125
In 1986 national examinations in the three core specialisms of each subject were

introduced. In the arts, the examinations included: History of Art; Descriptive

Geometry and Drawing (Despacho nº 10/EBS/86 and Despacho 43/SERE/88-

/SERE/90). The format of these examinations was a test (120 minutes in the case of

Geometry and Drawing and 90 minutes for the History of Art paper) conducted

under controlled conditions.. Entrance to university depended on three factors: (1)

the secondary education internal assessment results, (2) external examinations in the

core specialisms, and (3) results of a test called the ‘General Access Test’: ‘Prova

Geral de Acesso’ (PGA) covering general cultural knowledge. The general access

test and core specialism examinations were externally set and assessed by teachers

appointed by the Departamento do Ensino Secundário. The question papers related

directly to the syllabus content because, as Isabel confirmed 'the question papers

were designed by the same people who wrote the syllabuses’. These examinations

have not changed since 1981 and still preserve their original format.

From an analysis of 18 question papers from 1981 to 2002 it was evident that the

general structure of the Drawing question paper was preserved unaltered throughout

this time period. The question paper required a short period test (120 minutes) and

knowledge; understanding and skills were understood to reside exclusively in

understanding the so-called formal qualities of art and design. The visual tasks were

comprised of abstract exercises exploring formal elements of visual language

such as for example: line, form, texture, pattern, balance, composition and rhythm.

The recommended media was pencil and the question papers required recognition of

formal elements in reproductions of Western art works (painting or sculpture),

usually from the twentieth century. The weighting was usually 70 percent for visual

126
tasks and 30 percent for written tasks. The criteria for assessment included a list of

model answers for the written questions and ambiguous criteria for the visual tasks

related to abilities for exploring visual elements ‘expressively’. The assessors marked

them using a holistic process for judging technical and expressive drawing skills.

These examinations continued in the early 1990s until the introduction of the reform

of secondary education after the implementation of a general law for the education

system (Lei nº 286/89 de 29 de Agosto), the regulations for assessment in secondary

education were approved in 1993 (Despacho nº 338/93) and the first examinations

under these new regulations started in 1995/96 for students in the 12th year of

schooling.

4.2. Assessment in secondary education after 1996

4.2.1. General regulations for assessment

The assessment documents, which have regulated secondary education since 1993

(Despacho Normativo nº 338/93), expressed the general democratic intentions of the

Education System Law (Lei de bases do sistema educativo nº 46/86). The first

aim of this assessment model was ‘to promote students’ educational success’.

However, the documents presented a dichotomy between internal and external

assessment. This dichotomy served two different purposes: an internal assessment

system (without any form of moderation or control) seemed to subscribe to the

principles of ‘assessment for learning’, or formative assessment, whereas the external

assessment using more ritualised forms of control seemed to serve the functions of

accountability and selection. Internal assessment offered a measure of freedom for

teachers and schools. External assessment or national examinations were based;

127
it is claimed, on parity of treatment as a guiding democratic principle. And the so-

called parity of treatment required uniformity in the art and design examinations.

In the interview Isabel stated:

It was important that all students should have the same object [to
draw] to put all the students in the same situation. You know, if
students or schools arrange their own models for drawing some
would be more difficult than others and it wouldn’t be fair (Isabel,
Lisboa, 7-05-2001.

But in art and design such prescriptive methods may conflict with the diverse and

creative outcomes that the curriculum objectives for art education claimed to foster

(Eça, 1999).

Internal summative assessment was described as ‘the global judgement about the

level of students’ knowledge, competencies, capacities and attitudes’ in the official

documents (ME, Desp. Norm. nº 338/93). Assessment outcomes were quantitatively

expressed on a scale from 0 to 20 points as a means of informing students and

parents about student achievement. Internal assessment was more or less continuous,

occurring three times per year. The results were expressed as marks for each

specialism and were awarded by students’ own teachers during teachers’ assessment

meetings that took place at Christmas, Easter and at the end of the course year.

The last meeting of the year established the students’ final marks. External

summative assessment included external set examinations.

The examinations consisted of written tests devised by the Gabinete de Avaliação

Educacional (GAVE) and the National Jury of Examinations (Juri Nacional de

Exames) implemented the assessment procedures. The final results of secondary

education for the school-leaving diploma were calculated through a combination of

128
the final internal assessment results for each discipline and the examination results in

core specialisms that were weighted at 30 percent of the final mark. Students should

achieve a final score of ten in each specialism to be awarded certificates of secondary

education. Such weightings gave a significant value to internal assessment in the

students’ final awarding and consequently a significant value to teacher assessment.

4.2.2. National examinations

The principles of education followed since 1986 had been much in favour of

continuous and formative forms of assessment. However the need for control and

accountability of the education system for student selection (Machado, 1994, p.53)

brought in the re-introduction of national examinations which contradicted the

formative model of assessment favoured in the General Law of Education (Lei de

bases do sistema educativo nº 46/86).

4.2.2.1. Instruments

Members of the Gabinete de Avaliação Educacional (GAVE), a department of the

Ministry of Education developed the examination papers. The responsibilities of

GAVE included planning, coordinating, designing and controlling external

assessment instruments (Decreto-lei nº 229/97 de 30 de Agosto). The team at GAVE

comprised: (1) a Director for all the subject areas, (2) a co-coordinator for each

subject area and for each test, (3) two authors (usually secondary teachers) for each

specialism, (4) two or more specialist revisers (auditors), (5) two language revisers,

(6) two text revisers, (7) and one specialist adviser (consultor).

129
The question papers were written in accordance with a statutory assessment

framework, general aims and objectives for the Portuguese National Curriculum and

national syllabuses adopted for each specialism. The examination paper had a list of

questions with a mark scheme totaling 200 points corresponding to the mark scale

(0-20). The design of question papers included some procedures of validation, for

example: verifying the graphic aspect; checking clarity and the form of language;

checking if questions were related to students’ age and related to the syllabus

content. Language and specialist advisors carried out this kind of verification.

The role of the specialist advisers was to provide the expert validation considered

essential for art and design examinations, especially because they were not piloted.

We could not pilot our question papers; we based our evaluation on


the last year examination results. That’s why it is so important to
have the advisers (‘auditores’); they are all art teachers in
secondary schools and they know how it works (Isabel, Lisboa, 7-
05-2001).

At the same time that the examination papers were finalized, schools received the

instructions about the examination. The Departamento do Ensino Secundário

published a list of specific contents to be tested in the examination question papers

and teachers could focus their lessons on these contents. In January/February they

received a form called ‘matrix for the examination question paper’ from the GAVE,

with the content headings, an indication of the number and type of questions and a

list of the materials to be allowed for the examinations. Schools also received a

model (i.e. exemplar) test, usually during March/April. Teachers could prepare

students for the examination by using the model test either as a mock examination,

or by including items from the model test in their own tests during the year.

130
4.2.2.2. Procedures

The National Jury of Examinations (JNE) was responsible for the process of

delivering question papers and assessment. The JNE was composed of a President

and Vice-President and technical assessors from the secondary department of the

Ministry of Education, coordinators of regions and coordinators of the school centres

(‘agrupamentos’). The JNE responsibilities included the special arrangements and

special considerations for candidates with special educational needs (SEN); the

management of assessment procedures and appeals against examination results.

There were 30 school centres in Portugal, which were comprised of a cluster of

schools in each region. They selected assessors; received students’ examination

scripts sent by schools, contacted the assessors and arranged the regional meetings

where scripts were distributed. Assessed scripts were returned to the centres, and

the results were sent to the central section of the JNE at Lisbon. This process was

designed to preserve the anonymity of all the candidates.

All examinations were taken on exactly the same time and on the same day all over

the country. There were three examinations periods. Until 2004 candidates could opt

for the ‘first call’ (1ª chamada at the end June, early July); or ‘second call’ (2ª

chamada in mid-end July); another call took place in September (2ª fase) for

candidates who did not pass or were not able to attend the previous examination

periods. In 2004 the ‘2ª fase’ was abolished. This means that for each subject at

least three question papers of a similar level of difficulty were designed.

Examinations were administered under highly controlled conditions to ensure

uniformity and the security of the process. In the instructions for the conduct of

131
examination (Departamento do Ensino Secundário, 1996), detailed information was

given to schools about timings; how to arrange the rooms; how to invigilate the

examinations; the number and roles of personnel involved; the type of materials that

students were allowed to use; the instructions for checking students’ identity cards;

instructions for filling the examination response papers; the exact time to open the

envelopes containing the question papers (which were transported by police

officers); how to attach the pages of scripts; the exact time to collect scripts;

counting of scripts; how to deal with SEN students requirements.

During the examinations, government inspectors might visit schools. In each room,

there were two invigilators: teachers who were non-specialists in the examination

subject. In a separate room, a specialist teacher should be available to resolve

practical problems that might arise. Students were not permitted to leave the room

during an examination.

Teachers usually had to mark the examination scripts of students from other schools

in the cities and towns within the geographical region. The process was co-

coordinated at local level by school centres in each area. Schools had to propose a

list of teachers/markers related to the number of students registered for examinations

to the school centres. For example, a school with 100 students registered for an

MTEP examination was expected to nominate two markers (correctores).

Teachers could not refuse to undertake this task.

Meetings with all the teachers/markers attached to school centres were convened to

receive students’ scripts and discuss the criteria for assessment. These meetings

132
were supposed to establish dialogue between markers about the application of

criteria. One assessor marked students’ scripts using criterion-referenced procedures.

The assessors returned the scripts with completed marking forms to the school

centres. The JNE collected the marks and sent them back to schools by end of July.

Candidates could ask for a review of their examination marks (appeal). Candidates

were required to explain the reasons for the request in a written report. (Despacho

normativo nº 13/2002). Another teacher dealt with the appeal from the school centre

(relator) who re-marked the question paper. The second mark prevailed. If the

candidate was not satisfied with the result, he/she had one more opportunity to

appeal directly to the Ministry of Education.

According to the Ministry of Education database (Departamento do Ensino

Secundário), 12,755 examination scripts in all subjects were revised in 1998

following appeals: 11,776 marks were changed to a higher grade and 979 to a lower

grade. The Portuguese Council of National Examinations for Secondary Education

(CNEES, 1998) acknowledged that there was a need to pay more attention to the

variations in the interpretation of criteria in assessing examinations.

In 2000, 72 percent of the candidates asked for appeals. In 2001 the Jornal de

Notícias (25/April) published an article about appeals during the 2000 examinations

based on statistical data which revealed that candidates had a strong chance of

increasing their marks on the second appeal. For example, in one secondary school,

61 out of 72 marks were increased. Particular cases of candidates asking for a second

appeal were highlighted; for example one candidate, who obtained 13 points in the

133
History of Art paper, appealed twice, and maintained the score in the first appeal, but

after a second appeal her mark was changed to 17.8 points.

Year 12 examinations were covered extensively in the press. Newspapers published

the question papers for subjects with large number of candidates in the core

specialisms for university entrance. Before 2001, examinations results were not

released publicly but political pressures during 2001 forced the government to

publish the examination results of all 625 secondary schools. The league tables

generated huge public debate about the ranking of schools. The left wing deputies in

parliament argued that the ranking of schools was undemocratic, increasing

discrimination between schools: the right wing deputies claimed that not publishing

examination results was a plot to hide mediocrity and leniency (Público, 26 April

2001). The results raised a controversy about accountability and the quality of

education. The Diário de Notícias (27 August 2001) and Público (26 August 2001),

were alerted to the great disparity between internal assessment and examinations

results, and argued that:

Although such results express two different kinds of


assessment: one continuous and formative [internal
results] and the other punctual and eliminatory
[examinations], the continuous interval pattern
between the two assessments results doesn’t seem
normal…. (Diário de Notícias, 27 August 2001,
Educação III).

The Diário de Notícias, article was illustrated by specific and often extreme cases.

For example one student’s internal mark in Descriptive Geometry was 18 points and

the examination mark was 6.2 points. But there were other examples where students

obtained much better marks during examinations than during internal assessment.

134
The newspaper’s interpretation of such discrepancies did not question causes related

to the system of examinations but rather focused on possible grade inflation by

teachers and schools that over-marked or under-marked students.

According to statistics published by the government (Departamento do Ensino

Secundário, 2001) in 2001 there was variation between internal assessment and

examination results in art specialisms. These were especially: (a) in Descriptive

Geometry: 3 points (general courses), 4.3 (technological courses); (b) in History of

Art: 3 points; (c) in MTEP: 2.5; (d) in Theory of Design: 2 (general courses), 0.3

(technological courses); and (e) in Theory of Art and Design: 1 point. Such

differences may not appear particularly significant at a first glance, however these

results were not a true illustration of the examination results since the Departamento

do Ensino Secundário did not published other statistical data such as minimum,

maximum or standard deviations.

4.3. The model of Portuguese secondary art curriculum

4.3.1. Origins of current art curriculum

Art examinations between 1993-1996 were internally set and assessed. Students’

final results for qualifications and university entrance were expressed through

internal marks (Despacho 16/SEEBS/93). The current art examinations appeared

after the curricular revision for secondary education that was implemented in 1996.

Isabel was actively engaged in this curriculum revision. She said:

I was involved in the last curricular revision as a syllabus


developer; it was a very interesting programme for the arts.
They [the Ministry] introduced Theory of Design in the
secondary art curriculum; they thought it was important
because design was present in all sort of art courses, for

135
example; textiles; ceramics; product design [courses at
specialist art schools such as Antonio Arroio at Lisboa and
Soares dos Reis at Porto]. Art teachers from Antonio Arroio
and Soares dos Reis schools were also invited to help to
design the art curriculum for secondary education.

Other content considered to be part of all specialisms


was materials and techniques. So they decided to make
a subject called Materials and Techniques. We said
that there was nothing in the curriculum specialisms
related to art production, and they said that Materials
and Techniques should be directed towards art
production; so we decided to replace the Drawing
specialism with Materials and Techniques. We also
decided to call it Materials and Techniques of Plastic
Expression rather then just Materials and Techniques.

We had to create theoretical subjects. It was their idea.


They said we had to design theoretical specialisms.
When we did the syllabuses we didn’t think about
forms of external assessment; there were no external
examinations in the beginning of the ‘nineties, just
formative and summative internal assessment (Isabel,
Lisboa, 7-05-2001).

From an analysis of the syllabuses of the various art specialisms (Departamento do

Ensino Secundário, 1995) a conclusion was drawn that the structure of the art and

design curriculum was fragmented by rigid boundaries between specialisms and by

two main approaches: theoretical as in Descriptive Geometry; History of Art and

Theory of Design, and theoretical- practical or studio production such as MTEP;

Studio Art/ Design and Technologies. The main learning domains in each specialism

were technical and perceptual, although contextual and critical studies were also

expected to be a component of Theory of Design, History of Art and Studio Art.

The MTEP and Studio Art syllabuses presented a broad range of aims related to

knowledge, understanding and skills of art and design production including ‘artistic

and aesthetic values’, ‘problem-solving skills’, ‘creativity and critical skills’,

136
‘critical awareness’, and ‘research’. However making such a clear demarcation

between theory and practice could reduce the opportunities for students to develop

contextual and critical studies of art and design in studio art production.

The same demarcation existed in the recommended assessment instruments for art

specialisms. Tests were recommended for internal assessment in Descriptive

Geometry, History of Art, Theory of Design, and for MTEP. In Studio Art/ Design

and Technologies the recommended instruments for assessment were very flexible

and mainly based on coursework components. External assessment or national

examinations in the arts only existed, in Portugal, for specialisms widely viewed as

more objective such as Descriptive Geometry; History of Art and Theory of Design;

Theory of Art and Design and MTEP. The national examinations in such specialisms

reinforced their status in the curriculum and at the same time emphasised technical

drawing per se and the history of art and theory of design as encyclopaedic

knowledge separate from art production.

MTEP included a broad range of assessment criteria such as: ‘quality; quantity;

organisation; ability to faithfully copy reality; ability to communicate ideas;

expressiveness; creativity; problem solving’. The assessment criteria specified for

Studio Art were as follows:

Capacities for observation; involvement in research;


investigation and collection of data; curiosity about artistic
phenomena and personal investigation; capacities for
image’ analysis; capacities for organisation of data;
technical abilities in expressive and graphic media;
creativity; imagination; experimentation; formulation of
relevant questions; involvement and integration in
individual and team work; persistence; representational

137
skills; critical skills, inventive and intervention abilities; use
of knowledge’ (GETAP, 1992, p.13).

These criteria were recommended for internal assessment, which was considered to

be both formative and summative. Each teacher marked his/her students’ work

according to his/her own interpretation of them. Besides the list of criteria and

objectives there were no other written documents to help students and teachers.

In practice, internal assessment depended on teachers' various, often idiosyncratic,

interpretations of criteria without any sort of procedures to ensure consistency of

standards or inter-marker reliability.

4.3. 2. Assessment instruments in Portuguese art examinations

Compared with disciplines like Portuguese language and mathematics, art disciplines

had a small number of candidates. The total art examination entry in 2001 was as

follows: MTEP (general courses) 3,627 candidates; Descriptive Geometry (general

and technological courses) – 7,168 candidates; History of Art (general courses and

technological courses) – 6,008 candidates; Theory of art and Design (technological

courses) and Theory of Design (general courses and technological courses) – 8,435

candidates. (Source: Departamento do Ensino Secundário, 2001).

History of Art and Descriptive Geometry largest number of candidates for

examination was evidence of the comparative status of the specialisms.

However this apparent ranking could be because they were considered to be more

‘objectively’ assessable, especially Descriptive Geometry. As Isabel pointed out:

…and besides descriptive geometry was very easy to assess (Isabel,


Lisboa, 7-05-2001).

138
The assessment instruments used for national examinations in the arts were uniform

written tests covering the syllabus contents for year 12 in each specialism.

The Descriptive Geometry, Theory of Design, Theory of Art and Design, and History

of Art written tests lasted 120 minutes and Materials and Techniques of Plastic

Expression examination was considered to be a theoretical and practical test with a

duration of 210 minutes (Despacho normativo nº 13/2002). The Descriptive

Geometry examination paper included a set of exercises about conventional

representational systems. The History of Art, Theory of Art and Design and

Theory of Design question papers were a set of open-ended questions requiring

recall, contextual and perceptual analysis of art and design objects and art and

design movements.

The only specialism with question papers offering a more comprehensive range of art

and design production opportunities was Materials and Techniques of Plastic

Expression (MTEP). An analysis of 20 MTEP question papers, from 1996 to 2002,

revealed that the structure of the test did not change over the years. The test included

opportunities for written and visual responses. The required tasks were an

observational drawing exercise (first part) and in the second part a simulation of a

design problem-solving exercise through annotated sketches, a more complete sketch

as final solution and a written explanation of the proposed intentions and materials

(see example in Appendix V).

The criteria and weightings for assessment in the examination did not vary greatly

over time. In the 2002 question paper these were as follows:

139
• Part I: Representational drawing (60 points): Ability to represent
the object faithfully (form and proportions); Personal interpretation
(expressiveness and global effect).
• Part II (1) Preparatory studies and sketches (60 points): Capacity to
present alternative different solutions (creativity). Capacity to communicate
ideas in visual form. Adequateness of the techniques and materials used in
the sketches. Expressiveness.
• Part II (2) Final solution (60 points): Adequate proposal for the
solution of the problem; Composition and organisation, Space-form; Plastic
qualities.
• Part II (3) Written Report (20 points): Description of the process;
Justification of the options; knowledge of the properties of materials and
techniques.

Isabel described some of the constraints faced by an examiner developer. The tight

structure of a curriculum based on separate specialisms and examination regulations

in Portugal left very little opportunity for creating authentic assessment situations.

The only question paper that included art production, MTEP, was reduced to basic

technical skills of representational drawing and some problem solving skills.

According to the coordinator of the developers of the MTEP examination question

paper (Isabel), the test was created as a remedial strategy to compensate for the lack

of drawing skills elsewhere and the type and format was constrained by the general

regulations for examinations in the art curriculum. Isabel claimed that under such

constraints it was impossible to design examinations, which exhibit content and

theory-based aspects of validity

But, suddenly they said that all the subjects in the specific
areas should have external examinations; it didn’t fit our
syllabuses; and worse, they said that examinations should be
just theoretical pencil and paper tests. We tried to convince
them to give more time for practical question papers; at
least in MTEP, we proposed 15 days for the examination
time, but they didn’t accept. So we had to create these
question papers for MTEP. We decided to focus on drawing
skills since drawing was absent from the art curriculum
(Isabel, Lisboa, 7-05-2001).

140
In the end they gave 3 hours for the MTEP examination. So,
we invented these question papers for MTEP. We wanted
students to make drawings and we had to think about a
model for drawing which could be easily sent by post for
schools and be the same for everyone. And we did manage
to make the cardboard model (Isabel, Lisboa, 7-05-2001).

Figure 8: MTEP question paper cardboard model


(question paper September 2002)

But we were bound by the syllabus, which is oriented


towards problem solving; then we included the second part
of the question paper as a problem solving exercise; of
course it is not correct; but we were bound by the time of
the examination (Isabel, Lisboa, 7-05-2001).

The Portuguese art teachers association ‘Associação de Professores de Expressão e

Comunicação Visual’ (APECV) also endorsed the view that MTEP art examinations

were not a valid reflection of the course content in terms of content, face, construct

and theory-based validity:

In this discipline [MTEP] a pencil and paper test like the current one is
not valid to completely assess the knowledge, understanding and skills
of candidates in the arts (Parecer MTEP1.1.2002, APECV).

4.3.3. Assessment procedures in Portuguese art examinations

141
Another thing we asked for was a second assessor; but they refused
it. A second assessor would certainly improve the reliability of the
results; you know art examination results are not very reliable
(Isabel, Lisboa, 7-05-2001).

Analysis of the Portuguese documents suggested that assessment relied on teachers’

judgements both at internal and external assessment. There were no regulations or

procedures to verify the consistency of marking – controlling and monitoring

assessment procedures seemed to have little importance in the documentation.

The fact that marking was a solitary task revealed a complete but misplaced trust in

teachers’ understanding of standards and criteria as evidenced by the very high

percentage of appeals. At the same time standards were tacit, since there was no

explicit public information about them. The written criteria for arts provided by

syllabuses, instructions for the examinations and question papers were vague and

appeared to be a no more than a repetition of the objectives in each specialism.

There was no formal system for training assessors or formal meetings to debate or

develop common understanding of standards and the application of criteria.

The only process to ‘standardise’ assessors, established by JNE through the

examinations instructions, was the markers’ meeting convened at each school centre

for reception of students’ completed scripts. Copies of the minutes of these meetings

in the 2001 examinations were requested from five school centres after receiving

formal authorisation from the JNE. The school centres were: Porto Cidade; Coimbra

Centro; Lisboa Central; Faro and Viseu. The minutes’ schedule was defined by the

regulation: nº 48 /Norma 02/2000: (1) solving the question paper and discussion of

the various processes of solving the problems; (2) debate about the application of the

criteria provided by the GAVE; (3) report of the information required (phone or fax

142
request of information to GAVE or JNE) in case of doubts about the question paper

and application of criteria.

Each school centre provided six sets of minutes corresponding to the different

meetings for MTEP and Drawing (1st and 2nd call and 2nd phase) – 30 in total.

Two minutes reported that there was no meeting because the markers did not all

attend at the same time. In 80% of the minutes the text was: ‘nothing to report’. The

text of the other minutes was translated as follows:

We are very pleased because this question paper is in


accordance to the model test and matrix sent to
schools in the beginning of the year; what didn’t
happen last year’ [MTEP question paper].

There were omissions in the text of the question paper;


the missing words are in …’ [MTEP question paper].

Information requested from GAVE: ‘how to mark the


candidates who used colour pencils; ink and other
materials in Group III?’ [the question required
exclusively the use of graphite]. Answer from GAVE:
‘the student must be penalised under the criterion
graphic quality of the composition’ [Drawing question
paper].

The assessors consider this test too easy for the 12th year. Some students
use this exam as a core discipline for university entrance [Drawing
question paper].

From the minutes, two serious concerns about the MTEP assessment instrument were

evident: the fact that in the previous year the MTEP examinations instructions were

not in accord with the question paper and the 2001 MTEP question paper was not

clearly legible because of graphic design errors. The minutes concerning the

143
Drawing question paper were also interesting, especially because the question paper

was considered ‘too easy’ for the level by one centre.

Consideration of these minutes raised two different hypotheses: (1) that teachers

discussed the application of criteria but did not record the results of the discussion in

the minutes, or (2) teachers did not discuss the criteria at all. According to the

minutes, it seemed that teachers did not feel the need to ask for any explanations

about marking from the JNE and GAVE.

Isabel considered that the reliability of assessment results was a problem created by

assessment procedures based on single assessors. Another explanation for

unreliability could be because assessment instruments were designed by one

organisation (GAVE) and a separate body, the ‘National Jury of Examinations’,

(JNE) managed assessment procedures: resulting in a lack of coherence between the

examination instrument and examination procedures.

In order have precise data to check inter and intra-raters reliability of the current

examination ten MTEP examination assessors were asked by the researcher to mark

thirty-six MTEP examination scripts mock examination taken in September 2002 by

12th year art students. Art examination assessors selected from those who had

returned the survey questionnaire marked the scripts between January and April

2003. A sample of teachers was selected of different ages, from different

geographical locations, sex, and years of teaching and experience of art examination

assessment. Each teacher assessed nine scripts on two different occasions; all the

teachers marked six scripts in common. The difference between teachers’ marks was

144
from five to seven points on a scale of 20 (inter-rater reliability). The marking

difference with the same teacher (intra-rater reliability) varied from zero to six points

on a scale of 20.

Current Exam

Differences between 5-7 points


markers (inter-raters)
Over 36 scripts
Differences between the 0-6 points
same marker (intra-rater)
Over 9 scripts

Table 2: Current examination differences of marks (MTEP)

Another important aspect of reliability is the consistency of markers’ interpretations

of criteria. The minutes of some MTEP examination appeals procedures were helpful

in understanding how teachers applied the criteria. Copies of 40 reports from appeals

(2001 examinations) were requested from five school centres (20% of the total

number of centres) after having received formal authorisation from the JNE. The

following statement was quoted from one report:

… in the sketches there is an evident failure in the


capacity to communicate visually the ideas; there is no
correct representation of space and no correct framing
of the presented solutions in the space for the
scenario; particularly in the first and second sketches.
There is no representation of the perspectives and
principal plans (face and floor);
the chosen materials for the solutions are not the most
adequate for a scenario for example the stone. The
used media are not well controlled. In the final
solution, there is no representation or perspective of
the scenario plans; space representation is incorrect …
The final drawing is not elaborated in terms of use of
media (colour pencil); there is no care in making a
finished and expressive drawing. The overall
impression is that the work reveals problems of
organisation and composition and an unsatisfactory
plastic treatment (061 minute report).

145
Clearly, the marker was concerned about the student’s technical expertise for

representing space, and his or her use of techniques and materials.

Her marking approach was negative, pointing out what the student was

incapable of doing rather than what he/she could do. This is another example:

In group I there were some errors in the faithfulness of the form and
proportions and problems in expressiveness and global effect. In group
II, a) there was errors in creativity, capacity for visual communication
of ideas; problems in articulating techniques and materials; in the
sketches errors of expression. In group II, b) errors in articulating the
final solution with the problem, weak organisation, composition and
plastic treatment (551 report).

In group I, 3 points were awarded because of errors in the correct


representation of form (faithfulness to the model). In group II question
a) 2.8 points were awarded because of errors in creativity; difficulties
in communicating the ideas visually; problems in using materials and
techniques (462 report).

The marker seemed to use the terms derived from the criteria in the reports.

He marked by punishing the candidate; in other words subtracting points in relation

to the observed technical difficulties. Some explanations like ‘errors in creativity or

‘errors of expression’ are hard to decipher. In the next example, the same kind of

unclear discourse occurs based on terms like ‘failures’:

In Group I, 2.5 points were awarded because although the form


represented respected the proportions of the object there were evident
failures in the use of value, there is no representation of the projected
shadow of the object over the plan of the table and no accurate
definitions of the edges of the object. There were too many dark zones
in the representation of plans and in general the drawing is too dark.

In group II question (a). 2.5 points are awarded to the sketches because
there is no diversity of proposals that indicates a failure in creativity.
Besides the proposed materials are not the best indicated for the
solution, being too heavy, e.g. concrete and iron… (report 060).

All the judgements in the reports were based on academic criteria such as the use of

formal visual elements, techniques and materials. The main justification for awarding

146
marks was based on concepts of ‘composition’, ‘formal organisation’, and ‘graphic

adequacy’. Personal interpretation or style was not mentioned. In problem-solving

exercises the main justification was ‘the candidate is unable to communicate ideas

visually’; the reports did not mention if the candidate took creative risks in the

sketches of their proposals. Creative and expressiveness qualities were not described;

creative work was vaguely referred to, usually in relation to the number of sketches

and the appropriateness of chosen materials for the final product. The judgements

depended on overall impressions. But this was not surprising given the inadequate

assessment criteria. Nevertheless, these transcriptions of the minutes should be seen

as an imperfect indicator of the process of marking because it might be difficult for

teachers to translate their judgements about visual products into written language.

4.4. Users’ perceptions of Portuguese art examination practices

4.4.1. Respondents

As described in Chapter 3 (3.4.2.1., p. 90) a survey questionnaire was designed to

obtain teachers/assessors and students general perceptions about external art

assessment in Portugal (see Appendix VII). Forty-four art assessors and one hundred

and four students replied to the questionnaires. The final sample of teachers/assessors

and students included a wide variety of geographical locations, ages, teachers’

experience of assessment and students’ art courses. Respondents’ ages, sex,

geographical location, and other characteristics were distributed as follows:

147
Age sex Secondary school place University courses

Rank Number Sex Number Region Number Course Number


age
18-21 72 M 36 North 40 Architect. 5

22-25 24 F 68 Centre 31 Fine arts 23


26-29 8 South 29 Design 26

Islands 4 Education 50

Table 3: Questionnaire survey sample of students

Age Sex School place Nº of years of Nº of years as


teaching assessors
Rank Number Sex Number Region Number Years Number Years Number
age
F 26 North 14 6-10 4 1-3 13
27-35 5 M 18 Centre 14 11-15 17 4-6 26
36-40 14 South 12 16-20 12 7-10 4

41-45 17 Islands 4 21-25 9 11-15 1


46-55 8 26-31 2

Table 4: Questionnaire survey sample of teachers/assessors

4.4.2. Description and analysis of questionnaire results

4.4.2.1. Syllabuses (Section 2)

The respondents had different views about the value of the various art specialisms in

the curriculum (question 1.8). Sixty-eight respondents, 45.9 percent, considered that

all the art specialisms were important. For the others, Studio Art was considered the

most important specialism by 45 respondents, followed by History of Art (25

respondents); MTEP (12 respondents); Descriptive Geometry and Theory of Design

(11 respondents).

The existing aims of art specialisms were considered appropriate for art and design

courses by 68.6 percent of respondents (question 2.1), while 65.4 percent of

148
respondents considered that the objectives and contents described in syllabuses were

adequate for art and design courses (question 2.2). A further 51.1 percent considered

that art and design syllabus at school provided a good basis for further studies

(question 2.3).

There was no significant variation of responses between students (SR) and teachers

(TR) in the first two questions 2.1 (aims), 2.2 (objectives and contents). In question

2.3 which asked for a response to the statement: ‘The art and design syllabus at

school provides a good basis for students' further studies’, teachers and students

responded differently.

2.3 1-Strongly 2-Agree 3-Disagree 4-Strongly Median


agree disagree
Teachers: TR 2.5% 62.5% 25% 10% 2.00
Students: SR 5.1% 40.4% 46.5% 8.1% 3.00
Table 5: Responses to question 2.3

The tendency of students to disagree was illustrated by the following student’s

comment at the end of the questionnaire:

History of Art and Descriptive Geometry specialisms are generally well


taught and the examination question papers are adequate for the courses.
But they are not really relevant in university. Secondary courses do not
prepare students for the core disciplines of the university at least in
plastic art courses (SR 16, section 10)

149
4.4.2.2. Opinions about the validity of the examination (Section 3 and 4)

Respondents’ opinions about the validity of art examinations suggested considerable

dissatisfaction with the current model. Although 58 percent of respondents

considered that instructions (rubric) for examinations were clearly stated (question

4.1), only 48.8 percent thought the questions were clear (question 4.2). One students’

comment in section 10 of the questionnaire was: ‘The language of question papers is

not very clear’ (SR 80). But 4.3 percent of the respondents strongly agreed that they

were clear and 7.1 percent strongly disagreed.

The time allowed for answering the question paper was considered adequate by 57.2

percent of respondents (question 4.3). And 50.4 percent of respondents considered

that the questions in the examination paper were similar in their form and content to

tasks set in courses (question 4.4), however 9.4 percent strongly disagreed with this.

Of the respondents, 56.1 percent disagreed and 29.5 percent strongly disagreed with

the statement in question 4.5: ‘students are given a good opportunity in the

examination to show well what they know, understand and can do in and about art’.

The tasks set out in the question papers were not considered adequate: 58.7 percent

of the respondents disagreed and 27.5 percent strongly disagreed with the statement

in question 4.6: ‘The examination allows the most able students to demonstrate their

true ability’. Furthermore 52.2 percent of respondents considered that the

examination did not allow students to display the knowledge, understanding and

skills set out in the syllabuses (question 4.8). And 55.7 percent of the respondents

disagreed and 17.1 percent strongly disagreed with the statement (question 4.9):

‘The examination questions or tasks focus on highly relevant knowledge,

150
understanding and skills for future artists and designers’. Furthermore the current

criteria in the examinations were not considered clear by the majority of the

respondents, 55.8 percent disagreed and 18.1 percent strongly disagreed with the

statement in question 4.7 that: 'The assessment criteria in the examinations are

clearly stated’.

Teachers gave varied responses to the statement in question 4.4: ‘ The questions

in the examination paper are similar in their form and content to the tasks set in

the classroom’. The same was the case for the statement in question 4.8:

‘The examination allows students to display the knowledge, understanding and skills

set out in the syllabuses’.

4.4 1-Strongly 2-Agree 3-Disagree 4-Strongly Median


agree disagree
Teachers: TR 45% 40% 15% 3.00
Students : SR 4% 48.5% 40.4% 7.1% 2.00
Table 6: Responses to question 4.4

4.8 1-Strongly 2-Agree 3-Disagree 4-Strongly


agree disagree Median
Teachers: TR 37.5% 55% 7.5% 3.00
Students: SR 2% 50% 45% 3% 2.00
Table 7: Responses to question 4.8

According to one student’s comment in section 10, the examinations did not cover

all the contents of the course and students could take advantage of this:

We do not need to be good during internal assessment of


secondary courses. In fact we can choose to cancel our

151
registration and pass the exams as external students; this
has advantages for students because we only need to study
the contents of the exam question paper and not the whole
course syllabus (SR 18).

The majority of the respondents thought that the examinations were


unbiased;
for example 89.2 percent considered that they were free from gender bias
(question 4.10); 70.7 percent considered that the examinations were free
from bias related to where students live (question 4.11). And 84.8 percent
agreed with the statement
‘The examinations are free from bias related to racial/ethnic background
of the students’ (question 4.12) and 97.8 percent considered that
examinations were free from social class bias (question 4.13). However
students and teachers responses to the questions related to examination
bias (4.10- 4-13) varied. It seemed that while students were unaware of it,
teachers recognised the possibility of gender and geographical bias.

4.10 1-Strongly 2-Agree 3-Disagree 4-Strongly Median


No Gender agree disagree
bias
Teachers: TR 5% 35% 47,5% 12.55% 3.00
Students: SR 48% 35% 16% 1% 1.00
Table 8: Responses to question 4.10

4.11. 1-Strongly 2-Agree 3-Disagree 4-Strongly Median


No local bias agree disagree
Teachers: TR 5% 35% 47.5% 12.5% 3.00
Students: SR 48% 35% 16% 1% 1.00
Table 9: Responses to question 4.11

4.12 1-Strongly 2-Agree 3-Disagree 4-Strongly Median


No racial bias agree disagree
Teachers: TR 20% 42.5% 30% 7.5% 2.00
Students: SR 55.1% 38.8% 4.1% 2% 1.00
Table 10: Responses to question 4.12

4.13 1-Strongly 2-Agree 3-Disagree 4-Strongly Median


No social agree disagree
class bias
Teachers: TR 22.5% 40% 30% 7.5% 2.00
Students: SR 51% 41.8% 7.15% 1.00

152
Table 11: Responses to question 4.13

Responses to questions about assessment (section 3) revealed that both teachers and

students considered it was important that students know and understand the criteria

being used to assess their work; 96.6 percent agreed with the statement ‘ It is

essential that students know exactly what qualities the teacher is looking for in

students’ work’ (question 3.1), and 97.8 percent agreed with the statement: ‘It is

important that students understand the criteria used to assess their work’ (question

3.2). Respondents also strongly agreed with the statement that students should

participate in the development of criteria; 91.3 percent considered that ‘ students

should have some say about the criteria used to assess their works’ (question 3.3).

4.4.2.3. The ideal form and structure of art and design examinations

(section 5)

The respondents considered art and design examinations were important


and only 28.6 percent of them agreed with the statement that there
should be no final examinations in art and design courses (question
5.4). It was evident that they preferred assessment instruments to be
used in examinations other than the current pencil and paper tests

Only 5.7 percent of respondents thought that art examinations should consist only of

questions requiring a written response (question 5.1); 29.3 percent were in favour of

art examinations consisting only of tasks requiring practical studio-based responses

(question 5.2) and 89.3 percent favoured art examinations consisting of

questions/tasks requiring a combination of written and practical responses (question

5.3). However some differences were found between teachers’ and students’ views

in response to the questions about the nature of artworks to be submitted for

examinations. Table 12 shows the responses to the statement in question 5.1:

‘Art examinations should consist only of questions requiring a written response’.

153
5.1 1-Strongly 2-Agree 3-Disagree 4-Strongly Median
agree disagree
Teachers: TR 5% 52.5% 42.5% 3.00
Students: SR 2% 4% 42% 52% 4.00
Table 12: Responses to question 5.1

Table 13 below shows responses to the statement in question 5.3:

‘Art examinations should consist of questions/tasks requiring a combination of

written and practical responses’.

5.3 1-Strongly 2-Agree 3-Disagree 4-Strongly Median


agree disagree
Teachers: TR 55% 37,5% 5% 1% 1.00
Students: SR 48% 40% 8% 4% 2.00
Table 13: Responses to question 5.3

Respondents considered the introduction of portfolios to be a valid assessment

instrument for art examinations; 75.9 percent believed that individual portfolios

provide the best evidence for assessing the work of art and design students (question

5.5). Similarly coursework was understood to provide sound evidence for

examinations; 80 percent of the respondents agreed with the statement: ‘I believe

that final art and design assessment should be based entirely on coursework’

(question 5.6).

Self-assessment was considered important by 69.8 percent of respondents (question

5.7). However there were some differences in the responses of teachers and students.

As shown in the figure 4, teachers (indicated in green) seemed to favour self-

assessment as evidence for examinations more than students (indicated in red),

154
80 percent of teachers agreed with the statement 'students' self-assessment should

be taken into account for the final examination’ whereas only 65.7 percent of

students where in favour.

70

60

50

40

30

20
role
10
Percent

0 t
Missing 1 2 3 4

self-assessment

Students’ responses

Teachers’ responses

Figure 9: Responses to question 5.7

Respondents stated their preferences for the types of evidence of student

achievement to be submitted in art examinations (group of questions 5.8).

They prioritised as follows:

(1): Final products: 92.8 percent


(2): Developmental studies: 90.6 percent.
(3): Records of students’ intentions and how these are realised: 86.9 percent
(4): Preliminary investigation and research work: 77.9 percent.
(5): Work journals/ annotated sketchbooks: 77.1 percent.
(6): Self-assessment notes or reports: 73.4 percent.

155
(7): Notes of group criticisms: 72.8 percent.

Of the respondents 66.9 percent agreed with the statement that question papers and

criteria should be internally set by teachers and externally approved (question 5.9).

The frequency of responses to questions about the duration of art and design

examinations was inconclusive. The reason for this could be that the questions were

not sufficiently well constructed and the respondents thought that they should answer

of the sub-questions in question 5.10 separately rather than by choosing just one

option.

Some comments added at the end of the questionnaires (section 10) revealed that

individual students had strong opinions about how their work should be assessed.

For example:

The ideal procedure should be continuous assessment that


includes student’s achievement and progress through
evidence like preliminary studies showing the processes
used and constant dialogue between teacher and student.
The student must know how to justify her work; and this
must be also assessed. It is absurd to have examinations
based on one single piece of work because it is impossible
for the student to reveal all her knowledge, understanding
and skills in art in a single piece of work (SR 104).

The exam should consist of the presentation of a project


developed during the year of study; the project should be
marked by the student’s own teacher and by an external
assessor (SR 102).

4.4.2.4. Opinions about current procedures (section 6)

A percentage of 59.4 of respondents stated that students usually read instructions for

marking (question 6.1), 44 percent thought that the assessment matrix and how

instructions about how their work could be marked was clearly explained to students

156
before the examination (question 6.2), and 38.2 percent considered existing mark

schemes entirely appropriate for judging their artworks (question 6.3). But 53.8

percent of the teachers considered that guidance about how to mark students' work

was unclear (question 6.4).

A difference of opinions between teachers and students was evident in responses to

the statement in question 6.2 that ‘The assessment matrix and how work will be

marked is explained clearly to students before the examination’. The majority of

students disagreed while the majority of teachers agreed with this statement.

6.2: 1-Strongly 2-Agree 3-Disagree 4-Strongly Median


agree disagree
Teachers: TR 53.8% 33.3% 12.8% 2.00
Students: SR 7.4% 32.6% 42.1% 17.9% 3.00
Table 14: Responses to question 6.2

4.4.2.5. Reliability (section 7)


The responses in the questions about inter-rater reliability revealed that 61.8 percent

of the respondents did not believe that assessors always marked students work

consistently (question 7.1); 90.6 percent of them thought that their interpretations of

assessment objectives and criteria varied (question 7.2), and 94.2 percent thought

that the same student work could be marked differently by two different assessors

(question 7.3). Furthermore the answers to the question 7.4 were confirmation of

general distrust in the reliability of examinations: 79.1 percent of the respondents

agreed with the statement: ‘In my opinion it is only a matter of luck or personal

preference whether an assessor likes a particular work’. Intra-rater reliability was

also considered problematic: 72.3 percent of respondents considered that assessors

were not always consistent and might mark the same work differently at different

157
periods of time (question 7.5). Different tendencies in the views of teachers and

students were evident in responses to the statement in questions: 7.1: ‘Assessors

always mark students work consistently’; and 7.2: ‘Different assessors have different

interpretations of assessment objectives and criteria’; and 7.3: ‘The same student

work is often marked differently by two different assessors’. According to these

results it seemed that the students were more concerned with the issue of reliability

of assessment procedures than teachers.

7.1 1-Strongly 2-Agree 3-Disagree 4-Strongly Median


‘assessors agree disagree
always mark
students work
consistently’
Teachers: TR 10.3% 41% 38.5% 10.3% 2.00
Students: SR 1% 32% 48.5% 18.6% 3.00
Table 15: Responses to question 7.1

7.2 1-Strongly 2-Agree 3- 4- Median


different assessors have agree Disagree Strongly
different interpretations of disagree
assessment objectives and
criteria’
Teachers: TR 35% 47.5% 15% 2.5% 2.00
Students: SR 52.5% 41.4% 4% 2% 1.00
Table 16: Responses to question 7.2

7.3 1-Strongly 2-Agree 3- 4-Strongly Median


‘the same student agree Disagree disagree
work is often
marked differently
by two different
assessors’
Teachers: TR 32.5% 52.5% 15% 2.00
Students: SR 60.6% 37.4% 1% 1% 1.00
Table 17: Responses to question 7.3

4.4.2. 6. Opinions about ideal assessment procedures (section 8)

158
In section 8, the questions asked about ‘ideal’ assessment procedures and 73.2

percent of respondents agreed with the statement ‘Students’ examination work

should be assessed by the students’ own teacher and at least one external judge

(question 8.4). Less agreement was found with the statement in question 8.3:

‘Students’ examination work should be assessed by two external judges’ (table18).

A percentage of 79.7 of the respondents considered that students should have a voice

in the assessment process (question 8.5).

8.3 1- 2-Agree 3- 4- Median


‘Two external judges Strongly Disagree Strongly
assessing students' agree disagree
examination work’
Teachers: TR 20.5% 28.2% 35.9% 15.4% 3.00
Students: SR 13.7% 46.3% 29.5% 10.5% 2.00
Table 18: Responses to question 8.3

Several students’ comments on Section 10 stressed the reliability of assessment

procedures:

Examination scripts should be assessed by art critics, in other words


competent people who respect art (SR 60).

I think that students should have access to assessment criteria and their
exam work should be assessed by their own teachers (SR 83).

It is very important to have national exams; students’ scripts must be


assessed by external assessors (SR 80).

From students’ comments it was evident that they considered that the current

assessment procedures needed to be improved. For example:

I think that students’ work for examination should be


marked by more than one teacher (SR101).

159
Some teachers in Section 13 also wrote comments about assessment procedures,

although these tended to be vague. One teacher also wrote about the need for reform

and possible consequences:

I think that assessment in the arts needs to be improved, especially


because there are many assessors who judge students art works as a
matter of taste, which is not a real criterion. However to do that, we
need to revise not only the examinations but also the syllabuses and
teaching practice (TR 49).

4.4. 2.7. Impact: consequences of the examination (section 9)

The questions in section 9 of the questionnaire asked for opinions about the impact

of art examinations and one response stressed their effects upon students:

Examination results are not related to results obtained by


students during internal assessment. Exam results are
crucial for our lives because we depend on them to go to
university but exams narrow our performances (SR 102)

Opinions about the value of the results of the examination in society revealed that

both teachers and students had low expectations, 36.1 percent of the respondents

thought that employers valued the results of the art examinations (question 9.5), and

67.4 percent thought that universities and polytechnics highly valued them (question

9.6). Of the respondents, 60.4 percent considered that school administration valued

them (question 9.8), 69.6 percent thought that parents valued the results of the art

examinations (question 9.9), and 81.8 percent of the respondents considered that

students valued them (question 9.10).

It was evident that students valued their results in the art examination because they

depended on them for university entrance. However the responses from teachers

160
revealed low trust in their value; teachers may distrust the results as a consequence of

the examinations perceived lack of validity.

9.7 1- 2-Agree 3- 4- Median


‘teachers do highly value Strongly Disagree Strongly
the results of the art agree disagree
examinations’.
Teachers: TR 7.5% 22.5% 57.5% 12.5% 3.00
Students: SR 9.2% 56.1% 30.6% 4.1% 2.00
Table 19: Responses to question 9.7

Currently, in Portugal, examination results are used to select students for university

entrance. However one teacher questioned this function in the survey:

‘The universities should provide their own process or exams for student selection’

(TR- 29). The responses to questions about the use of the art examination results

showed that 79.5 percent considered that examinations were used to select students

for progression to higher education and employment. And 72.8 percent of

respondents considered that the examination results were the principal determining

factor with regard to students' choice of university courses.

Teacher responded in the following way to statements about other uses of

examinations:

• To control and monitor the curriculum (71.8 percent agreed);


• To check if schools and teachers are teaching the curriculum (56.4 percent
agreed);
• To monitor school standards (56.4 percent agreed);
• To rank school and teacher performance (51.3 percent agreed);
• To provide feedback to teachers (64.1 percent agreed);
• To revise next year's examinations (53.8 percent agreed)
• A mechanism to impose curriculum policies (64.1 percent agreed).

The responses to the statements about effects of the examination upon curriculum

practice revealed that 57.7 percent of the respondents agreed that the examination do

161
influence the work set during courses (question 9.1), more teachers than students

believed this was the case (see table 20). Although 52.2 percent of respondents

considered that a lot of work carried out during courses was not related to

examinations (question 9.2), the majority of respondents (64.9 percent) thought that

examinations influenced teaching practice negatively by focusing on a narrow range

of art and design knowledge, understanding and skills (question 9.3).

9.1 1-Strongly 2-Agree 3-Disagree 4-Strongly Median


agree disagree
Teachers: TR 20% 65% 10% 5% 2.00
Students: SR 5.2% 41.2% 44.35% 9.35% 3.00
Table 20: Responses to question 9.1

Teachers and students considered the correlation between internal assessment and

external assessment results weak since in question 9.13 only 39.6 percent of

respondents agreed with the statement ‘The marks received by students during

internal assessment in the secondary art course (10th, 11th, 12th year) will be a good

indicator of results obtained in national art examinations’. And the majority of

students did not consider that internal results were related to external examination

results.

Responses to the statements referring to predictive validity of art examination results

revealed that only 22.1 percent of respondents considered that the results of national

art examinations (12th year) were a good indicator of students’ results in their

university courses (question 9.11). But 44.8 percent of respondents expressed the

opinion that the internal assessment marks obtained in secondary art courses (10th,

11th, 12th year) were a good indicator of university course results.

162
Students and teachers responses to the statements in question 9.11 and 9.12 varied.

It seemed that students were less confident than teachers about the correlations

between examinations and internal assessment results and university results.

9.11 1- 2-Agree 3- 4- Median


‘The results of national Strongly Disagree Strongly
art examinations (12th agree disagree
year) will be a good
indicator of the results
obtained by students in
their university courses’
Teachers: TR 5.1% 53.8% 35.9% 5.1% 2.00
Students: SR 3.1% 31.6% 55.8% 9.5% 3.00
Table 21: Responses to question 9.11

9.12 1- 2-Agree 3- 4- Median


‘The internal Strongly Disagree Strongly
assessment marks agree disagree
obtained in the
secondary art course
(10th, 11th, 12th year) will
be a good indicator of
the results obtained by
students during their
university courses’
Teachers: TR 51.3% 43.6% 5.1% 2.00
Students: SR 3.2% 31.6% 55.8% 9.5% 3.00
Table 22: Responses to question 9.12

The questions 1.10-1.14 in section 1 of the questionnaire for students were designed

to obtain data to crosscheck with questions about the predictive validity of art

examination results. However a very small number of respondents were able to

provide the marks awarded in the first year of their university art studies

(58 respondents). The small number of responses did not make reliable data.

163
Therefore the correlations described in Appendix VIII (section 1.1, p. 47-48)

between the marks obtained by student respondents in their certificate of secondary

education (internal + external results) and the results obtained in the first year of

university can only be viewed as interesting but very tentative. Similarly the

correlations between examinations results and the results obtained in the first year of

university need to be read with caution.

Overall the questionnaire results about the impact of art examinations tended to be

contradictory and inconclusive. However the following conclusions were drawn:

• The results of art examination were generally considered important for

students’ selection and certification purposes.

• Art examination results were highly valued by students, but were not

generally accepted as important by teachers.

• The respondents did not recognise the impact of art examinations upon

curriculum practice as important.

• The great majority of respondents considered the predictive validity of art

examinations to be low.

4.4.2.8. Teachers views about assessment procedures (Sections 10-12)

In responses to questions in section 10 of the questionnaire for teachers, only 35.9

percent of the teacher respondents said that they assessed students’ examination

works using holistic or overall judgements; 56.4 percent said that they usually

assessed it by aggregating marks based on detailed criteria. The majority of

respondents said that they usually assessed student’s examination works using both

holistic judgements and judgements based on detailed criteria.

164
Notwithstanding the above responses, detailed criteria were not considered

necessary for art assessment by 74.4 percent of teacher respondents. A significant

majority of 94.9 percent of them thought that it was useful to discuss interpretation of

assessment criteria with their colleagues and only 5.1 percent strongly disagreed.

A percentage of 71.8 said that they usually met with their colleagues and tried to

reach a common interpretation of how to apply the criteria and 63.2 percent

considered it was not difficult to reach consensus about how to apply criteria.

These responses contradicted previous data derived from the reports of assessors

meetings (see section 4.3.3. pp.126-128).

Sections 11 and 12 of the questionnaire for teachers were about ideal assessment

procedures, 79.5 percent of the teacher respondents thought that evaluation

reports of examinations should be sent to schools. A significant percentage of

74.4 agreed with the need to receive notes of guidance and visual exemplars to

help teachers and assessors (question 11.2), and 89.7 percent stated that there

should be regular in-service training courses for teachers and assessors based

on the previous examination experience. Only 59 percent agreed with the

national publication of examination results in the form of league tables.

The majority of teachers who participated in the questionnaire favoured special training

for assessment. A significant percentage of 87.2 of respondents suggested the

need for special teacher in-service training on assessment. And 89.7 percent

agreed with the need for special assessment training for assessors, while 94.9

percent agreed with the statement that current students’ work should be used

for training of assessors. The statement: ‘The aim of in-service training should

165
be to reach consensual agreement about the interpretation of criteria’ had the

agreement of 89.7 percent of respondents.

Teachers’ open comments in section 13 of the questionnaire confirmed the general

opinions expressed in these questions. For example:

We need valid in-service teacher training (TR 26)

Teachers need to be well informed and have access to in-service


teacher training in order to prepare students for the future (TR 23)

Summary
From an analysis of the questionnaire the following evidence emerged:

• A great percentage of Portuguese teachers and students expressed

dissatisfaction, in various ways, with the current art external assessment

instruments; although some believed that the instructions and instruments

were clear they also believed that the examination focused on a narrow part

of art and design learning, the question papers did not allow students to

properly reveal their knowledge, understanding and skills in art and design

and consequently lacked content validity.

• Respondents did not consider the procedures used for assessment reliable

and, students especially, distrusted the inter-reliability of markers.

• The impact of art examinations was evident upon the users; the respondents

considered that students valued examination results. Teacher respondents

identified major outcomes of the art examination such as student selection,

monitoring and controlling the curriculum and standards and imposing

curriculum policies. Some respondents believed that art examinations

166
influenced teaching practice by focusing on a narrow range of art and design

knowledge, understanding and skills. Some respondents and especially the

great majority of students did not believe in the predictive validity of

examination results.

• Respondents’ opinions about how to reform the system suggested that it was

important to reconsider the nature of the examination, devising instruments

with both written and practical tasks, and including an element of self-

assessment. For example portfolios comprising coursework evidence such as:

preliminary investigation and research work; work journals; developmental

studies; final products; self-assessment notes; a record of student intentions

and how these are realised; records of group criticism discussions.

• The teachers’ responses to questions about how to improve assessment

procedures strongly supported the implementation of a system of teachers’

in-service training in order to develop assessment skills through consensual

interpretation of criteria and by the use of visual exemplars of student works

for training purposes. All respondents considered that assessment by the

student’s own teachers and at least one external assessor would be the best

way to improve the examination procedures.

The analysis of the history of Portuguese art examinations and current practice

revealed an emphasis on Descriptive Geometry as a core specialism of the art

curriculum. Assessment in Portugal was fragmented into internal summative

assessment which valued teachers’ judgments but had no external controls, and

external summative assessment or national examinations based on written tests.

The procedures used for external assessment seemed to lack quality assurance

167
mechanisms and consistency of marking. The underlying model of the art

curriculum and assessment was based on separate specialisms labelled as theoretical

or practical, and the core domains of art education were fragmented because of this

separation. Consequently assessment in art seemed to include only narrow aspects of

what could be a broad and rich art education. Problems with reliability of the

assessment procedures were detected. The existing assessment criteria were vague

and the process of marking extremely dubious; the absence of moderation procedures

and lack of consensual interpretation of criteria were identified as major problems of

the system.

168
Chapter 5
Art and design examinations in England

This chapter sets out to describe and discuss the strengths and weaknesses of

English art and design examinations at 17 + level through the analysis of their

evolution, underlying model, assessment instruments and procedures. This part

of the research was intended: to study the policies and practices of art and

design examinations in England; to identify critical issues for assessment

practice by analysing concepts and theories that underpinned art and design

external assessment in England; and to understand how assessors and

examiners were trained in England.

5.1. Introduction

From the review of literature a number of art and design examinations had been

identified where portfolios or coursework constituted the principal assessment

instrument and where the assessment procedures relied on different judges or

assessors. For example, Arts Propel; International Baccalaureate; Advanced

Placement; national examinations in The Netherlands, Finland and England.

From this list, the English GCE A level examinations appeared to fit most closely

with the preferences expressed by Portuguese students and art teachers about an ideal

examination (4.4.2.3; p.137 and 4.4.2.6, p.142), since the assessment instrument

included coursework and assessment procedures requiring teacher in-service training

and moderation. A study of the English model presented an opportunity to

investigate established procedures and strategies and to anticipate problems before

the design of a new assessment instrument and procedures for Portugal.

169
References to the English system of art and design examinations at AS and A levels

was included in this research because it was found during the Portuguese survey of

art teachers and art students that it appeared closely related to their views on an ideal

model for art examinations in Portugal. It is acknowledged that this section of the

thesis provides only a partial view of the English system of art and design

examinations and there was no intention to make a detailed comparative

study. However analysis of some key aspects of English art and design examinations

through documents, interviews and observation of the Edexcel Foundation awarding

body standardisation processes was useful for the purposes of collecting insights

about possible strengths and weaknesses of an examination system based partly

on coursework portfolios and assessment procedures reliant on standardisation and

moderation.

The documents used to inform this description included a review of literature about

the history of art and design examinations in England, articles in the press and

official publications (for details see 3.4.1.1, p.84). Since the English model of art and

design examinations was based on a system of standardisation, data was collected

through two observation periods conducted at the Edexcel Foundation awarding

body standardisation meetings (Appendix X). The observations were intended to

explore how moderators were prepared, the processes used to reach a common

understanding and application of criteria, what kind of visual exemplars were used

and how the assessment and marking of such exemplars was explained to the

moderators.

170
This stage of the research also set out to understand stakeholders’ perceptions of

English art and design examination practices. For that purpose, data collected

through interviews (Appendix IX) was analysed in order to determine the strengths

and weaknesses of the model in terms of validity and reliability and to develop a list

of tentative suggestions for new assessment instruments and procedures. The purpose

of the interviews was to understand examination practices from the point of view of

different stakeholders and to obtain deep information about the expectations and

problems experienced by the participants. A group of six interviewees was selected

including one principal examiner (Richard), one team-leader (Elisabeth), one

moderator (Ian), two art teachers (Cindy and Sally) and one student (Annie). Ian and

Elisabeth were first contacted during an Edexcel GCE AS standardisation meeting in

2001, and Cindy during an art teachers’ workshop (NSEAD, 2001). Cindy, Elisabeth

and Ian lived and taught in different regions of England and, through informal

conversations with the researcher in previous meetings, revealed extensive

experience and different points of view about the examination. Richard was a chief

examiner with considerable experience of developing examinations. Sally was

included because of her special situation as an inexperienced art teacher. Annie was

selected to introduce a student’s view of the model. The sample was not intended to

be representative of English students, teachers or moderators. The interviews were

expected to provide individual oral histories about these examinations worked in

micro-contexts rather than provide a general account of them.

171
Roles Name Sex Age Region Experience Awarding
body
teaching examinations
Principal Richard M 57 SouthEast 12 AQA
examiner England
Team-leader Elisabeth F 67 London 30 20 Edexcel
region
Moderator Ian M 48 London 18 11 Edexcel
region
Art teacher Sally F 46 London 1 Edexcel
region
Cindy F 42 NorthWest 18 OCR and
England Edexcel
Student Annie F 16 South England Edexcel

Table 23: Interviews with English art and design examination stakeholders

5.2. Art and design examinations before the 1988 Reform Act

Art education in England in the 1960s was influenced by rationales such as self-

expression and by formalist approaches (Ross, 1989). Practices in separate arts

subjects in the curriculum were strongly influenced by the modernist tradition in art

itself and the ‘progressive’ ideas of John Dewey and Herbert Read.

The leading exponents were singular charismatic individuals- artistic


teachers whose concern was with the child as artist – and untroubled by
any other considerations (Ross, 1989, p. 17).

These approaches to art education promoted changes in the way that candidates’

work was presented and assessed. According to Carline (1968), personal expression

came to be more valued than the previous tradition of accurate representation:

A quarter of a century ago, inaccuracy in representing objects,


mistakes in perspective and untidiness generally were almost the
only, and certainly the main, grounds for criticism. More recently,
examiner’s reports have emphasized that a personal interpretation
of the group [still life] is much more important than accuracy of
representation. The Oxford report for 1958 goes further, in
reminding schools that ‘good candidates’ show that they have
been excited by some aspect of the form, the pattern or the
colour”. To make a record of what is before one is manifestly
insufficient (Carline, 1968, p. 204).

172
The predominance of observational tasks changed with the new conceptions about

the value of artistic work. For example, living models, humans and plants, were

introduced as drawing subjects and personal interpretation of the object or model

was considered more important than copying: ‘…such work, even though it may

bear little resemblance to the model, is much to be preferred to a facile drawing of a

figure executed without character and culled from constant copying of fashion

drawings in magazines’ (Carline, 1968 , p. 214). The inclusion of ‘imaginative

composition’ and its subsequent development into ‘pictorial composition’ or ‘visual

composition’ introduced another important element derived from new concepts

where imagination was valued as an essential cognitive aspect of art education.

Imagination, personal response and problem solving were all rated highly in the

examination requirements. Later a formalist emphasis on abstract works was

gradually recognised in the examination syllabuses and question papers. Art

appreciation was mainly concerned with the history of art. According to Carline it

was the ‘least popular’ of the art papers, although it was taken by ‘some thirty per

cent’ of the Advanced level candidates in art, mainly by ‘candidates who plan a

career in which art may be useful, and are likely to be granted extra time at school

for preparation’. However according to Carline (1968, p.255) teachers experienced

difficulty ‘in combining art history with practice in the limited time allowed’ and it

was ‘unsatisfactory from an educational point of view’.

Carline’s (1968) description of art examinations in the 1960s was written from the

point of view of an examiner. His writing makes clear that examinations influenced

teaching practices. On several occasions he mentions the correlation of certain

subjects and their impact on teaching: for example, in the chapter about ‘craft’ he

173
stated: ‘…the inclusion of craftwork in the art examinations has given enormous

encouragement to the practice of crafts as a branch of art’ (Carline, 1968, p.246).

Carline used quotes from examining board reports to reinforce the claim that art

teachers should align their teaching to qualities of artworks valued by examiners.

GCE A level art and design examinations progressively evolved over time. In the

1970s education in England was characterized by a decentralised system (Ross,

1989). Before the implementation of the Education Reform Act in 1988, which gave

new statutory powers to the Secretary of State and central Government, 'Local

Education Authorities, schools, and even the individual teacher [had] almost total

autonomy' (Steers, 1989, p. 8). In fact, the only regulatory device for schools was

the examinations system (White, 1998, p.5). There was no national curriculum for

art and teachers had considerable control of the form and content of art education.

At that time, in England and Wales, secondary art teachers taught what they, as

individuals, considered important, within the context of a wide choice of fairly

flexible examination syllabuses (Steers, 1989, p. 9). Although examinations were

subject to scrutiny by the Schools Council, the numbers of options for examinations

and flexible procedures ' led the emphasis upon teacher-control of examinations'

(Earle, 1986, p145). Teacher ‘ownership’ of assessment systems was considered a

most important factor for examination improvement. In Steers’ words:

In my view, one consequence of this ‘ownership’ has been an


unprecedented, sustained national effort in which the
professionalism of classroom teachers has played a particular
significant role: by professionalism I mean both the ‘effectiveness
of implementation’ of the examination procedures, in this case by
the overwhelming majority of art and design teachers and, in turn,
the way in which this serves… the intrinsic values and interests of
the profession (Steers, 1994, p. 289).

174
An example of teacher ownership of external assessment was provided by the range

of options offered to teachers, not only through the choice of an examining board but

also through the choice of examinations structure. According to Earle (1986, p.51) in

the 1970s and 1980s there were three possible modes of examinations for 16+ and

18+ levels: Mode 1 – an external examination based on syllabuses provided by the

regional subject panel; Mode 2 – an external examination on syllabuses provided by

a school or a group of schools; and Mode 3 – an examination set and marked

internally in a school or group of schools but moderated by the regional board.

Within each of the eight regional bodies it was possible for a school to devise its own

examination if required (Earle, 1986, p.51). This freedom of choice was still

remembered by the English art teachers interviewed in this research (see interview

Cindy, Appendix IX, p. 101) as being a model for innovative and creative teaching

within an atmosphere of professionalism and consensus. Mode 3 was the structure

that allowed most teacher ownership, according to Earle (1986, p.51), but it was not

always preferred, perhaps because teachers did not have sufficient time or inclination

to produce individual syllabuses and because it was very ‘expensive to administer’.

A finding of review of literature about the history of art examinations in England was

that the model of the examinations had progressively evolved from a test of several

hours or a day to the inclusion of preliminary work and coursework in the form of a

portfolio or an exhibition. Candidates usually received the question papers before the

examination in order to prepare the themes and develop ideas in visual and written

forms. In 1977, a survey of the syllabuses of the GCE A level Art examinations

conducted by Earle, found that eight boards conducted a variety of art examinations

175
in different combinations and ways (art, art and craft, art history) providing a total of

sixty-two subjects; thirty two of these were in art history.

Earle (1986) pointed out there were problems of comparability among the

examination boards:

It could be argued at one level that one board expects candidates to undertake a

broad course of study while other boards demand a greater depth of study in specific

areas of art. The problem comes in equating one board's work with another, which

used to be the function of the A level sub-committee of the art committee of the

schools council. In its absence there is no indication of how standards among boards

will be moderated at this level.

‘In 1978 no examination board required the submission of coursework but in 1988

OCR allowed some coursework to be presented, in the beginning with no guidance

on how it should be marked or graded’ (Steers, 2003). Similarly, Earle (1986, p123)

pointed out: ' …there is not much stated in writing to assist the teacher with the

presentation of coursework, the amount required varies considerably...the boards

who require coursework and preliminary works are interested in the way which

candidates carry out their work as well as the final product'. Although art

examinations had evolved to include coursework as evidence for marking candidates

achievement, instructions for this were stated in vague terms and no explicit

guidance for teachers was given.

Earle (1986) stated that GCE A level art examiners and teachers became more and

more concerned with the problem of comparability of results:

176
The comparative wealth of examinations available to teachers has been a
luxury. While the curriculum was the 'secret garden' idiosyncratic
practices in art could continue. If teachers have to re-think their practices
in the light of the new examinations, it could become an advantage to the
profession to have a new and more standardised series of art
examinations (Earle, 1986, p126).

Earle was a chief examiner and, may be because of this his account of art and design

examinations argued for a more centralised model of external assessment.

He believed that the reliability of results was dependent on standardisation systems.

His vision was in line with a strong move towards criterion-referenced assessment,

which emerged during the 1980s, probably because of the external pressures upon

education driven by politicians (White, 1998). According to Steers (1989):

The policies of the present Government are having a profound influence


on all sectors of education. At all levels there are demands for greater
accountability and for the curriculum to be more 'relevant’ – relevant to
the Government's perception of national economic health rather than the
educational health of the nation (Steers, 1989, p. 7).

In the cause of a more reliable system of examinations the awarding bodies made

developments in mark schemes. The growing interest in defining marking schemes

and establishing domains of knowledge, grade criteria and criterion referenced

assessment was seen by teachers and examiners (Davies, 1987) as a way to introduce

more rigour and reliability into the examination.

5.3. The consequences of the Education Reform Act (1988) in England and

Wales

The introduction of the national curriculum in 1988 resulted in radical changes in

English education that were not always popular. The national curriculum was often

177
seen as the teachers’ losing control of the education system. According to White

(1998):

After 1988 one form of sectionalism –teacher control of the


curriculum –was replaced by another. Power shifted to the
government. This meant in practice that those in control of
education policy were able to institute the national curriculum
they preferred with minimum consultation and regardless of
obvious difficulties with it (p.5).

According to Ross (1989, p. 17): 'Authority in education passes swiftly from the

professional to the politician, from the researcher to the bureaucrat’. English teachers

were not used to such forms of policy-making – until then the only form of control

they had known being external examinations within a flexible choice provided by

several examination boards. However, at this time small boards were subsumed in

larger awarding bodies. For example AEB, NEAB and City and Guilds became the

Assessment and Qualifications Alliance (AQA) awarding body; ULEAC and BTEC

became Edexcel; Oxford and Cambridge and RSA examining boards become

Oxford, Cambridge, and RSA awarding body (OCR). The model of examinations

became more and more rigid with the amalgamation of examining boards. In the late

1980s and 1990s the trend was towards removing teacher ownership of education

and reinforcing more centralised control of the curriculum and external assessment.

Teachers were aware of the negative implications of such changes upon the

profession as Steers pointed out (1994, p. 289): ‘In recent years this professionalism

has been under threat from a government that has made known its mistrust of teacher

assessment’.

178
According to Steers (2003) usually A level art examinations were based on the

progression of a student’s work. The evidence for marking was submitted as a

portfolio or exhibition of selected works from the two years course of study.

Current examinations have less focus on drawing or craft skills per se,
but place a higher value on knowledge and understanding where both
process and product are assessed. The syllabuses are designed to allow
candidates to provide evidence of their achievement throughout a course
of study rather than test their ability to perform within set and very short
time limits. It can be argued that, in general, the 1998 examinations
provide more demanding but potentially more rewarding opportunities
for both candidates and their teachers (Steers, 2003).

During the 1990s ‘process’ progressively was more emphasised in relation to ‘end

products’ in examination values. Gradually the criteria for assessment focused

more and more on students’ evidence of developing ideas and investigative work.

This may have been a consequence of tendencies in art education theory at the time

to focus more on the process than on final products and by the development of more

conceptual contemporary art and design forms.

5.4. AS and A level art examinations after curriculum 2000


At the time of writing there were several qualifications in the English system for

external assessment at age 17+. There were general qualifications such as GCE AS

(Advanced Subsidiary) and GCE A levels (Advanced GCE); Vocationally related

qualifications such as Advanced GNVQs and occupational qualifications such as

NVQs (National Vocational Qualifications). Vocationally related and occupational

qualifications were linked to ‘national occupational standards, which set out real-

world job skills as defined by employers’ (QCA; 2000). In the General Certificate of

Education (GCE) at Advanced Level in any particular subject candidates could

achieve pass grades of A to E. Students who wanted to follow an academic route to

179
higher education or employment usually took A Level examinations. A Level

courses normally lasted two years and the examinations could be taken in a linear or

modular fashion. AS and Advanced GCE examinations in Art and Design were

suitable for students who wished to study art, craft and design at a higher level,

in order to take up careers for which an art background was relevant.

5.4.1. The role of the QCA

English art courses at level 17 + were regulated by a government agency, the

Qualifications and Curriculum Authority (QCA) that defined the main examination

specifications, including assessment objectives, grade descriptors, and assessment

procedures. The general procedures for all examinations were stated by the QCA in

the Code of Practice (QCA, 2000). The three awarding bodies in England set their

syllabuses, question papers and assessment procedures only with the approval of the

QCA, which also monitored and evaluated the awarding bodies through a

programme of ‘scrutinies’. QCA scrutineers checked if the awarding bodies could

guarantee ‘ parity in each subject and qualification from year to year, across different

syllabuses, and with other awarding bodies’ and ‘ compliance with the requirements

of Code of Practice and other relevant regulations’ (QCA, 2000).

5.4.2. The structure of awarding bodies

At the time of writing, English schools and examination centres could choose one of

the three English awarding bodies to deliver the examination or an examination

offered by the Welsh Joint Examinations Committee (WJEC). This choice could be

that of the head of school, head of art department or through agreement between

teachers in the centre. Awarding bodies managed ‘all stages of the examining

process’ ensuring that the procedures were ‘carried out in accordance with the Code

180
of Practice and with the awarding body’s policy procedures’ (QCA, 2000). Figure 5

(below) illustrates the typical structure of an awarding body.

181
Chief examiner
• Supervises the construction of the question papers, mark schemes and the criteria
for coursework assessment.
• Responsible for monitoring the standards of principal examiners and principal
moderators
• Evaluation report of the examination at the end of each examination period.

Assessor Reviser Principal examiner Principal moderators


Second expert in Expert required to Compile exemplar
the design of the evaluate question Responsible for setting coursework tasks,
question papers; papers and question papers and the annotated to show
responsible for provisional mark standardisation of its how the marking
checking the final schemes marking. criteria in the
drafts of all syllabus are to be
question papers applied,
ensure moderators
coordination and monitor

Team Leaders
Assistant Principal Coordinate and supervise the
team of moderators
Examiners

are responsible for the


coordination and
Moderators
supervision of the group
Checking if the teachers’
of team leaders and Team
marks in the centres are in
Leaders coordinate and
accordance with the
supervise the team of
established standards and for
moderators
making adjustments if
necessary.

Assistant Moderators
Assistant moderators moderate centres’ assessment of candidates
in accordance with the agreed marking criteria and the awarding
body’s procedures

Figure 10: Structure of awarding bodies

5.4.3. A modular structure of courses and examinations

182
From September 2000, a modular structure established the specifications for GCE

AS and A level examinations, with the modules corresponding to units. At the time

of this writing most A level qualifications were made up of six units that were

broadly equivalent in size (QCA, 2000). Three of these units constituted the

Advanced Subsidiary (AS) qualification, representing the first half of the full A

level. The other three constituted the second half of the full A level and were called

A2. A level qualifications included two distinct examinations, AS and A2, each one

covering three units of the course. The units were separately set and assessed.

The modular structures of courses and assessment according to the Code of Practice

(QCA, 2000), were intended to allow flexibility. But they appeared to be a potential

source of validity problems as well as posing a serious threat to the established

theory and practice of art and design education. In the interview with Richard he

said: ‘We were all bound by…the modular structure for the AS and A2’ (Richard,

SEE, 4-02-2001). The structure of the courses in separate units tended to make the

system uniform. One consequence of the modular structure was an increasing

numbers of students taking art and design qualifications at AS level, another was a

distortion of the art and design courses. In theory, modularity was supposed to

allow course flexibility. However, according to the interviewees it had resulted in

student over assessment, and prevented teachers from experimentation in their

courses. This was understood to be a particular threat to sequential development of

ideas and skills considered important in art and design learning. According to

Elisabeth:

Art is on-going, it is a developmental thing and it goes on, you can get
back to a piece, you can use it as a starting point for something else in
the future, whereas with this modular structure you can’t do that.

183
Now everything they do is assessed, they don’t have time to explore
and experiment which is what art is about. They always have a
deadline for something and that is in their heads, I think it stops
things…. I think if you are all the time focusing on deadlines things
have to be finished off by rushing through, I think that it does interfere
with the creative process (Elisabeth, L, 7-02-2001).

For Cindy:

‘I don’t think that’s how we learn. I think its based on the wrong
theory… to assess in a way like this, by putting things into chunks of
experience, is not how you work, I don’t think it is a good practice
too’ (Cindy, NEW, 10-02-2001).

The introduction of assessment by units to replace the single final assessment at the

end of the course, as was the case with the old A level, was explained by Cindy:

‘I know why they have done it, to an extent. Well I think it is so the
students don’t have the pressure of the work at the end, they can
assess as they go along’ (Cindy, NEW, 10-02-2001).

But in practice, in Cindy’s view, it had put more pressure upon students because of

strict deadlines and overloaded courses. Elisabeth thought that the modular structure

was intended to homogenise the system and also to create intermediate

qualifications:

‘…this is really to be in line with other subjects and the government


as well wanted something that students could gain at the end of the
first year [of the course]’ (Elisabeth, L, 7-02-2001).

She thought this could be the reason for an increasing number of students opting for

art and design at AS level. However a head of art at the North London Collegiate

School considered this unfair and he was quoted as saying: ‘Art students need elbow

room for the pursuit of a personal aesthetic and a modular straitjacket does them no

favours at all’ (Hardy, 2001).

184
This practice seemed to contradict the course rationales. For example the OCR

specifications stated:

Progression from breath to depth, from AS to A2 is achieved by the


various units emphasising particular assessment objectives. This ensures
that the specifications have a developmental nature’ (OCR Specifications
2000).

The National Society for Education in Art and Design documentation showed that

one year after the implementation of curriculum 2000, the modular course, despite

the intention to present ‘a developmental structure’, was often seen a threat to art and

design. The newsletter of Summer 2001 included an article about the new A and AS

levels examinations, based on considerable correspondence from members,

expressing concern. A principal complaint was:

The new essentially modular examination places pressure on candidates


to produce fully realised projects from the beginning of the course.
Consequently, some teachers and students feel that there is no time for
the proper experimentation appropriate to the early part of A/AS level
study with an inherent possibility of ‘failure’. This is likely to lead to
increased prescription by teachers and students who feel inhibited from
taking risks – from being creative (Steers, 2001,p.11).

It appeared that the introduction of modular assessment, which was intended

to allow more options for students, was problematic in terms of fairness and

explanation for the adoption of such measures could be the government desire to

increase rigour and accuracy. Cindy said that ‘…particularly in arts, in the 70s and

early 80s assessment was quite lax’ (Cindy, NEW, 10-02-2001). And this was also

an opinion expressed in an exploratory interview with a former Edexcel art

assessment coordinator (IPB; L, 12-06-2000). However the fact that students were

assessed three times per year did not provide proof of fairness or accuracy.

Furthermore, the new reform brought in a serious problem of over-assessment.

185
According to the interviewees students and teachers were being hard pressed by

assessment and losing time to develop exploration, experimentation, risk taking,

and reflection on progress, all which are typically understood as fundamental to

artistic development.

5.4.4. Scheme of assessment


As noted previously, at the time of writing, art and design examinations were based

on coursework completed in each unit. At AS level the scheme of assessment

required two units of coursework, each one weighted 30 percent of the total AS

marks and 15 percent of the total A level marks and another unit, the ‘controlled

test’. For A2 examinations, two other units of coursework each one weighting 15

percent of the total A level marks were required and another unit of controlled period

test weighting 20 percent of total A level marks. The scheme of assessment was the

same for all the boards; the differences were mainly in the names given to

coursework units and the time allocated for the controlled test or externally set

assignment (Appendix III).

5.4.5. The specifications


A considerable quantity of documentation was published by awarding bodies,

for example: specifications, instructions for the conduct of examinations (for

teachers, moderators; team leaders and principal assistant moderators); question

papers for externally set assignments; guides for marking and reports of examiners.

The specifications, formerly known as ‘syllabuses’ defined approaches to the

structure of the course and established the scheme of assessment through description

of assessment objectives; assessment criteria and grade descriptors in general, and

186
the qualities sought in students’ works in each one of the specification titles or areas

of study.

According to the Code of Practice, chief examiners usually devised or wrote

examination papers, specifications and mark schemes. The validity of the

examination instructions and question papers was estimated a priori through

evaluation processes involving revising question papers and provisional mark

schemes by revisers, assessors and a committee chaired by the chair of examiners.

The evaluation took into account clarity of language, graphic design, content

relevance, bias, and consistency across the years, and discrimination issues (QCA,

2000).

5.4.6. Rationales for the specifications


Specifications were developed according to the QCA Code of Practice general

requirements involving (1) spiritual, moral, ethical, social and cultural issues; (2)

environmental education, the European dimension and health and safety issues, and

(3) the general skills of communication; information technology; working with

others; improving own learning and performance; and problem solving (QCA, 2000).

The rationales as stated by the three awarding bodies emphasised the need for breath

and depth in the art courses. For example, teachers were supposed: ‘to encourage…

an adventurous and enquiring approach to art and design’ and ‘focus on art and

design practice and the integration of theory, knowledge and understanding’

(Edexcel Specifications, 2000); ‘…the specifications have a developmental nature’

(OCR Specifications 2000) and [the specification is designed to] ‘form the basis of a

course enabling candidates to develop their own personal responses to their

187
experiences, environment and culture in both their practical and theoretical activities

…and aims to achieve a sense of progression’ (AQA Specifications 2000).

External assessment in England was conceived as an integral part of the art and

design curriculum. According to the interviewed teachers, the guidelines established

by QCA and specifications of the different awarding boards were broad enough to

allow some diversity of approach to curriculum.

The specifications leave a certain freedom for teachers to set out the
course (Ian and Sally, L, 7-07-2001).

‘[Provide] different ways of working and opportunities for personal


development and independent work through the course (Richard, SEE,
4-02-2001).

‘They state things quite clearly and they give good direction for
teachers and the more mature students’ (Elisabeth, L, 7-02-2002).

‘The way this is written can be encouraging for teachers. If teachers


are creative and interpret it creatively they provide a good framework’
(Cindy, NEW, 10-02-2002).

As described in the documents the areas of study seemed to encourage a broad

approach to art and design allowing students to develop a sound grounding in a

number of areas and media. The courses allowed flexibility in students’ choice of

the areas of study.

From an analysis of QCA documents about art and design examinations and

awarding boards specifications the assessment objectives set out by QCA appeared to

provide a common framework for developing specifications, grade descriptors and

criteria. This framework emphasised investigation of art and design products. In this

188
model there was a focus on art and design process, more weight was placed on the

development of ideas and critical and analytical capacities through the understanding

of context, meanings and purposes of art and design. It seemed to encourage analysis

and critical processes more than the final products. This appeared to be adequate for

the cross disciplinary nature of contemporary art and design, providing broad

assessment objectives seeming to be appropriate for all the areas of study (Appendix

IV). However the breadth of this model could restrict areas of experimentation and

production. For example, Annie complained about the lack of development of

specific technical skills during her course (Annie, SE, 6-05-2002). According to the

interviewed teachers ‘Curriculum 2000’ brought considerable changes to art and

design. It had replaced ‘traditional’ ways of teaching, which tended to emphasise

final products and specialisation in art and design. By reducing art to component

parts for assessment purposes the underlying model seemed to have been divorced

from established art and design teaching and practice, minimising the importance of

quality production. Ian said:

‘ It is all about proof and documentation rather than finished


product’ (Ian, L, 7-07-2001).

5.4.7. Instructions for the conduct of examinations

The clarity of instructions provided for the users is one essential aspect of validity. In

subjects that deal with creative outcomes, such as art and design, achieving a balance

between unambiguous instructions and direct, prescriptive effects upon the

curriculum is critical. And this was particularly crucial in the case of English art

and design external assessment where visual exemplars were often provided.

189
The interviewees expressed contrasting opinions about the degree of prescription

inherent in the instructions for assessment and their consequences.

The interviewees thought that instructions were not sufficiently clear for students (or

at least for all students) especially in the question papers, and the role of the teacher

in explaining the instructions to students was considered essential. Elisabeth said:

‘…there were parts of it that were quite difficult …with those


students who haven’t had it explained to them by the teachers’
(Elisabeth, L, 7-02-2002).

Ian and Sally wanted the instructions to be less ambiguous, at least in terms of

numbers of works to be submitted for examinations. According to them the

instructions for teachers from Edexcel were not clear, and some teachers had

misunderstood the requirements of the examination in 2001. The reasons for this

could include the new examination format, because teachers did not pay enough

attention to instructions, because they needed training, because the instrument had no

face validity, or a combination of these factors.

Cindy and Elisabeth said that the instructions needed to be broad and not too

prescriptive:

‘I think its up to teachers to understand and interpret it in a very


creative way. If the teacher is creative, he then interacts with the
students’ (Cindy, NEE, 10-02-2001).

According to Elisabeth there was a danger that the instructions could act as a

limitation and as a way to impose specific teaching approaches. Too much

prescription narrows courses, she said:

190
‘…because, these units are very prescribed, the way they are
explained, in some centres, not all, make it very tight in the way that
all the candidates do similar work rather than the more individual art
that we would expect at that level’(Elisabeth, L, 7-02-2002).

She concluded that there was a need for instructions that allow for teachers’

professionalism and inventiveness. The instructions for examinations and visual

exemplars provided by awarding bodies for schools, like an Edexcel video

illustrating units of work, could be a factor contributing to increased prescription of

courses and loss of freedom to produce personal responses. Annie’s work was

influenced by examples of ‘good’ work displayed on the Edexcel video sent to her

school (researcher diary, St.X School, 7-06-2002). The samples of students’ work

observed during standardisation meetings at Edexcel during 2001 and 2002 did not

generally demonstrate ‘…a wide variety of outcomes and media; specially the

sketchbooks’ (researcher diary, Mansfield, 4-05-2001. This may have been a

consequence of the impact of examinations. Whether deliberately or not, teachers

and students may have been encouraged to ‘play safe’ and reproduce the best

examples in order to ensure good results and consequential benefits for all

concerned. The examination could be preventing some risk-taking and independent

thinking, which should be fostered in art and design. There was a tension between

instructions for the art and design assessment instrument and art and design

curriculum practice. Should the instructions be broad and encourage diversity or

should they be prescriptive? Should visual exemplars be used? Two teachers

interviewed were in favour of such visual guidance (Ian and Sally, L, 7-07-2001).

But the literature showed that it could have pervasive effects upon schools and that

exemplars themselves are controversial. Hardy commented on the Edexcel visual

instructions for AS in 2002 as follows:

191
My fears were confirmed when I saw extracts from the induction video.
Although peppered with talk of ‘building on good practice’; prescription
and prohibition had resulted in exemplar material from pilot schools that
was so dull and devoid of wit it took my breath away. My idea of a good
sketchbook is an open ended tool for research and exploration; evidence
of an individual path of discovery, not the series of formulaic and
dedicated exercises exemplified therein. I wonder how many examiners
thought they would scream if they saw another colour wheel (Hardy,
2002).

A particular problem in designing instructions for assessment was detected during

this part of the research: on the one hand, if instructions are too prescriptive, even

though they are clear for students and other users, they may limit the breadth of the

instrument. On the other hand, if the instructions are broad enough to allow for

diverse outcomes, teachers need to interpret them. In the case of these examinations

where the assessment instrument was delivered by the students’ own teachers, broad

instructions had advantages, but only if these teachers were well-trained

professionals. There may be issues of equality of opportunity also for those students

who do not have the support of a good teacher. Interpreting and delivering

instructions necessitated teacher training, which was provided through in-service

training courses at which attendance was not compulsory. The nature of these

training courses was questionable. In one interview, Cindy noted, there was a need to

foster professionalism and creativity of teachers but sometimes in-service teacher

courses tended to be too prescriptive (Cindy, NEE, 10-02-2002).

5.4.8. Assessment Instruments


5.4.8.1. Coursework
At the time of this writing, A-level candidates submitted selected coursework from

each module. This could take the form of a portfolio (structured sequence of

annotated drawings, paintings, photographs or three dimensional objects); an

192
exhibition; multimedia presentations; written work such as essays and written and

visual work such as work journals. Two units of coursework were required at both

the GCE AS and A2 examinations, accounting for 60 percent of the final A level

mark. The specifications, students’ guides and instructions for teachers set down

explicit parameters and instructions for setting coursework tasks and defined

marking criteria. Every unit should provide evidence of all the assessment objectives

being met.

5.4.8.2. The ‘controlled test’ and question papers


The ‘controlled test’ was not a test in the traditional sense. The question paper

described a theme and a list of starting points for developing a unit of work to be

carried out during a controlled period. According to Edexcel the controlled test

represented the ‘culmination’ of the course (Edexcel, Specification 2000), and

therefore was considered as synoptic assessment. Students received question papers

in advance and they had a preparatory period for the controlled work. During this

period students could consult with staff and be supplied with supporting guidance

and materials. The period of the controlled test was fixed for AS and A2 at 20 hours

distributed as follows by the different awarding bodies:

Preparatory AS A2
period Controlled Controlled
test (Unit 3) test ( unit 6)
Edexcel 6 weeks 8h 12 h
AQA 4 weeks 5h 15 h
OCR 4 weeks 5h 15 h
Figure 11: Controlled tests

193
Students should submit unaided work produced under examination conditions for

assessment, but could make use of supporting work in the form of research or

preliminary studies and work journals developed during the preparatory weeks.

Whilst the controlled period work in Unit 3 was not expected to be a finished piece

of work, at Unit 6 the test required the production of a final piece using the

preliminary studies and investigation carried out during the preparatory time.

5.4.8. 3. Type and format of the assessment instruments

Instructions for the conduct of examinations stated the types of work candidates

should submit. The three awarding bodies usually required preliminary studies,

research work in visual and written forms, preparatory and annotated visual studies,

and final products. At the time the research was carried out, Edexcel was the only

awarding body requiring a compulsory work journal for each unit.

A conclusion drawn from the analysis of documents about art and design

examinations was that the assessment instruments were related to art and design

methods of inquiry. The evidence required from candidates through written and

visual work allowed wide variety of responses. Coursework and controlled period

work in the form of portfolios permitted the gathering of a wide range of evidence

through several tasks, allowing students to reveal knowledge, understanding and

skills through both process and product. According to Elisabeth the large amount of

evidence to be marked was beneficial for students. She said:

‘…you need it, the more you have, the more you can see, the more you
relate to it, it benefits the students’ (Elisabeth, L, 7-02-2002).

194
This may have advantages in terms of reliability of the results because it

enabled several assessments of the same trait allowing systematic sampling or

repeated measures, thus enhancing consistency in the assessment procedures

through corroboration.

The main problems detected with the type and format of the instrument were the

excess of tasks created by the modular system of assessment and requirement of one

synoptic assessment unit per year. The regulations set out by QCA seemed inflexible

in terms of time allocations and number of units per year. Such measures tended to

increase prescription. Hardy (2002) considered that the imposition of short timed

tests (units 3 and 6) was inappropriate for art, especially because English art and

design external assessment had previously showed a tendency to increase the length

of such examinations over time. The controlled period tests were imposed apparently

to align art and design assessment with other subjects, with little consideration

for established art and design assessment practices. Hardy (2002) suggested that

the mandatory controlled period tests could be interpreted as distrust of teachers

and students:

This lack of trust of teachers and students alike to keep their noses to the
grindstone is most blatantly illustrated by the extraordinary belt and
braces approach to assessment. If all units are to be examined why the
synoptic requirement for the timed test which, once again requires
students to back track and showcase skills already in evidence? (Hardy,
2002, p.57)

A conclusion was drawn that the type and format of tasks required in the assessment

in England reasonably provided accurate information about the performance of

individual students and generally had the characteristics of disclosure and fidelity

described by William (1992). The work submitted for examinations (research

195
studies, preliminary studies with annotations, work journals and final products

produced over long periods) provided a set of written and visual evidence of

attainment that was disclosed and faithfully recorded using methods of work related

to appropriate patterns of artistic learning and production. Cindy described the

instrument as ‘evidenced-based’ and suggested it could be used ‘creatively’ (Cindy,

NEW, 10-02-2001). Furthermore the instrument was also considered to be a good

preparation for students’ future careers by all the interviewees. The majority

described it as being writing-dependent; this was observed in the samples at the

standardisation meetings. In fact the need for written evidence was emphasised

through the instructions and mark schemes; one key issue of validity was the degree

to which writing skills were relevant to art and design courses. This requirement

could be seen as a consequence of the QCA requiring general skills for all courses

and also as a consequence of an underlying model of art education which valued

process more than (or at least as much as) products.

5.4.8.4. Process orientated instrument

The interviewees considered art and design examinations process orientated – a

change that had been introduced gradually. Earlier examinations had focused on final

products and little by little process had gained more emphasis. The emphasis on

providing evidence about process significantly changed the way teachers set courses.

According to Cindy:

‘…this is helpful, this is coursework oriented so, process oriented. It


opens up more possibilities by looking at how you are doing it, how
you think’ (Cindy, NEW, 10-02-2001).

196
However she believed that there was an imbalance of process and product in the

assessment and the instrument tended ‘…to focus unnecessarily on processes’

(Cindy, NEW, 10-02-2001). According to Annie, focusing on process and reflection

was helpful for students, but students found it ‘boring’ and ‘time consuming’ (Annie,

SE, 6-05-2002).

It is possible that the increased emphasis on process can be explained by trends in

contemporary art, for example in the conceptual art movement, which puts more

value upon the development of ideas than upon final products. It is compatible with

postmodern theories of art education, which value critical understanding of contexts

and development of ideas. The move towards an instrument based to a larger extent

upon process could signify a loss of content validity. For example Ian and Sally still

believed that final products were more important than process, which, according to

their explanation, was a consequence of their training and experience using

traditional methods based exclusively on assessment of final products (Ian and Sally,

L, 7-07-2001). It was evident that these changes were quite radical and, as a

consequence older teachers had to reframe their own view of the purposes of art and

art education after the Curriculum 2000 reform.

Within process-oriented examinations the amount of required written work from

candidates increased. It was noted during the observations and interviews that the

quantity of written work had increased with the new post-Curriculum 2000

assessment instrument being use at the time. Cindy stated:

‘This again is all words, there are so many words!’(Cindy, NEW, 10-
02-2001)

197
And Elisabeth said:

[The instruments] ‘rely a lot on writing, writing skills’ (Elisabeth, L,


7-02-2001).

5.4.8.5. Bias

At first glance the instrument did appear equally fair for all the students. During the

interview Elisabeth said:

I think the exams cover everybody from any country, there is a scope
there for everybody to find something of interest and as a teacher, again
you often pick up the students own interests and it is dependent on the
quality of the teachers (Elisabeth, L, 7-02-2001).

However, the interviews revealed some problems of bias. For example students from

big cities with good resources in their schools and access to cultural events could be

advantaged; the instruments could advantage girls rather than boys; students from

different ethnic backgrounds could be discriminated against and finally students who

had access to committed teachers or involved parents had clear advantages.

5.4.8.6. Gender bias

According to the 2002 art and design examination results (Joint Council for General

Qualifications, 2002) it appeared that for several reasons girls performed better than

boys. Elisabeth thought that this was because girls tended to develop superior writing

skills; ‘…they are more competitive; more mature than boys’ (Elisabeth, L, 7-02-

2001), and Cindy thought that it was ‘…because they are more systematic, more

responsive’ (Cindy, NEW, 10-02-2001). Bowden (2001) reported on a study of

boys’ underachievement in art and design examinations. Bowden claimed that the

emphasis on planning and use of sketchbooks, while valuable, inhibits art making

using media in a direct ways that do not, by their nature ‘…involve prior

198
investigation’ or ‘gathering systematic evidence or imagery’. He suggested that boys

in particular were resistant to preparatory work in sketchbooks before doing final

projects.

5.4.8.7. Bias related to non-English students


According to Elisabeth, Cindy and Sally, the examination was problematic for some

students from ethnic minorities because of the issue of English language. They

helped these students to understand the English language by explaining the

instructions in more detail. Annie said that the instrument was unfair for non-English

students because of the increasing emphasis on written responses in the examination

(Annie, SE, 6-05-2002).

5.4.8.8. Bias related to geographical location and economic background


According to Cindy it was very difficult for students in the town where she taught to

have access to books or major exhibitions. However the instrument required

‘informed answers’ and the question papers proposed starting points based on

research into exhibitions, books and the Internet. Whereas Cindy believed that

teachers could help students by supplying resources, she pointed out that if teachers

were not specially committed then this raised problems of equal opportunities

(Cindy, NEW, 10-02-2001).

Differences in local resources were not the only problem. The compulsory themes for

the question papers might also be a source of bias. For example, the theme for the AS

controlled period test at Edexcel, in 2001, was ‘Cities’. According to Ian and Sally

this was a very fashionable theme in that year because of the current major exhibition

at Tate Modern, in London. Ian commented that his students had no opportunities to

199
‘know a city’ or to see the exhibition referred to in the question paper and he

believed that this inevitably would influence their performances negatively (Ian, L,

7-07-2001). Sally and Ian also said that the examination required a lot of extra work

outside school from students and this also was a bias because it disadvantaged

students from low-income families who had part time jobs (Ian and Sally, L, 7-07-

2001).

5.4.8.9. Teachers’ influence upon students’ achievement

According to the review of literature on art and design examinations in England,

the English model of external assessment had permitted a great degree of teacher’

ownership of assessment in the past, but it was losing this strength. The teachers

who were interviewed felt they had less and less autonomy within the increasing

prescription imposed by the awarding bodies and the QCA. However, the role of

the teacher was still considered essential within the English model of external

assessment. It was obvious that students with access to good teachers would have

more opportunities to develop their knowledge, understanding and skills in art and

design. This topic arose frequently in the interviews. Cindy said that ‘…teachers’

effective explanation of the instructions for examinations and orientation of the

work’ could help students to perform better (Cindy, NEW, 10-02-2001). It may be

that the examination instructions were not sufficiently clear or accessible to all

students. It seemed that teachers often had to remedy the ambiguity and complexity

of the instructions so as to help avoid bias, for example, by providing resources for

students or translating instructions more simply, particularly for those with

inadequate English language skills.

200
5.5. Assessment criteria
In the English art and design examinations reviewed for this study, the assessment

criteria were developed from general requirements set out by the QCA. The four

assessment objectives established by the QCA for art and design were:

• A01 : Record observations, experiences, ideas, information and insights in


visual and other forms, appropriate to intentions
• A02 : Analyse and evaluate critically sources such as images, objects,
artefacts and texts, showing understanding of purposes, meanings and
contexts.
• A03: Develop ideas through sustained investigations and exploration,
selecting and using materials, processes and resources, identifying and
interpreting relationships and analysing methods and outcomes.
• A04: Present a personal, coherent and informed response, realising intentions,
and articulating and explaining connections with the work of others (QCA,
2000).

The interviewees considered the assessment objectives published by QCA clear and

appropriate for art and design courses. In Cindy’s words they were ‘excellent’;

‘broad’ and respected the nature of art and design. The assessment objectives that

seemed to provide the frame within which the criteria were established by the

awarding boards acted as ‘windows’ for teachers to focus on particular work in the

way Boughton (1996) has proposed. In effect, the assessment objectives

corresponded to broad criteria. In Richard’s opinion assessment objectives provided

essential guidelines for teachers and students:

It’s up to the student to decide what they want to present. The key to this
is how they are answering the assessment objectives, and criteria. We
don’t actually specify exactly what they should do, we give them
instruction material [instructions for examinations] – the assessment
objectives are the key to that (Richard, SEE, 4-02-2001).

However, a conclusion was drawn that the assessment objectives were not as broad

as they appeared at first sight. Some problems of content and face validity in the

assessment matrix described by the interviewees could be a consequence of such

201
assessment objectives. First of all, the term ‘objective’ suggests a rigid, compulsory

and universal concept, which is not in accord with the postmodern emphasis on

fostering plurality and ‘little narratives’ rather than universal ‘truths’. Secondly,

objectives were centrally imposed and not negotiable or adaptable to different

contexts. So, on the one hand, the assessment objectives seemed to be broad and

include essential aspects of knowledge, understanding and skills, but on the other

hand, the fact that they were imposed may restrict the nature and range of art and

design work. For example, Elisabeth (7-02-2001) stated that not all students needed

to look at the work of other artists in order to develop their own ideas. Cindy (10-02-

2001) pointed out that not all students had to make connections between their own

works with the work of others; some students developed strong ideas and produced

very creative outcomes without systematically using references or connections with

the work of others. Annie (6-05-2002) particularly questioned the need to link

students’ work to work of other artists. However the assessment objectives required

all students to follow the same method and submit evidence for all the assessment

objectives in each and every unit.

5.5.1. Mark schemes

The assessment criteria or assessment matrix published by the awarding bodies in

England were intended to guide teachers and moderators to focus on those aspects to

be developed by students in their work and in doing so a common language of

assessment was established. Elisabeth said:

202
…because it’s focusing, it’s making teachers and students focus
on those specific things , …teachers were looking at the work in a
different way, they were confronted with this new matrix, and
they really have to look at the work that they were marking… It
clearly guided teachers and moderators (Elisabeth, L, 7-02-2001).

Drawing on Williams (2001) ideas, a conclusion was drawn that the assessment

objectives and subsequent assessment criteria presented problems of content,

construct and face validity. The instrument placed an emphasis on activities that were

not necessarily the core of art and design because of the search for objectivity. In an

article entitled ‘Forced in the same mould’ published in The Times Educational

Supplement (November 2001), Williams suggested that the search for objectivity in

examinations was distorting art and design learning and assessment. The article

reported several teachers concerns with the new GCSE and GCE AS art and design

examinations. They claimed that students who were creative, but not good at

recording, were being penalised and, vice versa, students who were not particularly

outstanding with only the ability to record could achieve very good grades. Williams

concluded that the obligation to fulfil all the criteria in each unit of work was ‘…

making art into a hoop-jumping exercise, and creating a topsy-turvy world whereby a

candidate with poor creative and technical ability can nevertheless gain a decent

grade if all the boxes can be ticked’. Teachers were not against the introduction of

criteria related to historical context and contemporary artwork but did not agree with

the requirement to meet such criteria in every piece of work, which in their view

prevented students developing ‘exciting work’. Steers was quoted in this article as

saying: ‘We have stopped assessing creative and technical skills and the imagination

of students’.

203
In the same issue of the Times Educational Supplement there was an article by the

head of art at North London Collegiate School entitled: ‘Farewell to the Wow factor’

(Hardy, 2001). Hardy asserted that risky experimentation and open-ended

exploration had been reduced to a minimum and ‘…it seemed as though the course

had been deliberately designed to reward mediocrity’. Hardy strongly disagreed with

the unit structure of the courses and questioned the advantages of a common

structure across subjects claiming that ‘ …art is important as a necessary divergent

foil to the convergent curricular core’, and that one of the most essential aspects of

art ‘…that joy at seeing students transcend the shackles of the syllabus’ was being

lost. He argued that ‘ The equal weighting for all objectives has led to the reward of

the spurious over the essential’. The analysis of the assessment objectives and the

detailed mark schemes of all awarding bodies confirmed that creativity, imagination

and experimentation were seldom explicitly mentioned but, in contrast, recording

and researching sources were heavily emphasised. This may be a consequence of the

underlying conceptual model, which emphasised understanding, and criticism of

contexts more than creativity and innovative studio production. However, it may also

represent a quest for more objective ways of marking, as Elisabeth suggested it was

‘…much easier to assess the ability to record and write rather than to assess

genuinely creative work and risk taking’ (Elisabeth, L, 7-02-2001). A conclusion was

drawn therefore that the content validity of the instrument was being threatened by

the search for greater ‘objectivity’.

5.5.2. Criterion referenced marking

According to Richard, publication of detailed mark schemes was intended to

reinforce criterion-referenced marking, setting out clear demarcations between

204
statements of achievement and assessment objectives. Richard expressed the views

of an examiner when he argued in the interview that:

It has to be mathematical, this criteria, ……….I think it is very


straightforward, because it is based on the idea, if you got 3 marks in
each box and you know it is basically the idea of marking through the
criteria, it gives flexibility (Richard, SEE, 4-02-2001).

For examiners who look for uniform judgements in order to obtain accuracy of

assessment, the ‘objectivity’ of numbers may be tempting. However, as the literature

on art assessment shows, assessing artwork is not possible through a simple

aggregation of numbers. Judging an artwork involves more than counting the parts; it

is through holistic appreciation the overall harmony between the parts is recognised.

According to Gestalt theory, a work has to be judged as a whole because the whole is

more than the sum of its parts (Boughton, 1996; Eisner, 1985; Wiggins, 1993). Rigid

methods of criterion-referenced marking might not be appropriate therefore to judge

a whole. Analysis of the recommendations for marking during the 2001 Edexcel AS

standardisation meeting showed they were driven by a rigid criterion-referenced

process of marking of each assessment objective followed by aggregation of the

marks. Interestingly one Assistant Principal Moderator at the 2002 Edexcel A2

standardisation meeting (4-05-2002) recognised this when he advised moderators to

start looking holistically before marking by criteria. In the interview (10-02-2001)

Cindy admitted to assessing students’ work by combining criterion-referenced and

holistic methods. A conclusion was drawn that in practice, methods of assessment

were not exclusively criterion referenced.

Elisabeth, Cindy, Ian and Sally objected to the marking scheme, which they

considered inappropriate for judging the quality of art works. In Elisabeth’s words:

205
…it needs something else, it is that peculiar thing for the art subjects, I
don’t know what to call it…. it is not necessarily the creativity … it is
something that makes it different…that extra…I don’t think this is
included. I think that it should be something else (Elisabeth, L, 7-02-
2001).

Elisabeth recognised that judging art works was not a rigid and immutable process,

she understood that holistic marking processes should be taken into account. Overall

judgement or holistic marking is widely viewed as a legitimate procedure for

assessing art and creative work (Hennessy, 1994; Eisner, 1986; Armstrong, 1994).

Boughton (1999) reported that in the 1999 revision of the International Baccalaureate

examinations, overall judgement was included in the list of criteria in order to

improve the model.

Overall judgement or holistic assessment was not mentioned in the GCE assessment

matrices published in 2000, and according to previous examination reports this was a

radical change from the traditional processes of assessment in art and design.

A report on art and design examinations published in 1999 by the AQA awarding

body noted that teachers and moderators usually assessed students’ artworks by

holistic judgements. So even in 1999 it was recognised that criterion-referenced

assessment was in conflict with the nature of learning in the arts:

There are inherent dangers in devising coursework experiences,


which present the objectives as discrete elements to be covered in a
mechanistic and often self-contained manner. Similarly, a
distinction has to be made between the assembling of information,
as an end in itself, and the use of this information in subsequent
developments. This issue is referenced in one report and amplified
in others. It states, ‘Generally, a major concern is this – the
assessment objectives reflect intentions and responses within the
holistic nature of making Art. This is an art course and the
candidates’ artwork – whether progressive, developmental or
outcome – is what we reward’ (AQA, 1999 Art and Design, Report
on the examination)

206
However, the regulations for GCE mark schemes in England introduced in 2001

for new modular A/AS examinations used criterion-referenced marking exclusively

with the aim of making assessment in all subjects more ‘objective’ and uniform.

When interviewed, Elisabeth objected to this saying:

I think this is difficult to understand by non-art people. This is what we


have been told we must do, we have been told for a long time that we
must move to be in line with other subjects, and this has been terribly
difficult. It is very difficult because you have non-art people telling you
to do things when they don’t really understand the art process. That’s
why I think that in the assessment there should be that little extra, there
should be another objective somewhere to cover (I know that it sounds
terrible arrogant) [the fact] that art is different. I think the process is very
different; it is a very different process to any other subjects. I think that
there are other things which go into it, although very difficult to assess,
but need acknowledging, you need to acknowledge students who are
particularly creative (Elisabeth, L, 7-02-2001).

This raised the question of whether mark schemes exclusively based on discrete

criteria were appropriate for art subjects. Elisabeth and Cindy viewed criterion

referencing as an imposition introduced mainly ‘to align it with other exams’.

When these criteria are compared with Boughton’s concept of criteria as ‘windows’

(1996), they may be too small for art and design, which explains Elisabeth thought

that they ‘ just need opening out a little bit more’.

But Richard, who was chief examiner, expressed the view that the majority of

teachers agreed with existing mark schemes and a considerable consistency of

marking was achieved:

‘There is an agreement about the marking; we have less than ten re-
marking situations in about 100 centres. Teachers generally were very
positive about it; they were not negative about the marking scheme.

207
That mark scheme actually works quite well’ (Richard, SEE, 4-02-
2001).

This suggests that mark schemes were clear and easy to apply despite the complaints

of being inappropriate. However, consistency of results could also be a result of

teachers’ tacit knowledge. It could be a demonstration of connoisseurship, which is

an essential quality for achieving reliability of results (Eisner, 1985).

5.6. Assessment procedures

Any instrument ultimately depends on the success of assessment procedures.

Training teachers and moderators was at the core of the English model of

assessment. In fact the breath of submitted evidence and the nature of art and design

criteria in this assessment model made in-service training a condition of success.

5.6.1. In-service teacher training

Well, the training is based on this [assessment criteria] and based


on samples, so we have standardisation samples for teachers.
When we look at these criteria [sic], this is before they mark,
alongside particular samples of work, and so they can see what
the numbers are, and also leave that meeting with a booklet with
samples alongside the marks and commentaries, so that is used by
everybody. Nothing is hidden in how it is marked; everybody
knows exactly what is to be marked, and all hinges on the
interpretation of criteria (Richard, SEE, 4-02-2001).

Although the awarding bodies had established procedures in place for teacher

training in the form of INSET courses, attendance was not compulsory; Cindy, Sally

and Ian found it difficult to attend because of the great number of candidates and

because of the demanding workload at school. Cindy (10-02-2001) claimed that,

208
while the government was especially interested in teacher improvement, there were

insufficient facilities for ‘guaranteed continuous professional development’.

The teacher interviewees regarded teamwork and regular teachers meetings at

regional level to share and discuss assessment as a most important strategy to ensure

consensus about standards and criteria. Training, as understood by Cindy, not only

depended on centralised courses but was part of a long-standing tradition of art and

design teachers in England using dialogue to arrive at consensus about assessment

(Cindy, NEW, 10-02-2001). However, Cindy said that such practices, once common,

had tended to disappear. The courses promoted by the awarding bodies at the time of

the research were not intended to promote negotiated assessment procedures.

In Cindy’s view they were designed to impose policies. However, they must have

worked, to some extent, because the instructions for external assessment were

explained to teachers. A conclusion was drawn that an efficient system of training

can foster common interpretation of criteria.

5.6.2. Internal marking and internal moderation

The analysis of documents revealed that the assessment procedures were the same

for all the awarding bodies, because the QCA Code of Practice defined them.

Students’ own teachers assessed candidates’ work in ‘centres’, which may be one

school or college, or a group of institutions. Typically the internal marking included

internal moderation. Teachers were required to write down marks for all units of

work, for all candidates, using the assessment matrix and then transfer a final mark

for each unit for every candidate onto specially designed forms; they had to arrange

209
a display of folders of work from candidates in the moderation sample, ensuring that

all pieces of work within each unit were clearly labelled and to send a copy of

relevant marking forms to the awarding body.

Cindy said that in her school internal moderation took place as required by the Code

of Practice and awarding body instructions, but in Sally’s school this was not the

case, possibly due to pressures of time, extra work and an uncommitted head of

department. Sally said that she experienced significant problems with a lack of

information exacerbated by the fact that it was her first year of teaching and she got

little support from her head of department. Cindy’s description of the process was

very different. As an experienced teacher, she had clearly acquired appropriate tacit

knowledge and contacts with other schools and the art department at her school

worked cohesively.

According to Richard, who worked at the OCR awarding body, there was a high

consistency of results in the 2001 examinations and Elisabeth commented that no

significant changes had been made to internal marks in the Edexcel AS examinations

in the schools she visited as a team-leader in 2001. Therefore in Elisabeth’s and

Richard’s views there was consistency of results, which suggests that internal

marking, and moderation were completed according to expected standards.

5.6.3. Standardisation

At the time this research was carried out, the awarding bodies in England had

established a team structure for standardisation and moderation procedures

210
involving: chief examiner(s); principal examiners; assistant principal moderators;

team leaders and moderators. Standardisation procedures consisted of meeting(s) to

secure consistent application of mark schemes by all examiners. According to the

Code of Practice, standardisation involved the following phases: (1) an

administrative briefing, normally by an awarding body officer to explain awarding

body procedures, time schedules, documentation and contact points; (2) explanation

by the principal examiner of the nature and significance of the standardisation

process; and a briefing by the principal examiner on relevant points arising from

previous examinations, drawing as necessary, on relevant statistical data and points

made in reports about examinations; (3) training and simulation of marking

candidates’ work. This agenda was strictly applied during the two observations of the

Edexcel awarding body carried out as part of this research (Appendix X).

5.6.3.1. Observation of standardisation procedures

At the Edexcel awarding body there was evidence of formal training and

standardisation for moderators in one-day meetings for a large group consisting

mainly of teachers. Fifty-three moderators attended an AS standardisation meeting

on 3rd May 2001 and 80 moderators attended an A2 standardisation meeting on 4th

May 2002. Training and simulation of marking candidates’ work was observed

taking place during familiarisation and blind-marking exercises.

In standardisation, the awarding body used exemplar work from its archives and

‘live’ work collected from centres in the year of examination to explain how

candidates’ work should be marked in accordance with assessment criteria. In

addition, exemplar portfolios for teachers and moderators should be published in

211
book form and videos should be available to centres. According to the QCA Code of

Practice, examples of coursework assessment and moderation must be prepared by

the awarding bodies containing assessed work that show ‘ clearly how credit is to be

assigned to particular assessment objectives’ (QCA, 2000, p. 17). At the AS and A2

Edexcel standardisation meetings attended on 3rd May 2001 and 4th May 2002 there

was evidence of some exemplar work; but a great part of the examples displayed at

the latter consisted of photocopies.

The work was displayed by unit. Near each display were the title of
the unit and a copy of the assessment matrix. Six sets of works were
already marked, accompanied by the completed assessment matrix
with underlying objectives and marks attributed for each objective
(familiarisation exercises). Seven sets of works were not marked
(blind marking exercises), they were marked during the second part
of the meeting by the moderators (researcher diary: observation AS
standardisation meeting, Mansfield, 3rd May 2001).

Only two samples presented the original student work. A set of ten
samples of photocopied students’ work was displayed for
familiarisation; an exemplar of the assessment objectives and marks
attributed to the work was attached to each sample A set of twelve
samples of photocopied students’ work was displayed for blind
marking exercises. An exemplar of the assessment objectives and
marks proposed for the work by centres was attached in each
sample. In the same fashion as blind marking samples, a set of 7
samples of photocopied students’ work was displayed as ‘RES’, or
reserved work, to be used if the moderators could not attain
accuracy during the blind marking exercises (researcher diary:
observation A2 standardisation meeting, Mansfield 4th May 2002).

During the observations of Edexcel standardisation meetings, the chief examiners,

assistant principal moderators and team leaders explained the assessment criteria to

the moderators using samples of student work during the ‘Familiarisation Exercises’.

Although the samples were not very varied the exercise seemed to help moderators

understand how to apply the assessment objectives and trained them in the

moderation procedures (See Appendix X).

212
All the moderators received photocopies with the assessment matrix
for each one of the examples with the marks and the justification of
the marks attributed to each one of the assessment objectives
(which were underlined). The Team Leader asked the moderators to
look carefully at the examples and to the attributed marks and to
discuss them. Afterwards, the Team Leader explained the work and
the attributed marks showing the evidence related to the assessment
matrix and why it was marked in that particular way (researcher
diary, observation AS standardisation meeting, Mansfield 3rd May
2001).

The team leader explained the marks using the language of the
assessment matrix and adding comments such as ‘the work revealed
a weak area of response to other artists’; ‘the student revealed
confidence in some areas’; ‘the student spent too much time on
research’; ‘the degree of analysis lacked depth and breadth’
(researcher diary: observation A2 standardisation meeting,
Mansfield, 4th May 2002).

The ‘Blind Marking Exercises’ consisted of simulation of marking candidates’

work.

According to one of the chief examiners the purpose of the blind


marking was the training of moderators through the marking of
examples of students’ work that illustrate the range of performance
likely to be demonstrated by the candidates in the centres and
helped moderators to consolidate a common understanding of the
application of the mark scheme. The purpose of the standardisation
was not to discuss the assessment matrix. In the words of one
assessment coordinator: ‘the moderator has to accept examples not
to discuss them’ (researcher diary: observation AS standardisation
meeting, Mansfield, 3rd May 2001).

Standardisation seemed to be a very rigid procedure. Richard (4-02-2001) explained

that this was training, although Ian (7-07-2001) expressed some doubts about this

training. During a GCE AS Edexcel standardisation meeting in 2001, standardisation

was observed by the researcher to be a strategy to ensure uniform interpretation of

criteria; moderators were ‘calibrated’ to apply the mark schemes without having an

active opportunity to discuss them. The researcher did not perceived standardisation

213
as negotiated interpretation but rather as the imposition of the chief examiner’s

interpretation of the criteria. From the events that took place during the Edexcel A2

standardisation meeting in 2002, a conclusion was drawn that standardisation was

also a strategy to evaluate moderators’ marking accuracy.

….the group had to mark the blind marking samples; and verify if
the marks attributed by the centres to the works were consistent. It
was an individual exercise; moderators had 10 minutes to mark
each sample; they marked in silence; the team leader was constantly
advising the group to read the sketchbooks because ‘everything or
almost a great part of the evidence is in there’ (researcher diary,
observation A2 standardisation meeting, Mansfield, 4th May 2002).

However problems of time and lack of diversity of samples were apparent. It was not

clear in the time available whether or not the standardisation would be accurate.

During the exercises a speaker was constantly reminding the group


about the schedule; moderators had to move very fast (researcher
diary, observation A2 standardisation meeting, Mansfield, 4th May
2002).

A conclusion was drawn that the large scale of the standardisation meetings could

work against the achievement of consistent marking. The most positive aspect of

standardisation appeared to be the cascade strategy through which chief examiners

standardised assistant principal moderators and team leaders and, in turn, team

leaders standardised a group of moderators. Familiarisation exercises usefully

exemplify the application of assessment objectives in students’ work and assisted a

common interpretation of criteria. The blind marking exercises provided a useful

mechanism evaluating moderators’ accuracy in applying criteria and consistency of

marking. It was concluded that standardisation meetings were also means of

enhancing moderators’ performance through team leaders’ feedback providing

214
additional training meetings or individual training for new moderators by team

leaders where necessary.

5.6.4. Moderation
According to the Code of Practice (QCA, 2000) moderation was designed to verify

the consistency of teachers’ marks in centres. In the English model it played a very

important role in ensuring the reliability of teachers’ (and moderators’) internal and

external assessment, not only in ensuring that the same standards were applied but

also the fairness of the system. Students’ work submitted for examination was

guarded from the personal idiosyncrasies of the teachers through this process.

The fairness of results was ensured and teachers could use moderator’s feedback

positively to improve their performance.

Not all the candidates’ work was submitted for moderation. The moderation sample

comprised work randomly selected across the mark spectrum by an awarding body

and candidates achieving the highest mark and the lowest marks at a centre. Centres

had two options for moderation: (1) to send a moderation sample to the awarding

body or (2) to have the sample moderated at the centre by a visiting moderator.

Option (2) was most commonly used for art and design examinations.

Moderation consisted basically of ensuring that standards were maintained from year

to year and were consistently applied. Moderators arranged the visits to centres

within fixed time schedules set up by the awarding body. A centre might be visited

by a single moderator or by several, for example accompanied by team leaders,

assistant principal moderators and chief examiner. At the start of a visit the head of

215
department ,or the teacher representing him/her, described and discussed the course

content and organisation, any problems experienced and the timetable (Edexcel,

2000, Instructions for teachers). According to Edexcel’s Instructions for moderators

(Edexcel, 2000, Instructions for moderators) the primary function of a moderator was

to decide whether marks awarded by a teacher were in accord with published

marking criteria and conformed with overall national examination standards.

5.7. Conclusions in terms of validity summarised

5.7.1. Weaknesses

The main weaknesses of the model were related to the modular structure of courses

and assessment, which were understood to threaten validity. This presented practical

problems for teachers and students because of the enormous amount of evidence

required and pressures of time. The negative consequences were that it distorted the

curriculum. The following particular weaknesses in terms of validity were

identified:

• Some problems of response validity were detected in the question papers for

the controlled test because of the written nature of the papers and

information required.

• Sources of potential bias included: gender bias, social class and geographical

backgrounds bias, and ethnic bias. But, at the same time, teachers appear to

have strategies for minimising these potential disadvantages by giving

additional help to students. These strategies could be adopted because

students’ submissions for external assessment were guided by the students’

own teachers.

• Process and product were not equally weighted in the assessment criteria.

216
• Core domains of art and design and essential knowledge, understanding and

skills in art and design were minimised by the mark schemes.

• A major problem was caused by the mark schemes that were entirely based

on criterion referenced marking, and which rejected art and design

judgements based on holistic methods.

• The examination imposed strong policies without sufficient respect for

traditional beliefs and practices in English art education A great

contradiction was found between teachers’ perceptions of art and design

assessment and the centrally imposed model of criterion referenced mark

schemes.

• It was apparent that examination developers at Edexcel awarding body did

not pay sufficient care in the design and development stage of the instrument

to the reality of art and design practices through empirical research and

analysis of needs because government imperatives were implemented in

haste.

5.7.2. Strengths

The following strengths in terms of validity were identified:

• The model had well articulated criteria, illustrated by visual exemplars

provided by the awarding bodies, which was a great advantage in terms of

understanding and applying the criteria.

• The instructions for the examination included a framework for identifying

learning needs and for interpreting examination performance; a broad

explanation of what candidates should be able to do in the different domains

using statements of achievement; broad explanations of what candidates

217
should be able to submit for the examination as final products, sketchbooks,

preliminary studies, etc., and clear explanations of how examination work

was graded.

5.8. Conclusions in terms of reliability summarised

5.8.1. Weaknesses

The archives of exemplary work, as required by the QCA Code of Practice, were

vital to the success of teachers and moderators’ training. However the lack of

samples for training at Edexcel seemed to threaten the validity and reliability of the

examination because the clarity of criteria and standards depended essentially on

these samples.

5.8.2. Strengths

The assessment procedures in the English model had significant advantages over the

Portuguese model in terms of reliability. The following strengths were identified:

• The type and format of the instrument, which was based on multiple evidence

of student’s achievement produced during coursework and controlled tests,

provided several assessments of the same trait. Systematic sampling was

possible in order to obtain consistency through multiple measures.

• The awarding bodies system of standardisation of teachers and moderators

was well designed to develop a uniform understanding of the standards.

• This was an advantage in terms of reliability of the procedures because there

was a common understanding of criteria and also because teachers,

moderators and awarding body officers were constantly monitored.

218
• External moderation provided a way to check the consistency of results in

relation to agreed national standards determined by awarding bodies.

5.9. Practicality

I think it is true to say that all the art and design examinations involve
internal assessment and external moderation. Most of those who have
written [letters from members of NSEAD] welcome the involvement of
teachers in the assessment process but question the ‘unpaid’ extra work
this involves… An additional problem is that the marking load is all
concentrated around Easter and cannot be moved forward without further
disadvantaging students whose courses in art and design are already
effectively shorter than for the most other subjects (Steers, 2001).

The interviewees constantly referred to problems with the practicality of the

examination. For example, problems with logistics – there was not always room in

the schools to exhibit and store the amount of evidence produced for examinations;

unreasonable teacher workload – there were difficulties in setting schedules for

assessment within schools; with regard to teachers timetables and the unpaid extra

work that they carried out. This suggested that the examination was not sufficiently

adapted to available resources. The fact that three separate units of work were

assessed in each year increased the problems of practicality, especially because this

required a great amount of extra work for students and teachers, which was seldom

provided for in school timetables.

5.10. Impact

The English model of external assessment was complex and fulfilled several

functions. These included temperature taking, gate keeping, determining whether

219
assessment objectives had been attained, providing feedback for teachers on the

quality of their professional work, and ranking schools. It was not used only for

purposes of students’ selection for higher education or for judging student

achievement, but also as a way to monitor centres and teachers’ performance and

methods of delivering the curriculum. Assessment results were important for all

kinds of users, including parents, students, teachers, employers, local educational

authorities, and government and higher education institutions.

According to Cindy (10-02-2001: ‘It [art curriculum] is very much assessment driven

and criteria driven’. The impact of external assessment upon schools was considered

to be potentially negative:

The more good results the schools have in examinations the more money comes in. I

am not sure if you are a parent you want to send your son or daughter to a school – if

a school has poor grades you won’t look for that school, you look for another school

(Elisabeth, L, 7-02-2001).

The interviewees constantly referred to concerns about the consequences of the

examinations upon teachers, curriculum and schools. They suggested that

examinations were intended to influence practice, distorted the curriculum, imposed

policies, and restricted core domains of art and design and limited learning time.

The examinations could also have a very negative impact upon the type of work that

students developed during the courses because of strict instructions or strict

interpretation of the examination instructions and exemplar material.

220
During the standardisation meeting the team leader warned moderators
about the similarity of students’ works in centres. I thought that it was
kind of revealing that in the centres the students followed strict
orientations and the work should be very similar, at least in the processes
used and organisation; does this mean that art and design in schools is
very prescriptive and there is no great variations in students outcomes? In
this school [St. X School] looking at the art students’ works I have
exactly this feeling: all the student work seems to follow strict
prescriptions even in the kind of sources they used (research diary; 8th
May 2002).

The interviewees considered that accountability in the form of league tables was

excessive. It was considered to have a major negative impact upon schools and

students; it was seen as unfair:

Grades are public. Teachers’ unions have been claiming that it is not a
way to judge a school, there are some very good schools doing amazing
things but not getting good grades, they are doing a lot of social skills,
lots of activities. Things in the community – none of these are assessed.
It is very unfair (Elisabeth, L, 7-02-2001).

A dramatic situation in which students were advised to withdraw from an

examination because schools did not want to get bad examinations results was

described by Sally (7-07-2001), as an extreme example of the negative impact of

examinations results. Similar situations were reported by the press (Halpin, 2003) in

which teachers ‘cheat’ by helping students with coursework submissions or refuse to

enter less able students for examinations so as not to damage their ranking in the

league tables.

A conclusion was drawn that the examiners’ reports published at the end of each

examination had a positive impact because they provided feedback for teachers and

schools. These included an analysis of work submitted for examinations and included

an evaluation of art and design practice in schools. Awarding bodies also published

general reports about examinations, which constituted a form of evaluation of all

221
the examination procedures, analysing strengths and weaknesses and providing

recommendations for teachers and moderators. This feedback provided by awarding

bodies to teachers and moderators was a very positive consequence of the model,

providing a form of training for teachers and moderators. As Elisabeth said:

We all report as well, as moderators. We pick up lots of things and


then put it into the reports; there is actually quite a lot of feedback
from the centre by the moderators. It is important that teachers have
this feedback. The booklet, reports that the exam board sends to
everyone, it is quite useful, quite a lot of comments in there about
the tasks, highlights things that have gone wrong and again is up to
them to take that [advice] (Elisabeth, L, 7-02-2001).

5.11. Suggestions for art and design examinations

The following key ideas for developing art and design examinations emerged at this

stage of the research:

1. It would be necessary to develop an instrument of assessment that provided

authentic tasks, related to art and design methods of learning and making,

such as portfolios or selected coursework with no terminal test.

2. It should be equally weighted requiring evidence of process and product.

3. It should allow both written and visual language; enable students to decide

individually the best kind of communication skills for them to use.

4. It should not fragment art and design teaching and learning into modules of

assessment. A single final ‘examination’ should be developed to confirm

coursework evidence.

5. It should have flexible criteria allowing a combination of criterion referenced

and holistic assessment.

6. Clear unbiased instructions for the examination will need to be produced.

222
7. Teachers should be trusted to deliver the examination under normal

classroom conditions and to explain the tasks and criteria to students

according to their different individual contexts and situations.

8. The instrument and procedures should be based on teachers’ ownership of the

assessment.

9. It should be adequately piloted.

10. Regional programmes of compulsory teacher and moderator training should

be integrated to develop a common understanding of criteria thorough

dialogue and consensus.

11. Processes of checking teachers and moderators’ assessment performance

need to be developed.

12. Provision for examination feedback for schools and teachers would be

necessary.

Summary
Art and design examinations in England were traditionally characterised by

flexibility and diversity, but the Education 1988 Reform Act brought about a chain of

considerable changes culminating in the Curriculum 2000 reform, which tended to

centralise art education curriculum and assessment with greater prescription and

monitoring. Without pretending to give a general evaluation of the whole system,

some positive and negative aspects were noted which provided insights for the

development of a reformed assessment instrument and assessment procedures in

Portugal. The English instrument for external assessment was considered generally

relevant and appropriate to art and design methods and courses. The tasks addressed

context and authentic situations. It was a process-orientated instrument, emphasising

the need to research artworks and design products. Requiring written evidence as

223
well as visual evidence provided an opportunity to assess the same knowledge,

understanding and skills on separate occasions. However the instructions were not

always considered clear for the users and the instrument relied on teachers’ particular

approaches, which could be a threat to the response validity of the instrument.

The underlying model of the instrument raised some problems of content and

construct validity, especially in the ways that the mark schemes appeared to

minimise core aspects of art and design such as production, exploration, independent

thinking, risk-taking and more genuinely creative artistic work. The modular

structure of the assessment raised problems of validity, threatening the established

developmental process of art and design learning and making.

The instrument was judged as not equally fair for all the students and problems of

gender, social class and ethnic bias were found. The model offered considerable

reliability of assessment results and suggestions were made about the reasons for

such reliability, including the type and format of the instrument, the traditional

professionalism of teachers, the strong system of in-service training and the

structured system of control of assessors and moderation. Finally a list of

consequences of the assessment was provided illustrating positive and negative

effects upon students, teachers, schools and curriculum.

224
Chapter 6
Design and piloting of a new external assessment instrument and procedures in
Portugal

This chapter is a report of a new conceptual framework for art and design

examinations at pre-university level that was developed and tested out in this

stage of the research. The content framework for the art and design

examinations is discussed in order to explain the materials developed for the

pilot experiment, which is described and analysed.

6.1. Introduction
Some suggestions and draft materials for the new assessment instrument and

procedures were devised during the first stage of the research taking into account

ideas in the research literature, and analysis of the data from the study of Portuguese

and English art and design current examinations.

After analysing the data and conclusions of the study of English and Portuguese art

and design examinations described in Chapters 4 and 5, some principal guidelines

and constraints were drawn up for a new model of external assessment in Portugal.

Although the English examinations had some of the strengths sought by Portuguese

participants, problems had been identified by the researcher which needed to be

addressed in order to avoid future problems of validity and practicality, for example,

sources of bias and teacher and student overload (See Appendix XXVIII).

6.2. Designing a new assessment instrument and procedures for art and

design external assessment in Portugal

225
The design of the assessment instrument and procedures was developed in three

stages:

• Stage 1: Development: Rationales, draft specifications of the assessment

instrument and assessment procedures.

• Stage 2: Trial 1  Piloting: Negotiate with participants; trial with a

reasonable sample, evaluate and refine the draft specifications for the

assessment instrument and assessment procedures.

• Stage 3: Trial 2  Implementation: trial a larger scale experimental

assessment and evaluate the specifications for the assessment instrument and

assessment procedures (see Chapter 7).

6.2.1. Rationales

The first stage of the design concerned the specifications for the proposed art and

design examinations. A rationale and a content framework were developed from

ideas encountered in research literature; the views of teachers and students (Chapter

4); and through an analysis of Portuguese art and design syllabuses. The specification

of the instrument followed the rationale for art and design education described by

Swift & Steers (1999) who argued that art education should address three

fundamental principles: difference, plurality and independent thought.

These principles are embodied in the promotion of risk-taking, personal enquiry

and challenging established orthodoxies, especially those associated with cultural

hierarchies. Thus, learning in the arts should develop disciplinary knowledge,

creative and critical thinking, and the skills to interrogate dominant ideologies.

The rationale was further inspired by four dimensions of student

understanding in the arts proposed by Ross, et al (1993, p. 51):

226
1. Conventionalisation: an awareness and ability to use the
conventions of art forms.
2. Appropriation: embracing for personal use, the available
expressive form.
3. Transformation: the creative search for personal knowledge.
4. Publication: the placing of the outcome in the public domain.

These dimensions of aesthetic learning operated as four basic quadrants from which

to systematize the contents and skills of art and design for assessment purposes.

6.2.2. Skills in art and design

The review of literature (Chapter 1/2) provided some insights into what is

or what should be assessed in art and design. Eisner (1972 p. 212- 216)

claims that the qualities that could be evaluated in the arts are: 'Technical

skills, aesthetic-expressive aspects, creative imagination', and he notes

that these are in constant interaction. According to Dorn (2002) there are

three important elements that need to be assessed in art education:

expression, knowledge and skill, and concept formation.

6.2.2.1. Knowledge base

The design of the new assessment instrument was influenced by the idea that

creativity or creative thinking requires a substantial knowledge base, (Alexander,

1992; Amabile, 1987). The knowledge base should include theory, for example

concepts and techniques of art and design, as well as knowledge of past and

contemporary art, design and media works from different cultures. Knowledge and

understanding of art and design works (and, more broadly, visual culture) using both

formal and contextual analysis was considered very important. But the new

instrument should not only provide both a test of memory and analytical skills; it was

227
essential also to include critical skills. Creative work in art and design depends on

knowledge of art and design works, understanding the principles and elements of

visual language, and the acquisition of skills, technologies and materials. (Dobbs,

1992; Best, 1996). Creative art and design work needs the contextual knowledge of

art and design discovered through research and critical analysis of wide-ranging

material from visual culture. An understanding of disciplinary knowledge may

enable students to extend boundaries through personal appropriation; what Eisner

(1972, p. 217-222) called ‘boundary pushing’: the ability to attain the possible by

extending the given.

Like other creative processes, artistic work involves memory and researching

relevant resources, response generation and response evaluation (Amabile, 1990).

Through understanding the symbolic rules and conventions the student can combine

and reorganise existing knowledge structures or conceptual categories. These new

linkages or rearrangements can provide new ways of understanding a problem

situation, providing a source of new ideas.

6.2.2.2. Thinking skills and metacognitive skills


Intentions and motivations are seen as intrinsic parts of the creative process.

The student receives stimulus in the form of external inputs from their social

milieu and also initial impetus from within the individual (Amabile, 1990;

Csikszentmikalyi, 1990). A decision was taken, therefore, that the new assessment

instrument should include the personal and social milieu inputs through a record of

intentions, experiences and motivations. Creative skills involving convergent and

divergent thinking abilities such as critical thinking, problem-finding, problem

228
solving and decision making skills were also considered important to allow students

to challenge established ideas, concepts and ways of making; to resist stereotyped

visions of the world and break boundaries. In boundary breaking, ‘students see gaps

and limitations in present theories and proceed to develop new premises, which

contain their own limits, they must be able to establish an order and structure

between the gaps they have 'seen' and the ideas they have generated’ (Eisner, 1972,

p. 217-222). In the new instrument, skills of creative thinking and making involve

appropriation and transformation strategies concerned with understanding

conventions within cultural contexts to enable students to convert them into a

personal style, signature or voice (Ross et al, 1993, p.53). These skills, considered in

the process, could be described by terms such as: ‘experimentation’, ‘exploration’,

‘discovery’, ‘problem-finding’, ‘analysis’, ‘synthesis’, ‘evaluation’, ‘risk-taking’,

‘decision-making’, ‘problem solving’ and ‘communication’.

6.2.2.3. Self-evaluation skills

Evaluation skills play a significant role in creative thinking (Feldhusen & Goh,

1995). Together with self-evaluation they were considered an essential component of

the design of the new instrument. Self-evaluation skills were interpreted as the

capacities to justify an outcome; to explain how to solve or to define a problem; they

involve sustaining the original insight, evaluating, elaborating and developing it to

the full (MacKinnon’s; 1962, p.485). Self-evaluation, in what Ross (1993) calls the

publication stage, was extremely important in the design of the instrument. By this

means students could reveal their ability to persuade others, including the audience,

viewers or assessors who would later validate the work, of the value of their work.

229
6.2.3. Defining a content framework for the new art and design external

assessment

The aims and general objectives described in the Portuguese Studio Art and Drawing

syllabuses were broad enough to enable the construction of a content framework that

included knowledge of art and design, independent thinking, critical and creative

skills and self-evaluation. Although technical skills in art and design formed a large

part of the content, space was left for creative process and contextual studies of art

and design. The current Portuguese syllabuses allowed the construction of a core

framework in the form of blueprint (see figure 11), which attempted to foster a

holistic vision of art and design, promoting risk-taking, personal enquiry and critical

skills.

230
Components Contents Activities

Process- Knowledge base Search, collect information, Interpret


Conventionalisation Understanding symbolic organise information,
rules: history, meaning and mapping, listing, analysis, Record personal ideas,
Public: Social purposes and conventions synthesis, select, combine, intentions, experiences,
culture/ of visual products. associate, articulate, information
visual culture; Understanding visual connect, re-organise,
students social language; visual perception; problem-finding, critic, Critical analysis- sources
environment visual media; methods of evaluate Reveal search/critical skills
visual representation;
Private situations, process of inquiry and
experiences, making
motivations,
dispositions,
intentions
Process- Independent thinking Experiment, explore, Develop ideas
Appropriation Critical thinking discover, problem-finding, Reveal technical expertise
Metaphorical thinking analysis, synthesis, (process of inquiry and
Convergent-divergent evaluate, combine, generate making) imagination and
thinking ideas, explore possibilities, persistence
risk-taking, regression,
boundary pushing, Craft skills, Technical skills
decision-making Problem Media skills
solving abilities, Evaluative Communication skills
abilities, flexibility, Evaluation skills
endurance, persistence,
resistance
Process/Product- Making generation, boundary Produce outcomes
Transformation invention, imagination, breaking, problem solving, Showing contextual
intervention, flexibility, decision making, understanding and technical
communication, evaluation create, communicate, expertise, find a new
evaluate, order/personal aesthetic
response or intervention
fitting the purposes and
conditions

Craft skills; Technical skills


Media skills
Communication skills
Evaluation skills
Product- Publication Self-evaluation Communication, Reveal evaluation skills
The public validates Display evaluation, explanation, Showing understanding of
the Explain, justify justification, persuasion, art and design theory,
work/idea/concept Persuade validation techniques, and purposes

Communication skills
Evaluation skills

Figure 11: Blueprint for the design of a new art examination

231
6.2.3.1. Kinds of evidence

According to the majority of Portuguese art teachers and students who responded to

the survey (Chapter 4, p. 139), ideally art and design assessment instruments should

be developed using evidence from portfolio tasks requiring a combination of written

and practical responses. They wanted the portfolio to include: a record of students’

intentions and how these are realised; preliminary investigation and research work;

work journals/ annotated sketchbooks; developmental studies; final products; self-

assessment notes or reports; and group criticism notes. These tasks should provide

students with an opportunity to reveal what they can do and what they know.

6.2.3.2. Portfolio

After having considered the views of Portuguese teachers and students and the

experience of English examinations, the portfolio was adopted as the assessment

instrument for the new examination. In compiling a portfolio, students explore a

theme, plan, elaborate, present and evaluate their work (Lindström, 1998).

Some degree of portfolio assessment has been tested out and used in a variety of

contexts (International Baccalaureate; Arts Propel; Project Zero; GCSE Coursework;

The Netherlands; Finland). Research on portfolio assessment had already

demonstrated the positive potential of portfolios as evidence for assessment in both

formative and external assessment (Gardner, 1992; Beattie, 1994; Hernandez, 1997).

6.2.3.3. Portfolio evidence

The portfolio was devised as an open-ended project to explore a theme that should

include selected evidence of students’ intentions, motivations, critical inquiry,

explorative and developmental studies, final products and records of self-assessment.

232
The required evidence needed to be flexible enough to allow students to choose

themes and tasks that best suited their individual learning styles, gender, ethnic group

and social background. The intention was to give them opportunities to choose

themes and tasks in order to allow fairness and equity in assessment, so that those

who were disadvantaged in one task could have an opportunity to offer evidence

of alternative expertise in another. It was considered also important to permit a wide

range of media and techniques in order to enable differentiated outcomes.

Self-assessment evidence was included in order to ensure students’ voices were

heard and their intentions made clear. Portfolio evidence was organised into three

main components: (1) Process; (2) Product and (3) Self-evaluation. However process

and product were not seen as discrete units; the framework used both concepts with

the necessary flexibility to respect different ways of working in art and design. The

evidence to be selected for assessment by students may be schematised as follows:

Reports or notes Preliminary


about previous studies,
experiences, Student developmental
interests, etc. Portfolio records
(visual/written) (visual/ written)
Folder,
Final Products exhibition;
(visual): Paintings, workjournals; Investigation
drawings, scultpures, CD; Webpage, reports and data
printings; graphic etc. critical inquiry
design; product (written and
design; multimedia visual)
photographs,films,
video records of Self-assessment report.
performances, Interviews: written or oral: tape, video, digital record about
installations, the students’ intentions, progress, investigation,
exhibitions,etc. achievement, presentations, evaluation. Records of self-
assessment and ‘crits’

Figure 12: Types of evidence for portfolio

233
In light of the Portuguese tradition of one single examination period at the end of the

art and design course, it was considered appropriate to develop one single portfolio

project as the final art and design assessment instrument for Portugal. A decision was

taken that the external assessment should be conducted in the last term of the

academic year during normal art and design class time. Students and teachers would

receive all the instructions at least one month before the examination to allow for

preparation of the project.

6.2.3.4. Assessment criteria

The design of the instrument was limited by the nature of assessment itself: it is an

exercise of power, involving some kind of judgement over others. It was limited, in

particular, by the need to establish criteria that would be subject to common

interpretation by users. The five criteria suggested by the researcher for negotiation

and refinement, during the pilot were devised in accord with the content framework:

AC1: Record personal ideas, intentions, experiences, information and opinions in


visual and other forms.
AC2: Critical analysis of sources from visual culture showing understanding of
purposes, meanings and contexts
AC3: Develop ideas through purposeful experimentation, exploration and evaluation.
AC4: Present a coherent and organised sample of works and final product revealing a
personal and informed response that realises their intentions.
AC5: Evaluate and justify the qualities of the work.
Figure 13: Assessment criteria

6.2.3.5. Assessment procedures

From the results of the survey in Portugal and research literature it was suggested

that the new examination should have assessment procedures based on appropriate

in-service teacher training and a process of moderation. The example of Edexcel

234
procedures for standardisation of moderators and moderation contributed to the

development of the new assessment procedures.

In-service teacher training was viewed as particularly important in order to achieve

the consensus of interpretation of criteria that is necessary for consistency of results

(Hennessy, 1994; Amabile, 1983; Csikszentmihalyi, 1988; Feldman,

Csikszentmihalyi & Gardner, 1994). A plan for in-service teacher training and

moderation was developed by the researcher to include explanation of instructions

and simulation of marking using examples of students’ portfolios all designed to

achieve consensus when interpreting criteria. In order to achieve consistent results a

procedure for external verification of the internal marks was also designed. Students’

portfolios (or a significant sample) would be made available to one or more external

assessors for checking the fairness and consistency of internal results.

6.3. Piloting the new assessment instrument

The design of the assessment instrument needed a priori validation through a pilot

examination. The pilot (Trial 1) was intended to ensure that negotiation took place

between users in order to determine final formats, tasks, media, criteria, mark

weightings, grade descriptors, time, and assessment procedures (see Appendix XIV,

pp. 170- 188). The objective of piloting the initial instrument was intended to: (1)

test the validity and reliability aspects of the instrument, and (2): propose strategies

to overcome any problems thus detected.

During the pilot data was collected through researcher observation notes,

questionnaires, student group interviews and reports. A research diary was kept in

235
order to collect information about the experiment, where notes about the pilot

sessions were recorded and transcriptions made of informal interviews with the

participants. One external observer evaluated the pilot or first trial: Graça Martins,

a recognised Portuguese expert in art and design education (see Appendix XIII).

Three student’s group interviews were conducted with a sample of twelve students

between 12 and 16 January 2003, the schedule for interviews included three sections

with questions about the validity of the assessment instrument (see Appendix XI).

A questionnaire for students and teachers involved in the trial and a checklist for

teachers were also used to collect data (see Appendix XII, pp. 141-166). The

questionnaire for students was conducted between 11 and 17 January 2003. The

checklist and questionnaire for teachers were conducted during the fourth pilot

training meeting (session 4, 11-01-2003). The pilot questionnaire included questions

used in the initial survey questionnaire (section 3, 4). However, in the trial, it was

found that the questionnaire and checklist schedules were not very effective and they

needed further revision before they could be used in the main trial. Some questions

were repeated (2.6; 5.3; 5.4.); others were irrelevant (3.10; 3.11; 3.12) or too

technical for students (4.5). So a decision was taken to remove repeated questions,

add new detailed questions about instructions, criteria, weightings and tasks, add

questions of the checklist for teachers to the questionnaire for teachers and remove

the checklist and provide more space for observations in

the questionnaire.

6.3.1. Pilot Sample (Trial 1)

A list of teachers willing to participate in the trial of a new instrument was compiled

from the survey questionnaire. From that list, ten teachers from the central region of

236
Portugal were contacted to participate in the pilot. Initially the ten teachers and their

students agreed to participate, but later three of the teachers withdrew because they

could not attend the time consuming meetings that were necessary.

Teachers Age Gender Background School Teaching years Role in the pilot
30- 4 M 2 Painting 3 School V OA 4 Teacher 4
40 Studio Art
31- 2 Sculpture 1 OD 1 Teacher 1
50 Studio
design
51- 1 F 5 Design 3 DGD 2 Moderator 2
62 geometry

Table 24: Trial 1 sample teachers

Students Age Gender Course


16 3 M 24 Technological 12
17 6
18 9
19 5 F 27 General 39
20 1
Table 25: Trial 1 sample students

6.3.2. Place, time and programme

The pilot was held at Escola Secundária Alves Martins in the city of Viseu. An

auditorium was used during the first session to allow exemplar portfolio documents

to be projected. The other sessions took place in a normal classroom. The schedule

for activities was as follows:

237
Training ( meetings) Activities
Session 1 • Explanation of the aims and purposes of the
pilot Distribution of roles: teachers were asked to
28/9/02 • Explanation of the assessment instrument: experiment with a portfolio with their
(8.30h- portfolios and assessment procedures based on students; only IL and AC could not do this it
13.30h) moderation because they were teaching descriptive
• Explanation of the proposed underlying geometry at the time. However they were
model and content framework interested in participating as external
• Examples of portfolio assessment assessors.
instructions were displayed from Finland; Arts
Propel; The International Baccalaureate and
Edexcel (UK);
• Reproductions of visual exemplars of
students’ portfolios (AS and A2) from Edexcel’s
art and design examinations were also displayed
• Discussion
Session 2
Production of instructions for the new examination:
12/10/02 a draft document including general guidelines for the Experimental trial project using portfolio
(8.30h- format; tasks; criteria; weightings, grade descriptors with students (22hours)
13.30h) was discussed. Negotiation with students
Session 3 (October- December)
16/11/02 Evaluate progress, experiment and discuss marking
8.30h- procedures and mark schemes (Familiarisation
13.30h) exercises).

Session 4: First marking by the teacher


Standardisation: Assessing samples; revise criteria; Moderation by one external assessor
11/1/03 mark schemes Discuss assessment with students
( 8.30h- For this session fourteen completed portfolios were (Questionnaires and interviews with
13.30h) marked collectively in order to establish the teachers)
standards. (Questionnaires and interviews with
students)
Session 5: Analysis of: teachers’ questionnaires;
Evaluate the pilot; revise instrument and procedures: students’ questionnaires; checklists;
18/1/03 verify the accuracy of marking and evaluate the interviews.
(8.30h- experiment.
13.30h) Multi-rash analysis (assessment results)
Table 26: Trial 1 schedule

6.3.3. Feedback from the pilot

The pilot was an essential part of the design of the new instrument since teachers and

students opinions and suggestions were constantly sought in order to obtain their

expert judgments and to negotiate processes. It was important to explain to the

volunteers the nature of portfolio based assessment and moderation using current

examples because, for them, it was very new in terms of external assessment.

238
Teachers reflected upon the proposed model, content framework and draft

specifications and agreed that the proposal was generally in accord with their own

models of teaching art and design and perception of content. The time for the

external assessment was set by common agreement: (1) preparation work: 25 hours

or 3 weeks extra class time; (2) developmental studies and final products: 15-20

hours during the three hour class-time sessions; (3) 2 hours for the self-assessment

report during the last class session.

The participants considered standardisation of teachers and moderators a useful

exercise in order to establish a common interpretation of the criteria. All the

volunteers marked a sample of thirteen portfolios and the results showed that the

difference between markers was of 1-3 points out of a total of 20 (for details of the

Portuguese marking system see p. 114). Internal consistency was 1-2 points’

difference between first and second marking of the same assessor. Although two

markers obtained less consistency, the moderation marking results for inter-rater

reliability showed some degree of consistency (See Chapter 7, p.286). The

volunteers considered that the instrument had strong validity and provided valuable

results. They suggested final revisions of the instructions and helped to inform the

definitive instructions for the trial (see Appendix XIV).

The pilot exercise was centred on negotiation. ‘The design was based on teacher

ownership: sharing power and constructing knowledge instead of an elitist, top down

vision of the assessment’ (Trial 1 External observer report, 2003-03-08/Appendix

XIII). Participants and the external observer considered this collaborative approach

to the design of the instrument a novelty. Teacher ownership of the assessment was

239
ensured and emphasised, and students’ opinions also were taken into account during

the pilot exercise.

Examinations used to be imposed by the government; the question


papers only express the view of the people in the GAVE about
what they value in art and design or about what they think should
be assessed. With this experiment I felt that things can be different;
we can also express our views about what must be assessed and we
ended up by enlarging the vision … (TR: AP; 18/01/2003)

It was quite odd during the examination preparation, because the


teacher was always asking us if we agreed with the instructions and
so on; I think it was a sign of respect for us; it was helpful not just
because we could express our opinions but also because by talking
about it we better understood the instructions (SR: M; 15/01/2003).

6.3.3.1. Underlying model

The participating teachers and external observer considered the underlying model

appropriate for the subject (See Trial 1 External observer report, 2003-03-08/

Appendix XIII). Generally teachers said that it was in accord with their own views

about what should be taught in art and design and with the learning objectives for art

and design.

I think this has much to do with my own ideologies and it is very


close to my way of teaching. …all this is about [the] art and design
process of inquiry and making (TR R, 12/10/02).

6.3.3.2. Format

In general, the examination based on portfolio was considered a valid assessment

instrument and teachers, students and the external observer welcomed the flexibility

of portfolios, which allowed for differentiated responses and respected students own

knowledge, opinions and interests. The questionnaire results showed a general

agreement with the validity of the external assessment in terms of contents; clarity of

instructions and criteria; appropriateness of tasks and weightings. However students

240
judged the instructions and criteria to be less clearly stated than teachers, possibly

because of the novelty of both tasks and criteria.

Questionnaire results: % of agreement with: Students Teachers


(51) ( 7)
2.1. The exam allows a considerable range of options. 94.1% 100%
The exam respects diversity of learning styles. 94.1% 100%
2.3. The exam respects students’ motivations and intentions. 86.3%) 100%
2.4. The exam respects students’ opinions and interests. 86.3%) 100%
2.5. I think that the examination allows students to display the 96.1% 100%
knowledge, understanding and skills set out in the syllabuses.
2.7. The exam components (Process, Product and self- 92.2% 100%
evaluation) are appropriate for the disciplines of art and design.
2.8. The assessment criteria are appropriate for the disciplines 94.1% 100%
of art and design.
2.9 The assessment matrix is appropriate for the disciplines of 94.1% 100%
art and design
2.10. The examination questions or tasks focus on highly 96% 100%
relevant knowledge, understanding and skills for future artists
and designers.
3.1. The instructions for the examinations are clearly stated. 80.4% 100%
3.3. The assessment criteria in the examination are clearly 82.4% 100%
stated.
Table 27: Responses to questions about format.

Students considered the instrument valid, in their own words:

Yes, we could make whatever we wanted to, in terms


of choosing artists and developing ideas. And also
with final products, it was very open (SR S,
15/01/2003).

It was useful to be properly assessed, because we


could show a variety of sketches and experiments (SR
G, 17/01/2003).

Students considered that the external assessment required them to perform in an

authentic way, realising tasks that were closely related to art and design learning

practices:

…we had to search the work of others, finding


problems in artistic works, in other works like media
and our daily lives, we had to sketch a lot, represent
the ideas, transform them, solving a problem through
the final product and we had to evaluate our work (SR
F, 15/01/2003).

241
We had to experiment with materials, techniques, formal elements
and composition, like colour, shadow, textures, etc. We had to
show technical abilities; knowledge about art forms and art making,
we had to show how creative we were (SR A, 15/01/2003).

We had to show that we could investigate sources like design and


art works; we had to show expressive works and technical abilities
and capacities to communicate like expressing ideas and judgments
(SR T, 17/01/2003).

We had to show persistence; effort, imagination, critical abilities


and personal responses (SR S; 17/01/2003).

However some problems resulted from the unfamiliarity of the assessment

instrument format. Although teachers said that they normally used coursework for

assessment, they considered that the format of the portfolio was more demanding in

terms of the working methods and types of evidence submitted.

…students are not used to justifying their work; explaining the


reasons and the processes (TR AP; 18/01/2003 ).

Students are not used to writing comments. I never asked for it


before, I only used conversations with students (TR C; 18/01/2003)

Things like the work journal are not commonly used, however
Studio Art syllabuses talk about it. But students are not used to it
(TR C; 18/01/2003).

In fact students had not previously been required to make notes or comments about

their working processes, progress or achievement. So providing this type of evidence

was new for them:

I was not used to being critical towards the work of others and my
own work. It was quite difficult to make the written comments (SR
F; 16/01/2003).

But, for F and some others, the difficulty was not the writing skills but the evaluative

skills:

242
… it is not the writing which is difficult; I prefer to write because I
have time to think during the writing. The difficulty was in being
critical, analysing the works and expressing personal opinions.
We were not required to do so before the portfolio. Last year,
they [teachers] were more interested in our drawing and painting
abilities (SR F; 16/01/2003).

The new assessment instrument, although designed in accord with curriculum

materials (Trial 1 External observer report, 2003-03-08/Appendix XIII), introduced

unfamiliar approaches into Portuguese art education. It was a challenge for teachers

and students used to art and design assessment focused on craft-skills and self-

expression. The pilot experiment revealed that students, more than teachers, felt the

different approach to art and design assessment. Although 92.2% percent of students

agreed that they were well prepared to develop portfolios (questionnaire statement

6.1) and 100% percent of teachers agreed that they were well prepared to conduct the

examination (questionnaire statement 6.1), they complained about a lack of

familiarity with tasks requiring evidence of critical thinking. According to Graça

Martins, the assessment instrument promoted a culture of self-reflection and

expected students to become reflective-practitioners (Trial 1 External observer

report, 2003-03-08/Appendix XIII). However, teachers and students alike considered

critical skills important. Teachers suggested that to overcome this problem there was

an urgent need to develop such skills. Some strategies were proposed, such as

promoting more group discussions in which the teacher could act as a mediator,

introducing facilitating questions to help students to reflect upon their work and the

work of others. Students considered group discussions useful during the development

of portfolios. For example M said.

It was fine when we had the group discussions about our work, we
helped each other. Well, sometimes my friends told me I was doing
wrong and I didn’t agree with them; but it was an useful exercise,

243
in the way we had to explain what we were doing and making
statements about the quality; what is good and why…using the
criteria words to explain it. But I am not sure about its importance
for assessment (SR M; 14/01/2003).

According to the questionnaire results, students and teachers used group discussion

during the preparation of portfolios, but students and teachers had different views

about the importance of teachers’ influence upon students. Teachers’ responses

showed that they believed that their influence was less important than students

perceived it. This may be because teachers thought that their influence was not very

strong since the portfolio was produced with a minimum of their help, whereas

students believed that teachers’ opinions were influential.

Questionnaire results: % of agreement with: Students Teachers


(51) (7)
6.2. It was important to discuss the contents of the portfolios 88.2% 100%
6.3. It was important to have regular group criticism or ‘crits’ with students 96.2% 100%
6.4. The contents of the portfolio were regularly discussed with the teacher 82.4% 100%
6.5. It was important that students have the chance to talk about and explain 88.2% 100%
the work in their portfolios regularly with the teacher during preparation
time.
6.6. It is true that I am very interested in hearing students’ views about the 92.2% 100%
work in their portfolios

6.7. Students had freedom of choice to make their project briefs according to 94.1% 100%
their interests and motivations.
6.8. It is true that I influenced students in their choice of works and approach. 58.8% 20%

Table 28: Responses to questions about preparation.

However including evidence in the external assessment of students’ performance

during group discussions was regarded with caution:

I think the teacher should take it into account. But some students
are more individualistic; they need to be left alone. So it is not fair
for them, if they don’t like to talk, it is their right to be silent during
group discussions (SR F; 16/01/2003).

244
Students agreed that for some of them, group discussions might benefit assessment.

Teachers agreed that the best way to include them in the final assessment would be

through the recommendations of the students’ own teacher. They agreed that

evidence of students’ performance during group discussions should be used as

a value-added, confirmatory element and should not be used in any way to

disadvantage students. The use of work journals was another strategy designed

to foster self-reflection. Some teachers observed that the students who developed

their projects with work journals were more critically aware than the others.

Students agreed that work journals were useful:

I liked the work journal, it helped me a lot to explain things and


also to have ideas, because I wrote about my feelings and made
collages of images that impressed me (SR N; 15/01/2003).

The question of whether work-journals should be compulsory was considered.

It was agreed that, in some cases, students could be disadvantaged by this

requirement and, besides, alternatives such as a PowerPoint presentation or a CD-

Rom including visual, oral or written comments could be equally valid. Therefore, it

was agreed to strongly recommend the use of work journals in the external

assessment or other evidence of reflection using verbal or non-verbal language.

The questionnaire results showed that 68.8 percent of students and 85.7 percent of

teachers agreed with statement 6.13: ‘It was important to have the option of

providing visual and oral (taped) comments about the intentions, quality and progress

of the work in the portfolio’. However in the pilot all the students chose to express

their notes and comments in writing but this should not eliminate the possibility of

using other means of expression. Whether to use verbal or non-verbal languages was

245
debated during teachers’ meetings and there was consensus that students should have

the freedom to use the kind of language that best suited their knowledge and

personality. Poor writing skills should not disadvantage students and there was a

shared opinion that visual and oral commentaries were acceptable to express

students’ intentions; opinions and evaluations. It was recommended, therefore that

students could submit notes and comments by audio, video and digital means and

such records should be afforded the same value as a work journal or a written report.

6.3.3.3. Bias

The questionnaire results showed that no significant bias was detected related to

gender, geographical location, and racial/ethnic/religious background. However

some students detected some bias related to social class; in their views, portfolios

demanded a considerable effort to acquire resources like drawing materials and

bibliographies that are not available for all.

Questionnaire results: % of agreement with: Students Teachers


(51) (7)
3.6. The examination is free from gender bias (e.g. equally fair to all students, 94.1% 100%
boys and girls).
3.7. The examination is free from bias related to where students live (e.g. equally 90.2% 85.7%
fair to students from big cities and students from small cities or rural areas).
3.8. The examination is free from bias related to racial/ethnic/religious 100% 100%
background of the students.
3.9. The examination is free from bias related to social class. 76.5% 85.7%
Table 29: Responses to questions about bias.

After analysis of teachers’ comments during the pilot training sessions and

students’ group interviews, evidence showed that the instrument increased teachers’

responsibilities. The fact that teachers’ role was so important in supporting the

preparation of the portfolios raised a problem of equality of opportunity.

The teachers discussed this point during the last pilot training session and

246
recommended that teachers should have training meetings and mutual support before

and during the examination. They thought meetings should act as a vehicle for

training and also for monitoring. However it was agreed that students should not be

left out of the dialogue and the researcher proposed that for future trials an Internet

website for the examination might be published, with a teachers’ and students’

discussion forum available to all the users. A students’ forum might reduce bias in

the instrument because those students with a less committed teacher could ask other

teachers for additional support during the preparation of the portfolios.

6.3.3.4. Tasks

Students, teachers and the external observer considered the portfolio tasks

appropriate and well organised, presenting a balance between process and product

(for details about tasks see Appendix XIV, p. 171 and p.173).

It was the first time students had been required to submit a considerable amount of

evidence for assessment. Although they complained about lack of time, they agreed

that the portfolio provided a way of structuring their work, because they found the

instructions clear and the tasks inspiring.

Questionnaire results: % of agreement with: Students Teachers


(51) (7)
6.9. It was important to develop investigative and developmental studies 90.2% 100%
during preparation time.
6.10 It was fair to develop some developmental studies and final products in 94% 100%
examination conditions.
6.11. It was important to make self-evaluation reports and notes 51% 85.7%
6.12. It was important to have the students’ written comments about their 84.3% 100%
intentions, quality and progress of the work in their portfolios
Table 30: Responses to questions about tasks.

247
I think the tasks are fine; and the way portfolio is organised helped
us to develop the project, gave us a kind of method of work (SR M;
15/01/2003).

Exploring ideas, materials and techniques throughout the time was


quite good. We had time to learn and improve our skills, sometimes
we had to go back, we had to select the best solutions, I think it was
really creative because without doing it, if we just had final
products, we could not go so far (SR N; 15/01/2003).

Some students found the investigative studies or research tasks, which were new to

them, very time consuming.

…the investigative work was boring; we are not used to doing this
in Studio Art . [R: Do you think it will be better to eliminate
investigative work?] No, not at all, we learned through searching
the work of artists, I think it is helpful to develop our own ideas, in
the sense that we see other things; we discover ways, techniques,
materials, approaches to ideas and it is not like in history of art, we
can chose our sources and study works we like and which are
useful for our projects (SR T; 16/01/2003).

Previously students had written essays about artists and art movements but had not

been required to connect their practical work with what they learned from theory.

In a great majority of the portfolios, the researcher observed that students failed to

connect the analysis of sources with their own projects. Interviews with students

showed they were aware of this. They said that it was very new in terms of

requirements and they did not fully understand it at the beginning.

Investigative studies were boring. Well, I think I did it them badly


because I collected so many things and I didn’t use them in my
project (SR M; 17/01/2003).

Some portfolios showed little variety in terms of developmental studies and

exploration of ideas. The teachers thought that this might be a consequence of the

breadth of the instructions. They recommended that instructions for students should

clearly state what was expected from students more clearly and simpler language.

248
All the teachers were strongly in favour of self-evaluation tasks, but self-evaluation

requirements were not considered so important by all students, and some boys in

particular distrusted it:

Self-evaluation, I don’t think it is very important. I don’t like to do


it (SR G; 16/01/2003).

Generally boys did not like self-assessment, and it was apparent from the interviews

that this was partly because it implied extra effort. However the majority of girls

agreed that the inclusion of self-evaluation tasks improve their learning and

performance.

We need to know what we have done to merit the marks. I think we


need to understand the quality, I mean what was wrong and good
and why and self-evaluation is a good way to help me to
understand that. And it is also good for the teacher to know what I
think about, we can discuss it. I don’t think I can influence my
teacher but he can understand and help me to improve (SR S;
17/01/2003).

I think self-evaluation is important. Because the external assessor


does not know about the development of the project, we need to
explain it. And it is good for us, because we have the opportunity to
say what we think, to express our views about progress and
achievements. I think we have lots of benefits by doing self-
evaluation (SR F; 16/01/2003).

6.3.3.5. Time and resources constraints

Questionnaire results: % of agreement with: Students Teachers


(51) (7)
2.12 . I think that students have plenty of time to complete the examination. 21.6% 57.1%
2.11. The physical conditions and resources for the examination 54.9% 57.1%
(accommodations; equipment and materials) were adequate in my school.
Table 31: Responses to questions about time and resources.

The responses from both students and teachers to questions about timing in the

questionnaires suggested that this was a main constraint, but after reflecting on this

249
during interviews they agreed that time was not the real problem – rather it was lack

of a disciplined work method. Because it was the first time they had developed a

portfolio with such requirements, they did not plan the various phases of the work

sufficiently.

I liked to search the history of alphabets; I liked the searching part


but it took a lot of time, I was not really organised and in the end I
had little time to explore ideas and to make the final product (SR S;
14/01/2003).

We had little time for the final product; I think the final product is
very important. But I agree with the sketches, we need to
communicate the ideas before the final product… but I didn’t plan
it well (SR F; 16/01/2003).

According to the teachers, planning was not the only problem; students were not

motivated to carry out extra schoolwork especially because Studio Art was not

considered an important specialism.

But this is time-consuming; I had difficulties with the students


because they are not willing to make extra-work; they only work
during class time. They had other subjects they considered more
important, like maths and geometry to study. And they have no time
to fully develop their abilities. You know they are not used to
working hard in Studio Art (TR R; 18/01/2003).

But when students were asked if they thought the preparatory studies should be

eliminated from the model, surprisingly all seemed to be in favour of them. It was

clear from their responses in interviews that they had not realised how this

examination was.

No, it is not that we don’t want to make this extra-school work. I


think it is much better for us if we have that possibility… but, you
know this time we didn’t realise how important it was for
developing the portfolio in class. We were not used to submitting
so many things and usually we had good marks without making any
extra effort. I think we didn’t realise that this time it was different,
the teacher was more demanding and we could not have a good
mark without making extra efforts (SR M; 15/01/2003).

250
It was evident that the new assessment instrument challenged old habits of work in

the discipline, and all students were quite aware that standards had been raised in

terms of achievement and commitment. The need to improve methods of planning

artwork for portfolios was noted by teachers and students.

The physical conditions and resources for the examination (classrooms space and

conditions for storage of materials) were inadequate in the school where the pilot

took place; however the school had a good library and all the students had access to

the Internet at this school.

6.3.3.6. Criteria and weightings

The proposed assessment criteria and assessment matrix introduced students to a

different approach to the qualities to be judged in students’ performance. It was

agreed that it was well adapted to the curriculum, underlying teaching models, and

were fair and balanced for art and design students. The emphasis on process, critical

reflection and self-evaluation, was approved as essential to the nature of art and

design learning and practice. However the unfamiliarity of some tasks was a problem

because of the inherent risk of failure. Teachers agreed that these risks should be

taken, at least in the pilot. They were in favour of the underlying conceptual model,

even though they had not required students to reveal so fully their capacities for

critical reflection and self-evaluation in the past.

251
Teachers and students agreed with the assessment criteria and weightings (see

Appendix XIV, pp. 183-184), teachers thought they were a good basis for

establishing common ground for judging students’ performances but they identified a

need to simplify language in the assessment matrix (see Appendix XII, p. 136).

Although students agreed with the criteria in the questionnaire, some of them

complained they were unsure of what qualities their art and design teacher and

external assessors were looking for in their portfolios. According to the teachers AC

and IL, students did not understand the statement in question 3.4 (Appendix XII,

p.147): ‘Students are aware of what qualities their art and design teacher and

external assessors are looking for in their portfolios’, and tended to misinterpret the

term ‘qualities’ as ‘taste’. A revision of the criteria in the instructions for students

was needed to explain further the concept of ‘qualities’ and how this was intrinsic to

the assessment criteria.

Questionnaire results: % of agreement with: Students Teachers


(51) (7)
3.4. Students are aware of what qualities their art and design teacher and 70.6 100%
external assessors are looking for in their portfolios
4.1. In my opinion, the criteria used to assess the portfolios are fair and 96.1% 100%
recognise students’ effort and achievement.
4.2. In my opinion, the criteria used to assess portfolios are sufficient 90% 100%
flexible to accommodate a wide range of work and student responses.
Table 32: Responses to questions about criteria.

6.3.3.7. Impact

Students understood the portfolio not only as an assessment instrument but also as

a learning experience. They agreed that the portfolio was an important tool for

preparing them for further study in art and design.

Questionnaire results: % of agreement with: Students Teachers

252
(51) (7)
5.1. I agree that putting together the art and design portfolio has been a 90.2% 100%
worthwhile learning experience for the students.
5.2. Students’ art and design portfolio are a good foundation for further 84.3% 100%
study in art and design.
Table 33: Responses to questions about impact.

The new instrument influenced the way art and design was taught and perceived, not

by narrowing but by increasing the range of contents. In the responses to the

questionnaire, 85.7 percent of teachers agreed with statement 5.5: ‘It is true that the

examination positively influenced my teaching practice’ and 71.4 percent of teachers

agreed with statement 5.7: ‘The examination provided me with a different

perspective on art and design education’. According to the results of the teachers’

checklist, the main effects of the instrument on teaching were related to the need to

foster investigative studies and self-evaluation tasks in their teaching practice (see

Appendix XII, section IV).

6.3.3.8. Assessment procedures

Students, teachers and the external observer considered the assessment procedures to

be fair. According to the questionnaire results there was a great percentage of

teachers’ and students’ agreement with the assessment procedures.

Questionnaire results: % of agreement with: Students Teachers


(51) (7)
4.3. I think it is important that students’ portfolios are assessed by more than one 90.2% 100%
person for the final assessment, to minimize subjective bias.
4.4. T: I think that my students’ portfolios were fairly marked./S: I think my 98% 100%
portfolio was fairly marked
4.5.The assessment procedure combining holistic and criterion referenced 100% 100%
marking is appropriate to judge art and design works.
4.6. The standards were consistently applied. 100%
7.1. It was useful to discuss the interpretation of criteria with other teachers. 100%
7.2. During standardisation meetings we reached consensus about interpretation 100%
of criteria.

253
7.3. It was useful to have visual exemplars to understand the criteria and 100%
standards.
7.4. I had adequate training to assess students’ portfolios. 100%
Table 34: Responses to questions about assessment procedures.

In the last pilot session (18-01-2003) teachers said that they had reached a common

interpretation of criteria and that the meetings had proved very important for training

them to help mark students’ work consistently. Teachers considered that using

students’ portfolios containing real examples of work, rather than photographic or

digital records, was very helpful in reaching a common understanding of standards.

I was afraid about external marking; because it was the first time I
encountered it. But in the end it worked quite well and this was a
sort of checking out my standards. The exercise with portfolios was
helpful to share a common interpretation of criteria, so I am not
surprised with the results of external assessment; we shared the
same views about the levels of achievement. (TR C; 18/01/2003).

And this process of external marking was fair for everyone; at the
same time I feel safer because someone is checking the accuracy of
my marking (TR P; 18/01/2003).

The proposed assessment procedures were considered more reliable than the existing

examination procedures. Clarity of assessment criteria and the matrix was identified

as a key factor enabling consistency of results, but the major reason given for

improved accuracy of marking was the quality of teacher training and the use of real

student portfolios for standardisation. However the training sessions were time-

consuming, partly because all the instructions had to be negotiated through long

group discussions and feedback from students.

The time spent in training sessions for teachers was not realistic given that the

teachers were already overloaded: the withdrawal of three participants made this

clear. This aspect of the model needs further thought. Shortening the training

254
sessions would be practical, but it would probably have negative effects on the

procedures. A possible solution to the lack of time for training could be providing

on-line support and arranging training/standardisation sessions in at least two regions

of the country in order to avoid the problems of teachers undergoing long distance

travel. However it is reasonable to expect that the time spent might be reduced in

further trials as familiarity with the assessment model increased.

6.4. Suggestions for improvement

Fostering the active involvement of teachers and improving the clarity of instructions

for students (see Appendix XIV, pp. 176-181) might overcome the problem of the

unfamiliar format of assessment. Portfolios demand more work, and more organised

work, from students in terms of motivation, structure and presentation. In the new

assessment instrument, the relationship between teacher and student changed when

assessment was integrated into learning. The instrument was considered to have the

necessary validity characteristics because it enabled students to display their

knowledge, understanding and skills in art and design – and offered a rich range of

art and design activities that were considered inspirational. The instrument was

judged to be flexible in its range of options. But points of fragility were detected

such as unfamiliarity with the format and potential bias resulting from the role of the

teacher. To overcome such problems there is a need to establish clearer instructions

and assessment matrix, effective in-service teacher training and purposeful dialogue

between teachers and students. A possible strategy for attaining such goals within

time and space constraints might be: (1) to conduct only one standardisation meeting

with the volunteer teachers, (2) to create a permanent support structure during the

255
examination in order to establish dialogue between teachers and students – possibly

online support through a dedicated website.

Summary

In this chapter the design of the new instrument was described and the external

assessment draft materials were explained. The assessment materials were discussed

and negotiated during the pilot in order to investigate issues of validity, practicality

and reliability. Problems were identified and were used to re-design the examination

instructions (see Appendix XIV). It was concluded that the new instrument and

procedures for assessment required: (1) development of programmes for in-service

teacher training, (2) a change of attitudes towards studio based art disciplines,

increasing the value of Studio Art in the curriculum and encouraging students to

engage with the subject with greater commitment and, (3) developing critical and

self-evaluation skills required a major change in the approach to curriculum and

pedagogy, educating students as reflective-practitioners.

256
Chapter 7
Main Study: Trial of the new external assessment instrument and
procedures in five schools in Portugal

In this chapter the trial of the new instrument and external assessment

procedures in five Portuguese schools is described, analysed and evaluated.

The evaluation was based on data collected during the trial through the

researcher observation notes, questionnaires, interviews and reports.

7.1. Trial sample

The sample was determined from a list of nineteen teachers who were willing to

participate in the trial of a new instrument obtained from the questionnaire survey

respondents (Appendix VII) and from the list of teachers involved in the pilot.

The teachers were contacted by telephone, the trial activities and timetable were

explained to them and they were asked if they were still willing to participate. Ten of

them volunteered accepted the invitation. A sample of teachers was identified aged

thirty to fifty years-old, with six to twenty five years of teaching experience and from

different regions of the country (North, Centre and South). They included five

teachers who had collaborated in the pilot, three of whom continued to use portfolios

in their studio art internal assessment throughout the year while the other two were

experienced teachers who had acted as moderators during the pilot study and were

willing to continue as moderators in the main trial.

However these teachers were clearly not representative of all secondary studio art

teachers in Portugal since their availability for the trial suggested a particular interest

in reform and motivation to experiment with new assessment approaches.

257
Names Age Teaching Gender Background School Role in Number
Experience the trial of
(Years) students
J 42 20 Male Master P (south) Teacher 5
Multimedia
AM 39 16 Female BA Design B (south) Teacher 15

C 30 6 Female BA Design V Teacher 23


(centre)
AP 32 8 Female BA Design V(centre) Teacher 10
I 39 14 Female BA Design K(south) Teacher 22
E 42 16 Female BA K(south) Teacher 15
Architecture
M 50 25 Male BA Painting A(north) Teacher 11
R 38 14 Male BA V(centre) Teacher 17
Sculpture
IL 49 23 Male BA Painting V(centre) Moderator
A 42 18 Female BA Design V(centre) Moderator
Table 35: Trial 2 sample

The schools and studio art classes were very different and included: (1) two schools

located in suburban regions in the South region of Portugal with students from

middle and working class families and ethnic minorities (Schools P and K); (2) one

school located in the biggest city of Portugal with students from middle and high

social status families (School B); (3) one school from a medium size city in the

Central region with students from middle class families (School V) and, (4) one

school from a small city in the North West region with students from rural and

middle class families (School A).

The volunteers asked their 12th year Studio Art students if they wanted to participate

in the trial; it was explained that the new instrument was demanding and they would

need to submit more work than usual but, at the same time, it would allow them more

freedom and opportunities to reveal their knowledge, understanding and skills in art

and design. With the exception of those at School P, the students were very

258
motivated to try out the new instrument. One hundred and seventeen students aged

between seventeen and twenty years of age were involved in the trial (seventy-two

females and forty-five males).

7.1.1. Data collection

A research diary was kept in order to collect information about the trial, where notes

about visits to schools and standardisation meetings were recorded and transcriptions

made of informal interviews with the participants. One external observer evaluated

the trial: Leonardo Charréu who was a recognised Portuguese expert in art and

design education (see Appendix XXIII). Five group interviews were conducted with

a sample of 31 students between 2 and 26 June 2003 (for details see Chapter 3, p.89).

A questionnaire for students and teachers involved in the trial was also used to

collect data (for details see Chapter 3, p.93). The questionnaires were conducted

during July 2003. Other data was collected through teachers’ reports.

7.1.2. Location, time and programme

The main trial started at the end of the second school term in 2003, before the Easter

holidays and finished by the end of June 2003. The schedule for activities comprised

standardisation meetings, implementation of the trial with students; researcher visits

to schools and interviews with teachers and students; internal marking of students’

portfolios; moderation of a sample of portfolios in each school by one or two

moderators; and on-line meetings with the volunteers (moderators and teachers).

The schedule of activities was developed as follows:

259
Teacher and moderator Teacher and student activities
activities
March/April Standardisation meetings: Reading the assessment booklets in
2003 (1) 7th March 2003 (School class
K) Explaining the instructions
(2) 10th March 2003 (School Aided work: Start preparatory work
V) (investigative studies and first
developmental studies)

May 2003 Unaided work: Continuation of


On-line meetings (web page) developmental studies, final products
Researcher visits to schools and self-evaluation reports, compiling
(students and teachers the portfolio.
interviews at art classrooms) Questionnaires students
Visit to schools- observation
Interviews

June/ July On-line meetings (web page) Internal marking (10-15 June)
2003 Post-trial meeting (18 July at Moderation ( 16-30 June)
School V for final evaluation Questionnaires teachers, reports
and marking samples) Interviews
Table 36: Trial 2 schedule

7.2. Description of the activities

The external observer characterised the trial activities as ‘… a critical and innovative

dialogue about assessment, where art assessment problems were discussed from

practice and solutions were tested in order to achieve a model of assessment adapted

to the nature of art with more validity and reliability’ (Trial 2, external observer

report, 30-07-2003, Appendix XXIII).

7.2.1. Standardisation meetings

Two standardisation meetings were held. The first meeting with teachers took place

on the 7th March 2003, in the ceramic room at School K, from 9.00h to 19.00h with

AM; E; I; J; M, IL and A. The second meeting took place on the 10th March 2003,

260
in the studio art room at School V, from 9.00h to 19.00h with teachers; AP; R; C and

moderators IL and A. The agenda for the meetings followed a common format:

• Part 1: Explanation of the instrument and procedures, discussion of


instructions; criteria, assessment matrix and grade descriptors. Detailed
explanation of the internal standardisation and moderation procedures and
forms to be used for marking.
• Part 2: Familiarisation exercises using 15 portfolios developed by students
during the pilot experience.
• Part 3: Blind-Marking: volunteers were asked to mark 10 other portfolios
developed by students during the pilot experience.
• Part 4: General discussion about standardisation; feedback from blind-
marking and dealing with practical questions such as dates for the researcher
visits to schools; moderation visits, online meetings, and deadlines for
sending questionnaires, marking forms and the final reports.

The volunteers received a copy of the assessment booklet resulted that was

developed for Trial 1 (Appendix XIV) two weeks before the standardisation

meetings. At the School K meeting the booklet was new for all the teachers involved,

since it was the first time they had participated in the experiment. However their

responses to question 15, in the questionnaire had shown that all of the teachers

found it easy to understand it, and were in agreement with the statement ‘The

instructions for the examination were clearly stated’. AM; I and M said that they

adopted similar strategies in their teaching, J said that although he was motivated by

the trial and agreed with the rationale, he expected some resistance from his students

because they were not very motivated and were unfamiliar with the tasks. Since the

moderators had participated in the pilot they described the pilot experience and

showed the fifteen marked portfolios, explaining how the criteria had been applied.

The group spent two hours looking at the marked portfolios asking questions about

specific issues related to the type of evidence and how it was marked. After the

familiarisation exercise, the volunteers were asked to mark individually the ten

261
portfolios displayed for the blind-marking exercises. Afterwards the group discussed

the marks.

At the meeting in School V the standardisation was easier to conduct because all the

teachers involved had participated in the pilot. The first and second parts of the

agenda were covered very quickly. The volunteers described other portfolios they

had developed with students during the second term (January-March) and showed

student work. Since they already had marked the exemplar portfolios during the pilot

the group also decided to use the new ones for the familiarisation exercises.

The results of the blind-marking exercises in both meetings revealed that the

volunteers were reasonably consistent in their marking with only one to four points

difference between markers (see table 37).

J IL AP AM C M E I A R diff
Portfolio 1 6 5 6 6 6 6 6 6 6 5 1
Portfolio 2 10 12 11 12 12 12 13 13 12 12 3
Portfolio 3 8 8 8 7 8 7 6 8 8 7 2
Portfolio 4 9 9 8 12 9 9 10 9 8 10 4
Portfolio 5 18 18 17 19 17 16 17 18 17 19 3
Portfolio 6 11 11 10 11 10 10 9 10 12 11 3
Portfolio 7 10 10 10 11 10 10 10 9 10 11 2
Portfolio 8 20 20 19 20 19 18 19 20 18 20 2
Portfolio 9 16 17 16 17 16 16 16 17 16 17 1
Portfolio 10 8 8 7 8 7 7 7 7 7 9 3
Table 37: Trial 2: standardisation blind marking results.

7.2.2. Web page and on-line meetings

Under the designation ‘Art assessment’ a web page was constructed on the Internet

(www.prof2000.pt), on a site administered by a consortium of schools dedicated to

distance in-service teacher training. The contents of the site were: (1) explanation of

the rationale and conceptual framework; (2) explanation of the portfolio and

262
evidence required for assessment submission; (3) instructions for teachers: (4)

instructions for students; (5) digital reproductions of marked portfolios; (6) teachers’

and moderators’ forum; (7) students’ forum; (8) on-line chat programme (#af31

channel at IRC). The web page, forums and on-line channel aimed to provide a space

for continuous dialogue between the users of the new instrument. On-line meetings

were conducted with some of the volunteers and the researcher on 5/5/03; 21/5/03;

16/6/03; and 25/6/03 between 18.30-21.30h. However they were not as useful as

expected because of a lack of teacher and student participation. Only four students

sent messages to the student forum asking questions about the appropriateness of

their project briefs. According to the external observer the on-line debates ‘were

interesting and teachers participated in the majority of the dialogue’, however

teachers’ and moderators’ participation in the forum and on-line meetings, although

important, was very limited. According to his report: ‘ It seems obvious [in this case]

that inter-personal communication via IRC does not have the same qualities as

physical communication, however it presents advantages because it allows contacts

between people from different geographical locations, furthermore it allows people

to think about their replies allowing a richer dialogue between the participants’

(Trial 2, external observer report, 30-07-2003, Appendix XXIII).

7.2.3. Implementing the portfolio with students

The portfolio assessment was administered during the last term of the year.

The teachers explained the portfolio requirements to the class, the types of evidence

to submit and assessment criteria. I, E, M, AP, J and C convinced their students to

opt for a project brief and media that could be easily realised taking into account

available resources and class time. R and AM gave their students complete freedom

263
of choice. However a significant number of students chose the proposed theme:

‘Excluded’ or themes related to it (for details see Appendix XIV, pp. 190-192). A

great diversity of media used was evident in R’s art class (multimedia; web pages;

product design; fashion design; installation; video; photography, painting, sculpture,

technical drawing). Different media were also chosen by some students at Schools K

(for details see Appendix XX) and B (installation; illustration; advertising; painting;

performance; video; sculpture and photography), for details see Appendix XIX. At

School A all the students chose themes related to the theme given by the teacher,

‘Human figure’ They developed expressive drawing and prints (for details see

Appendix XXII). AP and C suggested the theme ‘Moods’ to their students who

submitted drawings and paintings (for details see Appendix XXI).

7.2.3.1. School P (Appendix XVIII)

The art class at School P only had five students and the teacher viewed this as

negative. The teacher described the school location (J interview, 2 June 2003),

as ‘a dormitory town’, and the students’ economic background were ‘very poor.’

He characterised their cultural interests as ‘shopping halls and sporting events’.

The school did not provide students with access to any other kinds of cultural

activities; the school environment lacked stimulus. As J pointed out ‘in the school

nothing really happens’. In his view students’ poor attitudes and lack of motivation

were a consequence of negative educational practices:

...they have a passive attitude towards art disciplines;


because they are used and were trained as passive
objects to make pre-determined tasks
(J report, 6 June 2003).

264
All the students rejected the new instrument because they did not understand it as

related to art and design specialisms:

I didn’t like the portfolio; Studio art is for drawing and experimenting
with new techniques, not for investigative studies (SR Sergio, 2nd June
2003).

All the P students expressed the same opinions in the section 3 of the questionnaire.

They were unfamiliar with the new requirements, their previous learning in art was

underpinned by a formalist approach, and because of that they could not perform in

the majority of tasks.

[During the course] It was not the same thing; the preparation/search was
just finding an image and after we did the work using the image; for
example an impressionist or abstract work. We are used to making
drawings (MTEP; Studio Art) to experiment with specific techniques the
teacher ask us to experiment… The work is the same for everyone; for
example a face and then we work the form; light; colour, etc (SR Sergio,
2nd June 2003).

According to the teacher, the educational practices in the school followed different

underlying approaches and the new assessment instrument disadvantaged students

because they were not been prepared to develop independent thinking and critical

skills at school:

Practice in the arts has centered on knowledge of


content and technical skills without helping students to
understand and to be able to justify process. And this
was the main cause of the poor student results in the
portfolio. Because in the instructions there was no
detailed task prescription; students felt that anything
could be done and lost direction without any
prescription. They were unable to be creative (J report,
6 June 2003).

Students found the new instrument difficult and inappropriate for them:

Students got lost; they couldn’t define a problem.


They did not know how to create their own ideas; how

265
to experiment. They limited themselves because they
were afraid to experiment or explore (TR J interview,
2nd June 2003)

We didn’t know what to do; we felt lost because we


didn’t know what kind of final work we should make
(SR Sandra, 2nd June 2003).

I didn’t know what to do; I had no concrete ideas to


make the developmental studies (SR Raquel, 2nd June
2003).

Overall students considered the new instrument too time-consuming and too broad.

For these students the instructions were not detailed enough and the criteria focused

on knowledge, understanding and skills in art and design which they considered

unimportant; they wanted to be assessed on short and prescriptive tasks and not have

to make decisions for themselves, or carry out investigative studies or critical

reviews. Unfamiliarity with such tasks and a lack of previous training in critical

thinking were the main reasons why this trial failed in that school. These students

understood studio art merely as technical skills. In the two months available for the

trial, the teacher was not able to modify habits and foster development of

independent thinking, critical skills and decision-making. While it was probably an

error to use the new instrument with these students, the teacher did not use the

resulting marks for his internal assessment, so participating in the trial did not

disadvantage the students.

7.2.3.2 School B (Appendix XIX)

School B was a contrast. Located in the centre of Lisbon, it was a ‘top’ school.

According to the teacher, the students’ social and cultural background was rich, they

266
came from elite basic education schools and their parents were involved with school

activities. They frequently visited exhibitions, concerts, and theatre and dance

festivals.

My school has very good resources; students are


highly motivated and competitive. Students’ socio-
economic and cultural background is high. The school
life is dynamic promoting various activities and
cultural events (TR AM, 2nd June 2003).

According to the teacher they were also very competitive and motivated.

The researcher observed that the relationship between the teacher and students

was good and built on mutual respect:

There is a friendly relationship teacher/student; a student invites the


teacher to go to a concert; another student brought texts and magazines for
the teacher about graffiti; she says: ‘Teacher; if you want to know more
about the subject there is some material’ The teacher says: ‘Thank you
Sara; I am learning a lot from your projects’. She is also helping students
with additional technical support; one of the students is doing a video
about gipsies, she asked the student: ‘Telma did you contact the video
teacher I told you about? Telma answers: ‘Yes; he was really kind to me;
we fixed the time to mix the clips in the video; next week I will do it
because the lab is available’. Another students says: ‘Teacher; you know
the film you recommended to me about the hippies, the Jesus Christ
Superstar; I saw it during the weekend with Telma and Joâo; but I think
the Woodstock images are much more helpful to use in my portfolio’
(research diary, Lisboa, 2nd June 2003).

AM had no problems applying the new instrument with the students because she

already assessed students in a similar way and fostered the kind of knowledge,

understanding and skills required by the new assessment instrument in her classes.

So, the tasks were not unfamiliar and students had the necessary training to perform

within the constraints and opportunities it presented. Their previous experience

seemed to be a key factor for the success of the new assessment instrument in this

case.

267
In general, students considered that the new assessment instrument was valid and

they enjoyed the relative freedom of choice:

The portfolio was good essentially because we had to make different


works; it was not prescriptive; we had a theme and we had to develop
the work… From our heads; not like ‘go and draw, draw a bench with a
monkey’ … I did a portfolio with things I like to do and showing what
I wanted to show (SR Telma, 2nd June 2003)

The portfolio demands more creativity; it is much more enjoyable and


motivating for us; but it is also more difficult because of that (SR Sara, 2nd June
2003)

The success of the new instrument at School B was related to its characteristics, the

previous learning of students and, especially, their teaching strategies:

The teacher also helped us; I mean psychologically, by motivating us,


giving us confidence in our work (SR Ana, 2nd June 2003).

The teacher was always telling us to work autonomously and to think


independently (SR Susana, 2nd June 2003).

Autonomy; we learned it little by little during the course (SR Diana, 2nd
June 2003).

However other important factors were the students’ attitudes and behaviours; their

past working practices and ability to learn independently and collaboratively

facilitated their performance.

We learn with what the teacher gives us, but we also learn by ourselves
(SR Ana, 2nd June 2003)

It was very important to have ‘crits’; the opinions of other students is


very important …Yes, it helps us to improve …We had a lot of help
from our colleagues; and it helped us to think better (SR Sara, 2nd June
2003).

7.2.3.3. School A ( Appendix XXII)

268
School A was a small school far away from the big cities on the coast in a small town

in the North East region of Portugal. It has a museum and a library but not many

cultural events. The school lacked appropriate physical resources for the arts and the

teacher said that he tried to compensate for the lack of library and cultural activities

by using his own books and promoting school trips to exhibitions and museums.

He described students’ context as follows:

The school receives students from the region. We have students from
different socio-economical backgrounds. The students are very
motivated but general standards in all disciplines are low. Students are
used to a school system based on memorisation and they often have
difficulties in independent thinking skills; this is not only my opinion,
the same opinion was expressed by the Philosophy teacher at the last
teachers’ meeting (M’s report, 27 June 2003).

Implementing the new assessment raised several difficulties at this school:

The main difficulties in implementing the portfolio with my art class


were related to the rationale of the portfolio; it was very difficult to
require students to think independently because they are used to
following detailed prescriptions for each task (M’s report, 27 June
2003).

However the teacher ended up with a positive implementation of the portfolio and

there was evidence of good quality in the students’ outcomes. This probably

happened because the teacher had gradually prepared his students to develop

independent and critical thinking during the last years of the course, so they were able

to perform confidently in the majority of the assessment tasks. And besides, M knew

that his students had difficulties in decision making, and to overcome it, M

administered the new assessment instrument using a relatively prescriptive approach

by choosing two themes for the project briefs: ‘Human Figure’ and ‘Self-portrait’;

providing books for investigative studies and carrying out human figure drawing

exercises during the art classes. These students received a great deal of support from

269
their teacher; consequently their portfolios were not very diverse in terms of media

and experimentation. However they revealed considerable understanding of ways of

thinking and developing ideas in their notes and work journals.

By using such strategies M avoided failures like the one at School P. Nevertheless this

trial showed that it is possible to use the new assessment instrument with students

who were unfamiliar with the tasks if a teacher adopts a directive and supportive

approach in the preparation stage of portfolios. However some students still

commented negatively the lack of preparation. Mafalda wrote in her questionnaire:

This kind of examination is fair, but I think that students should have
better training for this type of exam. We should be taught for example
how to make investigative studies according to our themes; I think we
also should experiment with more techniques and media in order to
improve our technical skills and ability to evaluate our problems (SR
Mafalda, June 2003)

The experience was not wholly positive because problems of time and lack of

familiarity with tasks disadvantaged some students. According to M’s report:

Overall the portfolio was a good experience for me and for the
students. But I think that time should be more flexible. The main
advantage of the portfolio is fostering students’ sense of
responsibility, it has a very good structure but it should be developed
during the year. Students should be prepared for portfolio tasks during
the first year of the course. At this moment, I think it is very difficult
for students, at least for the average, because they are not prepared for
this kind of task (M’s report, 27 June 2003).

7.2.3.4. School K ( Appendix XX)

School K was located in a suburban region of Lisbon, well known for its baroque and

neo-classical monuments. It lacked physical and cultural resources but during the

year the staff promoted several cultural activities. According to information given by

the schools’ administrative service, the students were from different socio-economic

270
backgrounds and came from middle class parents and low-income families.

There was small minority of Black and Indian students but the majority were White.

According to the teachers the students were highly motivated: ‘They work hard to get

good grades; and they work hard in extra-curricular activities: (TR I; 9 June 2003)

‘They like to do things’ (TR I; 9 June 2003); ‘They are looking forward to making

things (TR E; 9 June 2003).

The implementation of the new instrument in this school revealed surprising results.

The two teachers, I and E, had different conceptions about art education and different

relationships with their students. E could be viewed as a prescriptive teacher while I

promoted independent work and group discussion in her classes. However during the

portfolio implementation the two teachers adopted a similar less prominent role; after

they had given the instructions to the students and provided some reference books

related to the starting points for the model project briefs they left the students alone.

I didn’t know about the kind of preparation; what kind of suggestions I


was allowed to make. So I left the students more or less alone after I
explained the portfolio instructions to them. Anyway students didn’t
ask me many things; they seemed to be very confident in what they
were doing (TR E, 26 June 2003).

Students were quite excited about the new instrument; they seemed to understand the

tasks as authentic and liked the more personal way of working in the arts.

I think it was well done; well though. I liked this work because it is
related to my motivation. I am not very good in examinations. Usually
I had bad results in tests; I know I can do things; but I am not able to
demonstrate it in test or examination conditions. The portfolio was
good because we were assessed when we were working under normal
conditions (SR Inês, 9 June 2003).

With our teachers, before the portfolio, we were given very


prescriptive tasks; when it was to make a drawing, we made a drawing;
when they asked us to paint, we painted. In the portfolio we could do a
lot of other things, we could choose. We had to think what to do…. It
was important to know what we wanted to do (SR Júlia, 9 June 2003).

271
The portfolio was more real than the usual work and tests we do in the
art classes; it is about our life; not school life or tasks that school thinks
are important for us (SR Ruben, 9 June 2003).

Students seemed very committed to developing their portfolios and it could be said

this instrument corresponded to their expectations for ‘real’ art and design

I liked the portfolio experience. It was related to the course. And it


was important for me because…. in the portfolio we were aware of
what is necessary; we know what we are required to do, to make it
well. It makes us think; more than usual. And it was also important to
know that the portfolio would be assessed by other teachers (SR
Isabel. 9 June 2003).

It was a surprise for the teachers – although they knew their students were already

motivated they found this motivated them even more:

They were so involved in the portfolio; we could say that they were
engaging themselves in a very unusual way; it was more than the
normal motivation. I could tell that they were living it intensively (TR
E, 26 June 2003).

And this motivation ended up having consequences in the other subjects:

They worked hard; and the other teachers complained that students
were more interested in Studio Art and were not paying the same
attention to other disciplines’ homework (TR I, 26 June 2003).

According to the teachers, students were not usually required to do much homework

for Studio Art – the coursework was developed in class time. It seemed that during

the portfolio experience students reversed the usual status of the disciplines paying

more attention and time to Studio Art.

Students at School K were obviously familiar with processes of independent and

critical thinking; the fact that students did not ask for additional support during

272
preparation time suggested that they were quite comfortable with the portfolio

requirements.

Sometimes they asked me questions, just to confirm if they were


working in the right direction, if the chosen media were all right, how
many developmental studies they should submit. Things like that (TR I,
26 June 2003).

It is possible that the students’ previous experience in Studio Art classes and other

disciplines enabled them to perform well. However the school’s dynamic tradition

in promoting several cultural activities and workshops as extra-curricular activities

could also explain the students’ independent work habits. However the portfolio

was not considered to be an easy examination option. For example Tânia wrote in

her questionnaire ‘ I had a lot of difficulties especially in organising the work’ (SR

Tânia, June 2003).

7.2.3.5. School V ( Appendix XXI)

V was a city in the centre of the country. It was a large school but had noticeably

limited physical resources.

The school receives students from the surrounding district; it is a


famous school in the region, especially because of the art courses we
have. Some students are from distant places and live at student
residencies during the year. The parents are from very different
economical background, but the great majority are from middle class
families. There are no significant numbers of ethnic minorities at the
school. A very few Black students, the majority are White (TR C, 25
June, 2003)

In this school the arts were valued:

… the arts are recognised and cherished by the entire school


community. There is a strong tradition of good results in the national
art examinations. (TR R, 25 June 2003).

273
C defined the school as very dynamic, promoting all kinds of extra-curricular

activities open to the entire community. Art students were fortunate because their

teachers organised school visits to sites where students could see contemporary art:

However it is far away from other cities and the art department usually
organises school visits to museums and art festivals at Lisbon; Porto;
Barcelona or Madrid, at least one per year (TR AP, 25 June 2003).

The school had committed teachers and motivated students:

There are many art classes in the general courses and in the
technological courses. There are twelve art teachers in the art
department. The students are highly motivated and the art department
teachers are fine; we work very well together (TR R, 25 June 2003).

I think students like the school; and also the kind of relationship we have
with them; generally they work very hard (TR R, 25 June 2003).

However we don’t reduce the school to this building and the art lessons
are not always in the classrooms. The school is everywhere; in the park,
in the museum, in the streets. We usually have outdoors lessons.
Especially in Studio Art, where students can develop different media, we
try to make connections with other places; for example with factories,
design and craft studios in the town (TR R, 25 June 2003).

The new instrument had been piloted at this school, so the students and teachers were

familiar with it. The three teachers involved in the main trial continued to use

portfolios as before in their internal assessment. Consequently the conditions in this

school were quite different from the other participating schools. The teachers

involved had adopted the new assessment instrument and, for example, for teacher R

it was not a novel approach to Studio Art:

You know, I have worked like this for several years. What you gave me
with the portfolio experience was a more structured instrument,
because the booklet was very organised and things were systematised
… (TR R, 25 June 2003).

274
The other two teachers were given time to prepare students for the trial during

the year:

…. since we started the portfolio in September [pilot] students had the


time to be prepared; little by little they learned skills of critical
reflection and self-evaluation. They learn how to plan the different
steps of the work in the time to respond to the deadlines. It was good to
continue to use portfolios after the first experience [pilot] at the
beginning of the year; students acquired so many learning experiences,
they had the opportunity to develop organised ways of thinking and
making; they developed independent skills and little by little they
pushed their own barriers; they learned how to realise their own
intentions and how to make projects that motivated them (TR AP, 25
June 2003).

It was really interesting to see how much students evolved during the
year. In the first portfolio they were not really good at making
investigative studies and developmental studies; but towards the
middle of the year they started to understand how it was good for
them and in the end they made portfolios with very good personal
responses (TR C, 25 June 2003).

However for some students the new assessment instrument was still problematic:

The most difficult thing about the portfolio was to justify our options;
but it also depended on the media. I developed a photography project.
It was easy to explain the techniques and to evaluate the experiments;
but it was less easy to explain the decisions (SR Joaquim, 17 June
2003).

For me the difficulty was in dividing the tasks; planning. Because I left
a lot of things to the last hour… it was difficult to organise the ideas; to
sort them out from the confusion (SR Manuela, 17 June 2003).

7.2.4. Moderation

Moderation of a sample of students’ portfolios was conducted during the period 16-

30 June 2003 after the internal marking. The sample of portfolios followed the rules

established in the guidance booklet that contained examples with a range of low,

middle and high marks. At Schools B and A, the sample included also four portfolios

275
that the teachers had considered problematic to assess during internal moderation.

A was first moderator and visited all the schools to moderate the samples.

Her reports confirmed that the marks awarded were generally in line with the

standards agreed during the standardisation meetings. In some cases there was a

difference of one or two points between her own marks and the teachers’ marks.

After the moderation A discussed the results with the teachers explaining why she

had found their judgements too harsh or too lenient. In the majority of cases the

teachers had been too harsh, perhaps because they had been influenced negatively by

students’ behaviour and attitudes throughout the whole year. After the moderation

teachers revised the marks in line with A’s suggestions. In two schools A found

portfolios that had been very under marked (School K: Joana’s portfolio, see

Appendix XX, p.270 and School P: Luis’ portfolio, see Appendix XVIII, p. 248).

The second moderator, IL, visited Schools P and K. IL agreed with A’s

recommended awards and informed the teachers J (School P, see Appendix XVIII,

p.249) and E (School K, see Appendix XX, p. 273). They appeared to understand

why they had under marked the portfolios and agreed with IL and A’s marks.

Joana’s and Luis’ marks were finally agreed and IL moderated all students’

portfolios in order to see if there were similar cases, but none were found.

Student Teachers’marks 1st Moderators’mark 2nd Moderators’ marks


(AC1+AC2+AC3+AC4+AC5) (AC1+AC2+AC3+AC4+AC5) (AC1+AC2+AC3+AC4+AC5)
Award
Luis 3+1+3+10+1=18 10+5+20+20+5=60 10+5+20+20+5=60
(School Final mark: 2 Final mark: 6 Final mark: 6
P)
Joana 26+21+28+28+12=105 30+23+48+40+17=158 26+28+50+40+15=159
(School Final mark: 11 Final mark: 15 Final mark: 15
K)
Figure 14: Differences between marks in Joana’s and Luis’ portfolios

276
Overall the results of moderation procedure showed consistency in marking.

Teachers and moderators shared a common interpretation of criteria, but the small

number of students and teachers involved in the trial only allow a tentative

conclusion about the possible efficiency of the moderation procedures in other

contexts. The participating teachers viewed moderation very positively as a means to

improve assessment and also as a way to enhance confidence in their own

judgements.

I was very happy to see that the marks I awarded to the portfolios are
recognised by the moderator; it gives me confidence in my assessment
skills, and besides I feel less alone in my profession because of the
opportunity to establish dialogue with other art teachers outside my
school (AM’s report, 18 June, 2003).

What was also good in the portfolio assessment was the external
assessment. I think that it was very good to have my marks confirmed
by the moderator. So she could reassure me and my students that we
were using the established standards (M’s report, 27 June 2003).

7. 3. Feedback from Trial 2

7.3.1. Underlying concept

An analysis of the questionnaire responses showed that the new instrument was

compatible with teachers Studio Art curricula. Seven teachers strongly agreed and

three teachers agreed with the statement: ‘ The examination respects my own

opinions about art and design learning’ (question 65). Six teachers strongly agreed

and four teachers agreed with ‘The examination respects my teaching practice’

(question 66).

7.3.2. Preparation

Question Questionnaire results: % of agreement with: Students

277
(117)
54 To be well prepared to conduct the examination. 73%

55 It was important to discuss the contents of the portfolios 96.5%


with the teachers.

56 It was important to have regular group criticism sessions. 93.8%

57 The contents of the portfolio were regularly discussed with teachers. 84.3%

58 It was important to talk about and regularly explain the work in their portfolios 94.8%
with the teacher during preparation time.
59 The teacher was very interested to hear students’ views about 91.4%
the work in their portfolios.
Table 38: Responses to questions about preparation.

An analysis of the questionnaire responses showed that most students considered they

were well prepared to take the examination. Collaborative working was one characteristic

of the new assessment instrument and the researcher observed constant dialogue between

teachers and students in the form of advice or providing motivation. Although 51.3

percent of the students thought that their teachers influenced their work, only four

teachers recognised this (questionnaire trial, question 59). This influence was mainly in

the form of verbal advice and helping students to develop a project brief appropriate for

the time and resources available. There was interaction between students and ‘crits’ were

also considered useful in the development of the portfolios.

7.3.3. Format

The students considered the assessment instrument very flexible.

Question Questionnaire results: % of agreement with: Students


(117)
60 Freedom of choice to develop project briefs that related to interests and 87.2%
motivations.
61 The teacher influenced students in their choice of works and approach. 51.3%
62 Useful to have a model project brief. 83.3 %
63 Important to have the opportunity to design a class 81.7%
project brief.
64 Important to have the opportunity to design an individual 96.6%
project brief.
Table 39: Responses to questions about format.

278
Both students and teachers considered the assessment instrument compatible with art

and design practice, related to the contents of the specialism and to be capable of

allowing for students’ personal interests. According to the interviews:

The portfolio was real work; I mean because it was like the work by
artists and designers (SR Carla, 17 June 2003).

It was related to the course (SR Isabel. 9 June 2003).

In the end it [the portfolio] was also a way to explore ourselves (SR
Sandra, 2nd June 2003).

Question Questionnaire results: % of agreement with: Students


(117)
1 Students are given a good opportunity in the examination to show well 96.6%
what they know, understand and can do in and about art.
2 The examination respects diversity of learning styles. 94.9%
3 The exam respects students’ different motivations and intentions. 88%
4 The exam respects students’ different opinions. 89.7%
5 The examination allowed students to display the knowledge, 91.5%.
understanding and skills set out in the syllabuses.
10 The examination makes it possible to distinguish between 87.9%
very good students and just good students.
Table 40: Responses to questions about content.

7.3.4. Instructions
All the teachers and the majority of students found instructions clear.

Question Questionnaire results: % of agreement with: Students


(117)
15 The instructions for the examinations are clearly stated. 92.3%
16 The instructions clearly stated what students are required to submit as 88.9%
preparatory work.
17 The instructions clearly stated what students are required to submit as 88.9%
developmental studies.
18 The instructions clearly stated what students are required to submit as 91.5%
self-assessment work.
19 The instructions clearly stated what students are required to submit as 94.8%
final product.
Table 41: Responses about instructions.

In the words of António:

279
‘The instructions for students were clear and helped us to organise
the sequence of tasks’ (School V, SR António, 17 June 2003.

In general the instructions for teachers and students were considered sufficiently

clear, accessible and useful for students and teachers.

7.3.5. Time

During the pilot, 78.4 percent of students had commented that the designated time for

portfolio development was not adequate. However the opinion of students who

collaborated in both the pilot and the trial changed for better. This confirms the

suggestion that the problem of lack of time was related to degree of familiarity with

tasks, as suggested in the previous chapter.

In the questionnaire response to the trial, all the teachers and 70.9 percent of

students agreed with the statement ‘students have plenty of time to complete

the examination’. However, time was a controversial and problematic issue.

Scheduling strict periods for developmental studies and for realising final products

might not be a good idea because some kinds of work take more time than others.

The problem of time is very relative; especially the time for production. It depends

on the kind of final work we develop. Some students made the final products very

quickly like Inês. She painted the canvas in two hours; but for me and others it took

lots of time to make the final product. We could have chosen quick final products but

it was our work; we wanted to make our best performance (School K, SR Isabel, 9

June 2003).

280
On the other hand more flexibility with time might introduce an element of bias

because students with more free time might be advantaged. Some students did not

want to spend extra-time on their portfolio, so they chose project briefs that could fit

the available time. However, other students felt that it was not fair to narrow their

options because of time limits; reducing time would prevent them fully developing

their work as they wish.

The question of time was discussed at length during teachers on-line meetings

without consensus. For example teacher E wrote: ‘All the students must have the

same period of time for the portfolio; it is their problem if they can not adapt their

projects briefs to their available time’ (on-line meeting 21 May 2003); and teacher R

argued ‘ We need to allow students to work on their final products after school,

otherwise they will not have time to complete it; we cannot ask students to narrow

their projects just because of limits of time’ (on line meeting 21 May 2003); and

teacher C said: ‘ It is not fair if some students have more time than others; all

students should have the same time for completion’ (on line meeting 21 May 2003).

During the main trial, teachers treated the question of time differently. However,

students who spent more time on their final products were not advantaged in this

assessment because the teacher and moderators marked the works taking time into

account. In their portfolios, as well as including the written time frame for their

portfolio’s development, students dated all the work for starting and finishing.

In the end a flexible timeframe was agreed depending on the school context and

students’ intentions. Although it might be seen as not allowing a strict uniformity of

281
application the ‘set’ requirement was the same for everyone. However students were

free to continue their work beyond examination time limits provided this was

recorded and this seemed to improve the authenticity of the assessment instrument.

7.3.6. School resources

Although in the researcher’s view only School B had suitable conditions for the

application of the assessment instrument, according to the questionnaire results 66.7

percent of the students said that the physical conditions (accommodation; equipment

and materials) were adequate for the examination. Teachers AM and R allowed their

students to develop final products outside the classroom when they needed to use

specific technical equipment, for example video or digital labs.

At a certain point we discussed; if we should narrow students’ choices


in terms of media and techniques because the school did not have
resources for everything which could be developed by students. But in
the end we decide that it was absurd to narrow students experimenting
and exploration just because of the school conditions (School V, TR
AP, 25 June 2003).

The other teachers advised students to work with the media and equipment that were

available in the classroom. However advising students to work this way narrowed

student’ freedom of choice, as teacher R pointed out:

Of course it was easier to reduce students choices in terms of media


and scale; but it wouldn’t be fair for them. Because what motivates
them is the freedom of choice; the possibility to go further, to explore
the impossible (School V, TR R, 25 June 2003).

The problem of inadequate physical conditions was more or less overcome in the five

schools by advising students to choose ‘realistic projects’ or by providing additional

resources. However access to adequate resources was identified as a possible cause

282
of bias, especially access to books, museums and exhibitions. In the main trial, the

majority of teachers brought their own books into the class; organised visits to

museums and exhibitions and helped students to find specific tools and equipment

for their projects in and out school.

7.3.7. Tasks
The results of questions about tasks in the questionnaire showed that examination

components Process, Product and Self-assessment were considered by all teachers

and by 90.5 percent of the students to be appropriate for assessing art and design.

The required assessment tasks were not familiar to 68.7 percent of students, but all

the teachers and a great majority of students considered the tasks appropriate for art

and design assessment as seen in the following students responses:

Question Questionnaire results: % of agreement with: Students


(117)
21 Recording ideas, intentions and motivations is an appropriate task to assess thought 96.6%
processes.
22 Investigative work is an appropriate task to assess knowledge and understanding in 93.1%
art and design.
23 Developmental studies are appropriate tasks to assess knowledge, understanding 93.2%
and skills in art and design.
24 The final product is an appropriate task to assess skills in 88.9%
art and design.
25 Self-assessment notes or reports are appropriate tasks to assess 82.2%
evaluative skills in art and design’.
26 Work journals/ annotated sketchbook is an appropriate task to assess thinking skills 89.7%
in art and design.
27 Annotations about the intentions, quality and progress of the work 89.7%
in the portfolio is an appropriate task to assess thought processes.
Table 42: Responses to questions about tasks.

In the pilot it was hypothesised that self-assessment tasks were perhaps more

attractive to girls than to boys (Chapter 6, p.232). In order to see if this could be

verified in the main trial, male and female student responses to the questions about

the appropriateness of the tasks were compared. Significant small differences were

283
found between male and female students’ responses to these questions (21-27). Table

43 shows the results of the t-test obtained with the gender variables, 21 (record

ideas), 22 (investigative work): 23 (developmental studies), 24 (final product), 25

(self-assessment), 26 (work journals), 27 (annotations).

One -Sa mple Te st

Test Value = 0
95% Confidence
Interval of the
Sig. Mean Difference
t df (2-tailed) Difference Lower Upper
sex 30.653 116 .000 1.38 1.30 1.47
record ideas 27.609 116 .000 .9316 .8648 .9985
investigative
18.242 115 .000 .8621 .7685 .9557
work
devlpmt studies 18.419 116 .000 .8632 .7704 .9561
final product 13.328 116 .000 .7778 .6622 .8934
selfassessment 9.414 116 .000 .6581 .5197 .7966
workjournals 14.109 116 .000 .7949 .6833 .9065
annotations 14.109 116 .000 .7949 .6833 .9065

Table 43: Questionnaire trial 2: one-sample T Test: sex/tasks

In talking about the assessment tasks, portfolios were viewed by both students and

teachers as a complete process of inquiry and making in art. The perception was that

capacities such as independent study and student autonomy improved, as Julia and

Carla made clear:

The portfolio helped us to think alone… I think that in the portfolio


people have time to explore and experiment without rushing towards the
final product… (School K, SR Júlia, 9 June 2003).

I think that with this last portfolio people can see what we learned during
the three years of the course; it was a final work. I can see how much I
evolved since the 10th year; but this year was more important; because
we had to think alone (School V, SR Carla, 17 June 2003).

Investigative studies were also seen as an important part of the portfolio.

The search part was also very helpful.... because it gives us things that
others have done and we can use it as starting points, I mean
interpreting it and going further according to our own personality
(School K, student Inês, 9 June 2003).

284
While students appeared to enjoy the opportunity to include process and

product in their portfolios, they firmly stated the importance of final

products:

I think that the portfolio is not complete without the real final product.
We need to finish the final product (School K, SR David, 9 June 2003).
The final product was the culminating point…(School B, SR Sandra, 2nd
June 2003).

But students also acknowledged the importance of the developmental

studies, work journals and annotations – the opportunity to include

evidence of process.

I am not very happy with the final product; because just seeing it
people will not understand what capoeira is [a Brazilian war-dance]
but, if they look at the complete portfolio it will be easier; I can tell
them what capoeira is …the final product is just a small part of my
work (School B, SR Susana, 2nd June 2003).

In my case; it is hard for people to understand what I want to say in the


final product; the final product is not enough to explain or show what I
can do in the arts; but with the process I can explain this to people
(School B, SR Inês A., 9 June 2003).

I think that for some students it was more than just a school work to be
assessed, it is about them … I think the portfolio is a whole, combining
process and products; a person has to see it all, to read everything and
not just see the final products, to fully understand our achievement
(School V, SR Carla, 17 June 2003).

There was a sense of great achievement to be derived from the portfolio. Students

felt motivated to learn.

With this last portfolio I could develop a product design project. My


project was to make a bicycle for disabled people; people with one
arm. I enjoyed the investigative studies. I enjoyed thinking about the
small problems, to see how to solve them in the sketches. But what
really satisfied me was the final product. I could make just a small
maquette; I will not have a better mark because I made the real
bicycle and tested the bicycle performance just like a real product

285
designer. But it was my personal pride. I needed to go until the end, to
see that I was capable of doing it (School V, SR Tiago, 17 June 2003).

7.3.8. Criteria and weightings

From the data from the on-line meetings and questionnaire responses it was clear that

teachers considered the criteria clear and well balanced and that they provided them

with the necessary common ground to be able to assess students’ work consistently.

According to teacher AP:

For us the portfolio experience was really good, because it made things
easier. I mean all the decisions about assessment and marking; the
criteria were clear and easy to apply to each portfolio whatever media
or techniques the students were using (School V, TR AP, 25 June
2003).

For the students the criteria were also appropriate, as can be seen from their

responses to the questionnaire:

Question Questionnaire results: % of agreement with: Students


(117)
7 The weightings attributed to the components are appropriate 78.6
for assessing the disciplines of art and design.
28 The criteria for marking are clear to all candidates. 85.5%
29 Students are aware of what qualities their art and design 77.8%
teacher and external assessors are looking for in their
portfolios.
30 The criteria used to assess the portfolio are fair and recognise 86.3%
students’ effort and achievement.
Table 44: Responses to questions about criteria.

And as expressed during student group interviews:

The criteria were fine; everything which is important was there (SR,
Manuela, 17 June 2003).

It was balanced, including the process and the products (SR, Joaquim,
17 June 2003).

It was good to know the criteria; before we knew more or less what the
teacher was looking for in our works; but it was quite vague. In the

286
portfolio everything was written down; we knew what to do and how
the work would be marked (SR, Inês, 9 June 2003).

I think the criteria are fine; it helps us to be responsible; to acquire


responsibility because we had to think by ourselves; our thinking
process is assessed (SR, Ruben, 9 June 2003).

Similarly all teachers and a great majority of students, in the questionnaire, agreed

with the specific criteria and respective weightings (questions 31-45) as follows:

Criteria Assessment Assessment Assessment Assessment Assessment


Criterion 1 Criterion 2 Criterion 3 Criterion 4 Criterion 5
% of students 97.4 98.3 97.4 98.3 97.4
agreement
Weightings
% of students 92.2 95.7 90.6 95.7 94.9
agreement
Table 45: Responses to questions about weightings.

The response to questions 46 and 48 showed that for all teachers and 84.5 percent of

students all five criteria were equally important and for all the teachers and

84.5percent of the students no other criteria were needed. The students who

disagreed with the weightings for the assessment criteria made suggestions about a

fair weighting. However their opinions varied, revealing personal preference for

certain kinds of skills and tasks. In the responses to question 48 of the questionnaire

some students suggested other criteria such as: ‘the student is able to plan the work

according to the time available’; ‘creativity ‘; ‘student commitment to work’;

‘imagination’; ‘ the student is able to justify her work’; ‘technical competence’.

Such suggestions were similar to the original criteria, but the students did not appear

to recognise them in the assessment matrix, may be because they were integral to

broader criteria.

7.3.9. Bias

287
Responses to questions 11-13 revealed that none of the teachers and very few

students found the instrument biased, however a considerable number of students

(28.7 percent) believed that students from small cities or rural areas were

disadvantaged, and in fact the access to resources like good art libraries may be

problematic in such areas.

Question Questionnaire results: % of agreement with: Students


(117)
11 I think that the examination is free from gender bias (i.e. equally fair 97.4%
to all students, boys and girls).
12 I think that the examination is free from bias related to where students live 71.3%
(i.e. equally fair to students from big cities and students from small cities or
rural areas).
13 I think that the examination is free from bias related to racial/ethnic/religious 94%
background of the students.
14 I think that the examination is free from bias related to social class. 88%
Table 46: Responses to questions about bias.

Although the majority of respondents did not detected gender bias in the

examination, some tasks involving written comments and self-reflection notes

favoured female more than male students. The researcher observed that the portfolios

of male students included fewer notes and often exclusively visual work-journals

using brief annotations. In contrast, girls included long prose passages and visual and

written work in journals.

According to some questionnaire responses students from small cities were

disadvantaged in terms of access to libraries and museums. And students from low-

income families could have problems in acquiring expensive materials as seen in the

case of Jocelyne and Joana at School K:

I have no money to buy another film and photographic paper and the
last photographs are not good enough; I have to present it as it is (22
May, Jocelyne’s work journal).

288
I spent all my money with the coloured photocopies… I would like to
go to the exhibition at Culturgest but the ticket is too expensive (12
May, Joana’s work journal).

This problem was acute at School K where teachers were aware of their students’

constraints in terms of materials. It was agreed from the first standardisation meeting

that the types of materials used by students should not influence teachers and

moderators and, in fact, some students with very high marks used recycled materials.

But the problem of bias related to students’ financial opportunities was not fully

overcome because certain students could not achieve what they wanted because and

they had to adjust their choice of material to their budgets.

The new portfolio assessment instrument facilitated collaborative work between

students and teachers in an ‘apprentice’ type relationship as observed with AM and

her students. R favoured tutorials as a teaching method and almost all the teachers

used negotiation of objectives and progressive development of the autonomy of

student learning in their methods. The diversity of methods was possible because of

the flexibility of the instrument, commitment of teachers and motivation of students.

However a serious bias was detected related to the degree of aid and advice students

obtained from their teachers. As teacher R warned:

I think that if you are carrying on this experiment with portfolios you
should be careful because it is only successful with a certain kind of
teacher; it demands involved and well-prepared teachers; it requires
good school conditions…(School V, TR R, 25 June 2003).

Collaborative work between teachers and students, negotiation of objectives and

progressive autonomy of student learning perhaps was only possible because the

majority of teachers who participated in the main trial were well motivated.

289
They worked hard to help their students without over influencing their students’

intentions and outcomes.

Although the new assessment instrument was considered valid and to provide authentic

tasks, considerable problems of bias were found related to flexibility of time, available

resources and teacher support as seen in the case of Tiago’s portfolio at School V:

Even if you say that you are not impressed by this type of final work; it
is difficult not be impressed by a student who actually makes a real
bicycle as a prototype instead of a small-scale maquette. And in this case
the student made the bicycle because he had more time, he made the
final product outside school. And this is not fair for the others who can’t
afford extra-time or a teacher who has friends owning a bicycle factory.
So I think, for a question of fairness that students should have the same
resources and time conditions (teacher AP, 25 June 2003).

7.3.10. Assessment procedures


In the questionnaire all the teachers and majority of students stated that they found

the assessment procedures fair:

Question Questionnaire results: % of agreement with: Students


(117)
49 Students’ portfolios were fairly marked. 94%
50 I think it is important that students’ portfolios are assessed by more 93%
than one person for the final assessment, to minimize subjective bias.

Table 47: Responses about assessment procedures.

The teachers expressed positive views about the appropriateness of the materials and

efficiency of training. However slight problems of consistency were detected as only

seven teachers strongly agreed with question 75 and six with question 76. The others

opted for ‘agree’ may mean they were not entirely satisfied with the assessment

matrix and the degree of consistency in standards. Comparing these answers with the

multi-faceted analysis of the trial marking (p. 285) it is interesting to note that four

290
teachers were slightly inconsistent in marking which could be a consequence of less

training or marking experience with the new assessment matrix.

Question Questionnaire results: number of agreements with: Teachers (10)


Strongly Agree
Agree
67 It was useful to discuss the interpretation of criteria with other 9 1
teachers
68 During standardisation meetings we reached consensus about 6 4
interpretation of criteria
69 It was useful to have visual exemplars to understand the criteria 8 2
and standards
70 The instructions for assessment were clearly stated 9 1
71 I had adequate training to assess students’ portfolios 10
72 The assessment procedure combining holistic and criterion 9 1
referenced marking is appropriate to judge art and design works
73 The global descriptors include an indication of the nature of 8 2
knowledge, understanding and skills required in each mark range
74 The assessment matrix includes general instructions on marking 8 2
75 The assessment matrix is designed to be easily and consistently 7 3
applied
76 The standards were consistently applied 6 4
Table 48: Responses about consistency of marking.

7.3.11. Consequences/Impact

7.3.11.1. Results

For 74.3 percent of the students, portfolio marks were close to those that students

obtained during internal assessment (question 53). At School V the marks were

similar, but in the other schools teachers said that the marks in the portfolios were

lower and probably this was a consequence of the breadth of the instrument, which

was ‘more demanding in terms of knowledge and skills than the usual coursework

exercises’ (M, on line meeting 25 May 2003). However it may be misleading to

291
compare the portfolio marks with coursework marks because the criteria and

standards normally used in internal assessment may not be the same as those used in

the new examination. This was confirmed by teacher J who stated in the interview

that he used to mark student work leniently in internal assessment in order to help

them obtain good marks for university entrance purposes.

7.3.11.2. Effects upon students

The portfolio was a way to improve our knowledge and at the same time
it was preparing us for the future (SR Mafalda, questionnaire, June
2003).

The new assessment instrument increased students’ workload, but the majority of the

students were motivated and did not see this effect as negative. However students

firmly stated that with the portfolio requirement they needed to be more committed

and pay more attention to the subject. The questionnaire revealed that all the teachers

and a great majority of students saw the new instrument as a good learning tool and a

sound foundation for university. Ninety-four percent of students agreed with the

statement ‘putting together the art and design portfolio has been a worthwhile

learning experience’ (question 51) and 95.7 percent agreed with ‘Students’ art and

design portfolios are a good foundation for further study in art and design’ (question

52). These responses were confirmed during the interviews:

If someone wants to go further and choose a creative job, for instance as


a designer, the portfolio is a good training (SR Sara, 2nd June 2003)
The portfolio was a good preparation for university because we learned
about the development of methods of work (SR Zé and Marta; 9 June
2003)

7.3.11.3. Effects upon teachers

292
In common with the English examination system it was evident that teachers felt

overloaded by the portfolio experience. However in the context of this study, since all

the teachers were volunteers: they did not see it as negative. On the contrary, teachers

appreciated the opportunity to learn, improve their relationship with their students and

assessment practices.

But, you know the portfolio is a real challenge for us, as teachers. We
had so different student outcomes…. It was amazing and I found it
difficult to organise it alone; it demanded extra-work; which I don’t
mind because I am used to it. You can’t imagine the quantity of phone
calls and emails I received per day from students. I was as much
involved in the projects as the students; so we lived it together
intensively…(TR R, 25 June 2003)

The experience helped me to understand my students better. Through the


dialogue established during the portfolio, I was able to understand them
as individuals, looking to their particularities and respecting their
differences (TR AP, questionnaire, June 2003)

The experience confirmed my own pedagogical practices in Studio Art


and provided me new tools for art and design assessment (TR AM,
questionnaire, June 2003)

Now; I think that I can assess students with more consistency and I can
give more valuable assessment feedback to students (TR C,
questionnaire, June 2003).

After the trial at least nine teachers appeared convinced about the positive qualities

of portfolio assessment. According to their responses to questions 77-92 in the

questionnaire, their perceptions of the necessary knowledge, understanding and skills

in art education had changed. They were convinced of the advantages of portfolios as

a learning strategy and as an instrument for summative assessment.

293
Str. Agree Disagree Str.
Questionnaire results: number of agreements with: agree disagree
77. The examination influenced my teaching practice by using 8 2
portfolios as a learning strategy
78. The examination influenced my teaching practice by using 9 1
portfolios for summative assessment
79. The examination influenced my perception of required knowledge, 5 4 1
understanding and skills in art and design education
80. The examination changed my views about the importance of 4 3 3
process in art and design assessment.
81. The examination changed my views about the importance of 1 6 3
product in art and design assessment.
82. The examination changed my views about the importance of 4 3 3
investigative studies in art and design assessment.
83. The examination changed my views about the importance of self- 2 5 3
evaluation in art and design assessment.
84. The examination changed my views about the importance of 3 3 4
developmental studies in art and design assessment.
85. The examination changed my views about the importance of 2 7 1
technical skills in art and design assessment.
86. The examination changed my views about the importance of 2 4 4
thinking skills in art and design assessment.
87. The examination changed my views about the importance of 1 4 5
problem finding skills in art and design assessment.
88. The examination changed my views about the importance of 2 8
problem solving skills in art and design assessment.
89. The examination changed my views about the importance of 3 5 2
independent thinking in art and design assessment.
90. The examination changed my views about the importance of 1 2 7
creativity in art and design assessment.
91. The exam did not influence my teaching practice 1 1 5
92. The exam influenced positively my 7 2
teaching practice
Table 49: Responses about impact.

However some teachers disagreed with the propositions in questions 84, 85, 86, 87

and 90. They did not consider developmental studies, technical skills, thinking skills,

294
problem finding skills and creativity as new, may be because they already fostered

these qualities in their previous teaching and assessment practices.

7.3.11.4. Effects upon schools and curriculum

Teachers saw the portfolio as a valid instrument for learning and assessment with

strong consequences for the educational experience. As stated by J the new

assessment instrument could be used as an agent for curriculum reform:

However, we live in a period of social and technological change and our


students should have access to a different kind of learning, fostering the
generation of personal ideas and the portfolio as an assessment instrument
is a good strategy to introduce new pedagogical practices (J’s report, 6
June 2003).

The teachers agreed that the new assessment instrument could function as an agent for

the reform of established educational orthodoxy, and to promote diversity and

plurality of approach, because it was a form of assessment that was integrated with

the learning process and focused on students as active subjects rather than objects for

passive reception of information:

That’s why I think this is not just about assessment; portfolio is more
than an assessment instrument; it is a way of teaching and learning (TR
R, 25 June 2003).

The teachers who participated in the trial had an opportunity to review and reflect on

their concepts of the art and design curriculum and to re-think their practices in the

light of the proposed assessment framework. It is possible that some of them will

continue to use this model for designing portfolios and become agents of change in

the context of their own schools (on-line meeting, 25 June 2003).

295
7.3.12. Reliability of results

In order to check the inter and intra-reliability of assessors (raters) within the new

assessment procedures during the main trial, the teachers marked ten portfolios on

two different occasions (April 2003 and July 2003). The difference between teachers

was from one to three points (inter-rater reliability). The marking difference between

the same teacher (intra-rater reliability, i.e. performance at different times) varied

from zero to two points of difference.

The high consistency of the results was, probably, a consequence of: (1) the ease of

application of the assessment matrix; (2) the use of a common shared language in the

criteria statements; (3) the training of teachers and moderators at the standardisation

meetings.

7.4. Comparing the current MTEP examination and new model

7.4.1. Validities

The existing Portuguese examination in arts did not present many problems of bias

(see Chapter 4). The time and materials prescribed were strictly controlled and

uniformly applied, the tasks did not require materials or equipment other than pencils

and paper, the tasks were short and did not require long periods of development and

production, the questions did not require long, and possibly expensive, investigative

studies. In contrast the new external assessment instrument showed some bias

problems. However it was considered by the users to be more valid and authentic in

terms of content, face and response validities, qualities that were not evident in the

existing examination.

7.4.2. Reliability

296
In the mock MTEP examination the ten raters had no training for marking; they

were asked to mark the scripts in the same way they usually did during the national

examinations – in other words they only had the official instructions used for

marking in the current examination. They marked students’ completed scripts as they

usually did, at home during the period January-April 2003 and they sent the marked

scripts to the researcher (for details see Chapter 4, p.128).

During the fourth pilot training session (11 January 2003) the seven teachers who

attended it were asked to mark thirteen portfolios; they had received training during

the previous sessions in order to come to common interpretation of the criteria.

During the main trial, the researcher collected twelve portfolios, and the teachers

were asked to mark them on two different occasions. The new assessment procedures

improved the reliability of marking in terms of inter and intra-raters reliability as

seen below:

Current Exam New External Assessment


Instrument and Procedures
Differences between 5-7 points 1-3 points
markers (inter-raters)
Differences between the 0-6 points 0-2 points
same marker (intra-rater)
Table 50: Comparing inter and intra raters reliability between the current
examination and the new assessment instrument and procedures

A Rasch-based rating scale analysis computer programme called FACETS (Linacre,

1999) was used to analyse the data obtained from marking current examination

scripts, the pilot (Trial 1) and main trial (Trial 2) portfolios. FACETS is a

generalisation of the Rasch (1980) family of measurement models that makes

possible the Analysis of examinations that have multiple potential sources of

measurement error (such as assessment items, raters, and rating scales). In the many-

297
facet Rasch model (Linacre, 1989), each element of each facet of the assessment

(such as the candidate, the marker, item, scale, etc) is represented by one parameter

that represents proficiency (for students), severity (for markers), difficulty (for items)

or challenge (for rating scale categories). For each element of each facet in the

Analysis, the computer programme provides a measure (a logit estimate of the

calibration), a standard error (information about the precision of that logit estimate),

and fit statistics (information about how well the data fit the expectations of the

measurement model).

The specific questions explored with the FACETS output were related to the

consistency of markers: intra and inter-raters reliability in the three different

situations in order to compare the old examination and the new assessment

reliability.

7.4.2.1. Summary of the FACETS

Tables 51,52 and 53 display all facets of the analysis. The markers, candidates and

rating scales are calibrated so that all facets are positioned on the same scale, creating

a single frame of reference for interpreting the results from the Analysis. The scale is

in log-odds units, or logits, which constitute an equal-interval scale with respect to

appropriately transformed probabilities of responding in particular rating scale

categories. Column 4 display the 20 points rating scale markers used to score

students’ works. The bottom rows provide the mean and standard deviation of the

distribution of estimates for candidates and markers. The first column in the tables

displays the measure scale in logits. In the current examination (table 51) the range

of scale measurement used was very narrow while with the new assessment (table 52

298
for trial 1 and table 53 for trial 2) enlarged considerably the scale of measurement

(column 1). The measurement scale is used to report estimates of probabilities of

candidates’ responses under the various conditions of measurement (ability, rater

severity) that have been entered in the Analysis. The difference between the current

examination and the new assessment measurement scales might show that within the

current examination there were fewer opportunities for differentiating performance;

and that the new assessment provided more reliable information than the current

examination. The second column displays estimates of candidates’ proficiency.

Higher scoring students appear at the top of the column, while lower scoring students

appear towards the bottom of the column (Candidate ability measure). The third

column compares the markers in terms of the level of severity or leniency. Because

more than one rater marked each script (six scripts in the old exam) or portfolio (13

portfolios in the pilot and 12 portfolios in the trial) markers’ tendencies to rate

student’s works higher or lower on average could be estimated. More severe

markers appear higher in the column, while more lenient raters appear lower. The

markers showed great variations in marking in the current Portuguese examination.

In the pilot and in the trial there was less variation of severity between the markers.

This could be because during the pilot and trial the markers were able to achieve

consensus about the interpretation of criteria; and also because the criteria were more

helpful for marking the new instrument than those for the current examination.

299
Table 51: Measurable data Table 52: Measurable data Table 53: Measurable data
summary: Current summary: Trial 1 summary: Trial 2
examination
Me Cand raters S.1 Mea Can raters S.1 Meas Cand Raters S.1
as idat sr did r idat
r es ate es
s
1 (17) 5 (18) 10 (20)
---- 6 ----
4 14 9 8 19
1 4 1 ---- ----
---- 8 5 18
10 13 7 ----
3 17 6 17
---- 8 ---- 5 9 ----
2 9 12 2 11 16 16
1 6 ---- 4 ----
1 5 ---- 12 5 15 3 11
123 11 1 15
7 ---- 4 3 ---- ----
7 2 12 14
0 2 4 ----
1 8 10 1
---- 0 6 1 13
9 9 14 ----
3 6 9 12
-1 ---- 0 2 1 5 7 ----
---- 2 13 2 8 11
-1 4
3 ---- ----
-2 -2 6 10
7 12 7 ----
10 9
-1 (6) 5 ---- -3 4 ----
-3 11 8
-4
-4 ---- ----
13 -5 7
-5 3
(10) -6 10
-7 ----
-8 6
-9 ----

-10 1 (4)

7.4.2.2. Reliability index


Multi-faceted Analysis executed through the FACETS program provides, in addition

to the summary map, detailed information about the extent to which the assessment

300
instrument defines different levels of ability – its capacity to distinguish between

candidates – is provided in a summary of the information on candidates in the form

of a reliability index. The index is from zero to one, with values close to one

suggesting good reliability. This reliability does not indicate the degree of agreement

between raters, it is more like the indices of reliability associated with instruments of

assessment or tests (e.g. Cronbach’s alpha). The Analysis revealed that the current

Portuguese examination had a lesser degree of reliability in terms of internal

consistency than the new assessment instrument: 0.7 for the current examination

estimates of reliability, 0.97 for Trial 1 and 0.99 for Trial 2.

7.4.2.3. Raters measurement reports

FACETS produces two indices of the consistency of agreement across markers. The

indices are reported as fit statistics  weighted and unweighted, standardised and

non-standardised. In this analysis the infit mean-square was used to estimate markers

reliability. The expectation for this index is 1; the range is 0 to infinity. The higher

the infit mean-square index, the more variability can be expected. When markers are

fairly similar in the degree of severity they exercise, an infit mean-square index less

than 1 indicates little variation in the pattern of scoring, while an infit mean-square

index greater than 1 indicates more than typical variation in the ratings. According to

Myford and Wolfe (2000) there are no hard-and-fast rules for setting upper and lower

control limits for the infit mean square index. An acceptable interval for infit mean

square can be between 0.5 and 1.5. Below 0.5 the raters show less than acceptable

variation in scoring and those with more than 1.5 will show inconsistency or

excessive variation in marking.

301
The fixed (all same) chi-square is a test of the ‘fixed effect’ hypothesis:

‘Can this set of elements be regarded as sharing the same measure after allowing for

measurement error?’ The chi-square value and degrees of freedom (d.f.) are shown.

The significance is the probability that this ‘fixed’ hypothesis is the case. This tests

the hypothesis: ‘Can these raters be thought of as equally severe?’ The final line

contains the Random (normal) chi-square test. This is a test of the ‘random effects’

hypothesis: ‘Can this set of elements be regarded as a random sample from a normal

distribution?’ The significance is the probability that this ‘random’ hypothesis is the

case. This tests the hypothesis: ‘Can these persons be thought of as sampled at

random from a normally distributed population?’ (O’Sullivan, 2002, p. 17).

Looking at the Infit mean Squares in table 54 (current Portuguese examination)

revealed serious problems with four raters. Raters 9,6 and 5 they tended to give the

same scores to every script with no deviations, especially the marker 9. Marker 10

was also very problematic showing total inconsistency of marking (3.2).

In the trial 1 (table 55) one marker presented an infit mean square of 0.3 (poor

deviation) and two markers 1.6 (slightly inconsistent).

In the trial 2 (table 56) four markers showed poor deviation (infit mean squares 0.2,

0 .1, and 0 .3), but these four markers did not participate in the pilot, so it was

reasonable to expect less consistency because they had less training than the others.

Table 54: Current examination raters measurement report


| Obsvd Obsvd Obsvd Fair-M| Model | Infit Outfit |
| Score Count Average Avrage|Measure S.E. |MnSq ZStd MnSq ZStd | Nu Raters
| 74 6 1s 0 .7 0 | 1 1 |
| 74 6 12.3 12.32| -.19 .22 | .9 0 .9 0 | 2 2

302
|
| 75 6 12.5 12.48| -.24 .22 | .6 0 .6 0 | 3 3
|
| 54 6 9.0 8.95| .71 .21 | .9 0 .8 0 | 4 4
|
| 72 6 12.0 12.01| -.09 .22 | .2 -1 .3 -1 | 5 5
|
| 70 6 11.7 11.70| .00 .22 | .3 -1 .4 -1 | 6 6
|
| 77 6 12.8 12.81| -.34 .23 | 1.0 0 1.0 0 | 7 7
|
| 80 6 13.3 13.32| -.49 .23 | 1.4 0 1.4 0 | 8 8
|
| 68 6 11.3 11.40| .10 .22 | .0 -3 .0 -3 | 9 9
|
| 64 6 10.7 10.76| .28 .21 | 3.2 2 3.3 2 | 10 10
|
| 70.8 6.0 11.8 11.81| -.05 .22 | .9 -.5 .9 -.5| Mean (Count: 10)
|
| 7.1 .0 1.2 1.17| .33 .01 | .8 1.5 .9 1.4| S.D.
RMSE (Model) .22 Adj S.D. .24 Separation 1.09 Reliability .54
Fixed (all same) chi-square: 22.5 d.f.: 9 significance: .01

Table 55: Trial 1 raters measurement report


| Obsvd Obsvd Obsvd Fair-M| Model | Infit Outfit |
|
| Score Count Average Avrage|Measure S.E. |MnSq ZStd MnSq ZStd | N Raters
|
| 178 13 13.7 13.87| .33 .31 | .7 0 .7 0 | 1 1
|
| 176 13 13.5 13.73| .51 .31 | 1.6 1 1.7 1 | 2 2
|
| 172 13 13.2 13.37| .91 .32 | .3 -2 .3 -2 | 3 3
|
| 175 13 13.5 13.65| .61 .31 | 1.1 0 1.0 0 | 4 4
|
| 168 13 12.9 12.92| 1.33 .33 | .5 -1 .5 -1 | 5 5
|
| 179 13 13.8 13.94| .23 .30 | 1.0 0 .9 0 | 6 6
|
| 173 13 13.3 13.47| .81 .32 | .7 0 1.1 0 | 7 7
|
| 174.4 13.0 13.4 13.56| .68 .32 | .8 -.5 .9 -.4| Mean (Count: 7)
|
| 3.5 .0 .3 .32| .35 .01 | .4 1.1 .4 1.0| S.D.
|
RMSE (Model) .32 Adj S.D. .15 Separation .48 Reliability .19
Fixed (all same) chi-square: 8.4 d.f.: 6 significance: .21

Table 56: Trial 2 raters measurement report


| Obsvd Obsvd Obsvd Fair-M| Model | Infit Outfit |
|
| Score Count Average Avrage|Measure S.E. |MnSq ZStd MnSq ZStd | N Raters
|

303
| 143 12 11.9 12.03| -.08 .33 | .9 0 1.0 0 | 1 1
|
| 148 12 12.3 12.64| -.60 .33 | .2 -2 .2 -2 | 2 2
|
| 137 12 11.4 11.21| .54 .32 | .1 -3 .1 -3 | 3 3
|
| 153 12 12.8 13.19| -1.13 .33 | .7 0 .6 -1 | 4 4
|
| 140 12 11.7 11.62| .23 .32 | .1 -3 .1 -3 | 5 5
|
| 138 12 11.5 11.34| .44 .32 | .5 -1 .6 -1 | 6 6
|
| 144 12 12.0 12.16| -.18 .33 | .5 -1 .4 -1 | 7 7
|
| 147 12 12.3 12.53| -.50 .33 | .3 -2 .3 -2 | 8 8
|
| 138 12 11.5 11.34| .44 .32 | .8 0 .8 0 | 9 9
|
| 143.1 12.0 11.9 12.01| -.09 .32 | .5 -1.9 .4 -2.0| Mean (Count: 9)
|
| 5.1 .0 .4 .65| .54 .00 | .3 1.3 .3 1.3| S.D.
|
RMSE (Model) .32 Adj S.D. .43 Separation 1.32 Reliability .63
Fixed (all same) chi-square: 24.6 d.f.: 8 significance: .00

It is important to note that three of the Trial 2 teachers (R, C and AP) also

participated in Trial 1 (pilot), so they had more experience with the new marking

procedures than the others who had just attended the standardisation meeting and this

was reflected in the Trial 2 teachers’ highly consistent marking (see in the table 56

the raters numbers 6, 7 and 4 for infit mean squares logits). Failure to use the full

scale for marking can be observed with the new teachers involved in the trial and

who had not attended all the on-line meetings (see table 56 raters 3, 2, 5 and 8 for

infit mean squares logits). It is possible that these raters were not confident in

applying the grade descriptors because of their lack of training with visual exemplars

and participation in group discussions about interpretation of criteria. Therefore it

seems apparent that teachers who experienced previous assessment training

improved their marker reliability while the lack of previous training reduces

reliability.

7.4.2.4. Probability curves

FACETS also provides probability curve graphics. The curves present the probability

of occurrence for each category. The probability of the extreme categories always

304
approaches 1.0 for corresponding extreme measures. Most scale developers intend

this to look like a series of hills (O’Sullivan, 2002, p. 21). From the graphics in tables

57, 58 and 59 it is evident that the current Portuguese examination probability curve

is not at all as might be expected. It appears that in the current examination (table 57)

between 7 and 16 on the scale there is never a clear probability that any particular

score will be achieved suggesting very serious problems in the scale. The middle of

the scale appears to be unable to distinguish between different levels of ability and

this is problematic as this is where critical decisions often have to be made.

In the Trial 1 (table 58) the range of performance only extends from 10-18, and there

were problems with the scale points 13 and 16. At these points on the scale there is

never a clear possibility that this score will be awarded, i.e. this ‘hill’ is not

distinguishable from the other ‘hills’ and this could be caused by lack of clarity in the

draft assessment matrix and grade descriptors. However the revision of such tools

seemed to help because in the trial 2 (table 59) the probability curve was closer to the

expected probability curves, though there are still some residual problems with the

scale, for example at points 9 and 11.

In conclusion it is evident that still more work needs to be done to improve the new

examination instructions at these points on the scale, but the trial scale is clearly

superior to that used in the current Portuguese examination.

-4.0 -2.0 0.0 2.0 4.0


++----------------+----------------+----------------+----------------++
1 | |
| |
|6 7|
| 666 777 |
| 666 777 |

305
P | 66 77 |
r | 66 77 |
o | 66 77 |
b | 6 7 |
a | 66 7777 77 |
b | 6777 77 7 |
i | 7766 7 666666* |
l | 77 6 66 77 666 |
i | 77 66 7 6 7 66 |
t | 77 6 7 6 7 66 |
y | 77 6 *00 22***336 77 66 |
| 777 66 00 1****1 2633 7 666 |
| 777 * 11* 33001 6**5**555 6666 |
|7 009***9*7 ***5 **4*4 5555 6|
| 8888****************0***1 22****4 555555 |
0 |*********************************************************************|
++----------------+----------------+----------------+----------------++
-4.0 -2.0 0.0 2.0 4.0

Table 57: Current examination probability curves

-6.0 -4.0 -2.0 0.0 2.0 4.0 6.0


++----------+----------+----------+----------+----------+----------++
1 | |
|0 |
| 000 88|
| 00 88 |
| 00 88 |
P | 0 8 |
r | 0 8 |
o | 0 8 |
b | 0 88 |
a | 0 4444 7777 8 |
b | 0 44 4 555 77 7 |
i | 4 4*5 55 7 *7 |
l | *111 2222 4 5 4 5 8 7 |
i | 11 0 ** 2 5 4 7 8 7 |
t | 11 02 1 **3 5 4 75 8 77 |
y | 1 20 1 3342 33 5 4 7 5 8 7 |
| 11 2 0 3* 4 2 3*5 * 588 77 |
| 111 22 03 *1 225 33 76*666** 77 |
| 111 22 330*4 1 552 33 6*6 4 8 **6 77|
|1 2222 333 44 00 5**1 222 6**** 88*44 5*666 |
0 |*******************************************************************|
++----------+----------+----------+----------+----------+----------++
-6.0 -4.0 -2.0 0.0 2.0 4.0 6.0

Table 58: Trial 1 probability curves

-12.0 -8.0 -4.0 0.0 4.0 8.0 12.0


++----------+----------+----------+----------+----------+----------++
1 | |
| 0|
| 00 |
| 0 |
|4 0 |
P | 4 666 |
r | 6 6 0 |

306
o | 4 6 6 |
b | 0 |
a | 4 6 88 |
b | 6 6 77 8 8 00 55 66 77 0 |
i | 7 7 8 0 0 44 5 6 67 7 8 |
l | 46 7 8 22 4 * * 8 8 9* |
i | 7 6 7 ** 02 333 5 76 87 * 9 |
t | 5**5 7 8 7 9 9 324 5 4 6 7 6 9 8 9 |
y | 5 5 6 9 08 9 *03 3 4 5 8 7 0 9 |
| 5 6 4 5 7 86 * 1 1 42 53 6 75 86 97 8 9 |
|5 5 7 8 70 8 *2 31 * 6 4 7 9 0 9 |
| 6 4 7* 8 699 07 *29 0* * * 58 * 07 8 9 |
|66 7*4 55*8 9660 711283**40** 2* 337 4488559 6* 77 88 99|
0 |*******************************************************************|
++----------+----------+----------+----------+----------+----------++
-12.0 -8.0 -4.0 0.0 4.0 8.0 12.0

Table 59: Trial 2 probability curves

7.4.2.5. Expected ogives

The expected score ogive (linear graph representing scores expected to look like

similar steps) shows the average rating value expected for any measure relative to the

item, judge etc. The ogive also indicates the category ‘zone’ or areas between

average ratings (O’Sullivan, 2002, p. 21). Comparing the expected ogives from the

current Portuguese examination (table 60), trial 1 (table 61) and trial 2 (table 62) it

is evident that the new assessment showed much more consistency between points in

scales because the ‘zones’ between average ratings are more equally distributed.

In the current Portuguese examination only a small portion of the scale is used; it is

impossible to determine different steps for the real scale, at the top and at the bottom

the scores are overlapping. There is almost no increase in score for large areas of

ability between 2 and 7. Between 7 and 15 the scores for a similar ability increase.

But between 16 and the top of the scale there is almost no improvement in score for a

similar increase in ability. It appears that the examination was unable to

distinguish between levels of student ability or achievement and raters were

unable to differentiate between those levels. In the trial 1 (table 61), and especially

307
in the trial 2 (table 62) the increase in score is more in line with increase in ability

or achievement.
-4.0 -2.0 0.0 2.0 4.0
++----------------+----------------+----------------+----------------++
17 | 77777777|
| 66666777777 |
16 | 6666 |
| 556 |
15 | 5 |
14 | 44 |
| 4 |
13 | 3 |
| 23 |
12 | 2 |
| 12 |
11 | 1 |
| 0 |
10 | 0 |
| 9 |
9 | 99 |
8 | 8 |
| 78 |
7 | 777 |
| 666666777777 |
6 |666666666 |
++----------------+----------------+----------------+----------------++
-4.0 -2.0 0.0 2.0 4.0

Table 60: Current examination expected score ogive

-6.0 -4.0 -2.0 0.0 2.0 4.0 6.0


++----------+----------+----------+----------+----------+----------++
18 | 888888|
| 77788888 |
17 | 7777 |
| 667 |
16 | 66 |
| 56 |
15 | 555 |
| 4455 |
14 | 4444 |
| 344 |
13 | 333 |
| 23 |
12 | 222 |
| 12 |
11 | 1111 |
| 000011 |
10 |00000000 |
++----------+----------+----------+----------+----------+----------++
-6.0 -4.0 -2.0 0.0 2.0 4.0 6.0
Table 61: Trial 1 expected score ogive

-12.0 -8.0 -4.0 0.0 4.0 8.0 12.0


++----------+----------+----------+----------+----------+----------++
20 | 0000000|
19 | 9999 |
18 | 888 |
17 | 7777 |
16 | 6666 |
15 | 5555 |
14 | 4444 |

308
13 | 33 |
12 | 22 |
11 | 111 |
10 | 00 |
9 | 9999 |
8 | 8888 |
7 | 77777 |
6 | 66666666 |
5 | 5555 |
4 |444 |
++----------+----------+----------+----------+----------+----------++

Table 62: Trial 2 expected score ogive

7.4.3. Consequences/impact

According to the questionnaire responses about current art examinations in Portugal

(see Chapter 4) the current examination narrowed the range of the discipline and

distorted curriculum practice; the new assessment instrument enlarged the range and

promoted a broad vision of art and design education which might have some positive

impact upon the curriculum. The current examination does not require specific in-

service training for teachers; the new assessment procedures facilitated useful

professional training and dialogue between teachers. And, finally, the new

assessment instrument based on portfolio work was seen as a good foundation for

further studies in higher education, which was not the case with the current

examination.

7.3.4. Practicalities

The current Portuguese examination might be considered more practical in terms of

the limited resources, time, equipment and environment required. The new

assessment instrument and procedures require extra work from students and teachers,

improved work environment and possibly more expensive resources. The cost of

achieving a degree of authenticity of tasks is high but not impossible if the

examination is conducted in normal classroom situations. The price to pay for the

309
enhanced reliability of marking results is also higher because teachers and

moderators must be properly trained to conduct the external assessment.

Summary
Through its implementation in five different schools the new assessment instrument

was evaluated. From the data collected for evaluation purposes the following

conclusions about its strengths and weaknesses were drawn:

Main advantages:
1. The new assessment instrument was more closely related to the nature of art

and design providing ‘authentic’ tasks. It fostered in students development of

independent thinking, critical skills and decision-making.

2. The necessary evidence required to reveal relevant knowledge, understanding

and skills in art and design in portfolio increased validity. The assessment

instrument was considered to be compatible with art and design practice,

related to the contents of the discipline and to be capable of allowing for

students’ own interests.

3. The new assessment instrument included tasks that enabled students to realise

their intentions and motivated them.

4. It facilitated assessment for learning by providing useful feedback for

teachers and students. Students experienced a great sense of achievement

from the portfolio and they felt motivated to learn. In addition the new

instrument was seen as a good learning experience and a good foundation for

preparation for art and design at university.

5. It fostered effective dialogue between students and teachers. The new

portfolio assessment instrument also enabled collaborative work between

students and teachers in an ‘apprentice-like’ relationship. Collaboration was

310
one key aspect of the new assessment instrument that facilitated constant

dialogue between teachers and students, and between students especially in

group discussions, or ‘crits’.

6. It was adaptable to different school contexts.

7. The instructions, criteria, assessment matrix and global descriptors were

clear, increasing the transparency of assessment.

8. It offered a system of teacher and assessor training that increased common

interpretation of criteria.

9. It included a system of moderation that enabled a more uniform application

of standards and provided useful feedback for teachers and students.

10. Finally it provided more reliability of results than the existing examination.

Main disadvantages:

1. It requires hard-working and committed teachers. During the pilot and trial

teachers spent long periods of their free time in meetings. Collaborative work

between teachers and students, negotiation of objectives and progressive

autonomy of student learning was only possible because teachers who

participated in the trial and in the pilot on a voluntary basis were particularly

committed.

2. Problems were revealed resulting from unfamiliarity with the new instrument.

The candidates’ previous experience of tasks similar to those tasks required in

the portfolio seemed to be a key factor for the success of the new assessment

instrument. Unfamiliarity with the tasks and the lack of previous training in

critical thinking were the main causes for the failure. An important issue in

311
the relative success of the portfolio was linked to the students’ attitudes and

behaviours. Their past working practices and experiences of learning

independently and collaboratively all contributed to their overall

performance.

3. Potentially serious problems of bias were revealed, for example bias related

to access to resources and teacher support. The bias related to the degree of

help and advice students could obtain from their teachers was particularly

problematic.

4. The proposed new instrument is more expensive to operate than the current

Portuguese examination.

5. Finally the increased flexibility offered by the new assessment instrument

may inhibit its uniform application.

312
Chapter 8Conclusions

The three main sections of this chapter focus on: (1) the research questions

identified in the Introduction Chapter (p. 15) which were revisited in order to

summarise the findings and to draw conclusions, (2) some unanswered

questions were identified which might be explored more fully in further

research, (3) the research methodology was reviewed and critiqued, and (4)

implications of the study for Portugal were established and recommendations

made.

8.1. Current system of art examinations in Portugal (2000-2003)

The finding that the current system of art and design examinations in Portugal lacked

the validity and reliability deemed necessary elsewhere was not a surprise outcome

(see Chapter 4). Eisner’s views (1979) were confirmed in that the old-fashioned

system of tests to assess art and design provided: ‘a biased, and distorted picture of

the reality that we are attempting to understand’. Portuguese art educators and

examinations were strongly embedded in tradition and regulations designed for the

benefit of administrators rather than students. Portuguese art examinations were in

conflict with many essential art and design concepts and even with the general

ideology underpinning the Portuguese educational system. In the name of

‘objectivity’ art and design students were examined in only easily assessable

activities by means of pencil and paper tests. Consequently art and design students

were selected for higher education on the basis of assessments that did not reflect the

real complexity of art and design learning. The examinations focused on a narrow

313
area of the curriculum; studio art was eliminated from the external assessment

mainstream because its assessment was thought to be ‘subjective’ and

problematically fosters a diversity of outcomes. So, instead, students were examined

on less important specialisms such as Materials and Techniques of Plastic Expression

(MTEP) by means of a short, timed drawing test. Studio art’s status in the curriculum

was weak although considered by a great majority of art students and teachers to be

the most important specialism for preparing students for further studies in art and

design and associated fields such as architecture.

The Portuguese system of art examinations tended to deny important rationales and

educational objectives for the arts. The contemporary demand for transferable skills

such as those of communication, information retrieval, problem solving, critical

analysis, self-monitoring and self-assessment did not feature in art examinations.

Students were not given an opportunity in the examination to show how well they

know, understand and can make art (Chapter 4, p.130). Summative assessment and

examinations were used only for gate keeping purposes or as punishment/reward

system, not to empower students or as educational experiences designed to display

student’s knowledge as Wiggins described (1993, pp 54-67).

Debate about new forms of assessment in Portugal was mainly limited to formative

assessment, mostly in science subjects and teacher education (Valadares & Graça,

1998; Boavida & Barreira, 1993; Sá Chaves, 2000). Although they were dissatisfied

with the current system of art examinations, teachers and students generally were not

aware that these traditional forms of external assessment could be replaced.

As Broadfoot (1998, p. 473) argued, tests: '… with their emphasis on scientific

314
rationality are so pervasive in modern societies that they blind us to the potential for

alternative forms of assessment'. This study found evidence for the need to reform.

The great majority of participants in the Portuguese survey (see Chapter 4) proposed

new external assessment instruments to improve the system, which were linked to the

alternative or more authentic methods of assessments discussed in the international

literature (see Chapter 2). Conservatism was not an issue among the participants and

it was found that there was a strong will to change practices.

A finding of this research was that Portuguese art and design examinations did not

offer a clear set of assessment materials and guidance for teachers and students

(Chapter 4, pp. 128-130). Curiously while there was an extraordinary amount of

regulations for the conduct of examinations, assessment criteria were only vaguely

defined. There were no established attainment targets or standards – these were

tacitly used but not publicly discussed. The ritual of the examinations was more

important than the evidence of achievement they produced.

It could be argued that the system embodied Eisner’s connoisseurship model (1985)

in the conviction that teachers and assessors should be qualified artists and teachers

who know how to interpret the visual terms used in the criteria: art critics and

connoisseurs. This may be an important factor but is not the only necessary

condition. As the literature has established valid and reliable assessment does not

depend on single teachers working in isolation; it depends enormously on dialogue

and sharing values. It is true that it is difficult to define criteria and standards in art

and design (Boughton, 1999) but it is erroneous to expect assessors to work

efficiently and effectively without common agreement about the qualities thought to

315
be desirable in the work. Not only are standards and criteria difficult to establish,

it is also very expensive to prepare teachers to interpret them accurately and provide

places and time for dialogue. The research found that in-service opportunities for

training in assessment were not on the agenda of the Portuguese art examination

system, and the negative consequence of this was the examinations poor reliability of

results.

In Portugal the independent judgements of each assessor were trusted by the system

without any moderation procedures to ensure consistency of marking. Teachers and

students were dissatisfied with this situation (Chapter 4, pp. 136-138). The lack of

control of the assessment procedures was another possible cause for the unreliability

of the examination results and eliminated a possible means of providing effective

feedback for teachers and schools.

8.2. Current system of art and design examinations in England (2000-2003)

The current system of art and design examinations in England was found to have

some advantages when compared with the Portuguese system in terms of validity and

reliability. Overall the assessment procedures revealed some degree of increased

reliability, especially as a consequence of the standardisation and moderation

procedures. However an analysis of key stakeholders’ views revealed some

problems. For example, one problem was the ever increasing workload for teachers,

while the other was ‘more philosophical and concerned with the effect of the

examination on teaching and learning’ with negative effects on students and the

curriculum (Steers, 2003, p.25). In the GCE examination students were required to

submit every unit from the start of the course fully realised and of the required

316
standard, potentially exposing them to over-assessment. Students and teachers lacked

time for experimentation and creative risk-taking. According to Steers (2003, p.25)

the examination’s hidden message for teachers and students was: ‘…don’t bother

being creative, avoid risks, play safe, do what is expected’.

The modular system was believed by some to threaten the very nature of art and

design learning which, in theory, should foster a climate of inquiry, risk-taking and

creative opportunities (Hardy, 2002). It was also in conflict with the current

emphasis in English education on the need to develop the creative potential of

students, as expressed for example, in the report of the National Advisory Committee

on Creative and Cultural Education (NACCCE, 1999). In the researcher’s view, at

the Edexcel standardisation meetings the set of exemplar of art and design students’

work displayed was rather ‘safe’ devoid of obvious risk-taking. Besides the

‘exemplars of students’ art works’ were used not for discussion or to negotiate

standards but rather to illustrate the competencies students needed to reveal in order

to meet assessment objectives. These visual examples were important tools for

understanding and interpreting the language of the criteria and assessment matrix,

but there was no ‘debate about what counts as good work in the community’

(Boughton, 1997). Instead it was found that in practice there was little teacher’s

ownership of assessment. It was observed that students’ voices were also neglected

by the assessment system, which had little emphasis on self-assessment in the

instructions and criteria for the examinations.

At the end of 2003 several signs of dissatisfaction with the English examination

system were evident and debate about reform was ongoing. Tattersall, a former head

317
of the Assessment and Qualification Alliance (AQA), reported to in The Guardian

(September 30, 2003) that teachers should be trusted more to assess their own

students. She wrote:

‘ …the equation of reliable assessment with externally


set and marked examinations is neither helpful nor
based on reality. It devalues the skills which external
assessment cannot accommodate, it places pressure on
students. Most of all it undermines teachers’
confidence and commitment’.

Ronnie Lane in the National Society for Education in Art & Design newsletter:

A’N’D (Nº 10, Winter 2003/4) expressed teachers’ concerns as voiced in the

Liverpool Art and Design Teachers Forum meeting (2003) related to increasing

workload for marking, internal standardisation and preparing moderation exhibitions:

‘The forum was of the opinion that art teachers cannot sustain the levels of workload

which is being demanded by the examination boards’ (Lane, 2003, p. 3). At the time

of this writing the Department for Education and Skills was working on new

proposals for opportunity and excellence in education at 14- 19 ages (Tomlinson’s

Working Group on 14-19 Reform), building on proposals outlined in the Green Paper

consultation of February 2002 (www.dfes.gov.uk/14–19) concerning extending

opportunities and raising standards. Their extensive consultation confirmed the need

to create a clearer and more appropriate curriculum and qualifications framework for

the 14–19 phase – ‘…one that develops and stretches all our young people to achieve

their full potential, and prepares them for life and work in the 21st century’

(www.dfes.gov.uk/14–19). By the end of 2003 the Working Group on 14-19 Reform

reached interesting interim conclusions and recommendations emphasising the

importance of matching styles of assessment to styles of learning, using a variety of

assessment methods as well as fostering the idea of assessment for learning,

318
increasing formative assessment and reinforcing the professional judgement

of teachers.

8.3. Impact of the English and Portuguese art and design examinations

From the study of English and Portuguese art examinations it was evident that

assessment of art practice constitutes a particular instance of what Foucault describes

as power-knowledge which invokes processes of surveillance, normalisation,

discipline and regulation of student’s art work. As Atkinson pointed out assessment

logic is reductionist and by implication it omits ‘…those things, those forms of life,

which do not fit’ (Atkinson, 2002, p. 195).

A tension between the curriculum and summative/external assessment was

evident in Portugal. Formative assessment to reinforce learning was emphasised in

curriculum materials but summative assessment and examinations were not viewed

as having learning potential. Therefore it was not surprising to discover that

internal assessment results were inconsistent with external examination results.

This confirmed the hypothesis that external assessment in Portugal could be

interpreted as a simple gate keeping exercise with an extremely negative washback

upon the curriculum. Tests and examinations strongly influenced teaching practice,

teachers taught to the tests and examinations which apparently were intended to

assess students’ memory, understanding of technical drawing conventions, trivial art

historical information and formalist applications of the elements and principles of

design whilst neglecting critical understanding of art making and creativity.

319
In England, art and design examinations, as well as providing a means for selection

and certification, also seemed to be a way of controlling the curriculum. The QCA

and awarding bodies exercise great power over students, teachers and schools

imposing a centralised view of art education as defined in the National Curriculum.

As previously stated, the modular system of examination in England was found to

have negative effects on the curriculum, teachers and students. Not only in terms of

work overload but also because they did not encourage teachers to foster

experimentation and risk taking during their courses. Therefore the need for

accountability in England also generated negative effects upon schools and teachers.

For example allocation of school resources by the government depended on the

grades that their students obtain during the examinations and the position of

individual schools in national ‘league tables’.

8.4. Increasing the validity of Portuguese art and design external assessment

The great majority of participants in the Portuguese survey reported in Chapter 4

(p. 134) suggested external assessment instruments as portfolio work that they

believed might improve the system. However assessing portfolios requires changes

to assessment practices and corresponding changes to the curriculum and pedagogy

(Klenowski, 2002). In the context of this research it was impossible to change the

established curriculum for art and design in Portugal, thus it was necessary to find

strategies to increase the transparency of the assessment procedures; to revisit the

current domains of knowledge, understanding and skills in order to provide a clear

and valid framework for defining new criteria, new tasks and the evidence required

for art examinations.

320
8.5. Increasing the reliability of Portuguese art and design external

assessment results

It was not surprising to discover that the great majority of participants in the

Portuguese survey (p. 139) suggested teacher training and assessment juries as ways

to improve the system. The key reforms identified for increasing the reliability of

results were in-service training, the introduction of standardisation meetings and

moderation procedures.

8.6. The proposed model for art examinations

8.6.1. The underlying model

The proposed framework was designed to accommodate postmodern views of art and

education and was intended to relocate student experience, and contemporary

realities of the visual arts, by encouraging students to ‘…develop a critical awareness

of the visual culture they encounter every day’ (Freedman, 2003, p.11). It was partly

influenced by Swift & Steers (1999) manifesto for the arts in schools and was

intended to reduce tensions between assessment discourses and the heterogeneity of

practice by moving towards a more inclusive approach to teaching and learning in art

education. It was also designed to enhance the production of visual forms of

signification in order to explore students’ experiences and their interests in specific

social issues.

During the trials of the new assessment (Chapters 6 and 7) it was found that in

certain cases students integrated everyday life experiences and reflected on the social

meanings of art through exploration of issues such as racism, sexual orientation,

otherness, etc. They used a variety of methods of investigation to understand social

321
issues including interviewing members of the community and independent searches

for source material. But this was not the case for all students. In school A for

example the project briefs for the portfolios did not ask pupils to explore social

issues and the investigation was limited to examples of modernist art. The role of the

teachers and their beliefs significantly influenced students’ choices. In some cases

teachers were not able to go beyond the formalist art concepts they had been taught

and they did not encourage critical analyses of visual culture. Such teachers, for

example M, AP and C, were nervous about letting students investigate themes they

were not comfortable with themselves, and convinced their students to engage with

‘safer’ and more conventional projects. Short standardisation meetings were

insufficient to encourage such teachers to make profound changes in their attitudes.

The on-line meetings while were intended to provide support for teachers and more

space for dialogue did not completely attain these goals. Using Internet resources for

dialogue about art and design learning and assessment may be more useful in the

future; at the present time it was found that both teachers and their students found it

difficult to use such tools because of their unfamiliarity. Moreover digital

reproductions of students’ studio art works are not appropriate for fully appreciating

and discussing the visual characteristics of portfolios, except perhaps when the

student’s chosen media is digital.

8.6.2. The proposed instrument

After the trials of the proposed instrument (Chapters 6 and 7) it was evident that the

role of the teacher was crucial when students challenged conventions or methods of

work through creative proposals or when they attained unexpected outcomes that

revealed their own intentions, personal experience and awareness of social and visual

322
culture issues. A, I and R acted as facilitators through the substantive

dialogue that took place between the student and the teacher, using teaching methods

based on collaborative learning and partnership (see Appendices XIX, XX and XXI).

This enabled assessment to be integrated into the teaching and learning cycle.

This confirms Klenowski’s (2002, p.115) view that teacher’s role is fundamental in

preparing and managing the learning environment. However, one negative

consequence of the importance of teacher’s role in the assessment was the evidence

that the instrument was strongly biased by the degree of aid afforded by the teacher.

The new assessment instrument was found to respect student’s voice and

personal style as Wiggins (1993) and Ross et al (1993) have claimed it should do.

Student performance was very variable according to their school context, although a

wide range of levels of achievement was obtained in each school with the exception

of School P where the portfolio experience verged on complete failure for the

reasons described in Chapter 7, for example the lack of student motivation and

unfamiliarity with the format. In the new assessment instrument students were given

a considerable degree of decision-making: selection of works for inclusion in

portfolios and the reasons for inclusion were negotiated with their teachers.

However, for some students, such a level of autonomy was difficult, perhaps because

they did not clearly understand the aims and assessment criteria. While they were

aware that they were expected to be thoughtful about the selection of work for

inclusion, this presented problems because they had not developed a capacity for

reflection in previous years. From the research it was also apparent that some

students found self-evaluation difficult. In the new instrument students need to take

responsibility for self-directed learning and for developing and maintaining their

323
portfolios. The student interviews confirmed that this requirement was understood

but nevertheless they were not always able to evaluate their own progress and

achievement. Interactive learning, tutorials, interviews, one-to-one sessions,

discussion groups, peer critique, were all found to be helpful in this regard.

But developing a capacity to select appropriate evidence of attainment of particular

competencies takes time.

According to the teachers involved in the main trial some students succeeded in

presenting portfolios of exceptional quality even in cases where resources were poor

and teacher aid was not forthcoming.

The students’ commitment and motivation were evident and contributed to their

success. For example Joana fought to pursue her own goals despite a lack of the

resources to realise them and the teacher’s disagreement with the chosen theme

Apocalypses and sources‘(see Appendix XX, p.270). According to their teachers

Susana presented excellent investigative studies despite the lack of available

information about ‘capoeira’ and Ruben presented very good developmental studies

using recycled materials. In the case of these students the opportunity to make a

project relevant to their own lives appeared to give them the strength to work hard

and seek out aid elsewhere. Students felt that with the new assessment instrument

they gained more ownership of the learning and assessment process. The conclusion

was drawn that students need to have decision-making powers about themes for

project briefs and about which work is chosen for inclusion in portfolios if they are to

feel ownership of, and real commitment to, the art making. It increases the content

validity of the assessment instrument and presents more opportunities for student

motivation and independent learning.

324
Unlike the current examination in Portugal, which limits the range of the curriculum,

the new assessment instrument enlarged the range of the subject and raised the status

of studio art in some students’ perceptions. It required more evidence and more

committed work from them and this was only possible because they came to believe

that studio art had some direct relevance for their lives. Should the proposed

assessment instrument ever be implemented on a national scale, it might enhance

the status of studio art in the curriculum.

The breadth and flexibility of the assessment instrument accommodated a variety of

different visions of art and design processes and making which challenged the old

habit of setting teacher-directed prescriptive tasks. It does, however, require art

teachers who are able to help students develop independent critical and self-reflective

skills. It also demands dialogue between teachers and students, mutual understanding

of aims and intentions and negotiation of tasks and approaches. From the trials it was

concluded that the new assessment instrument provided useful feedback for students

and teachers. Assessment was seen in Wolf ‘s terms as 'an episode of learning'

(Wolf et al, 1991, p.183) providing qualitative information in the course of portfolio

development (Wiggins, 1993, p.195) in the form of a constructive dialogue between

students and teachers.

Problems may arise when teachers do not understand the implications of such

changes, as happened with teacher E (School K) who did not understand

Joana’s visual work in her portfolio because she did not clearly realise that the

assessment had expanded art practice to the world of visual culture (see Appendices

325
XX, p.264 and XXVI, p.319). E did not recognise that the teachers needed to acquire

a deep respect for the students’ artwork and this challenged the limits of her

understanding of art practice. It seems

evident therefore that before implementing the proposed assessment instrument

it would be necessary to change teaching and learning practices and to revise

preconceptions of what students in the art and design disciplines should learn and be

taught. Teachers need to reconceptualise their pedagogy before they can integrate the

new instrument into their everyday routines and classroom teaching.

At the same time, the breadth and flexibility of the assessment instrument presented

problems in terms of uniformity of application and therefore its potential reliability.

Students and teachers in the trials recognised that ‘taking risks’ was positively

encouraged, and diverse creative outcomes and approaches were expected. In the

current Portuguese art examination system tests were easily administered under the

same conditions for all schools and all students. The only resources required were

pencil and paper and the short time for the test allows effective standardisation of

assessment conditions. The new instrument, which aims to provide a more valid

assessment, presented problems related to diversity of outcomes and the need for

increased resources. In the English art and design examination system this problem

was generally less evident because the work submitted for examination was often

very similar in character, may be because teachers and students felt safe within

orthodoxies established by the school art tradition and examination rubrics.

Perhaps they felt that they could not risk different approaches because the results of

examinations had major implications for public perceptions of schools and teachers,

as well as students.

326
It is recognised that if the new assessment instrument were to be used over a period

of several years, there is a danger of losing flexibility. Just as that which appears to

have happened in the English art and design examinations, a new orthodoxy could

develop around sets of exemplars displayed during standardisation meetings and

teachers might encourage students to present very similar portfolios in order to

achieve good results. One possible way to prevent such an occurrence is to stress the

rationales for the assessment instrument and emphasise the importance of diverse

creative outcomes during in-service teacher training sessions. And students need to

understand that they will be rewarded for innovative and personal responses

evidenced through a wide variety of outcomes.

It is widely agreed that creativity needs to be encouraged in the learning community;

there is a need for a wide variety of visual experience and debate within assessment

communities. As Freedman (2003, p. 24) pointed out, surprise should be valued,

educators ‘…should continually hope for surprising crossings of aesthetic levels in

the creation of knowledge, and for the unexpected outcome that surpasses planned

objectives’. Teachers should take into account the most up-to-date student and

professional art being produced at any given time in their debates about what should

be valued, and such debates ‘…must include continual challenge so that student art

[that] goes beyond the hopes stated in instructional objectives, is rewarded’

(Freedman, 2003, p. 168).

327
8.6.3. The assessment procedures

Using the new assessment procedures required considerable time for training and for

dialogue between teachers. The increased burden on teachers needs to be

acknowledged – the quantity of work to be assessed, the standardisation meetings

and reports to be written – all contributed to teachers’ workload. This was only

possible in the trials because the teachers who participated in them were convinced

that it was important for their own professional development and for the fairness of

art and design students’ assessment. It would be unrealistic to expect all Portuguese

art and design teachers to engage in the new assessment procedures with the same

commitment as the volunteers in the trials without considerable preparation and in-

service training. However this experience showed that it is possible to change old

established and unreliable habits of arts assessment in Portugal if the reasons for

change are properly understood, and change is understood to be in the best interests

of the subject, teachers and students.

The development of communities of assessors for inter-rater reliability purposes can

be used for positive professional development opportunities (Klenowski, 2000;

Schönau, 1996). The concept of a ‘community of judges’ (Boughton, 1997) and the

example of the English moderation procedures was used to design the assessment

procedures. It was clearly established that reliability of results achieved in the trials

increased in comparison with the current Portuguese system of a single assessor (see

Chapter 7, p. 281). As noted by several authors (Boughton, 1997; Beattie, 1997;

Steers, 1996; Blaikie, 1996) assessors can reach a considerable degree of consensus

through the use of standardisation and moderation procedures. The new procedure

was found to be more demanding than the current assessment system in Portugal, but

328
it was also found to present real benefits in terms of validity, reliability and teachers’

professional development. One feature of the experiment’s success was that teachers

had enhanced confidence in their powers to make assessment decisions. The new

procedures did not undermine teachers’ professionalism. The current Portuguese

reliance on teachers’ connoisseurship was maintained but the shift was made to

dependence on a community of assessors achieving consensus, thus helping to

eliminate some of the idiosyncrasies of the present system.

8.6. 4. Impact

From the research (see Chapter 7, pp. 275-279) it appeared that the effects of the new

framework for assessment were significant for both the teachers and the curriculum.

A principal outcome of the new assessment instrument could be an increase in the

status of studio activity and the raising of art and design educational standards by

redefining rationales, domains of learning, understanding and skills in art and design.

The teachers in the pilot and the trial reported that they changed the way that

they thought of their teaching, classroom practice and assessment methods.

These changes happened because the teachers who were already competent

practitioners reflected on the proposed rationales for the experience and decided,

with the support of their peers, that they had to make changes, because they agreed

with the proposals and believed that their students had the right to be involved in and

be responsible for their own learning and assessment.

329
8.6.5. Strengths and Weaknesses

Throughout this chapter strengths and weaknesses of a new framework for external

assessment in the arts have been discussed and these are summarised as follows:

8.6.5.1. Weaknesses

a. The assessment instrument (portfolio) is extremely demanding in terms of

teachers and students’ workload. Portfolios demand a great amount of

selected evidence to be submitted and marked.

b. The assessment instrument has various potential sources of bias.

c. The work requires considerable time and resources that are not always

available in schools for teachers and students.

d. It requires constant interaction and collaboration between teachers and

students, and some students may receive different degrees of teacher aid.

e. The assessment criteria require skills that might or might not have been

fostered during the previous years of the course; problems of unfamiliarity

might affect students whose critical reflective capacities have not been

developed in the course prior to assessment using the new instrument.

f. The assessment instrument may not be applied evenly for all students and

schools and its inherent flexibility might be a factor in reducing uniformity

of implementation.

g. Standardisation and moderation procedures are not easy to develop and

their success is dependent on expensive in-service teacher training and

professional debate among the community of teachers and examiners.

330
8.6.5.2. Strengths

a. The assessment instrument provides valid and authentic tasks related to the

art and design curriculum.

b. The assessment instrument enhances learning and student motivation

enabling their individual voices to be heard.

c. The assessment instrument integrates a wide range of methods of inquiry;

media, and domains of art and design, allowing students to develop personal

projects in which they can personalise social issues and reveal important

cognitive and metacognitive skills.

d. The assessment instrument enables collaborative learning; peer group and

self-assessment.

e. The assessment procedures enable a significant degree of improved

reliability of results through common interpretations of criteria and standards

between teachers and assessors, as compared to the present Portuguese art

and design examinations.

8.7. Wider implications

The approach to designing assessment instruments for art and design presented in

Chapter 6 was influenced by Weir’s (2004) framework for validating tests. It was

adapted for art and design contexts and might also be of use for developing similar

exercises in other countries. However, it is recognised that this present proposal for

a new system of art examinations needs debate and further revision. The conceptual

framework developed in this thesis was specifically designed for the Portuguese

context. Nevertheless some of the research findings could be of use for others

elsewhere as these suggested that: (1) it is possible to make worthwhile ‘bottom-up’

331
changes in art and design assessment, (2) dialogue, collaborative work and some

degree of teachers’ ownership of the assessment is possible in external assessment, at

least with small-scale groups. Each of the negative and positive findings of this study

raises important questions for consideration when designing assessment instruments

based on portfolios or coursework. For example:

a. Portfolio assessment requires change to traditional

assessment practices and corresponding changes to curriculum and pedagogy.

b. Developing portfolios demands a culture of self-

reflection. Before using portfolios as assessment instruments teachers need to

reconceptualise their pedagogy to integrate portfolio processes into their

everyday routines and classroom teaching.

c. In portfolio assessment the teacher’s role is

fundamental in preparing

and managing the learning environment. For portfolio assessment teachers

have to be able to help students develop independent critical and self-

reflective skills.

d. Developing portfolios demands dialogue mutual

understanding of aims

and intentions, and negotiation of tasks and approaches between teachers

and students.

e. Assessment procedures in the arts require a wide

variety of visual experience and debate within the art and design community.

The development of communities of assessors for inter-rater reliability

purposes can be used as positive professional development opportunities

332
These issues may be useful as a staring point to generate further discussion and

experimentation for those who are interested in developing portfolios and assessment

by juries. As Eisner (2003, p.55) pointed out ‘…our resolutions will generate other

situations that will need further resolution’. It would have been unrealistic to try to

resolve all the problems of external assessment in art and design in this research.

The idea established in the literature, that using alternative assessments such as

portfolios and assessments procedures involving teacher in-service training and

moderation to provide some degree of validity and reliability of external assessment

in art was supported by the empirical studies. It is hoped that the conclusions of this

research might be of use for the Portuguese government, art teachers, art education

researchers, and researchers into evaluation and assessment everywhere who are

searching for more valid and reliable assessment instruments and procedures for art

and design.

8.8. Further research

This research study was not intended to resolve all the problems of validity and

reliability in external art assessment. And the problematic issues identified in

Chapters 1 and 2 were not fully resolved. The identified strengths and weaknesses of

the English system of art and design examinations proved to be a useful starting

point for designing a new framework for external assessment in the subject in

Portugal. However biases identified in the English examinations were also found in

the new model for external assessment proposed during the research. As Beattie

(1995, pp. 52-53) pointed out, the emphasis on written descriptions of learning

processes in reflective journals may be unfair to disadvantaged students, minorities,

some boys and non-standard language speakers. To prevent bias it will be necessary

333
to reconsider issues concerning teacher and student ownership of the assessment,

school resources and opportunities for learning.

The English external assessment system was able to combine formative and

summative assessment (see Chapter 5) and the framework developed for a new

system of assessment in Portugal also combined them. Nevertheless questions remain

about whether these assessment functions should be separate or combined.

Contemporary trends are to foster ‘assessment for learning’, assessment for which

the first priority is to serve the purpose of promoting students’ learning (Black et al,

2003) and a combined approach to assessment (Eisner, 2002). However Gipps and

Stobart (1993) claimed that different purposes require different models of assessment

and whereas they admit that it might be possible to design one assessment system

that measures performance for accountability and selection purposes, whilst at the

same time supporting the teaching and learning process, ‘… no one has yet done so’.

However this statement might be applicable to assessment of portfolios. Klenowski

(2002, p. 10), for example believes that ‘ …a portfolio of work can fulfil the full

range of various assessment purposes: accountability, summative assessment,

certification, selection, promotion, appraisal and formative assessment in support of

teaching and learning processes’. The research did not fully confirm Klenowski’s

views, because it would be necessary to study the long-term consequences of the new

assessment instrument to do this. Nevertheless the issue of combined formative and

summative assessment needs to be addressed in future research.

Other aspects of assessment that could be explored more fully in the future are the

relationship between curriculum and examinations and the development of other

334
alternative methods of assessment. Assessment systems, with their bias towards

easily testable achievements rather than depth or interconnectedness of

understanding often reinforce the fragmentation embedded in the curriculum (White,

2003, p. 183). The disjointed nature of the Portuguese art and design curriculum

might be one reason for the relative invalidity of the assessment system, because

contemporary art and design work often links several disciplines or specialisms.

However it is hard to believe that it is the only reason and the question remains

whether the fragmentation of the curriculum is a consequence of an educational

system based on a post-nineteenth century tradition of compartmentalised and

‘objective’ examination methods or the primary cause of it.

The alternative form of assessment described in Chapter 6 achieved greater

validity but whether or not more equality and fairness was obtained is questionable.

Some students were disadvantaged because of gender bias and access to resources

and teacher aid. These problems may be related to the form of examinations in

general, which on a large scale, can be understood as power structures representing

‘…the desire to discipline an irrational social world in order to establish rationality

and efficiency’ (Broadfoot, 2000, p. ix). As Foucault (1977, 183) points out

examinations are potentially unfair, because they have ‘…the constraint of a

conformity that must be achieved’. Even if the required knowledge, skills,

competencies and criteria are established through dialogue and agreement, allowing

some degree of teachers’ and learners’ ownership, assessment still denies diversity

and plurality of approach for some groups of students and schools. More fairness

might be obtained by using negotiated assessment (Ross et al, 1993). Forms of

negotiated assessment were not used in this research; however the possibility of

335
negotiated assessment as an element in external assessment could be explored in

further research. Other methods and forms of assessment need to be investigated, as

Rayment (1999, p.193) suggests, it is necessary ‘…to identify methods of providing

valid and reliable evaluation procedures which can perform both formative and

summative functions’.

8.9. Reflections on the research methodology

The research method was constantly refined and readjusted during the period of the

study. Its hybridity provided flexibility for readjustments and allowed a balance

between statistically significant and meaningful outcomes. The mixture of qualitative

and quantitative methods provided considerable data for analysis (see for example

Appendices VIII, IX, X, XVII, XVIII). Some of the evidence proved superficial and

unnecessary detailed – and analysing it was consequently time consuming. On the

other hand the questionnaires, providing statistical information, revealed worthwhile

data with which to understand current examinations and that directly informed the

design of the new assessment instrument. However interviews, documents and

observation generally were more helpful for these purposes. Furthermore the

response rate to the questionnaires on the current Portuguese art examinations was

relatively poor (44 teachers and 104 students). Every effort was made to obtain

questionnaire responses during conferences, teachers’ meetings, and through

correspondence. But conducting a successful national survey is difficult when a

single researcher undertakes research with no formal authority consequently an

indicative sample is the best that can be expected.

336
Overall the research method was ambitious in trying to access views of participants

in two countries. However the programme as planned was followed thanks to the

generous help of various individuals and organisations. The fact that the research was

carried out by one single researcher limited the number of case studies and

consequently the possible validity of generalised results. The full potential of

methods of analysis such as multi-faceted Rasch measurement could not be realised

because of the restricted number of participants. On reflection, therefore, the research

ideally needed more resources and participants than as initially planned.

8.10. Recommendations for assessment stakeholders

8.10.1. Students

In the trials of portfolio assessment the students were expected to think, to assess

themselves, to accept challenging expectations and be collaborative learners.

From the sample included in the research it was apparent that the impact of the new

assessment instrument based on portfolios increased students in-depth study, active

and independent learning, awareness of their own learning strategies, motivation and

interest in their own achievements and performance. These benefits are compatible

with the findings of the Arts Propel project that also used portfolios as alternative

assessment instruments (Gardner, 1996). Inevitably this demanded increased work

and students who have grown used to being tacit observers might resent having to

work harder. By comparison the current Portuguese tests do not require so much

work and evidence to be submitted. Nevertheless students felt there were benefits in

the new assessment system; it seemed more interesting and relevant. The new

instrument required them to take individual responsibility for developing a portfolio.

337
As Black et al (2003, p. 59) stress: ‘Learning cannot be done for the student; it has

to be done by the student’.

However in the light of the problems detected during the research concerning

students’ lack of self-monitoring skills, it is evident that if these modes of

assessment are to be used, then the coursework must be planned carefully in advance.

Planning should take into account available resources and the assessment

requirements and students should be encouraged to follow work plans in order to

keep to deadlines.

As Klenowski (2002) pointed out, it is not only teachers who need to learn about the

pedagogical implications of using portfolios for assessment and learning; students

need specific teaching and support to develop the necessary cognitive processes. It

was evident from the research that students need skills in independent study, group

work, self-reflection, self-assessment, self-evaluation and questioning. Some students

were not prepared to develop their portfolios because they lacked background skills

in critical inquiry, in defining problems, self-discipline and self-evaluation. The

teachers saw that the source of the problem was that their students lacked the

necessary skills both to judge specific problems in understanding and to set realistic

targets to remedy problems within reasonable time frames. However, where teachers

created classroom environments in which students worked together to understand

teacher’s comments about their work, then peer and self-assessment provided the

training that students needed to judge their own learning and to begin to take action

to improve their performance. It was evident that changes are needed to the role of

students, but the period during which the role of the student’s changes needs to be

338
handled carefully and the students have to be supported as they learn to become

active, responsible learners. This requires time.

8.10.2 Teachers and Assessors

It was evident that to increase its validity the new assessment requires a long period

of time for students to become familiar with a culture of self-reflection that is only

established through long years of education. It also became increasingly clear, as

Black et al argue (Black et al, 2003, p. 59), that the teachers also needed to train their

students to take responsibility for their own learning and assessment. The teacher’s

role is to ‘scaffold’ this process – that is, to provide a framework of appropriate

targets and to give support throughout the task of attaining them. Implementation of

changes in classroom assessment would call for profound changes in both teachers’

perceptions of their own role in relation to their students and in their classroom

practice. Knowledge acquisition must be conceptualised as an active process rather

than passive. Cognitive processes used by students such as self-regulation and self-

monitoring are fundamental to understanding individual development and

achievement and have to be fostered by teachers. This kind of assessment therefore

has to be viewed as an integral part of the curriculum combining summative and

formative forms. Before it can become widespread it would be necessary for

teachers to change prescriptive modes of teaching and probably to revise

fundamentally their rationales for art and design education.

8.10.3. Schools

If the new assessment instrument is to be further developed with a larger sample then

certain problems need to be acknowledged. Choices between validity and uniformity

339
have to be made bearing in mind that less uniformity will not necessarily reduce the

reliability of results but will increase the potential for bias. These problems will have

to be solved before full implementation of the instrument by providing greater and

more equal resources for schools and by promoting pedagogy that emphasise

cognitive growth and inquiry methods. The learning milieu is a fundamental resource

and schools must recognise its importance.

Another important consideration for schools is the allocation of time for teachers in-

service training, meetings and marking. Teachers need to be motivated and to

understand the justification if they are to be expected to take on the burdens of

change. They need time to plan, discuss and mark students’ work, they also need

opportunities for in-service training, peer-reviews or meetings in order to create a

community of assessors.

8.10.4. Universities

The research found that the new assessment instrument had clear advantages in terms

of preparation of students for further studies in art and design because the evidence

submitted required a broader range of knowledge, understanding and skills in art and

design (see p.275). In Portugal the majority of art and design faculties and schools

select their students on the basis of the current examination system. The proposed

new framework allows for wider and more appropriate evidence of students’ artistic

achievement in terms of working processes and quality of outcomes, which may be

of more value for accurate selection of art and design students in higher education.

8.10.5. Government

340
It is fully recognised that the proposed framework and assessment instrument is less

than perfect but it does offer some clear advantages over existing practices in

Portugal. It would need further revision, development and re-trials to be ready for

implementation on a regional or a national scale. But, more importantly and as a first

step, rationales, concepts and models of art and design education need to be radically

re-thought in order to align Portuguese art and design education at secondary level

with the needs of contemporary society. It is also necessary to revise present

concepts of examinations and external assessment so as to provide opportunities for

the development of much more valid assessment instruments.

In conclusion, it was a key finding of this research that an instrument based on

portfolios, which includes a range of learning domains and skills in art education,

offers more validity than the current Portuguese examinations and can provide

positive outcomes for teachers and students. In-service teacher training and

moderation procedures combined to increase the reliability of the assessment results

in the small-scale trials reported here. However the instrument and assessment

procedures offered less utility than the current examinations, requiring more

resources and more work from teachers and students. Nevertheless the advantages of

the proposed framework probably outweigh the disadvantages. If the Portuguese

government is minded to improve the system of external assessment in art and

design, undoubtedly, this will require a long overdue, comprehensive and radical

reform of the art and design curriculum and examinations.

341
Glossary

Ability
A mental trait or the capacity or power to do something (ALTE, 1998).

Achievement
Ability to demonstrate accomplishment of some outcome for which learning
experiences were designed (Armstrong, 1994).

Achievement levels
Behaviours along a continuum that represent degrees of attainment of a criterion
(Armstrong, 1994).

Alternative Assessment
These are ways of measuring student achievement using methodologies other than
pencil and paper tests e.g. observation, checklists, portfolios, interview etc.

Art and design works


Art and design works are envisaged as a purposeful activity, the capacity to bring
into being something meaningful, useful or valuable (Minkin, 1998) which could be
original or a re-formulation of a concept or artwork. The outcomes should be of
value in relation to the pre-established intentions and in relation to particular
audience or viewer.

Art and design


Art and design is taken in England to mean the National Curriculum subject so
defined in English legislation for general education, examined by the recognised
awarding bodies under this associated specifications provided for students in state
maintained and independent schools. The subject encompasses a range of practical
activities such as drawing, painting, printmaking, graphics, and crafts including
textiles, ceramics, metalwork and Information & Communication Technology( ICT)
in which the emphasis is on developing creativity, understood as ‘ innovative
application of knowledge and skills ( NACCCE, 1999, p.30). It may also include art,
craft and design history and critical and contextual studies in which the emphasis is
on developing critical thinking and knowledge and understanding of art and design
theory and practice.

Art Education
The process of learning to understand, make and adequately respond to the
complexities of the arts.

Art examinations
In the context of this research the term ‘art examinations’ are used to refer
examinations covering visual art subjects.

342
Art criticism
Describing and evaluating the media, processes and meanings of works of visual art,
and making comparative judgements (NAEA, 1994).

Assess
To analyse and determine the nature and quality of achievement through means
appropriate to the subject (NAEA, 1994).

Assessment
The process of obtaining information that is used to give feedback about students’
progress, strengths and weaknesses i.e. make educational decisions about students
(Weir, 2004). The process of judging student behaviour or product in terms of some
criteria (Clark,1975).

Assessor
Someone who assigns a score to a candidate’s performance in a test, using subjective
judgement to do so. Assessors are normally qualified in the relevant field, and are
required to undergo a process of training and standardisation. Also referred to as
examiner, marker or rater (ALTE, 1998).

Assessment objectives
The Assessment objectives are means by which the formal elements, processes and
practices can be defined and assessed to ensure that a coherent and meaningful
course has been followed.

Attainment targets
Description of expected standards. A kind of performance matrix where different
levels of achievement to be expected in students' works are described based on a set
of agreed criteria. Those attainment targets act as descriptors of performance,
knowledge skills and understanding.

Authentic assessment
A characteristic of assessments that have a high degree of similarity to tasks
performed in the real world, a view that assessment should include task types which
resemble real life activities as closely as possible.

Bias
A test or item can be considered to be biased if one particular section of the
candidate population is advantaged or disadvantaged by some feature of the test or
item which is not relevant to what is being measured. Sources of bias may be
connected with gender, age, culture, etc. (ALTE, 1998)

Candidate
A test/examination taker. Also referred to as examinee.

Cognitive model

343
A theory concerning the way in which a person’s knowledge, in the sense of both
concepts and processes, is structured. This is important in examinations because
such a theory may have an effect on choice of instrument or examination content.

Competence
The knowledge or ability to do something. (ALTE, 1998)

Connoisseurship
The art of appreciation, the ability to define the quality of an object or environment
(Armstrong ,1994).

Construct
Can be viewed as definitions of abilities that permit us to state specific hypothesis
about how these abilities are or are not related to other abilities, and about the
relationship between these abilities and observed behaviour.

Context
A set of interrelated conditions that influence and give meaning to the development
and reception of thoughts, ideas or concepts and that define specific cultures and eras
(NAEA, 1994).

Criterion
A distinguishing property or characteristic of anything, by which its quality can be
judged or estimated, or by which a decision or classification can be made
(Sadler,1987).A behaviour, characteristic, or quality of a product or performance
about which some judgement is made ( Clark, 1975).

Criterion -referenced assessment


Means that achievement is being assessed in reference to some student outcome that
can be expected as a result of an educational experience (Armstrong, 1994).

Curricular materials
Documents as orientations to curriculum, syllabus; list of objectives, list of criteria
and list of grade descriptors.

Descriptors
A brief description accompanying a band on a rating scale, which summarises the
degree of proficiency or type of performance expected for a candidate to achieve that
particular score (ALTE,1998).

Discrimination
The power of an item to discriminate between weaker and stronger candidates.
Various indices of discrimination are used. Some (e.g. point-biserial, biserial) are
based on a correlation between the score on the item and a criterion, such as total
score on the test or some external measure of proficiency. Others are based on the
difference in the item’s difficulty for low and high ability groups (ALTE 1998).

344
Evaluation
A judgement of merit based on various measurements, notable events and subjective
impressions (Armstrong, 1994).

Examinations
Measure the attainment of a candidate at the end of a course of study. They may be
external or school based, or a combination of both and, usually conducted in a formal
manner. Examinations may include a variety of instruments such as practical tests,
written essays, coursework and portfolios.

Examiner
The person who is responsible for judging a candidate’s performance in a test or
examination.

Expressiveness
Expressive features. Elements evoking affects such as joy, sadness, or anger (NAEA,
1994).

Expression
A process of conveying ideas, feelings, and meanings through selective use of the
communicative possibilities of the visual arts (NAEA, 1994).

Feedback
Comments of people involved in the testing process (examinees, administrators, etc.)
which provide a basis for evaluating that process. Feedback may be gathered
informally, or using specially-designed questionnaires (ALTE, 1998).

Formalism
Conception of aesthetics developed in the late 19th and early 20th centuries it focus on
the analyses of physical and perceptual characteristics of art objects and involves the
reduction of form to elements such as line, shape, and colour and principles of design
such as rhythm, balance, and unity. (Freedman, 2003, p. 27) .

Formative assessment
Testing which takes place during, rather than at the end of, a course or programme of
instruction. The results may enable the teacher to give remedial help at an early
stage, or change the emphasis of a course if required. Results may also help a
student to identify and focus on areas of weakness (ALTE, 1998).

Formative Evaluation
Ongoing evaluation of a process, which allows for that process to be adapted and
improved as it continues. It can refer to a programme of instruction (ALTE, 1998).

Global assessment
A method of scoring. The assessor gives a single mark according to the general
impression made by the work or performance produced, rather than by breaking it
down into a number of marks.

Grade

345
A score may be reported to the candidate as a grade, for example on a scale of A to
E, where A is the highest grade available, B is a good pass, C a pass and D and E are
failing grades (ALTE, 1998).

Grading
A process of assigning a symbol for some judgement of quality relative to some
criterion.

Holistic
Holistic, whether describing learning or assessment, refers to the conscious
awareness of the way parts interact and influence, looking globally rather than
analytically (Armstrong, 1994).

Holistic Scoring
Scoring based upon an overall impression (as opposed to traditional test scoring
which counts up specific errors and subtracts points on the basis of them). In holistic
scoring the rater matches his or her overall impression to the point scale to see how
the portfolio, product or performance should be scored.

Impact
The effects created by an assessment instrument, both in terms of influence on
general educational processes, and in terms of the individuals who are affected by
test results.

Instrument of assessment
Assessment tool. A method of gathering data about student performance
(Armstrong, 1994).

Instructions
General directions given to candidates, for example on the front page of the answer
paper or booklet, giving information about such things as how long the test lasts,
how many tasks to attempt and where to record their responses (ALTE 1998).

Internal consistency
A feature of a test, represented by the degree to which candidates’ scores on the
individual items in a test are consistent with their total score. Estimates of internal
consistency can be used as indices of test reliability ; various indices can be
computed, for example, KR-20, Cronbach alpha (ALTE, 1998).

Inter-rater reliability
An estimate of test reliability based on the degree to which different assessors agree
in their assessment of candidates’ performance (ALTE, 1998).

Intra-rater agreement
The degree of agreement between two assessments of the same sample of
performance made at different times by the same assessor (ALTE 1998).

Level

346
The degree of proficiency required for a student to be in a certain class or represented
by a particular test is often referred to in terms of a series of levels. These are
commonly given names such as ‘elementary’, ‘intermediate’, ‘ advanced’, etc.
(ALTE, 1998).

Mark
A symbolic number which represents the achievement level of the students' work.

Marker
Someone who assigns a score to a candidate’s responses to a written test. This may
involve the use of expert judgement, or, in the case of a clerical marker, the relatively
unskilled application of a mark scheme (ALTE, 1998).

Marking
Assigning a mark to a candidate’s responses to a test. This may involve professional
judgement, or the application of a mark scheme which lists all acceptable responses
(ALTE, 1998).

Measurement
Generally, the process of finding the amount of something by comparison with a
fixed unit, e.g. using a ruler to measure length. In the social sciences, measurement
often refers to the quantification of characteristics of persons (ALTE, 1998).

Method of assessment
Instruments and procedures used to measure student performance in meeting the
standards for a learning outcome. These assessments must relate to a learning
outcome, identify a particular kind of evidence to be evaluated, define tasks that
elicit that evidence and describe systematic scoring procedures.

Moderation
Moderation is the means by which the marks of different teachers in different
centres/schools are equated with one another.

Norm-referenced
If a test is norm-referenced it aims to place candidates on some sort of ordered scale,
so that they can be compared with one another (Alderson et al 1995).

Objective testing
Objective testing refers to items such as multiple-choice, true-false, and error-
recognition, amongst others, where the candidate is required to produce a response
which can be marked as either ‘correct’ or ‘incorrect’. In objective marking the
examiner compares the candidate’s response to the response or range of responses
that the item writer has determined is correct (Alderson et al 1995).

Perception
Visual and sensory awareness, discrimination, and integration of impressions,
conditions, and relationships with regard to objects, images and feelings (NAEA,
1994).

347
Portfolio
A portfolio is defined as a purposeful collection of selected student work, exhibiting
effort, progress and achievements in more than one area, and including student
participation in selecting the contents and self-reflection (Lindström, 1998).

Practicality
A practical assessment instrument is easy to administer and to score without
wasting too much time or effort.

Question
Sometimes used to refer to a task or item (ALTE, 1998).

Rating scale
A scale consisting of several ranked categories used for making subjective
judgements such as mark schemes or assessment matrices. Rating scales for
assessing performance are typically accompanied by grade descriptors which make
their interpretation clear.

Reliability
Reliability is the extent to which test scores are consistent: if candidates took the test
again tomorrow after taking it today, would they get the same result (assuming no
change in their ability). There are three aspects to reliability: the circumstance in
which the test is taken, the way in which it is marked and the uniformity of the
assessment it makes. There are several ways of measuring the reliability of
‘objective’ tests (test-retest, parallel form, split-half, KR 20, KR21 etc.) The
reliability of subjective tests is measured by calculating the reliability of the marking.
This is done in several ways (inter-rater reliability, intra-rater reliability etc.)

Reliability coefficient
A measure of reliability, in the range 0 to 1. Reliability estimates can be based on
repeated administrations of a test (which should produce similar results) or where
this is not practicable, on some form of internal consistency measure. Sometimes
known as reliability index (ALTE, 1998).

Rubric
The instructions given to candidates to guide their responses to a particular test task
(ALTE, 1998).

Scale
A set of number or categories for measuring something. Four types of measurement
scale are distinguished – nominal, ordinal, interval and ratio (ALTE, 1998).

Score
The result obtained by a student on an assessment, expressed as a number.

Self-Assessment

348
Students reflect about their own abilities and performance, related to specified
content and skills and related to their effectiveness as learners, using specific
performance criteria, assessment standards, and personal goal setting.

Skill
Ability to do something, expertness.

Specifications
Specifications provide the official statement about the method of assessment and
how it assess what it intends to assess.

Standard: 'a definite level of excellence or attainment, or definite degree of any


quality viewed as a prescribed object of endeavour or as the recognised measure of
what is adequate for some purpose, so established by authority, custom, or
consensus' (Sadler, 1987).

Standards in the arts


Challenging, but attainable visions of art student outcomes (i.e. what students should
know and be able to do and appreciate, resulting from their art education experience).
Art standards for excellence can motivate change in art education programs,
curriculum, instruction and assessment. National standards in art education can guide
educational decisions about art programs and fill a gap in the large picture of art in
education (Armstrong, 1994).

Standardisation
The process of ensuring that assessors adhere to an agreed procedure and apply
rating scales in an appropriate way (ALTE, 1998).

Summative assessment
Testing which takes place at the end of a course or programme of instruction (ALTE,
1998)

Syllabus
A list of headings outlining a curriculum (Steers, 1979). A detailed document which
lists all the areas covered in a particular programme of study, and the order in which
content is presented (ALTE, 1998).

Task
A combination of rubric, input and response.

Techniques
Specific methods or approaches used in a larger process; for example, graduation of
value or hue in painting or conveying linear perspective through overlapping,
shading or varying size or colour (NAEA, 1994).

Test
Assessment instrument based on objectivity, usually pencil and paper tasks requiring
short answers through for example items like matching, multiple choice, alternative
items or essays.

349
Testing
Testing is one procedure used to obtain data for purposes of forming descriptions or
judgments about one or more human behaviours. Tests can be used to obtain
summative assessments and in external examinations, '…the typical testing situation
puts students in an artificial situation in the sense that their performance will be
appraised and rewarded accordingly' (Eisner, 1972, p. 205).

Traditional assessment instruments


Objective tests, short essays.

Validity
The degree to which an assessment instrument actually measures what it is supposed
to measure. Validity occurs when assessment procedure measures the performance
described in the objective, that it claims to measure (Armstrong, 1994). Validity is
the extent to which a test measures what it is intended to measure: it relates to the
uses made of test scores and the ways in which test scores are interpreted, and it
therefore always relative to test purpose (Alderson et al 1995).

Validation
The process of gathering evidence to support the inferences made form assessment
results. The extent to which assessment scores enable inferences to be made which
are appropriate meaningful and useful, given the purpose of the assessment
instrument.. Different aspects of validity are identified, such as content, criterion and
construct validity; these provide different kinds of evidence for judging the overall
validity of the assessment for a given purpose.

Value-added measures
Value-added measures measure the progress made by individual pupils in an
institution rather than raw outcome scores.

Visual arts education


The process of learning to understand, make and adequately respond to the
complexities of visual arts

Weighting
The assignment of a different number of maximum points to a item, task or
component in order to change its relative contribution in relation to other parts of the
same assessment instrument.

350
Bibliography

ALEXANDER, P.A. (1992). Domain Knowledge: Evolving themes and emergent


concerns. Educational Psychologist, 27(1), 33-51.

ALEXANDER, R; BROADFOOT, P.; PHILIPS, D. (Eds.) (1999). Learning from


Comparing: New directions in Comparative Educational Research. Oxford:
Symposium Books.

ALDERSON, J.C., CLAPHAM, C., & WALL, D. (1995). Language test


construction and evaluation. Cambridge, UK: Cambridge University Press.

ALDRICH; R. & WHITE, J. (1998). The National Curriculum beyond 2000:


the QCA and the aims of education. London: Institute of Education,
University of London.

ALLISON, B. (1982). Identifying the Core in Art and Design, Journal of Art and
Design Education, 1(1), pp.59-66.

ALLISON, B. (1986). Some Aspects of Assessment in Art and Design Education. In:
Ross, M. (Ed.). Assessment in Arts Education. Oxford: Pergamon Press.

ALLISON, B. (1986). Assessment in Art and Design. In Ross, M. (Ed.) Assessment


in the Arts. Oxford: Pergamon.

ALTE (1998). Multilingual Glossary of Language Testing Terms. Cambridge:


Cambridge University Press.

ADDISON, N. AND BURGESS, L. (2003). (Eds.) Issues in art and design teaching.
London: Routledge Falmer.

ADDISON, N. AND BURGESS, L. (2000). (Eds.) Learning to Teach Art and


Design in the Secondary School. London: Routledge Falmer

AFONSO, A.J. (1998). Estado, mercado, comunidade e avaliação. Revista Crítica


das Ciências Sociais 51., Coimbra: Centro de Estudos Sociais, pp. 109-136.

AMABILE, T. M. (1990). Within you, without you: The social psychology of


creativity, and beyond. . In M. A. Ruco and R.S. Albert (Eds.) Theories of
Creativity, pp.61-91. Newbury Park, CA: Sage.

AMABILE, T.M. (1983). The social psychology of creativity. N.Y.: Springer-Verlag.

ANASTASI, A. (1988). Psychological Testing (6th Edition). New York: Macmillan.

351
ANDERSON, T. (1994). The International Baccalaureate Model of Content-based
Art Education. Art Education 47 (2), pp.19-24.

APECV (2002). Parecer da Apecv sobre a prova de MTEP, 1ª chamada 2002. Porto:
Apecv Reports.

ARMSTRONG, C.L. (1994). Designing Assessment in Art. Reston: NAEA.

ASH, A.; SCHOFIELD, K.and STARKEY, A (2000). Assessment and Examinations


in Art & Design. In: Addison, N. and Burgess, L. (Eds) Learning to Teach Art and
Design in the Secondary School. London: Routledge Falmer.

ASKIN, W. (1985). Evaluating the Advanced Placement Portfolio in Studio Art.


Princeton, New Jersey: Advanced Placement and The College Board.

ATKINSON; D. (1999). A Critical Reading of The National Curriculum for Art in


the Light of Contemporary Theorisations of Subjectivity. In: Broadside 2. Swift, J. &
Hughes, A. (Eds.) Birmingham: University of Central England.

ATKINSON; D. (2002). Art in Education: Identity and Practice.


Dordrecht/Boston/London: Kluwer Academic Publishers.

BACKMAN, L.F. (1990). Fundamental considerations in language testing. Oxford,


UK: Oxford University Press.

BAKER, R. (1997). Classical Test Theory and Item Response Theory in Test
Analysis. Preston, Lancashire: SDS Supplies Limited.

BARRET, M. (1990). Guidelines for Evaluation and Assessment in Art and Design
Education 5-18 years. Journal of Art and Design Education, 9 (3), pp.233-313.

BEATTIE, D. K. (1996). Objects of Assessment: What Aspects of Learning Effects


Do We Assess: Products and/or Processes? In: Art&Fact: Learning Effects of Arts
Education, International Conference. World Trade Center Rotterdam. The
Netherlands, pp.44-60.

BEATTIE, D. K. (1997). Visual Arts Criteria, Objectives, And Standards: A Revisit.


Studies in Art Education 38 (4), pp.217-231.

BEATTIE, D. K. (1994). The Mini-Portfolio: Locus of a Successful Performance


Examination. Studies in Art Education, 47 (2), pp.14-18.

BELL, J. (1989). Doing Your Research Project. Milton Keynes: Open University
Press.

BENAVENTE, A; COSTA, A F.; MACHADO, F.L.; NEVES, M. C. (1992). Do


Outro Lado da Escola .Lisboa: Editorial Teorema.

352
BENAVENTE, A (1990). Escola, Professoras E Processos De Mudança. Lisboa:
Livros Horizonte.

BERNARDES, C. & MIRANDA, F.B: (2003). Portfolio: Uma Escola de


Competências. Porto: Porto Editora.

BEST, D. (1996). A Racionalidade do Sentimento. O Papel das Artes na Educação.


Colecção Perspectivas Actuais. Lisboa: Ed. ASA.

BEST, D. (1982). Objectivity and Feeling in the Arts. Journal of Art & Design
Education, 1 (3), pp.373-390.

BINCH, N. (1994). The Implications of the National Curriculum Orders for Art.
Journal of Art & Design Education, 3 (2).

BLACK, P.; HARRISON, C.; LEE, C.: MARSHALL, B.; WILIAM, D. (2003).
Assessment for Learning: Putting it into practice. Maidenhead: Open University
Press.

BLAIKIE, F. (1994). Approaches to Secondary Studio Art Assessment in North


America: 1970-1993, Journal of Art and Design Education, 13 (3), pp. 299-311.

BLAIKIE, F. (1994). Studio Art in North America: Advanced Placement, Arts


Propel, and International Baccalaureate. Studies in Art Education, 35 (4), pp.237-
251.

BLAIKIE, F. (1996). Qualitative Assessment of Studio Art: Problems, Definitions


and Solutions. Canadian Review of Art Education, 23 (1), pp.17-30.

BLISS, J., MONK, M. & OGBORN, J. (1983). Qualitative Data Analysis for
Educational Research. London: Croom Helm.

BOAVIDA, J. & BARREIRA, C. (1993). Nova Avaliação: Novas Exigências.


Inovação 6.IIE.

BOUGHTON, D. (1994). Evaluation and Assessment in Visual Arts Education.


Geelong, Australia: Deakin University Press.

BOUGHTON, D. (1996). Evaluating and Assessing Art Education: Issues and


Prospects. In Boughton, D , Eisner, E. & Ligtvolet (Eds.) Evaluating and Assessing
the Visual Arts in Education . New York: The Teachers College Press.

BOUGHTON, D. (1996)b. Assessment of Student Learning in the Visual Arts. In:


Translations From Theory To Practice. Reston: USA: NAEA

BOUGHTON, D. (1996)c . Assessing Learning Effects in the Visual Arts: What


Criteria Should Be Used? In: Art & Fact: Learning Effects of Arts Education,
International Conference 27/28 March 1995. The Netherlands: World Trade Center
Rotterdam, pp. 72- 86.

353
BOUGHTON, D. (1999). Framing Art Curriculum and Assessment Policies in
Diverse Cultural Settings. In: Mason, R, and Boughton, D. (Eds.) Beyond
Multicultural Art Education: International Perspectives. New York: Waxmann, pp
331-348

BOUGHTON, D. (1997). Reconsidering Issues of Assessment and Achievement


Standards in Art Education. Studies in Art Education, 38 (4), pp.199-213.

BOWDEN, J. (2001). Boy’s under-achievement in art and design: A project


summary. A’N’D: The Newsletter of The National Society for Education in Art, nº2:
Autumn 2001, p.9.

BROADFOOT, P. (1992). Multilateral Evaluation: a case study of the national


evaluation of records of achievement (PRAISE) project. British Educational
Research Journal 18 (3), pp: 245-257.

BROADFOOT, P. (1996). Education, Assessment and Society. Milton Keynes: Open


University Press.

BROADFOOT, P. (1998). Records of Achievement and the Learning Society: a tale


of Two Discourses. Assessment in Education 5 (3),pp. 447-477

BROADFOOT, P. (2000).Preface in Ann Filer (Ed.) Assessment: Social Practice


and Social Product. London: RoutledgeFalmer, pp. ix-xiii.

BROWN, N.C. (1997). The Meta-Representation of Standards, Outcomes and


Profiles in Visual Arts Education. Australian Art Education 20, (1/2), pp.: 34- 43.

BROWN, S. (1994). Assessment: A Changing Practice In Moon, B. & Shelton


Mayers, A. (Eds) Teaching And Learning In The Secondary School. London: Open
University Press

BTEC. (1986). Assessment and Grading: General Guide. London: Business and
Technician Education Council.

BURGESS, R.G. (1982). The Unstructured Interview as a Conversation. In: Burgess,


R. C. (Ed.) Field Research: A Sourcebook and Field Manual. London: Allen &
Unwin.

BURGESS, R.G. (1993). (Ed.) Educational Research & Evaluation for Policy and
Practice? London: Falmer Press.

BURGESS, L. & Addison, N. (2000). Contemporary art in schools: why bother? In:
Hickman, R. (Ed) Art education : meaning, purpose and direction. London:
Continuum.

354
BURKHART, R. (1965). Evaluation of Learning in Art. Art Education, 19 (4),
pp.3-5.

CARDINET, J. (1986). Èvaluation Scolaire et Mesure. De Boeck-Wesmel:Bruxelles.

CARLINE, R. (1968). Draw they Must. London: Edward Arnold Publishers.

CNEES (1998). Relatório sobre os exames de 1998. Portuguese Council of National


Examinations of Secondary Education.

CHALMERS, G. (1981). Art Education as Ethnology. Studies in Art Education 22


(3), pp.6-14.

CHAPMAN, L. (1978). Evaluation of Learning Art. In: Art Approaches to Art in


Education. New York: Teachers College Press.

CLARK, G. (1975). Evaluation in Art Education: Less Subconscious And More


Intentional. In D. J. Davis (Ed.). Behavioural Emphasis in Art Education, pp. 43-50.
Reston, VA: National Art Education Association.

CLARK G.; ZIMMERMAN, E. and ZURMUEHLAN, M. (1987). Understanding


Art Testing. Reston, VA: National Art Education Association.

CLARK G. (1995). Clarks Drawing Abilities Test. Bloomington: Arts Publishing Co.
Inc.

COHEN, L. and MANION, L. (1994). Research Methods in Education (4th Ed.).


London: Routledge.

COLLINS, G. and Sandell, R. (1987). Women's Achievements in Art: An Issues


Approach for the Classroom. Art Education, 40(3),pp.12-21.

COMISSÃO INTERNACIONAL SOBRE EDUCAÇÃO PARA O SÉCULO XXI:


UNESCO (1996). Educação Um Tesouro A Descobrir. Porto: Colecção Perspectivas
Actuais: Ed. ASA.

CRESWELL, J.W. (1998). Qualitative Inquiry And Research Design, Choosing


Among Five Traditions. Thousand Oaks: Sage.

CROSSLEY, M. & BROADFOOT, P. (1992). Comparative and International


Research In Education: scope, problems and potential. In: British Educational
Research Journal 18 (2).

CRONBACH, L.J. (1990). Essentials of Psychological Testing (5th edition). New


York: Harper and Row.

CRONBACH, L.J. (1971). Test validation. In: Thorndike, R.L. (Ed.) Educational
Measurement (2nd edition.). Washinghton, DC: The American Council on Education
and The National council on Measurement in Education.

355
CSIKSZENTMIKALYI, M. (1990). The domain of creativity. In M. A. Ruco and
R.S. Albert (Eds.) Theories of Creativity. Newbury Park, CA: Sage, pp.190-212.

DASH, P. (1999). Thoughts on a Relevant Curriculum for the 21st Century. Journal
of Art and Design Education, 18 (1), pp. 123-127.

DAVIES, I. K. (1987). Art and Design in the GCSE: Objectives, Criteria and
Assessment. Journal of Art and Design Education, 6 (1), pp.51- 65.

DARLING-HAMMOND, L. (1994). Performance-Based Assessment and


Educational Equity. Harvard Educational Review, Symposium. Equity in
Educational Assessment. 64 (1), pp.5-29.

DEPARTAMENTO DO ENSINO SECUNDÁRIO: DES (1992). Orientações e


Gestão dos Programas: Oficina de Artes. Lisboa: M.E.

DEPARTAMENTO DO ENSINO SECUNDÁRIO: DES (1992). Orientações e


Gestão dos Programa da Disciplina De Materiais E Técnicas de Expressão
Plástica:. Lisboa: M.E.

DEPARTAMENTO DO ENSINO SECUNDÁRIO: DES (2000).Revisão Curricular.


Lisboa: M.E.

DEPARTAMENTO DO ENSINO SECUNDÁRIO (2001). Resultados dos exames


do 12º ano . Lisboa: Ministério da Educação.

DEPARTAMENTO DO ENSINO SECUNDÁRIO (1996): Instruções para a


realização dos exames. Lisboa: Ministério da Educação.

DEPARTMENT FOR EDUCATION AND SKILLS (2003). 14–19: opportunity and


excellence. Nottinghamshire: DfES Publications.

DENZIN, N.K. & LINCOLN, Y.S. (1998). Strategies of Qualitative Inquiry.


Thousand Oaks: SAGE.

D' HAINAUT ,L. (1977). Des Fins aux Objectives de L'Education. Editions Labor:
Bruxelles.

DOBBS, S. (1992). The DBAE Handbook: An overview of Discipline Based Art


Education, Santa Monica: Getty Center for the Arts.

DEPRYCK, K. (1993). Paper presented at Creativity 93 World Congress in Madrid

DEWEY, J. (1950). Experience and Education. Co., New York: Macmillan


Publishing.

DIAZ DE RADA, V. (1999). Técnicas de Análisis de Dados para Investigadores


Sociales: Aplicaciones practices con SPSS para Windows. Madrid: Ra-Ma.

356
DORN, C. M. (2002). The teacher as stakeholder in student art assessment and art
program evaluation. Art Education 55 (4), pp. 40-45.

EARLE, D.M. (1986). Art Examinations in Secondary Schools. Institute of London:


PHD thesis.

EÇA, T. (1999). Assessment of Creative Work in Portuguese MTEP External


Examinations. London Roehampton Institute. MA thesis.

EDEXCEL (2000). GCE A level Art & Design: Guide for Teachers and Moderators.
London: Edexcel.

EDEXCEL (2000). A Student's Guide to the AS and Advanced GCE in Art & Design.
London: Edexcel.

EISNER, E. (1972). Educating Artistic Vision. London: Macmillan.

EISNER, E. (1985). The Educational Imagination. New York: Macmillan.

EISNER, E. (1986). The Art of Educational Evaluation. Philadelphia: The Falmer


Press.

EISNER, E. (1988). Structure and Magic in Discipline- Based Art Education.


Journal of Art and Design Education 7 (2), pp.185-196.

EISNER, E. (1991). The Enlightened Eye: Qualitative Inquiry and the Enhancement
of Educational Practice. New York: Macmillan.

EISNER (1992). The Misunderstood Role of the Arts in Human Development. Phi
Delta Kappan, April 1992.

EISNER, E. (1998). The Kind of Schools We Need. Personal Essays. Portsmouth:


Heinemann.

EISNER, E. (2003). The Arts and the Creation of Mind. New Haven & London:
Yale University Press.

EISNER, E. (2003). Qualitative research in the new millennium. In: Addison, N. and
Burgess, L. (2003). (Eds.) Issues in art and design teaching . London: Routledge
Falmer , pp: 52- 60.

EMERY, L. (1996). Heuristic Inquiry. In: Australian Art Education 19 (3) p.29.

EFLAND, A, KOROSCIK, J. and PARSONS, M. (1991). Assessing Art Learning


Based on Novice-Expert Paradigms: A Progress Report. Unpublished manuscript.
The Ohio State University. Ohio, U.S.A.

357
EFLAND, A, FREEDMAN, K. & STUHR, P. (1996). Postmodern Art Education:
An Approach To Curriculum. Reston, Virginia: NAEA.

EPPI Centre (2004) A Systematic Review of the Impact of Formal Assessment on


Secondary School Art and Design Education. Evidence for Policy and Practice
Information and Co-ordinating Centre: University of London, Institute of London,
Social Science Research Unit.

ESTRELA, A. (1994). Avaliações em Educação: Novas Perspectivas. Porto: Porto


Editora.

EU Comenius 3,1/Socrates & YOUTH, Report of the project 40966-CP-1-97-1-FI-


C31. Portfolio assessment in secondary art education and final examination. June
1999, UIAH Helsinki: Department of Art Education: University of Art and Design.

FELDHUSEN, J.F. & GOH, B.E. (1995). Assessing and Accessing Creativity: An
Integrative Review of Theory, Research, and Development. Creativity Research
Journal 8 (3), pp. 231-247.

FERNANDES, D. (1992). Práticas e Perspectivas da Avaliação (Dois anos de


experiências no Instituto de Inovação Educacional). Lisboa:I.I.E.

FIELDING, R. (1996). A Justification for Subjectivity in Art Education Research.


Australian Art Education 19 (2).

FIELDING, R. (1998). Creative Interpretation in Art Criticism . Australian Art


Education 21 (3).

FILER, A. ( 2000). (Ed.) Assessment: Social Practice and Social Product . London:
Routledge Falmer.

FLINDERS, D. & MILLS, G.E. (1993). (Eds.) Theory and Concepts in Qualitative
Research. N.Y: Teachers College Press

FODDY, W. (1994). Constructing Questions for Interviews and Questionnaires.


Cambridge University Press.

FOUCAULT, M. (1977). Discipline and Punish. New York: Vintage Books.

FREEDMAN, K. (1993). The social production of computer graphics and art


education. In Muffoletto, R. and Knupfer, N. (Eds.) Computers in education: Social,
political, historical perspectives. Hampton, NJ: Hampton Press.

FREEDMAN, K. (2003). Teaching Visual Culture: Curriculum, Aesthetics and the


Social Life of Art. N.Y.: Teachers College, Columbia University.

FREEDMAN, K. (2003)b. Recent shifts in US art education. In: Addison, N. &


Burgess, L. (Eds.) Issues in art and design teaching. London: RoutledgeFalmer, pp.
8-18.

358
GENTILE, R., & MURNYACK, N. (1989). How shall Students be Graded in
Discipline-Based Art Education? Art Education 42 (6), pp.33-41.

GARDNER, H. & GRUNBAUM, J. (1986). The assessment of artistic thinking:


comments on the national assessment of educational progress in the arts.
Unpublished paper, Harvard Project Zero.

GARDNER, H. (1990). Multiple Intelligences: Implications for Art and Creativity.


In Moody, W. J. (Ed.) Artistic Intelligences: Implications for Education. New York:
Teachers College Press.

GARDNER , H. (1990)b. The Assessment of Student Learning in the Arts.


Conference Paper. Bosschenhoofd: Netherlands.

GARDNER, H. (1992). Assessment in Context: The Alternative to Standardized


Testing. In Guifford, B & O, Connor M.C. (Eds.), Changing Assessment: Alternative
Views of Aptitude, Achievement and Instruction. Boston: Kluwer.

GARDNER, H. (1996). The Assessment of Student Learning in the Arts. In:


Boughton, D., Eisner, E. & Ligtvoet, J (Eds.) Evaluating and Assessing the Visual
Arts in Education. New York: Teachers College Press.

GARDNER, H. (1999) Intelligence Reframed: Multiple Intelligences for the 21st


Century. New York: Basic Books.

GETAP: MINTÉRIO DA EDUCAÇÂO (1992). Programa da Disciplina de Oficina


de Artes. Lisboa: M.E.

GETAP: MINTÉRIO DA EDUCAÇÂO (1992). Programa da Disciplina Materiais


E Técnicas de Expressão Plástica. Lisboa: M.E.

GIPPS, C. AND STOBART, G. (1997). Assessment: A teacher's guide to the issues.


London: Hodder & Stoughton.

GOSWAMI, A. (1996). Creativity and the Quantum: A unified Theory of Creativity.


Creativity Research Journal 9 (1), pp.47-61.

GOUGH, S. & SCOTT, W. (2000). Exploring the Purposes of qualitative Data


Coding in Educational Enquiry: insights from recent research. Educational Studies
26 (3).

HALPIN, T. (2003). Teachers caught cheating to help pupils make grade. In: The
Times, October 11, 2003, p. 1.

HANNAN, B. (1985). Assessment and Evaluation in Schooling. Victoria: Deaken


University Press

HANS, N. (1964). Comparative Education. London: Routledge.

359
HARDY, T. (2001) Farewell to the Wow. Times Educational Supplement, 16th
November 2001, p. 7.

HARDY, T. (2002). AS Level Art: Farewell to the ‘Wow’ Factor. The International
Journal of Art and Design Education 21 (1), pp.52- 59.

HARLEN, W., GIPPS, C., BROADFOOT, P. & NUTALL, D. (1994). Assessment


and the Improvement of Education. In: Moon, B. Shelton Mayes, A. (Eds.) Teaching
and Learning in the Secondary School. London: Open University Press.

HARGREAVES, A (1984). Experience counts, theory doesn't. How teachers talk


about their work. Sociology of Education. 57, pp.244-254.

HARGREAVES, D.; GALTON, M.J. & ROBINSON, S. (1996). Teachers’


assessments of primary children’s classroom work in the creative arts. Educational
Research 38 (2), pp. 199-211.

HASSELGREN, A. (1998). Small words and valid testing. PHD thesis. Bergen:
University of Bergen, Department of English.

HAUSMAN, J. (1994). Standards and Assessment. Art Education 47 (2), pp.9- 13.

HAUSMAN, J. (1995). Evaluation in Art Education . Australian Art Education, 18


(3), pp: 25-29

HENNING, G. (1987). A guide to language testing. Cambridge, Mass: Newbury


House.

HENRY, C. (1990). Grading Student Artwork: A Plan for Effective Assessment. In:
Little, B.E. (Ed.) Secondary Art Education: An Anthology of Issues. Reston, Virginia:
NAEA, pp.: 61- 68.

HERMAN, J.L.; KLEIN, D.C.D. AND WAKAI, S.T. (1997). American Students'
Perspectives on Alternative Assessment: Do they Know It's Different? Assessment in
Education 4 (3), pp.339-52.

HERMANS, P (1996). Why Assess? In: Art & Fact: Learning Effects of Arts
Education, International Conference 27/28 March 1995. The Netherlands: World
Trade Center Rotterdam, pp: 97-107.

HENNESSY, B. A. (1994). The Consensual Assessment Technique: An Examination


of the Relationship between Ratings of Product and Process Creativity. Creativity
Research Journal, 7(2), pp.193- 208.

HERNANDEZ, F. (1997). Educación Y Cultura Visual . Sevilla: Morón.

HERNANDEZ, F. (1998). La Necessidad de Una Perspectiva Critica en la


Educacion Artistica y la Enseñanza de las Artes . Ugo (21). Barcelona: Universidad
de Barcelona

360
HERNANDEZ, F.; VALLS, M.R.; RODRIGUEZ, J.M.B. (1998). Pedagogia De
L'Art: Identitat de l' Artista, Context i Ensenyament de les Arts. Barcelona:
Universitat de Barcelona.

HEYFRON, V. (1986). Assessment in Art and Design, in Ross, M (Ed.). Objectivity


and Assessment in the Arts. Oxford: Pergamon.

HICHMAN, R. (2000). (Ed.) Art Education 11-18 Meaning, Purpose and Direction
London: Continuum.

HITCHCOCK, G. & HUGHES, D. (1995). Research and the teacher. New York:
Routledge.

HOLMES, E:R. & VAN DE GRAAFF (1973). Relevant Methods in Comparative


Education. UNESCO.

ILTA(2000). Code of Ethics for ILTA. International Language Testing Association.


[available March 2000, http://www.dundee.ac.uk/languagestudies/Itest/ilta.html]

INSTITUTO DE INOVAÇÃO EDUCACIONAL (1995). Estudo Comparativo dos


sistemas de avaliação em quatro países europeus. Lisboa: IIE.

JAMESON, F. (1998). La Condición Posmoderna: Un Informe Sobre el Ser. Ugo


(21).Barcelona: Universidad de Barcelona, pp. 3-20.

KARPATI, A. (1994) Hungarian Examinations in Visual Arts- the academic


tradition. Paper presented at the 3rd European Congress of InSEA. Lisbon 17-21 July
1994.

KELLAGHAN, T. and MADAUS, G.F. (1991). National Testing: Lessons for


America from Europe. Educational Leadership 49: 87-93.

KINNEAR, T., C. & TAYLOR, J.R. (1996). Marketing Research. (5th Ed.) New
York: McGraw-Hill, Inc.

KLENOWSKI, V. (2003). Developing Portfolios For Learning And Assessment.


London: RoutledgeFalmer.

LAMBERT, D. & LINES, D. (2000). Understanding Assessment. London:


RoutledgeFalmer.

LANE, R. (2003). Liverpool Teachers’ Exam Concerns. A’N’D (10). National


Society for Education in Art & Design, issue Winter 2003/4.

LINDSTRÖM, L. (1997). Integration, Creativity, or Communication? Paradigm


Shifts and Continuity in Swedish Art Education. Arts Education Policy Review 99
(1).

361
LINDSTRÖM, L. (1999). The Multiple Uses of Portfolio Assessment. In: L.
Piironen (Ed.) Portfolio Assessment in Secondary Art Education and Final
Examination. University of Art and Design Helsinki (Finland), Department of Art
Education, 1999, pp. 7-16. Report of EU Comenius 3.1 project.

LINDSTRÖM, L. (1999) b. Portfolio Assessment of Creative Skills in the Visual Arts.


Paper presented at the 29th Congress of the Nordic Educational Research Association
(NERA), Stockholm, 15-18 March 2001.

LINCOLN, Y. S. & GUBA, E. G. (1985). Naturalistic Inquiry. London: Sage


Publications.

LEMOS, V. (1993). O Critério do Sucesso: Técnicas de Avaliação da


Aprendizagem. Lisboa: Texto Editora.

LEARY, A. (1986). Art, Assessment and Ethnocentrism . Journal of Art & Design
Education 5 (1 & 2) .

LINACRE, M. (1989). Many-faceted Rasch measurement. Chicago, IL: MESA


Press.

LYOTARD, J. (1984). The Postmodern Condition: a Report on Knowledge.


Manchester: Manchester University Press.

MACE, M. (1997). Toward an Understanding of Creativity Through a qualitative


Appraisal of Contemporary Art making. Creativity Research Journal 10 ( 2&3), pp.
265-278)

McNAMARA, T.F. (1996). Measuring Second Language Performance. New York:


Addison Westley Longman.

MACGREGOR , R. N. (1992). A Short Guide to Alternative Practices. In: Art


Education 45 (6), p.34-38.

MACGREGOR, R. N. (1996). What You See and What You Get: Assessment and
Its Constituents In: Art&Fact: Learning Effects of Arts Education, International
Conference Art&Fact, 27/28 March 1995. World Trade Center Rotterdam. The
Netherlands

MACKINNON, D. & STATHAM, J. (1999). Education in the UK. Facts & Figures.
(3th Ed.) London: Hodder &Stoughton/The Open University.

MADAUS, G. (1994). A technological and historical consideration of equity issues


associated with proposals to change the nations testing policy. Harvard Educational
Review, 64 (1). pp.76-95.

MASON, R., AND BOUGHTON, D. (1999). (Eds.) Beyond Multicultural Art


Education: International Perspectives. New York: Waxmann.

362
MARTINS, G. (1999). Práticas Avaliativas, Momentos Formativos .MA thesis:
Lisboa: Universidade Católica de Lisboa.

MARTINDALE, C. (1999). Aesthetic and Cognition. Paper presented in the


Conference: Educação Estética-Abordagens Transdisciplinares. Lisboa: Fundação
Gulbenkian ( September 1999).

MESSICK, S. (1971). Validity . In: Thorndike, R.L. (Ed.) Educational


Measurement (2nd edition.). Washinghton, D.C.: The American Council on Education
and The National council on Measurement in Education.

MESSICK, S. (1989). Meaning and Values. In Test Validation: The Science and
Ethics of Assessment. Educational Researcher 18 (5), pp.10-11.

MESSICK, S. (1992). Validity of Test Interpretation and Use. In: Alkin, M.C. (Ed.)
Encyclopedia of Educational Research (6th edition). New York: Macmillan.

MIRZOEFF, N. (1999). An Introduction to Visual Culture. London/ New York:


Routledge.

MILES, M.B. & HUBERMAN, A.M. (1994). Qualitative Data Analysis: An


Expanded Sourcebook (2nd Ed.). London: Sage Publications.

MINISTÉRIO da EDUCAÇÂO: Diário da Répública (1993). Despacho Normativo


nº 338/93. Lisboa.

MINISTÉRIO da EDUCAÇÂO: Diário da Répública (1993). Despacho Normativo


nº162/ME/9. Lisboa.

MCFEE, J. (1986). Cross-Cultural Inquiry into the Social Meaning of Art:


Implications for Art Education. Journal of Cross-Cultural and Multicultural
Research in Art Education, 4 (1), pp. 6-16.

MICHAEL, J. (1980). Studio Art Experience: The Heart of Art Education. Art
Education 33 (2), pp.15-19.

MINKIN, (1998). Inviting A Response to the Agenda. Draft for Discussion in the
Consultative Group on Creative Development. Unpublished paper.

MOURA, A. (1999). Art Patrimony in Portuguese Middle Schools: Problems of


Cultural Bias. In: Boughton, D. & Mason, R (Eds.) Beyond Multicultural Education.
NY: Waxmann, pp.115-133.

MOUSTAKIS, (1990). Heuristic Research. USA: Sage.

MYFORD, C. M. & WOLFE, E. W. (2000). Monitoring Sources of Variability


Within the Test of Spoken English Assessment System. Educational Testing Service.
Report 65. [Available June 2003: http://www.toefl.org.]

363
MUMFORD, M.D.; BAUGHMAN, W.; MAHER, M.A.; COSTANZA, P. &
SUPINSKI, E.P. (1997). Process-Based Measures of Creative-Solving Skills: IV.
Category Combination. Creativity Research Journal 10 (1), pp. 59-71.

NACCCE (1999). All our futures: Creativity, Culture. Education. London: DCMS
and DFEE.

NEAB. (1999). Art and Design ( Advanced): Syllabus number 4191. Northern
Examinations and Assessment Board.
[available January 2000: http://www.neab.ac.uk/syllabus/arts/artdes/ss992191.htm]

NEWTON, PAUL (1997). Examining Standards Over Time . Research Papers in


Education. 12 (3), pp.227- 48 .

NAEA: NATIONAL ART EDUCATION ASSOCIATION (1994). The National


Visual Arts Standards. Reston VA: NAEA .

NAEA: NATIONAL ART EDUCATION ASSOCIATION (1994). Visual Arts


Assessment and Exercises Specifications. Reston VA: NAEA.

NCTE (2000). Developing a Test’s Taker Bill of Rights. Milwaukee: NCTE Annual
meeting.

NORUSIS, M.J. (2000). SPSS 10.0: Guide to Data Analysis. New Jersey: Prentice
Hall.

NOVAK, J.D. & GODWIN, D. B. (1984). Learning How to Learn. Cambridge


University Press.

NUTALL, D. (1990). Talk presented to Chief Examiners for General Certificate of


Secondary Education Examiners in Art and Design. Bath, 1990, March. UK.
Unpublished paper.

NUTALL, D. (1987). The Validity of Assessment. European Journal of Psychology


of Education, II (2).

OECD (1972). Programas de Ensino a partir de 1980. Organisation for Economic


Cooperation and Development report.

O’ SULLIVAN, B. (2001). Multi-facet Rasch Practical. Unpublished paper.

PACHECO, J.A (1995). A Avaliação Dos Alunos Na Perspectiva Da Reforma.


Porto: Porto Editora.

PATMAN, T. (1988). Key Concepts: A Guide to Aesthetics, Criticism and the Arts in
Education. London: The Falmer Press.

364
PIIRONEN, L. (1999). (Ed.). Portfolio Assessment in Secondary Art Education and
Final Examination. University of Art and Design Helsinki (Finland), Department of
Art Education.

PISA (2000). Mesurer les connaissances et les compétences des élèves : Lecture,
Mathématiques et Sciences : l´evaluation de PISA 2000. Programme international
pour le suivi des acquis des élèves (PISA).

PRENTICE, R. (1995) (Ed.). Teaching Art and Design: Addressing Issues and
Identifying Directions. London: Continuum.

POSTLETWAITE, N.T. (1988). The Encyclopedia of Comparative Education and


National Systems of Education. NY: Pergamon Press.

PRICE, E. (1982). National Criteria for a Single System of Examining at 16+ with
reference to Art and Design. Journal of Art and Design Education, 1 (3),
pp. 397-407.

QCA (1999). Curriculum guidance for 2000. [Available July 2000:


http://www.qca.org.uk.]

QCA (2000). England: Context and principles of Education. [ Available July 2000:
http://www.inca.org.uk/.]

QCA (2000). GNVQ Code of Practice. London: QCA.

QCA (2000). GCSE? A level? Vocational A level (Advanced GNVQ)? AS


qualification? Entry level award? NVQ . [Available July 2000: http://www.qca.uk.]

QCA (2000). GCSE and GCE A/AS Code of Practice. London: QCA.

RAGIN, C. (1987). The Comparative Method. Los Angeles: University of California


Press.

RAYMENT, T. (1999). Assessing National curriculum Art AT2, Knowledge and


Understanding: A small-scale project at Key-Stage3. In: Journal of Art and Design
Education 18 (2), pp. 188-193.

READ, H. (1943). Education Through Art. London: Faber and Faber.

REID, L.A. (1986). Assessment in Art Education, in Ross. M. (Ed.). "Art" and the
Arts . Oxford: Pergamon.

RESNICK, L. & RESNICK, D. (1992). Assessing the thinking Curriculum: New


Tools for Educational Reform. in Guifford & O'Connor (Eds.). Changing
Assessment: Alternative Views of Aptitude, Achievement and Instruction.
Boston: Kluwer.

365
RIBEIRO, C. (1993). Educação e Arte. Braga: Revista Portuguesa da Educação,6
(1), pp.103-108.

RIBEIRO, L.C. (1997). Avaliação da Aprendizagem. Educação Hoje. Cacém: Texto


Editora.

ROBINSON, K. (1982). The Arts in Schools: Principles, Practice and Provision.


London: Calouste Gulbenkian Foundation.

ROBINSON, V.M.J. (1993). Problem Based Methodology: research for the


improvement of practice. Oxford: Pergamon.

ROBSON, C. (1993). Real World Research. Oxford: Blackwell.

ROSS, M. (1978). The Creative Arts: Evaluation, Assessment and Certification.


Oxford: Pergamon.

ROSS, M. (1986). Against Assessment. In: Assessment in Art Education, Ross, M.


(Ed.) Oxford: Pergamon.

ROSS, M. (1989). The Claims of Feelings: Readings in Aesthetic Education. UK:


Heineman.

ROSS, M. (1992). Assessment of Arts Achievement in the United Kingdom: The


Reflective Conversation. Journal of Aesthetic Education, 26 (3).

ROSS, M., RADNOR, H., MITCHELL, S., BIERTON, C. (1993). Assessing


Achievement in the Arts. Buckingam: Open University Press.

SÁ CHAVES, I (2000). Portfolios Reflexivos. Aveiro: Universidade de Aveiro.

SADLER (1987). Specifying and Promulgating Achievement Standards. Oxford


Review of Education. ERIC Digest ED348238. Bloomington.

SANTOS, A. S., Fragateiro, C.,Vieira, J.,Peixinho, J. Dias, J.A, Quadros, A. (1996).


Ensino Artístico. Porto: Edições Asa.

SCHÖNAU, D.W. (1999). The Future of Nation-Wide Testing in the Visual Arts.
Art Education, 18 (2), pp.183- 87.

SCHÖNAU, D.W. (1994). Final Examinations in the Visual Arts in the Netherlands.
Art Education, 47 (2), pp.35- 39.

SCHÖNAU, D.W. (1996). Nationwide Assessment of Studio Work in the Visual


Arts: Actual Practice and Research in the Netherlands. In D. Boughton, E. Eisner, &
Ligtvolet (Eds.) Evaluating and Assessing the Visual Arts in Education. New York:
Teachers College Press.

366
SCHÖNAU, D.W. (1999). A Survey on Visual Art Exams in Europe. In : Piironen, L
(Ed.) EU Comenius 3,1/Socrates & YOUTH, Report of the project 40966-CP-1-97-1-
FI-C31. Portfolio assessment in secondary art education and final examination, June
1999, UIAH Helsinki: Department of Art Education: University of Art and Design.

SCRIVEN, M. (1967). The Methodology of Evaluation in Tyler, R et al (Eds.)


Perspectives on Curriculum Evaluation. AERA Monograph Series on Curriculum
Evaluation (1). Chicago: Rand McNally.

SELTON-GREEN, J. & SINKER, R. (2000). Evaluating Creativity: Making and


learning by young people. London: Routledge.

SHOAMY, E. (2001). The Power of Tests. Essex: Pearson Education Limited.

SPOLSKY, B. (1995) Measured Words. Oxford University Press.


STAKE, R. (1978). The Case Study Method in Social Inquiry. Educational
Researcher 7, pp 5-8.

STEERS, J. (1987). Art, Craft And Design Education In Great Britain: A Summary.
Canadian Review of Art Education, 15 (1), pp.15-20.

STEERS, J. (1988). Art And Design In The National Curriculum. Journal of Art and
Design Education 7(3), pp.303-323.

STEERS, J. (1994). Art and Design Assessment and Public Examinations. Journal of
Art and Design Education, 13 (3), pp.287-297.

STEERS, J (1994). b Orthodoxy in Art Education. Paper presented at the 3rd


European Congress of InSEA. Lisbon 17-21 July 1994.

STEERS, J. (1996). Response to Papers by Gardner and Schönau. Evaluation in the


Visual Arts: a Cross-cultural Perspective. In Boughton, D., Eisner, E. & Ligtvoet, J
(Eds.) Evaluating and Assessing the Visual Arts in Education. New York: Teachers
College Press.

STEERS, J. (2001). New A and AS levels. A’N’D: The Newsletter of The National
Society for Education in Art, nº1: Summer 2001, p.11.

STEERS, J. (2003). Art and design in the UK: the theory gap. In: Addison, N. and
Burgess, L. (Eds.). Issues in art and design teaching. London: RoutledgeFalmer, pp.
19-31.

STEERS, J. (2003)b. Art and Design. In: WHITE, J. (Ed.) (Ed.) Rethinking The
School Curriculum: Values, Aims and Purposes. London: RoutledgeFalmer.

STEWART, R. (1996). Constructing Neo-narratives in Art Education. Australian Art


Education.19(3) p.46.

367
STOKROCKI, M. (1997). Qualitative Forms of Research Methods. In: Lapierre,
S.D. & Zimmerman, E. (Eds.) Research Methods and Methodologies for Art
Education. USA: NAEA, Indiana University.

SULLIVAN, G. (1996). Research in Art Education. Australian Art Education.19 (3).

SUTHERLAND, G. (1996). Assessment: Some Historical Perspectives. In:


Goldstein, H. and Lewes, T. (Eds.) Assessment: Problems, Development and
Statistical Issue.UK: John Wiley.

SWIFT, J. & HUGHES, A., (1999). (Eds.) Broadside2. Birmingham: UCE:


University of Central England.

SWIFT, J. & STEERS, J. (1999). A Manifesto for Art in Schools. Journal of Art &
Design Education, 18 (1), pp. 7-13.

TATTERSALL, K. (2003). A national obsession. in: Guardian Education,


September 30, 2003, p.4.

TAYLOR, R. (1986). Educating for Art. Harlow, Essex: Longman.

THISTLEWOOD, D. (1992). Editorial: Controversies Highlighted By The National


Curriculum. Journal of Art and Design Education 11 (1), pp.3-7.

TEUNE, H. (1990). Comparing Countries: Lessons Learned. In: Oyen, E.(Ed.)


Comparative Methodology . Theory and Practice in International Social Research .
London: SAGE.

TGAT (1988). Task Group on Assessment and Testing, a Report. London: DES.

THOMAS, R.M. (1990). International Comparative Education. London: Routledge.

TORRANCE, H. (1989). Ethics and Politics in the Study of Assessment In: Burgess,
R.G. (Ed.) The Ethics of Educational Research. Philadelphia: Falmer Press.

TORRANCE, H. (2000). Postmodernism and Educational Assessment . In: Filer, A.


(2000) (Ed.) Assessment: Social Practice and Social Product . London: Routledge
Falmer.

TUCHMAN, G. (1998). Historical Social Science. Methodologies, Methods and


Meanings. In: In: Denzin, N.K. & Lincoln, Y.S. (Eds.) Strategies of Qualitative
Inquiry. Thousand Oaks: SAGE.

VALADARES & GRAÇA, M. (1998). Avaliando para Melhorar a Aprendizagem.


Coimbra: Plátano Editora.

VERMA, G.K. & MALLICK, K. (1999). Researching Education. Perspectives and


Techniques. London: The Falmer Press.

368
VOS, A. J. & BRITS,V.M.(1987). Comparative and International education for
Students Teachers. London: Buttlerworth.

VICKERMAN, C. (1986). Assessment in Arts Education. In Ross, M (Ed.). The Arts


in Public Examinations. Oxford: Pergamon.

VYGOTSKY, L.S. (1972). Thought and Language. Cambridge, MA: MIT Press.

WALLING, D. R. (2000). Rethinking How Art is Taught. Thousand Oaks, Corwin


Press, Inc. (Sage).

WALKER, J. A. & CHAPLIN, S. (1997). Visual Culture: an introduction.


Manchester: Manchester University Press.

WELLINGTON, J.J. (1996). Methods and Issues in Educational Research.


University of Sheffield: USDE Papers in Education.

WEIR, C.J. (1993). Understanding and developing language tests. New York:
Prentice Hall.

WEIR, C.J. (2004). Language Testing and Validity Evidence. In Print.

WHITE, J. (2003). (Ed.) Rethinking The School Curriculum: Values, Aims and
Purposes. London: RoutledgeFalmer.

WILLIAM, D. (1992). Some technical issues in assessment: a user’s guide. British


Educational Research Journal, 22, pp.537-548.

WILLIAMS (2001). Forced in the same mould. In: The Times Educational
Supplement ,16th November 2001.

WIGGINS, G. P. (1993). Assessing Student Performance. Exploring the Purpose


and Limits of Testing. San Francisco: Jossey-Bass Education Series.

WOLF, A. (1988). Opening up Assessment. Educational Leadership, 45 (4),


pp.24-29.

WOLF, A (1995). Competence Based-Assessment. Buckingham: Open University


Press.

WOLF, D. (1988). Artistic Learning: What and Where is it? Journal of Aesthetic
Education 22 (1).

WOLF, D.; Bixby, J.; Glen, J. and Gardner, H. (1991). To Use Their Minds Well:
Investigating New Forms of Assessment. In: Grant, G. (Ed.) Review of Research in
Education. Washington, D.C.: American Research Association.

WORKING GROUP ON 14-19 REFORM. [Available March 2003:


www.dfes.gov.uk/14–19]

369
YOUNG, J. O. (1995). Relativism and the Evaluation of Art. Journal of Aesthetic
Education 29 (4).

ZABALZA, M. M. (1992). Planificação e Desenvolvimento Curricular na Escola.


Porto: Edições ASA.

ZIMMERMAN, E. (1992). Assessing Students' Progress and Achievements in Art.


Art Education 45 ( 6), pp. 14-24.

370
Appendix XIV
Booklet (Instructions for an experimental examination in art and design)

Instructions for an experimental examination in art and


design

Index:
A. General Instructions for teachers
B. Instructions for students
C. Assessment Matrix
D. Grade descriptors
E. Model Project Brief

371
General Instructions for teachers

Teachers are responsible for the delivery of the examination and for the first marking
of portfolios.
Training sessions will be conducted during March.

The final exam in art and design consists of the submission of a portfolio by
candidates.
Schools are responsible for organising the exam. Teachers are responsible for the
development of the unit for external assessment. Students should have access to the
examples of project briefs, assessment criteria and mark schemes one month before
the exam (before the Easter holidays) in order to start the preparatory investigation
through independent study.

Project Brief
There are three options for project briefs:
1. The theme and optional final products provided.
2. Theme and final products designed by the students and the class
teacher.
3. Theme and final product designed by the individual student.

The project brief can be designed by the students and class teacher taking into
account students’ motivations. The class teacher and or the students will develop a
project brief according to a chosen theme and starting points in order to be realised
during the class time taking into account the discipline(s) aims, objectives and
contents. Materials, techniques should be specified in task definitions (according to
school conditions) , however the choice can also be left open to students. The project
brief can be transdisciplinary involving several disciplines (for example:
Technologies; Studio Art and Design; MTEP; History of Art). The project brief
designed by the teachers and students must include the following:

1. Aims and objectives (clear explanation of what is intended to be developed


and the expected qualities of the works)
2. Themes
3. Starting points and possible connections (examples of starting points;
problems to find, possible media and possible sources to use)
4. Final products (a range of optional tasks according to the disciplines)
5. Assessment criteria, mark schemes and weightings
6. Bibliography, references, etc.

2. Portfolio
Students construct their portfolios around one task/theme or starting point. The
portfolio contains a selected collection of investigation work, preparatory and
developmental studies (annotations, sketches, experiments, visual studies), the final
product and a self-assessment report ( written, oral, visual). The separate items in the

372
portfolio should be connected and depict the entire process. The portfolio should also
reflect the development of the project in time; all the evidence should be dated and
numbered as a part of the process. Portfolio should include comprehensive
documentation ( models, sources).
The portfolio, for example, can be, an expanding file, document case, box, album,
web page, CD Rom; video, work journal or notebook, etc. The appearance of the
portfolio can be a part of the artistic production and reflect the character of the
project and the personality of the author. The appearance can contribute to the
general visual expression, and help to form a positive evaluation.

Form 1: students’ submission form


Project Brief Preliminary
Form3: internal standardisation form studies
Form 4: Awarding Experiments;
Assessment matrices exploration of
possibilities with
reflection and
Reports, records or evaluation of the
notes about previous processes
experiences, Work journals
intentions, interests, Not compulsory Investigation
etc. (information;
sources; models
and related critical
Final Products Student analysis)
(visual): Paintings, Portfolio
drawings,scultpures,
CD, web-pages,
design products, Crits/ Self- assessment
photographs,films,
video records on Written or oral (tape, video, digital record) about the
performances, students’ intentions, investigation, progress, decisions;
installations, techniques, materials, purposes; functions; meanings
exhibitions, design achievement, and evaluation.
2D or 3D; etc.

3. Time:
Portfolio tasks should be developed during class time excepting investigative work
which can be developed extra-class time.

• Preparation/ Investigation/ Developmental studies- March/April, 25 hours


extra class-time:
• Developmental studies and final products - May 15-20 hours during class
time:
• Self-assessment report: 2 hours (class time).

373
Teachers should be available for tutorials with students during the preparation time.
When portfolios are evaluated the time available should be taken into account as
well as the student’s ability to plan the work beforehand and carry out his/ her plans
in the time given.

4. Special considerations
Teachers should provide conditions for students with special needs to conduct their
work with appropriate conditions. Students with physical disabilities should have
special arrangements or special equipment according to their different needs. Non-
native language students should have additional support in order to understand the
instructions of the assessment and task requirements (for example access to
translation).

5. Equipment and materials


Candidates choose and purchase the materials needed for their project. The range of
materials may be restricted in the definition of the tasks. Advice on materials and
techniques can be sought from the class teacher before the final exam period.

6. Role of the class teacher


The class teacher is responsible for the unit description project and mark schemes.
The unit project and mark schemes should be designed following discussion with
students. The class teacher should give a clear explanation to students before the
examination period of assessment criteria; project brief; tasks and mark schemes.
The teacher can, on request, advise the candidates on purchase of materials and
equipment. The teacher is not, however, allowed to intervene with the project work
itself, or with the use of materials and equipment during the final exam period. The
teacher can, on request, advise candidates on gathering background information
before the final exam period. Teachers should record the nature of the assistance they
offered to individual candidates during preparation time, and their observations can
be used as reasons to explain the internal marking of candidates’ portfolios.

This instrument of assessment is not familiar to students. The teacher needs to


prepare students in a systematic way advising them to make written comments while
the project develops, to make critical reflections about the sources and the processes
used, they need to ensure that students understand they have deadlines for
submission of the work. They need to help them plan their work. Sometimes students
spend a lot of energy and time collecting useless information; pasting and copying
documents, it is very important that the student understand the purposes of critical
reflection and the role of personal opinions as a starting point for development of
ideas and experiments. Students are not used to develop ideas in quantity and variety.
A specific number of preparatory studies is not indicated in the instructions but the
teachers must help the students to develop their ideas intensively and not give up too
easily. Students are not used to justifying the quality of their work; their intentions
and purposes; it is up to the teachers to help them to acquire specific vocabulary in
order to explain the merits of the work and the meaning and function of their

374
proposals. Portfolio assessment require great effort and commitment from the
students in terms of time and persistence; students must see studio art or studio
design as important disciplines; they must understand why portfolio will be
beneficial for them in terms of fair assessment and preparation for further studies.

7. Components and criteria


Students’ portfolios must include the description of the project brief; developmental
studies; final products and self-assessment report. The portfolio components for
assessment are:

Components Weighting range


Process Investigative notes or studies 50%- 60%
Developmental studies
Preparatory notes and studies in the work
journals or developmental studies
Product Final products 30%- 40%

Self- Self-assessment report 10%


assessment And notes about self-assessment in the work
journals or developmental studies

Teachers are responsible for verifying the authenticity of their


students’ work.

The assessment criteria to be negotiated with students are as follows:

AC1: Record personal ideas, intentions, experiences, information and opinions in


visual and other forms.
AC2: Critical analysis of sources from visual culture showing understanding of
purposes, meanings and contexts
AC3: Develop ideas through purposeful experimentation, exploration and evaluation.
AC4: Present a coherent and organised sample of works and final product revealing a
personal and informed response that realises their intentions.
AC5: Evaluate and justify the qualities of the work.

The criteria are not strict guidelines; they must be used with the necessary flexibility.
Teachers and students may develop them, adding rubrics for each. However the
general structure and the mark schemes provided in the assessment matrix must not
be changed in order to achieve a common language for assessment.

8. Assessment procedures
The quality of the work and student achievement is holistically estimated taking into
account overall judgement and criteria.

375
The teacher must mark all the portfolios of all the candidates. Portfolios are assessed
in accordance with the negotiated assessment instructions (assessment criteria,
assessment matrix). The class teacher must provide a copy of the portfolio tasks and
mark scheme used in the project unit to the external assessors and explain the
methodology and strategies used in the project unit to the moderator(s).

Internal Marking
Internal marking should be conducted after student’s submission of the portfolios.
Student’s work and achievement is assessed by the student own teacher before the
moderator’s visit to the centre. Teachers will take into account the quality of the
portfolios and their own observations of students’ behaviour during classroom time.
Teachers must use the assessment criteria and assessment matrix language. A total
mark out of 20 points must be awarded for each portfolio. Marks will be allocated in
the assessment matrix and assessment form for each candidate.
Teachers should take into account their own observations of student’s performance
during the course. Evidence for some rubrics related to ‘crits’ for example can only
be observed by the student’s own teacher. Teacher should use such observations to
justify their award.

Art departments are reminded that it is their responsibility to ensure that where more
than one teacher has marked the work in a centre, effective standardisation (group
assessment) has been carried out across all teaching groups. This procedure ensures
that the work of all candidates at the centre is marked to the same standard.

8. 2. External Assessment: Moderation


If in one school there are more than ten candidates, only a sample of ten portfolios
will be subject to moderation. The sample will include candidates achieving the
highest mark and the candidate achieving the lowest mark and will otherwise be a
random selection of portfolios determined by the moderator.

Moderations visits shall be conducted in the period 1 – 15 June. Teachers will


display the required samples of students’ work in appropriate conditions for the
moderation.

Students’ work can be displayed as folders; exhibitions; multimedia presentations,


etc. The class teacher must organise the display of students’ portfolios (sample) in
appropriate audio-visual conditions. Only the portfolios in the identified sample
should be displayed, however all the students’ portfolios should be available.
Teachers must ensure that all the portfolios are stored in appropriated conditions at
the centre during the period: June- September.

Whatever the chosen means of presentation, each portfolio presentation must be


clearly distinguished, one from another, and include the candidate’s name and
number.

The following documentation must be available to the visiting moderator at the start
of the moderation visit:

376
• Copies of the project brief for each candidate.
• Copies of the assessment matrix for each candidate (with awarded marks).
• Awarding forms for each candidate.

The teacher should meet the visiting moderator(s) at the beginning of the visit, to
introduce them to the work. They should be readily available throughout the visit in
case they are required. Visiting moderators will review the portfolios in the
moderation sample in order to ensure that the centre’s marking is:
• In accordance with the marking criteria stipulated;
• In conformity with the overall standards of the examination.

The moderator(s) checks the samples’ marks. Moderation is not a re-marking but
rather a confirmation or otherwise of the teachers’ marks. Re-marking is only
recommended if a serious discrepancy of marks (5 points difference) occurs between
moderators and teachers and will be conducted on a second moderation visit which
can be required by the teacher or by the moderator. At the end of the visit the
teacher(s) will be informed of the moderators’ findings.

The visiting moderator may request a further visit by a second moderator if there are
any aspects of the work and the moderation, which the visiting moderator believes
should be subject to further consideration.

In this case candidates’ portfolio must be retained in the same conditions as viewed
by the original visiting moderator. As a result of the second visit, the second
moderator may, in certain circumstances, find it necessary to recommend that the
original moderator’s marks should be amended upwards or downwards. Should this
be the case, the recommendations of the second moderator will stand.

Moderation standards are checked in the following ways:


• Accompanying visits where the moderator is observed by a second external
assessor.
• Follow-up visits where schools are visited by a second moderator after the
initial moderator’s visit
• Statistical review of moderators’ performance which takes place after the
moderation when all marks are in the system. All moderated marks will be
reviewed and may be subject to further adjustment if necessary. Teachers will
be informed about any further adjustments.

8. Meetings with teachers


During March teachers will be asked to attend the preparation meeting. The meeting
has two sessions (one or two days), the first session concerns the explanation of the
instructions and the second session the assessment of students’ portfolios. The
assessment of a sample of students works aims to achieve consensus about the
interpretation of criteria and standards of the examination.

A third session will be conducted with external assessors in order to prepare them for
moderation; this will include the marking of students’ portfolios and is intended to
verify the accuracy of moderators’ marking.

377
9. Reports about the examination
Teachers will receive a report of the examination (September).

378
B. Instructions for Students

1. Components
The work to be submitted will take a form of portfolio. The portfolio includes three
components: Process; Product and Self-assessment

Components Weightings
Process Investigation notes or studies 50%- 60%
Developmental studies
Preparatory notes and studies in
the work journals or
developmental studies
Product Final products 30%- 40%

Self- Self-assessment report 10%


assessment And notes about self-assessment
in the work journals or
developmental studies

1.1. Assessment Criteria


The following assessment criteria describe the qualities that teachers and external
assessors will look for in your portfolio:

AC1: Record personal ideas, intentions, experiences, information and opinions in


visual and other forms.
AC2: Critical analysis of sources from visual culture showing understanding of
purposes, meanings and contexts
AC3: Develop ideas through purposeful experimentation, exploration and evaluation.
AC4: Present a coherent and organised sample of works and final product revealing a
personal and informed response that realises their intentions.
AC5: Evaluate and justify the qualities of the work.

In other words the qualities the qualities that teachers and external assessors will look
for in your portfolio will be:

AC1: Visibility of the intentions: Show that you are able to express your ideas,
motivations, opinions, purposes through words and images, explain your intentions at
the beginning of the project and as long as it develops make annotations about your
progresses, if you re-formulate ideas explain why, if something influences you
explain it, explain the importance of the examples of visual culture that you have
selected and used in your work. Show that you are able to plan your project and
respect deadlines; if for some reason you cannot fully realise what you have
proposed explain why it was not possible.

379
AC2: Searching abilities; abilities to interpret and use examples of visual culture: Show that you
can collect information about the work of others in the world of visual culture (art, design, media,
etc); that you can interpret its meanings, purposes and functions. But, be careful, do not spend too
much time in collecting information that is not relevant for your project; you are not required to copy
and paste information, but rather to reflect critically on it, expressing personal opinions and informing
your work through what you have learned in your search. Show that your sources were useful for
developing your own ideas and experiments of techniques, materials, etc.

AC3: Purposeful development of ideas and experiments: Several qualities will be


evaluated under this criterion. The quantity and quality of sketches or initial ideas is
important. Show that you are persistent, that you do not give up easily. The teachers
will look for personal style and the capacity to communicate your ideas visually; they
will look for your critical reflection about the ideas explored; the processes used and
your capacities to explain your decisions. They will look for your skills in finding
and solving problems; raising issues; presenting possibilities and evaluating them.
Show that you can find problems alone, and not only the problems stated by your
teacher. Many times the problems are formulated through searching the work of
others and exploring ideas. Show that you can be imaginative and propose new ways
of representing a problem from several angles. Your technical skills will be very
important, show that you can represent real and imaginary things with expertise. For
example in drawing, painting, sculpture you must understand and fluently use the
formal elements such as shape, light, space, composition, colour, rhythm, etc. In
product design you must understand and apply notions about ergonomics,
anthropometrics, design methodology, etc.

AC4: The portfolio as a whole and the final product: The teacher will look for
your knowledge, understanding and skills in art and design and if your personal
response was informed by what you have learned through the process and adequate
for realising your intentions. The teacher will see how you used the formal elements
of visual language; art and design concepts and conventions; your technical skills
and your personal style.

AC5: Knowing the strengths and weakness of your work, explain and justify the
meaning and purposes of your work: The teacher will look for your skills of
evaluation, if you are able to reflect upon the process and product you have
developed, if you can justify the purposes, meaning and function of your final
outcome by using specific vocabulary. Show that you are aware of the difficulties
you encountered; that you can point out what you achieved in terms of knowledge,
understanding and skills in art and design.

2. Portfolio
Students construct their portfolios according to the chosen project brief. The
portfolio contains a selected collection of investigation work, preparatory and
developmental studies (annotations, sketches, experiments, and visual studies), the
final product and a self-assessment report (written, oral, visual). The separate items
in the portfolio should be connected and depict the entire process. The portfolio
should also reflect the development of the project in time; all the evidence should be

380
dated and numbered as a part of the process. Your portfolio should include
comprehensive documentation (models, sources).

The portfolio can be, for example, an expanding file, document case, box, album,
web page, CD Rom; video, work journal or notebook, etc. Appearance of the
portfolio can be a part of the artistic production and reflect the character of the
project and your personality. The appearance can contribute to the general visual
expression, and lead to a positive evaluation.

Project Brief
You have three options for you project brief:
1. The theme and optional final products provided.
4. Theme and final products designed by the students and the class teacher.
5. Develop your own theme and final products.

The project brief may involve one discipline or involve several disciplines’ contents.
You must describe the project brief. In the description you must state:

1. The theme
2. The tasks: what are you proposing to do; what kind of final product will you
develop. Explain why do you want to develop it, submit diagrams and of
development and methodology.
3. Timetable: submit a planning diagram or schema showing a time line to
completion of your project.
4. What type of knowledge; understanding and skills will you develop in the
project? What kind of help and resources do you need?

During March April you must choose and develop the project brief. You can use the
example provided, examples provided by your teacher or design your own project
brief in negotiation with your teacher. Your teacher will help you to design your
project brief through individual tutorials at your request.
You must choose a theme for your work; the portfolio is built around one task or
final product. You have one month to study the tasks, gather necessary background
information and make your choice before the final exam period.

381
2.2. What kind of evidence to submit in the portfolios
The following diagram describes some types of evidence to include in portfolios:

Form 1: students’ submission form Preliminary


Project Brief studies
Form3: internal standardisation form Experiments;
Form 4: Awardings exploration of
Assessment matrix possibilities with
reflection and
evaluation of the
Reports, records or
processes
notes about previous
experiences, Work journals
intentions, interests, Not compulsory Investigation
etc. (information;
sources; models
and related critical
Final Products Student analysis)
(visual): Paintings, Portfolio
drawings,scultpures,
CD, web-pages,
design products, Crits/ Self- assessment
photographs,films,
video records on written or oral (tape, video, digital record) about the
performances, students’ intentions, investigation, progress, decisions;
installations, techniques, materials, purposes; functions; meanings
exhibitions, design achievement, and evaluation.
2D or 3D; etc.

2. 3.Suggestions for carrying out Portfolio tasks

1.Preparation: (March –April)


You are expected to make the preparatory studies outside school over a period of 25
hours. Prior to the timed examination, you must produce and submit preparatory
supporting studies undertaken before the examination period. The preparatory
supporting studies should chart the development of the timed work from conception
to completion; include a description of your intentions and motivations; include an
analysis and interpretation of sources from visual culture or other things seen,
imagined or remembered which are important for the work. You may include initial
ideas and first experimentation with materials and processes. Use your work journal
to describe your intentions at the starting point of the portfolio by explaining what
the reasons were for choosing a specific task, what was interesting about the task,
what are the aims and goals of your work, what is your planned timetable, for
example:
- What is the content of your project, what do you want to express, study or tell?
What is at the root of your personal artistic expression (Who or what has inspired
your personal style? What is the reason behind your choice of mode of expression,

382
techniques and materials? What kind of experiments and exploration do you want to
try and why?)

2. Timed examination: (15-20 hours + 2 hours (self-assessment report) during class


time).
The work produced during the examination period should include

• Continuation of the developmental studies initiated during preparation time


for example: experimental works; exploration of ideas/techniques/ materials;
sketches or other visual attempts; visual, written or oral annotations;
experiments and exploration of sources, materials and processes, ideas
evaluated, rejected, selected and developed.

• Final Products.

• Self- assessment report explaining your progresses and achievement.

The portfolio provides evidence about how you have tackled your chosen task and
contains material produced during the process. You should choose related sketches,
studies and experiments that, together with the final product, best illustrate your
project and end results.

The portfolio should illustrate the origins of your chosen content and mode of
expression. It should also reveal pivotal decisions and their consequences: how you
have started and developed your form and content, mode of expression, techniques
and use of materials. You can also include annotated supporting materials that you
have used and analysed, such as texts, newspaper clips, material samples,
reproduction of images or media files. You may include the work journal in your
portfolio.

The portfolio should show that you have made an art and design study of your
subject. This means that making sketches, studies and decisions on alternative
solutions should be a part of the process. The portfolio should also show how your
project developed in time. Date, sign, number all your developmental studies and
other material during the work period. The work journal may be useful to help you
document and critically reflect on the progress and purposes of the project; it helps
you to record and justify your decisions. Notes, visual or textual, also help you to
develop your ideas; reflect about problems found and your decision making process.

You may include the original final product; but if it is impossible to include it
because of size; media or other reasons include a video/photographic or digital
reproduction of it. The final product(s) or the record of it must be labelled as
Product.

383
3. Self-assessment report guidelines
This part should be conducted after completion of the work (during the specified
time). Try to answer the following questions using oral, visual, written media for
example.

Describe how you developed the form and content of your project; your progress,
how you overcame the difficulties; what kind of sequence of work (leap of
imagination, how new ideas appeared and why). If you changed your point of
departure, why and how was that? Did you use any sources or models; were they
useful, why? How well did you attain your purposes, aims and goals; how well did
you achieve the qualities required by the assessment criteria and mark schemes? Did
you achieve your purposes; why and how? Does your work reveal a personal
reflection about an issue, why and how? Is your personal reflection important for
others? Is your work meaningful for others? Why? An honest explanation of partial
failure may be as valuable as an exaggerated claim that all went smoothly.

384
C. Assessment matrix

385
New Examination Instructions

Criteria Level: 1-4 Level: 5-9 Level: 10-14 Level: 15-17 Level: 18-20
CA1: Record personal .
ideas, intentions, Reasonable amount of Considerable amount of
experiences, information Limited or inappropriate Small amount of records not records, the intentions The student’s intention is records with personal
and opinions in visual and records always persistent. The sometimes are clear , obvious. The student reflection.
other forms. The student has no student knows what she shows persistence and shows persistence and Intentions are clearly stated.
conscious intention for wants to achieve, but her/his curiosity combines some The student approaches
what she/he is doing. intention is not explicit. information with the work themes and problems from
according to the several angles and develops
intentions. it through a series of drafts
and. sketches
(1/6pts) (7/14)
( 30 pts) (15/21) (22/26) (27/30 pts)
CA2: The student uses the sources The student actively The student actively The student actively
The student only uses the indicated by the teacher and searches for sources to get searches for various searches for and critical
Critical analysis of sources sources indicated by the others but limits the search ideas for her/his own sources in time and space
reflect about various
from visual culture teacher, the analysis is to collect and organise work. Collect, organise and use it in a well-sources in time and space
showing understanding of limited to collect information. and select information integrated way in her/his
and use it in a versatile,
purposes, meanings and information. according to intentions. own work ( collect, independent, and well-
contexts organise, select, combine)
integrated way in her/his
( 30 pts) own work ( collect,
organise, select, combine,
(1/6pts) (7/14) (15/21) (22/26) critic and re-organise)
(27/30 pts)
CA3: Develop Limited or inappropriate The students show a small The student can use pre- The student can re- The student can find, and
developments of obvious amount of exploration of established problems. . formulate problems. formulate problems in an
ideas through ideas, small ideas, lacking sense of order Reasonable and safe Comprehensive independent way.
experimentation , experimentation and no and technical abilities, exploration of ideas exploration of appropriate Constantly experiment and
signs of critical reflection Tendency to repeat ideas and (without risk-taking) ideas taking risks, explore possibilities taking
exploration and about the experiments and experiments. No critical Shows some understanding revealing understanding risks and finding non-
evaluation decisions made. reflection about the of visual language, of visual language, expected responses. Shows
experiments and decisions concepts and techniques concepts and techniques critical reflection about the
made. but reduced reflection and critical refection experiments and decisions
about the experiments and about the experiments and made.
decisions made. decisions made.
(1/12 pts) (13/27 pts) (28/42 pts) (43/51 pts) (52/60 points)

386
New Examination Instructions
( 60 pts)
CA4: Present a coherent Inappropriate small The amount of works and The considerable amount The amount of works and A set of works was selected
and organised sample of amount of works and final final product shows little of works and final product final product shows and organised. The works
works and final product product showing lack of understanding of visual shows reasonable understanding of visual and final product shows
revealing a personal and understanding of visual language, concepts and understanding of visual language, concepts and excellent understanding of
informed response that language, concepts and techniques. language, concepts and techniques. visual language, concepts
realises their intentions techniques. techniques. and techniques.

( 60 pts)
(1-12 pts) (13-27) (28/42) (43/51) (52/60)
CA5: Evaluate Unable to explain the The student explain the The student evaluate the The student evaluate the The student evaluate
and justify the reasons for her work. merits of her work using characteristics and merit of characteristics and merit fluently the characteristics
qualities of the The student cannot point vague terms, can refer to the her work using specific of her work using specific and merit of her work using
work. out the strengths and intentions and used sources terms, can describe the terms, can explain the specific terms, can explain
weaknesses of her own but is unable to justify the progresses referring to progresses referring to and justify the progresses
work or distinguish quality and purpose of the intentions, sources and intentions, sources and referring to intentions,
between works that are work. problems encountered, problems encountered, sources and problems
successful and those that Justify the purpose and Justify the purpose and encountered, Justify with
are less successful. meaning of her work using meaning of her work in fluent terms the purpose
vague terms. cultural and social and meaning of her work in
(1/4 pts) ( 5/9 pts) ( 10/14 pts) contexts. cultural and social contexts.
( 15/17 pts) ( 18/20 pts)

( 20 pts)

387
New Examination Instructions

D. Grade Descriptors

Assessment Criteria; grade descriptors

A list of draft criteria was established in order to be used as ‘windows’ to help the

assessors to judge students’ artworks.

Draft criteria

Students should:

AC1: Record personal ideas, intentions, experiences, information and opinions in

visual and other forms.

AC2: Critically analyse sources from visual culture showing understanding of

purposes, meanings and contexts

AC3: Develop ideas through purposeful experimentation, exploration and

evaluation.

AC4: Present a coherent and organised sample of works and final product
revealing a personal and informed response realising their intentions.
AC5: Evaluate and justify the qualities of their work.

2. Global Descriptors

The following descriptors or band scales might be useful for holistic assessment.

However teachers are reminded to use it with the necessary flexibility.

1-4. Reduced

388
New Examination Instructions

A clearly inadequate and disorganised quantity of work has been completed

including a few records of obvious or literal ideas; some collected but not analysed

sources from visual culture; few developmental studies and a final product which

does not realise the intentions. Personal expression is weak and stated in vague terms

and the student is unable to justify the strengths and weakness of their work Overall

the work is lacking a sense of order, basic technical skills and relevant

understanding of art and design processes of inquiry (formal elements/ composition/

design principles/visual language).

5-9. Limited

A small amount of work has been produced including some record of personal ideas

and intentions, some organisation of sources but the student does not use it to explore

his/her own ideas. There is some sense of order and structure in the way ideas are

formed but few explanations of decisions made. The exploration of ideas is

abandoned too early, ideas and experiments are repeated without any changes in the

form and content. The student sometimes critically reflects on his/her work and the

work of others but the vocabulary is clumsy and unrefined.

Overall the work demonstrates a limited understanding and use of art and design

concepts, processes and techniques (formal elements/ composition/ design

principles/visual language).

10-14 . Basic/Reasonable/Appropriate

A reasonable sample of work has been presented including records of a range of

ideas, clear explanation of the intentions and motivations; an organised selection of

389
New Examination Instructions

sources appropriate to the project but with superficial critical reflection. Some

sources seem to be used during the development of ideas but not fully explored. The

original ideas may be consolidated too early; experiments are not fully developed

and the outcomes are predictable, however there is some sense of order and method.

The student sometimes critically reflects on his/her work and the work of others

using appropriate vocabulary.

Overall the work demonstrates an adequate understanding of the contexts, concepts,

processes and techniques of art and design production (formal elements/

composition/ design principles/visual language).

15-17: Good

A good sample of work has been organised including a record of some unusual ideas;

the majority of development studies show a personal style and reasonable technical

expertise, and, sometimes unexpected experiments which progressively evolve

towards the realisation of intentions through systematic evaluation. The student

understands and critical reflects on the contexts, meanings and purposes of visual

culture production. Uses the sources to develop personal interpretations. Experiments

and explores possibilities taking risks and some times evaluating the decisions. Use

of art and design specific vocabulary to reflect and evaluate his work and the work of

others. Explains the importance of the work presented within social and cultural

contexts.

390
New Examination Instructions

The work overall illustrates personal experimentation, exploration of ideas and

possibilities and a good resolution of concepts, media and technical expression

through developmental studies and final products.

18-20: Very Good/ Excellent /Fluent

A considerable sample of work has been organised including a record of a range of

unusual ideas; development studies showing a personal style and technical expertise,

and unexpected experiments which progressively evolve towards the realisation of

intentions through systematic evaluation. The student understands and critical

reflects on the contexts, meanings and purposes of visual culture production. Uses

the sources to develop personal interpretations. Experiments and explores

possibilities taking risks and

always evaluating decisions made. Use of art and design specific vocabulary to

reflect and evaluate his/her work and the work of others. Persuades the audience

(viewer) about the importance of the work presented within social and cultural

contexts.

The work overall illustrates a personal and sophisticated experimentation,

exploration of ideas and possibilities and an outstanding resolution of concepts,

media and technical expression through developmental studies and final products.

391
New Examination Instructions

E. Examination in Art and Design ( Experiment)


March- May 2003-03-14
Model Project Brief

Timed examination: 22 hours


This paper may be given to students as an option for the project brief. Students have
a four weeks period in which to do preparatory supporting studies prior to the timed
examination.

1. General instructions

This paper and the instructions for examination are given to you in advanced of the
examination so that you can make sufficient preparation.

The portfolios for external assessment will be developed according to a project brief
theme. You may chose to develop:
1. This project model theme
2. A theme and project brief developed by the class and the teacher
3. Your own theme and project brief.

The works to be included in the portfolio are:


• Preparatory studies
• Developmental studies
• Final product
• Self-assessment-report

The assessment criteria and weightings are:

Assessment criteria Weightings


AC1: Record personal ideas, intentions, experiences, information and 30 points
opinions in visual and other forms.
AC2: Critical analysis of sources from visual culture showing 30 points
understanding of purposes, meanings and contexts
AC3: Develop ideas through purposeful experimentation, exploration 60 points
and evaluation.
AC4: Present a coherent and organised sample of works and final 60 points
product revealing a personal and informed response that realises their
intentions.
AC5: Evaluate and justify the qualities of the work. 20 points

392
New Examination Instructions

2. Model Project briefs

2.1. Theme: ‘Excluded people’

The excluded people are those persons who are rejected by the society, or those who
are discriminated by the community because they are different from the majority of
the people in the community. People can be rejected because they are physically or
mentally different; other can be excluded because they have different ethnic origins
or they are from a different race. Some persons can be discriminated by sexual
characteristics; others are excluded because they have different beliefs, different
social backgrounds or simply because their ways of living are different from the
majority.

It is intended to make a visual study reflecting upon the people who are excluded
from the community, who are discriminated. You are asked to present a personal
reflection about ‘the other’.

Dwarfs, beggars, crippled and crazy people were depicted in works of art though
history; for example by Velasquez; Murillo, Gericault, picaresque paintings, etc.
Contemporary artists have been working on projects about homeless people.
Similarly in film and photography some artists had approached discriminated or
excluded people. Photographs by Sebastião Salgado show a vision about the other,
and some of his works are about minorities. The photograph by Gabriel Orozco: ‘ the
island in the island’ is a reflection about closed and discriminating spaces. The
photographer Diane Arbus in her works presented a personal vision about different
people often called monsters. Monsters are often present in comics and science
fiction films. The super-heroes, androgynies, cyborgs and mutants are somehow a
vision about different people and their place in society.

Some theatre and dance companies, and film producers have artists with disabilities,
do you know about ? In advertising and publicity different people are not commonly
represented, do you often see fat people, disabled or poor people in publicity? Why?
However there are some humanitarian founded advertising about different people.
Designers invented products to make easier the daily life of disable people, for
example cutlery and scissors for left-handed people.

2.2. Starting points

• Three ways to look at the other


We can look at the other from three different ways: discriminate; tolerate or
integrate him. We discriminate the other when we think that he doesn’t belong to
our circle and that he shouldn’t share our lives. We tolerate the others when we
see them as different but with their own rights, we accept his presences but we do

393
New Examination Instructions

not invite him to share out lives. We integrate the other when we invite him to
share our daily lives with no constraints.

• Me as the other
I can be the other, the different one. I can be discriminated by some reason. How
will I feel within the community? What will be my sensations and feelings? How
shall I live; who are my friends? How do the other persons look at me in the
street. Why they insist to look at me?

What is my relationship with the objects made for the normal people? The
entrances in the buildings are accessible for a wheeling chair? What about the
size of urban equipments? If I am blind, are the streets safe for me? If I am
locked in a psychiatric hospital or prison cell, how do I feel in this space?

• The other is a human being


How does the community accept the other? How can we inform and make people
aware about the others? How to fight racism and other discriminating attitudes?
How to change people attitudes towards difference? How to inform and persuade
the community about the need to be aware of discriminating attitudes towards
different people? How to denounce inequalities?

2.3. Tasks and media

• Fine Arts
Fine arts could be defined as a mean of to express personal experiences, feelings
and particular visions, it is not aimed to solve practical or utilitarian questions,
however fine arts media can transmit strong messages about the reality and the
world including social questions. If you chose this media you can make a
painting, a drawing, sculpture, printing, installation, performance, site-specific
work, etc. You can also choose ceramic, textile or other crafts.

There are several artists who approached this theme, make a search about it and
try to understand their visions about different people. Develop your ideas after
the search and from your own opinions about the issue.

• Photography, film, video and multimedia


Photography, film, video and multimedia can be seen as autonomous fields or as
products which can be embodied in other fields, for example photography and
advertising, photo-journalism, film and publicity, video-clip and multimedia
productions.

Search about photography and media products where different people are
presented, and try to understand the underlying visions about the issue. You can
produce as outcome one or several finished photographs, film or video in order to

394
New Examination Instructions

present your own views and opinions about the theme. You can also make a web-
page, video-clip or other multimedia presentation.

• Comics
Comics are a narrative medium using images and texts. You can make a comic
strip or a graphic novel to tell a story related to the issue. The graphic novel
cannot be a finished product, but you must include the complete story-board;
essays for scenes and characters; drafts for page settings and some finished ‘
pranches’.

• Graphic design
The aim of graphic design production is the communication of messages using
expressive, symbolic, formal and functional aspects of the image. Messages can
be informative and or persuasive.

You must detect a need or a problem, investigate specific areas of the


problem, determine relevant sources of information and use these to
develop and redefine the problem. You can produce outcomes of
illustrating, packaging, advertising, computer-aided design, typography
and multimedia design. For example a logo; a pictogram, a set of
sinaletics, a web-page, a leaflet, a billboard/ show card, etc.

• Product design
Try to detect needs related to different people and disable in your environment
before thinking about the final product (you can use brainstorming strategies to
help you). You should show evidence of an understanding of the appropriateness
of the medium to function and of fitness for the purpose. Use your knowledge
about design process to develop your ideas (concept, formulation of brief,
research, experimentation, realisation and evaluation). You can re-design a
utilitarian object or a space or invent a new one.

395

Вам также может понравиться