Академический Документы
Профессиональный Документы
Культура Документы
Operations - the
SurveyLang
software platform
147
First European Survey on Language Competences: Technical Report
6.1 Introduction
SurveyLang has developed an integrated, state-of-the-art, functionality-rich software
system for the design, management and delivery of the language tests and
accompanying questionnaires. The platform is fine-tuned to the specific set of
requirements of the ESLC project and is designed to support the delivery of the paperbased and computer-based tests. The software platform also supports all major stages
of the survey process.
6.2 Requirements
The technical and functional requirements of the software platform were developed in
close cooperation between the SurveyLang partners. At a high level, the software
platform should:
Support all various stages in the development and implementation of the
survey (see Figure 17)
enable the automation of error-prone and expensive manual processes
be flexible enough to handle the variety of task types used by the survey
support the implementation of the complex test design used in the survey
meet the high security requirements of international assessment surveys like
the ESLC
reduce the technical and administrative burden on the local administrators to a
minimum
run on existing hardware platforms in the schools
be an open source platform (to be made available by the European
Commission after the completion of the project), for free use by any interested
party.
148
First European Survey on Language Competences: Technical Report
Figure 17 Stages and roles in the development and delivery of the survey
Translators
Authors
Test and
sample
designers
Candidates
Analysts
Test item
authoring
Test item
localization
Test
assembly
Test
rendering
Data
integration
Data
preparation
Data
analysis
Scorers ,
data
managers
Test item databank
Item Bank
Manager
Reporting
and
dissemination
Secondary
analysts,
users
Test
administration
Local test
administrators
(proctors)
149
First European Survey on Language Competences: Technical Report
6.3 Architecture
The high level architecture of the software platform that has been designed to provide
this functionality can be seen in Figure 18.
Figure 18 High level architecture
The platform consists of a central Test-item databank interfacing two different tools
over the Internet: the Test-item authoring tool and the Test assembly tool. In addition,
an interface to a translation management system is also provided. As a whole, these
distributed tools, plus the Test-item databank, are designed to support the central test
development team in their efforts to develop and fine-tune the language tests.
The production of paper-based and computer-based test materials is handled by the
Test assembly tool. The physical production of computer tests (delivered on USB
memory sticks) is, however, done by a USB memory stick production unit containing
specially built hardware as well as software components.
To support the test-delivery phase of the project, another set of tools are provided.
These are i) a Test-rendering tool to be installed on the test computers in all the
schools taking computer-based-testing and ii) a data upload service which allows the
test administrator to upload student test data from the test USB memory sticks to the
central databank.
The various tools and components are described in further details in the following
paragraphs.
150
First European Survey on Language Competences: Technical Report
151
First European Survey on Language Competences: Technical Report
The left navigation frame, shown in Figure 19 above, allows the user to browse the
Item Bank, find or create tasks or upload tasks to the Item Bank etc. The various
elements of the task are shown in the task display area to the right where the user can
add or edit the content, upload multimedia resources like images and audio and define
the properties of the task. The elements and functionality of the task display area are
driven from a set of predefined task templates.
Figure 20 Metadata input window
152
First European Survey on Language Competences: Technical Report
Tasks can be found in the Item Bank by searching their metadata. A free-field text
search as well as an advanced structured search dialogue is provided. The example of
the latter is displayed in Figure 21.
One of the metadata elements describes at what stage in the life-cycle a task is
currently positioned. This defines who will have access to the tasks and what type of
operations can be performed on the task. The structure of the implemented life-cycle is
described in Figure 22.
153
First European Survey on Language Competences: Technical Report
soundtrack and load the task back up to the databank. The databank includes a
version control mechanism keeping track of where the task is in the life-cycle as well
as a secure role-based authentication system, making sure that only authorized
personnel can see or change a task at the various stages in the life-cycle.
The Test-item databank is implemented in Java on top of Apache Tomcat and MySQL
and communicates with the various remote clients through Adobe Blaze DS.
One of the most innovative features of the Item Bank is its ability to manage the audio
tracks of the Listening tasks. Creating high quality audio is normally a time consuming
and expensive operation. Traditionally the full length track of a task has been created
in one go and stored as an audio-file. If a change is made to this task at a later stage,
the audio-file is no longer usable and a completely new recording is thus required. To
avoid this, an audio segmentation model has been developed whereby the audio files
can be recorded as the shortest possible audio fragments. The various fragments are
stored along with the other resources of the task and are assembled into full-length
audio-tracks when the test materials are produced.
The basic principles of this audio segmentation model are shown below. The model
consists of:
Test level segments, which are reusable segments used to introduce and end
the test as a whole, as well as to introduce the next task within the test.
System level segments, which contain the fixed task rubrics as well as shorter
155
First European Survey on Language Competences: Technical Report
The model also specifies the number of seconds of silence between the various types
of segments when these are assembled into full-length audio-tracks. The assembly of
segments is handled by the system when the various test series are defined and as an
early step of the test material production process.
156
First European Survey on Language Competences: Technical Report
157
First European Survey on Language Competences: Technical Report
Each line in this table is a task and each column is a test (or more accurately, the
Reading section of a test administration). In this example a total of 25 individual tasks
are used to construct a series of 36 unique tests. The 8 tests to the left are A1-A2, the
middle group of tests are A2-B1 and the rightmost group are B1-B2. See section 2.5
for further details on the test design.
The illustration below is focusing on the A1-A2 group of tests:
Figure 26 Example of a test design for a single difficulty level of a single skill
section
The first column in this table contains the task identifiers. The second column shows
the testing time of each task and the bottom line the overall testing time for this skill
section in each test. The column to the right shows the number of tests that each task
158
First European Survey on Language Competences: Technical Report
is included in. Coloured cells signal that a task is used in a test and the number shows
the sequence in which the selected tasks will appear in the test. Similar designs are
developed for Listening and Writing.
The Test Assembly Tool has a specialized graphical interface allowing the user to
specify the test designs as illustrated above. Tasks in the Item Bank which are signed
off and thus approved for inclusion in a test are dragged and dropped into the first
column of the table. The content of each single booklet or test sequence is then
defined clicking the cells of the table.
The Tool has another graphical interface which allows the user to inspect and preview
the various test sequences which are created. An example of this interface is shown in
Figure 27.
Figure 27 Test preview interface
The interface provides a graphical overview of the test series using colour-coded bars
to indicate tasks of different difficulty levels. The length of a task bar is proportional to
the length of the task (in seconds). Each line in this display is thus a booklet or test
sequence. By clicking on the buttons to the right, the user will either be allowed to
preview how a test will render in the test rendering tool for computer-based testing, or
produce and inspect the test booklet which will be produced for paper-based-testing.
This functionality proved to be very useful in the very last rounds of quality assurance
of the test materials.
159
First European Survey on Language Competences: Technical Report
6.9 B: Allocation
The second role of the Test Assembly Tool is to decide which booklet or task
sequence each single student will get. To accomplish this, the system is combining
information from the previously described test design, with information about the
student samples from the Sample Data Base. The latter is a database designed to
enter and store the information about student samples (see chapter 4 for further
information about the sampling process).
The allocation is partly targeted and partly random. The system makes sure that each
single student receives a test sequence which corresponds to their proficiency level,
as indicated by the routing test. The rest is random. For each single student the
system first selects the two skills which the student will be tested in (from Listening,
Reading and Writing) and then booklets for these skills are randomly selected from the
The allocation process is managed through the interface displayed in Figure 28 below:
Figure 28 Allocation interface
160
First European Survey on Language Competences: Technical Report
The user will first decide which educational system (sample) and testing language the
allocations should be run for. Secondly, the relevant test designs for the various skills
are indicated. An allocation for an educational system/ language combinations takes
about a minute and produces a log where the user can inspect the properties of the
allocations. Due to the random nature of the allocation process, the resulting allocation
will in many cases be biased in one direction or another. Normally it will therefore be
necessary to run the allocation process a few times before a balanced allocation
appears. Each allocation was reviewed and when satisfactory was signed off by the
Project Director.
An example of the allocation log is displayed in Figure 29 below:
Figure 29 Allocation log example for sample country
Sample country is 100% paper-based
Target language: English
TOTAL NUMBER OF STUDENTS: 1000
NUMBER OF ALLOCATED STUDENTS: 1000
ALLOCATION TIME: 01:04
ERRORS DURING ALLOCATION: No errors
STUDENT DISTRIBUTION BY TEST TYPE:
paper-based: 1000
STUDENT DISTRIBUTION BY SKILL:
L: 656
R: 667
W: 677
STUDENT DISTRIBUTION BY DIFFICULTY:
1: 153
2: 322
3: 525
STUDENT DISTRIBUTION BY TEST:
01. EL1/1: 50
02. EL1/2: 50
03. EL2/3: 104
04. EL2/4: 118
05. EL3/5: 171
06. EL3/6: 163
07. ER1/1: 6
This is the log for a sample country consisting of 1000 students to be paper-based
tested. We can see that the distribution by skill is balanced. By inspecting the
161
First European Survey on Language Competences: Technical Report
distribution by test, we can also decide to what extent the algorithm has produced a
balanced allocation across the various test booklets that are available.
The following table shows the percentages of students for each educational system
allocated to each testing mode (paper or computer-based).
Table 26 Number of students allocated Main Study test by educational system
and mode
Educational system
French Community of Belgium
German Community of Belgium
Flemish Community of Belgium
Bulgaria
Croatia
England
Estonia
CB
100%
PB
100%
100%
21%
17%
France
100%
79%
100%
83%
100%
100%
Greece
100%
Malta
Netherlands
Poland
Portugal
100%
100%
100%
100%
Slovenia
100%
Spain
Sweden
Grand Total
28.5%
16390
71.5%
42487
162
First European Survey on Language Competences: Technical Report
and adds the request to the production queue. Several production requests can be
started and run in parallel. Due to the fact that the system produces individualized test
materials for each single student, the production for a single educational system/target
language combination can take several hours.
Figure 30 Test materials production interface
When completed, the materials for paper-based testing are copied on DVDs (booklets)
and CD (audio) and distributed to the countries. The materials for computer-based
testing are transferred to the USB memory stick production unit for the physical
production of test USB memory sticks. There was quality control over all of these steps
with manual checks and sign-off of each stage in the process.
163
First European Survey on Language Competences: Technical Report
164
First European Survey on Language Competences: Technical Report
The navigation bar at the bottom of the screen is used to navigate through the test and
also to inform the student about the length and structure of the test. Colour-codes
indicate whether a task is completed or not.
165
First European Survey on Language Competences: Technical Report
Prior to testing, students were referred to the ESLC Testing Tool Guidelines for
Students and demonstrations which were on the SurveyLang website
(http://www.surveylang.org/Project-news-and-resources/CB-Familiarisationmaterials.html). This ensured that students were familiar with both the task types and
software system before taking the actual Main Study tests.
despair data files, a Data Upload Service was provided. This is a web-based solution
(implemented in Java), extracting relevant data from the test USBs one by one. The
solution checks the integrity of the student data by opening files and comparing the
incoming data with information from the sample data base. The system also provides a
log where the Test Administrator can check whether all data files are uploaded or not.
167
First European Survey on Language Competences: Technical Report
6.17 Performance
The developed technology proved to be very efficient and robust. The system provided
the necessary support for the test developers and was able to reduce the amount of
manual errors to a minimum. The amount of mistakes and errors in the computerbased and paper-based test materials was negligible in the Field Trial as well as in the
Main Study. It would have been impossible to implement a survey of this complexity
without the support of a system like this.
The system also performed well during the test delivery phase. In the Field Trial, a
data loss of 1.9 percent was caused by different types of issues related to the
computer-based testing tools. To reduce this number even further, we systematically
addressed all issues reported from the Field Trial and improved the system wherever
possible. This was done by reviewing all issues reported to SurveyLang during the
Field Trial and through the NRC feedback report and Quality Monitor reports (see
Chapter 8 for further details). The amount of data loss in the Main Study was
consequently reduced to 0.75 percent. These figures include all data loss occurrences
and not only those relating to computer-based administration.
168
First European Survey on Language Competences: Technical Report