Development and trialling of Computer-Based

Anuncio
Development and trialling of
Computer-Based Cambridge
English: Young Learners tests
LTF conference, 17 Nov 2013
Szilvia Papp & Agnieszka Walczak
© UCLES 2013
Overview
-
Introduction
What’s new and what’s not new in CB YLE?
Stages of development and trialling
Examples of validation studies
Results of PB/CB comparability study
© UCLES 2013
What’s not new in CB YLE?
•
•
•
•
•
•
•
•
test content – identical syllabus, same wordlists, same grammar and
structures list, same topics
task types – identical across PB and CB
number of questions and tasks – identical
overall timing of papers – identical
marking – marking identical, speaking assessed by same YLE
examiners
level of difficulty – all material calibrated to same standard
results – shields for each skill and same YLE certificate
purpose – to measure children’s language ability, offering
candidates a positive, fun experience that has a beneficial impact on
future language learning
© UCLES 2013
What’s new in CB YLE?
•
•
•
•
new style graphics
navigation: arrows and light bulbs
response mechanisms
test functionality
– adjustable sound volume
– onscreen keyboard
– enlargeable graphics
• onscreen timer
• simplified device-neutral rubrics
© UCLES 2013
© UCLES 2013
Test Format –what’s new?
© UCLES 2013
Test Format –what’s new?
© UCLES 2013
Test Format –what’s new?
© UCLES 2013
Test Format –what’s new?
• response mechanisms
© UCLES 2013
What’s not new in CB YLE Speaking?
• same 1:1 ratio
• same examiner script
• same visual prompts
• same timings
© UCLES 2013
What is new in CB YLE Speaking?
• speaking test delivered via computer
• uniform test experience
• animated characters to indicate speaking
window
• no support
• CB marking process
© UCLES 2013
Test Format –what’s new?
© UCLES 2013
© UCLES 2013
CB YLE development & trialling
Development started
Autumn 2011
(Papp, Dec 2011)
Stages of trialling
Movers & Flyers
March-Nov 2012 : China, Shanghai
Beijing, Shanghai, Guangzhou, Suzhou
January 2013
April 2013
Hong Kong
Spain demo road show
Starters
July-Aug 2013
(Galaczi & Miller, Apr 2012)
(Papp, Khabbazbashi &
Miller, June 2012)
(Papp & Miller, March 2013)
(Papp, June 2013)
Hong Kong
(Papp & Miller, Sep 2013)
Starters, Movers & Flyers
August 2013
Sep 2013
Nov 2013
© UCLES 2013
Mexico
Spain
Mexico, Argentina, Spain, Hong Kong
(Papp and Walczak 2013)
Launch
Aim of trials
To ensure comparability btw CB and PB/f2f
In CB L and RW this involved fine-tuning
functionality, ensuring delivery mode
has no impact on marks.
In CB Speaking, this involved investigating
–
–
–
–
© UCLES 2013
quality of sound files
timing
support
marking.
Timing in Speaking tests
Measuring length of responses in f2f Starters Speaking
tests
Response
length
Under 1 second
(= average
75 ms examiners
73 ms candidates)
1
second
2
seconds
3
seconds
4
seconds
5
seconds
6+
seconds
Total
Examiners
(No. = 17)
8%
26%
30%
15%
7%
4%
10%
738
Candidates
(No. = 17)
41%
34%
9%
5%
2%
2%
7%
658
© UCLES 2013
Timing in Speaking tests
Measuring length of processing time (including hesitation
phenomena) in f2f Starters Speaking tests
Processing time
response
frequency
average length
maximum
total
Noemi
27 times
1.7 seconds
4.7 seconds
46.4 seconds
Cheng Quing
29 times
1 second
3.2 seconds
28.9 seconds
© UCLES 2013
Timing in Speaking tests
Fine-tuning length of response windows in CB Movers
Part
Task description
Input
Candidate thinking and talking time allowed
March 2012
Shanghai trial
25 seconds
May 2012
Beijing trial
30 seconds
Jan 2013
Hong Kong trial
30 seconds
Part 1:
Spot the
difference
Part 2:
Telling a
story
Describing 2 pictures by
using short responses
2 similar pictures,
4 questions
Understanding the
beginning of a story and
then continuing it based
on a series of pictures
1 picture sequence
(consisting of 4
pictures)
35 seconds
(approx 12
seconds per
picture)
42 seconds
(approx 14
seconds per
picture)
45 seconds
(approx 15
seconds per
picture)
Part 3:
Odd one
out
Suggesting a picture
which is different from a
group of pictures and
explaining why
3 questions about
picture sets
15 seconds per
question
(45 seconds total)
15 seconds per
question
(45 seconds total)
15 seconds per
question
(45 seconds total)
4 open-ended
questions about
the candidate
15 seconds per
question
(60 seconds total)
12 seconds per
question
(48 seconds total)
12 seconds per
question
(48 seconds total)
2 min 45 seconds
(165 seconds)
2 min 45 seconds
(165 seconds)
2 min 48 seconds
(168 seconds)
Part 4:
Understanding and
Question
responding to personal
and
questions
Answer
Total speaking time available
© UCLES 2013
Support in Speaking tests
Quantifying and classifying examiner support in f2f YLE
Speaking tests
Hm. Put the tomato in front of the train. Where’s
the train? Where is the train? [circling] Is this the
train? [pointing]
George: You put the pencil on the box. [gives him card]
Where is the box? On the box. [pointing]
Trish:
© UCLES 2013
Starters
(N = 12)
Ave
%
Movers
(N = 12)
%
Flyers
(N = 12)
Ave
%
15.6
35%
acknowledging
candidate response
Aha. OK. Good.
Right. I see.
17.8
50%
10.3
23%
back-channelling
Uhum. Yes.
4.7
13%
3.3
9%
Ave
acknowledging
candidate response
Aha. OK. Good.
Right. I see.
18.2
50%
acknowledging
candidate response
Aha. OK. Good.
Right. I see.
pointing to direct
candidate’s attention
5.3
15%
pointing to direct
candidate’s attention
8.7
19%
checking
comprehension
OK? Hm? Yes?
asking a YES/NO
question
4.2
11%
back-channelling
Uhum. Yes.
repeating question,
And…?
3.8
11%
asking a YES/NO
question
2.8
6%
pointing to direct
candidate’s attention
2.8
8%
2.5
6%
repeating question,
And…?
2.6
7%
back-channelling
Uhum. Yes.
3.3
9%
checking
comprehension
OK? Hm? Yes?
checking
comprehension
OK? Hm? Yes?
0.8
2%
repeating question,
And…?
1.7
4%
asking a YES/NO
question
2.2
6%
asking a Whquestion
0.5
1%
asking a Whquestion
1.6
4%
asking a Whquestion
1.7
5%
supplying the answer
0.2
0%
giving first half of
response
1.2
3%
supplying the answer
0.4
1%
giving first half of
response
0.0
0%
supplying the answer
0.5
1%
giving first half of
response
0.0
0%
© UCLES 2013
Marking in Speaking tests
Movers
PB
CB
March
March
2012
2012
Shanghai Shanghai
N=25
N=25
Flyers
CB
May
2012
Beijing
CB
July
2012
China
N=39
N=259
CB
PB
CB
Jan
March
March
2013
2012
2012
Hong Kong Shanghai Shanghai
N=25
N=23
N=23
CB
May
2012
Beijing
N=32
CB
CB
July
Jan
2012
2013
China Hong Kong
N=263
N=25
Total mark
8.60
( .76)
8.41
( .87)
6.91
(1.84)
6.21
(2.39)
7.00
(1.63)
11.13
(1.29)
11.01
(1.24)
9.19
(2.18)
9.64
(2.36)
9.52
(3.38)
Reception:
Listening &
responding
2.92
( .28)
2.92
( .27)
2.41
( .77)
2.10
( .88)
2.36
( .69)
2.83
( .39)
2.93
( .25)
2.56
( .67)
2.60
(0.70)
2.50
( .84)
Production:
Appropriacy,
extent and
promptness
2.72
( .46)
2.61
( .49)
1.95
( .73)
1.89
( .85)
2.08
( .72)
2.83
( .39)
2.62
( .49)
2.00
( .67)
2.24
( .68)
2.40
( .90)
-
-
-
-
-
2.61
( .50)
2.59
( .50)
2.03
( .65)
2.18
( .67)
2.46
( .86)
2.96
( .20)
2.88
( .33)
2.37
( .77)
2.22
( .84)
2.34
( .69)
2.87
( .34)
2.87
( .37)
2.59
( .61)
2.62
( .69)
2.40
( .95)
Production:
Grammar &
vocabulary
Production:
Pronunciation
© UCLES 2013
PB/CB comparability study
Research design
• 129 Mexican & 219 Spanish Movers and Flyers
trial candidates took a live paper-based (PB) test
version & its computer-delivered format (CB) (all
exam components)
• CB test taken either on PC/laptop or tablet
• 56 Starters trial candidates took CB test (all
exam components) and a f2f Speaking test
© UCLES 2013
Are PB & CB YLE tests comparable?
RQ1: How do PB and CB scores relate to
each other?
RQ2: What explains trial candidate
performance in PB and CB tests?
RQ3: If there are differences in scores in the
two delivery modes, what can they be
attributed to?
© UCLES 2013
RQ1: Relationship between PB &
CB scores for Movers & Flyers
© UCLES 2013
(1) How do PB and CB scores relate to each
other?
© UCLES 2013
RQ2: Performance in PB & CB
tests for Movers & Flyers
© UCLES 2013
(2) What explains trial candidate performance in
PB and CB tests?
•
M4
For Movers and Flyers separately, the effects of
– individual background variables (age, gender)
M3 M1&2
– years of English instruction
– preference for exam mode (on computer, on paper [B],
no difference)
– frequency of computer use (every day, once or twice a
week, at weekends [B])
– reason for computer use (English homework, games,
email/chat, other)
– type of computer at home (PC/laptop [B], tablet,
combination)
• For Movers and Flyers, the effects of computer device
((PC/laptop, tablet) used in CB tests on candidate
performance
© UCLES 2013
Statistical tests show that there is a curvilinear relationship
between PB/CB total scores and age for Movers and …
© UCLES 2013
…for Flyers.
© UCLES 2013
MOVERS
© UCLES 2013
There is a difference in the performance of Mexican and
Spanish Movers trial candidates.
© UCLES 2013
For Movers, only age matters for explaining trial candidate
performance on PB…
© UCLES 2013
… and CB tests.
© UCLES 2013
Movers trial candidates who use computer every day
perform significantly better than those using computer only
at weekends, but …
© UCLES 2013
…the type of computer at home does not affect
performance in the CB test.
© UCLES 2013
FLYERS
© UCLES 2013
In Flyers, trial candidates in Spain performed better than
trial candidates in Mexico, both in PB and CB tests.
© UCLES 2013
Flyers trial candidates with more years of English instruction
and a preference for taking tests on computers perform
better in the PB test…
© UCLES 2013
… and in the CB test, and boys perform better in CB test
than girls.
© UCLES 2013
Frequency of computer use does not affect Flyers trial
candidates’ performance in the CB test.
© UCLES 2013
Flyers trial candidates who have a tablet at home perform
significantly better than those with a PC/laptop, but the
reason for computer use does not effect CB performance.
© UCLES 2013
STARTERS
© UCLES 2013
For Starters, years of English instruction have a positive
effect on candidate performance on the CB test.
© UCLES 2013
Effect of device used (ipad vs PC)
on CB performance for Movers and
Flyers
© UCLES 2013
Which device (IPAD or PC) trial candidates (Flyers &
Movers) used in the test does not have a significant effect
on their performance in the CB test.
© UCLES 2013
RQ3: Difference in PB & CB scores
© UCLES 2013
Difference in PB and CB scores is not affected by frequency of
computer use or type of computer at home.
© UCLES 2013
The test order does not explain the difference in trial
candidates’ PB and CB scores.
© UCLES 2013
Summary
• CB and PB provide comparable results
• Scores are affected by the same variables on
PB and CB: years of English instruction,
preference for computer
• Difference between PB and CB scores is not
affected by individual variables that might put
some children at a disadvantage
• Device which they took the CB test on does not
have an effect on CB performance
© UCLES 2013
Conclusions
For YLE candidates
•
•
A real choice between PB and CB YLE tests, in line with Cambridge
English’s ‘test for best’ principle
CB YLE tests an intuitive, accessible, contemporary, fun alternative
way to assess children’s language ability.
For Cambridge English
• CB YLE tests provide access to learner performance data and
examiner scores which allow
– on-going research into the nature of the CB test construct and interaction
– review and QA of test material and assessment criteria
– refinement of data-based scales and performance descriptors.
© UCLES 2013
Candidate feedback (Flyers, Hong Kong)
I enjoyed taking the test on the computer
because it was easy to use
(Taylor Holly Nor Chen)
I liked the speaking test the most.
(Adam Chris Wong)
Yes, because I can use the computer to do
the test which I think it’s not bored.
(Cheuk Long Ngan)
© UCLES 2013
Candidate feedback (Flyers, Hong Kong)
Speaking - It's fun/special - I can say to the
computer
I enjoyed taking the test because it was easy and
fun and helped my english.
Yes, because it is not easy and not too hard, it just
right.
Because I learnd new things.
© UCLES 2013
Candidate feedback (Starters, Hong Kong)
Computer
Face-to-face
•
•
•
•
•
•
very funny
I like compewter
relax using computer, real human
nervous
I don't like the human
•
•
because I like to talk to men
because I can listen the teacher
voile caily / because rew people I
can hear caily
can speak more
speak slower
Observer comment:
They seemed really eager and keen in speaking to the
computer - spoke freely and followed instructions well.
© UCLES 2013
Candidate feedback (Starters, Hong Kong, Spain)
Computer
Face-to-face
Yuk Ting Tina Wong, Starters trial candidate, age 8, Hong Kong
Marta Asuncion Lacomba Gascó, Starters trial candidate, age 9,
Spain
Hiu Ching Chow, Starters trial candidate, age 5, Hong Kong
© UCLES 2013
Yuet Yiu Cheung, Starters trial candidate, age 7, Hong Kong
Lucia Ariza Espejo, Starters trial candidate, age 7, Spain
Candidate feedback (Starters, Spain)
Ana De Hevia Selma,
age 8, Spain
Sergio Ruiz Lozano, age 9, Spain
Marta Asuncion Lacomba Gascón,
age 9, Spain
Elena Abadía Gonzalvo, age 9,
Spain
© UCLES 2013
Candidate feedback (Movers & Flyers, Mexico)
© UCLES 2013
Candidate feedback (Movers & Flyers, Mexico)
Natalia Moreno Trejo, Flyers trial
candidate, age 12, Mexico
Jose Daniel Hurtado Bravo, Flyers trial
candidate, age 12, Mexico
© UCLES 2013
Candidate feedback (Movers & Flyers, Mexico)
© UCLES 2013
Candidate feedback (Movers & Flyers, Mexico)
© UCLES 2013
Selected candidate testimonials (Spain)
I enjoyed the computer exam, it was like a game - it was fun. I would tell my
friends to take the exam because it's from Cambridge and they study a lot for
this.
Javier Gámiz Fernández, Flyers trial candidate, age 11, Spain
I enjoyed taking the exam on the computer because you don't get as nervous
and it is more fun. The best bit was the listening exercise. I would recommend it
to my friends because it's a difficult exam that's fun at the same time.
Marti Ambros Viedma, Movers trial candidate, age 9, Spain
I like it - it's quicker and more fun, to tell you the truth I liked all of it, but if I had to
choose one part it would be the speaking. I would recommend it to my friends, I
would tell them: try it, it's fun and not boring!
Maria Palazon Guerrero, Movers trial candidate, age 11, Spain
I enjoyed taking the test on the computer - it's very fun. I would tell my friends to
do the exam because it's fun, cool and entertaining.
Antonio Hidalgo Calderón, Starters trial candidate, age 8, Spain
© UCLES 2013
Selected parental testimonials (Spain)
The Teacher recommended that my children try the exam. They enjoyed the
test because it was easier to correct yourself if you make a mistake, and it's
more comfortable than the paper-based exam. I would recommend it then
because the children enjoyed it, and I think it's more environmentally-friendly
than on paper.
Parent of Lidia Lopez Nuñez, Starters trial candidate, age 10, Spain
Our child took the test because it seemed a good experience and you could
learn how good your child is with language. She liked the listening exercises
because you can hear really well with the headphones, it's easier to
concentrate.
Parent of Pilar Herbella Narváez, Flyers trial candidate, age 11, Spain
My child took the test to gain more knowledge, She said it was like a game
and as a mother I have seen more motivation with the computer and overall.
Parent of Lucia Arize Rubio Barreda, Starters trial candidate, age 8, Spain
© UCLES 2013
© UCLES 2013
[email protected]
[email protected]
© UCLES 2013
Descargar