Welcome to “Testing: A Double-sided Analysis of Testing Tools and

Anuncio
Welcome to
“Testing: A Double-sided Analysis of
Testing Tools and Techniques”
(The two sides of a coin)
Prof. Juan Diaz – 2009
Content of the workshop/talk:
1. The teacher/tester personality
2. Testing activities: pros and cons,
and solutions
Testing: A Double-sided Analysis of Classroom Practices
and Testing Tools and Techniques at School.
Juan Oscar Díaz (2009)
Abstract
Teaching and testing should be considered two phases of the same process of education. They should
be viewed as part of a cyclical process. Our educational system and educationalists have encouraged and
insisted on the idea that teaching and testing are two different instances in education that should be kept well
apart, each bearing its own rules, conditions, expected behaviors and set of attitudes. These beliefs are so deeply
rooted in our culture that even well-meaning individuals with robust theoretical background find it hard to shun this
view and stop acting accordingly.
As a result of this divide between teaching and testing, two very different, clashing and sometimes
schizophrenic personalities emerge, that of the teacher and that of the tester. At a closer look, this phenomenon
may pose several problems and inconveniences, to say the least, to all the participants in the process of teaching
and learning.
This personality split is more than apparent in the wide gap existing between teaching and testing
practices, instruments, behaviors, attitudes, emotions, and other variables. Becoming a better educationalist and
teacher starts with an exploratory journey of ourselves, the language we use, and the unjudgemental observation
of some our basic human emotions.
In a first section, this workshop/course basically is aimed at the exploration of our own attitudes and
beliefs and is meant to raise awareness on the great divide that separates our teaching practices from our testing
ones. Besides, we also seek to evaluate well-known (but usually misunderstood or ill-employed and badly
constructed/applied) testing tools and techniques in order to become familiar with the pluses and minuses of each
one of them and be able to construct testing/assessment instruments that are finely tuned and geared to our
learners and the learning objectives set for each particular course or learning event.
Key words: teaching vs. testing, assessment, awareness, attitudes and beliefs, classroom practices, testing tools,
instruments.
Biodata
Juan Oscar Díaz is a teacher of English Language and Literature for higher education, a translator of English, and
he has successfully completed the Master’s Degree programme in Applied Linguistics at Facultad de Lenguas,
Universidad Nacional de Córdoba.
He has been a teacher of Reading Comprehension (ESP) in the engineering courses (electronics,
telecommunications, aeronautics and computing) at Instituto Universitario Aeronáutico (IUA). He has also been
chief instructional designer and the coordinator of the Proficiency Test at the Facultad de Ingeniería. He has
worked as a learning designer at Asociación de Investigaciones Tecnológicas on the design and development
phases of e-learning courses.
He has held the chairs of English Language 3, 4 and 9, Grammar 3 and 4, and Linguistics at Siglo 21
University and has also been in charge of mentoring undergraduates of the Licenciatura en Lengua Inglesa in
their Final Dissertations. He has worked as a counselor and facilitator for e-learning, blended and face-to-face
courses.
He has trained staff at Siglo 21 University, the Language Department of IUA, IDIPE Empresas (incompany training institute), the German School, the National University of Villa María, and the National University
of Santiago del Estero, etc. in the instructional design of reading comprehension material and other related topics.
He has been counselor and lecturer in the articulation project between the National University of Villa María and
Secondary School Education: Reading Comprehension and Instructional Design and has also lectured at several
ACPI-organized seminars, courses and workshops. He has participated in National and International
Conferences.
He has published reading comprehension teaching materials and courses in both English and
Spanish. He is the author of E-Learning English: Strategic Reading and Listening for Business, a series of three ebooks for business English addressing the two receptive skills of reading and listening comprehension from a
strategy-based instructional approach.
At present he is working for Trainex Group, a Buenos Aires-based Instructional Design Company
working for the Royal Dutch Shell Company. At Trainex Juan is Project Manager, Coordinator and supervisor of
blended learning developments based on e-learning formats. The job involves negotiation with clients, project
management, distribution and allocation of human resources, quality control, training to personnel, etc. He is also
working as a teacher of conversation and reading comprehension at the Engineering School of the Catholic
University of Córdoba.
Meet some of our experts…
Salim Razi
Douglas Brown
Elana Shohamy
Penny Ur
Charles Alderson
John Munby
Andrew Cohen
Arthur Hughes
Tim McNamara
Barbara Dobson
Cyril Weir
Participants in the Testing Process/Event
Boring - I see they’re frightened
Disappointed when I see the results
No me gusta prepararlos ni corregirlos
Hate to correct tests, exams, quizzes,….
Las horas que paso en eso para nada
No veo que sirvan de mucho
Les quisiera decir que no es de vida o muerte
No sé muy bien qué es lo que estoy
evaluando
Múltiple opción - Me dan mucha bronca los
aplazos – No ME estudian
Trato de ser innovadora pero no es fácil
Ya sé quienes van a aprobar y quienes no
Correcting takes so much time
How to help students to feel better
A nadie le gusta pero son una obligación
Tiring - Not sure how to test something
Llenar actas me harta – Irritating
Uso y reuso cosas que ya tengo porque si no,
no me alcanza el tiempo
Always have problems to know how to be fair
Se copian y eso se nota y me molesta
No sé cómo controlarlos
Ese día estoy hecha una bruja
Hacer un plan de estudio y tener poco tiempo
Saber quién corrige y adaptar el discurso
Me gusta que el docente me ayude a entender
Me ponía muy nerviosa y tenía dolores de
estómago
Odiaba los fill-in-the-blanks y los multiple choice
con trampas
Preguntas con trampas
Estudiar de memoria…y ahora los reto!
Sentía que no iba a poder con tanto
Me ponía re-ansiosa
Las notas, era lo único que miraba
¿Cómo estoy? ¿Cuánto sé? ¿Qué me van a
pedir?
Dificultad
Resultados, si dependían del “humor” de las
profes - I remember I cried several times at home
¿Y si me va mal?
Tenía ganas de matar a mi hermano….
Angustia en los orales
Ganas de llorar con algunos profes
Nervios, presión, tensión
Posibilidad de dar a conocer lo que aprendí
Incomodidad, sin saber cómo me iban a evaluar
Cierto aburrimiento
Se quedaba todo en lo teórico
Muy paciente
Me encanta enseñar
I see myself as a tolerant teacher
Creo que soy clara y amena
Tengo muy buena relación con los
alumnos
Me gusta la disciplina pero también
el desorden y el ruido
Me percibo como paciente,
eficiente, sé cómo hacer mi trabajo
Soy de ayudar y de contestar todas
las dudas
Trato de que estén preparados para
los parciales y exámenes
Querida por casi todos mis alumnos
I think I’m a good teacher
Impaciente y acelerada
No me gustan los exámenes
No tolero que se copien o que la
“canchereen”
Sometimes I feel I’m not fair
Me odian en las pruebas, soy muy
demanding
Trato de que se den cuenta solos de
los errores
A veces no sé qué tomarles
No sé si soy justa en cómo corrijo,…
Siento molestia de que me hagan
tantas preguntas en los exámenes
Soy un poco autoritario
A veces se me va la mano y apunto
demasiado alto con las actividades
Very strict with time
How do we start making changes (for the better)?
Situation
Interpret
-ation
Emotion
Behavior
Reflecting on our own “testing” practices.
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
How often do you test your students?
What type of instruments do you use?
Do you test the four skills?
What abilities do you find harder to test?
How do you prepare your tests?
Do you test only “book” material?
Do you try to innovate in the type of exercises you include?
Do you ever include a “surprise” element?
Do you take a long time to prepare the test?
What type of activities/exercises do you include?
Why do you test your students?
What type of exercises do you find harder to score?
Do you write comments?
Do you give feedback? What type? When? How?
What do you do before the test? During the test? After the test?
How do your students react to the tests you prepare?
Where do you get the written and oral texts for your tests from?
What aspects of your tests do you think need improving?
What aspects are you satisfied with?
What does the TESTER do?
•
Plans what to include in the test
(according to objectives, range of topics
covered, etc.)
•
Prepares the test and the testing props
•
Administers the test
•
Marks and scores the test
•
Hands out the tests with/without feedback
H. Douglas Brown,
Language
Assessment (2004)
A deeply-rooted dissociation
What is a test?
A method of measuring a person’s
ability, knowledge, or performance
in a given domain.
• A test is a method
• A test must measure
• A test measures an individual’s
ability, knowledge, or performance
• A test measures performance
(competence) – knowledge
• A test measures a given domain
Why take testing so seriously?
Tests represent a social technology
deeply
embedded
in
education,
government, and business; as such they
provide the mechanism for enforcing
power and control. Tests are most
powerful as they are often the single
indicators for determining the future of
individuals.
INSTRUCTIONS
TESTS
ACTIVITIES
T
E
S
T
E
R
How to test?
The best method!
•
MultipleMultiple-choice tests
•
True and False tests
•
GapGap-filling tests
•
Questionnaires
•
Ordering tasks
•
Summary writing
The cloze test
The C-test
The cloze-elide test
Fill-in-the-blanks tests
•
•
•
•
Reading Teachers ‘ discomfort
Variety of techniques in reading classes,
they do not tend to use the same variety
of techniques when they administer
reading tests.
Any teaching activity can easily be
turned into a testing item, and vice
versa.
No single method satisfies reading
teachers --- different purposes
•
•
•
•
Reading teachers : what they need to test
Selecting the most appropriate testing
method for their students;
Discrete-point techniques when they
intend to test a particular subject at a time,
Integrative techniques when the aim is to
see the overall picture of a reader.
Multiple-choice Activities: a definition
What is the right option to complete the
blank in the following sentence?
““They
They ______ Mary with her boyfriend in
the pub last night.”
night.”
1.
2.
3.
4.
looked
saw
turned
watched
?
A multiple-choice question
consists of a stem and a number
of options (usually 4), from which
the testee has to select the right
one.
Multiple-choice Activities
1. It is time-saving to score
2. It provides a very objective way
of testing
3. It is a widely-researched
technique
4. It provides testers with the
means to control test-takers’
thought processes when
responding
5. It allows the tester to control
the range of possible answers
6. Testees are very familiar with
the technique
1.
2.
3.
4.
5.
6.
It tests only receptive
kowledge/recognition
knowledge
Guessing may have a
considerable but unknowable
effect on scores
It restricts what can be tested
It is very difficult to write
successful items
Washback may be harmful
Cheating is facilitated
Multiple-choice Activities: some caution!
Being a good reader
doesn’t garantee being
successful in multiplechoice tests.
Distracters may trick
deliberately, which results in
a false measurement.
Test-takers do not
necessarily link the stem
and the answer in the
same way that the tester
assumes.
Test-takers are provided
with possibilities that
they might not otherwise
have thought of.
Multiple-choice Activities: difficult to construct?
It is very difficult to write successful items for a number of
reasons.
• Items meant to test grammar only, may also test for
lexical knowledge as well.
• Distracters may be eliminated as being absurd.
• Correct responses may be common knowledge.
• Items are answerable on the basis of just memory and
not comprehension.
• Items get encumbered with material meant to trick the
careless student (i.e., the use of double negatives).
• Items are constructed with more than one correct
answer (even if students have not been taught both
possible answers).
• Items are constructed with clues as to which is the
correct answer (the difference in length or structure).
• It may require training in improving educated guesses
rather than in learning the language.
Multiple-choice Activities: using errors
…the possibility of using high frequency errors among
students (in completion exercises) to construct the distracters
for grammar and vocabulary. …the answers provided by
students in a blanks exercise were somehow different from
the options chosen by professionals to use in the multiple
choice exercise.
…all or most of the distracters are somewhat
appropriate, but only one is the best answer.
…suitable for developing students’ problem
solving abilities. It is the students’ task to arrive
at certain conclusions why one option is the
best and why discard the others. It may also be
useful, at least from time to time, to use totally
implausible distracters, just to see which
students are not actively engaged in the
reasoning process of finding the best answer.
Multiple-choice Activities
How about including errors as part of the options?
Ingram
•
•
incorrect forms in item
elicitations, perfectly
acceptable
teaching and testing: a
learning situation and a
discrimination situation.
Chastain
•
•
students make enough
errors
with some ingenuity,
teachers can test common
errors
Multiple-choice Activities
Multiple-choice editing task
Multiple-choice Activities: some useful tips!
•
Only one option should be correct. Otherwise, call it M-R and warn sts.
•
There should be no question about the right option.
•
Distracters must be plausible.
•
Options should agree grammatically with the stem.
•
Options should be balanced in terms of syntax, gender, number, verb
tense, etc.
•
Options should not repeat, or use synonyms.
•
They should be roughly the same length.
•
One of the options should not help infer the right one.
•
Display them at random unless a chronological or sequence order is
needed.
•
Options should include students’ common errors as distracters.
•
Do not use: “all of the above,” “none of the above,” “I don’t know,” etc.
(they only show the tester’s inability to think of other plausible options)
•
Avoid the use of negative forms (not-any, no, never, etc.) or absolute terms
(never, always, totally).
•
Do not repeat words in the stem and in the options.
True & False Activities
Dichotomous Items (True-False Technique)
1.
2.
3.
4.
T – F (or Yes – No)
T – F – Not given
T – F + correcting
T – F + How do you know?
True & False Activities
1. They are easy to design
(echoing statements?)
2. Easy and fast to score
3. It provides a very objective way
of testing
4. It is a widely-used – most
students are familiar with the
technique
5. They may tap meaninginference skills
(comprehension?)
6. Good for machine-markable
activities
7. (Perceived as) less threatening
than other types of testing
techniques
1.
2.
3.
4.
5.
Do not know exactly what the
test measures – local, global
comprehension, inference of
word meaning, guessing,
ability to spot lexicogrammatical patterns?
50% chance of guessing
33% chance of guessing
Chance of hitting the right
one without actual
comprehension
High chances of cheating
Gap-filling Tasks
•
Fill-in-the-blank tests: rational deletion
•
Cloze tests
- Fixed-ratio deletion
- Multiple-choice cloze tests
•
C-tests
•
Cloze-elide tests
Cloze Test
...typically constructed by deleting from selected texts
every n-th word ... and simply requiring the test-taker to
restore the word that has been deleted. … ‘n’ usually
differs from intervals of every 5th word to every 12th word….
…according to research, in order to achieve reliable results
there should be at least 50 deletions in a cloze test.
•
•
•
•
Reading passages of 150 to 300 words
30-50 blanks
Deletion of every 7th word (-/+ 2)
Integrative test approach
Cloze Test
Cloze results: good measures of overall proficiency.
A number of abilities (competence in language):
Knowledge of….
•
•
•
•
•
vocabulary
grammatical structure
discourse structure
reading skills and strategies
internalized expectancy grammar
Global language proficiency
Cloze Test
“n” is a number
between 5 to 11.
This type of test does
not require extracting
information by
skimming.
“n” is the a number
between just 5 to 7.
Cloze tests are sensitive
to constraints beyond 5
to 11 words on either
side of a blank.
Cloze tests do not assess
global reading ability but
they do assess local-level
reading.
Cloze Test
1. It is time-saving and easy to
mark and score
2. It provides a very objective way
of testing
3. It is a widely-researched
technique
4. It MAY provide testers with the
means to control output and
focus on specific language
points
5. It MAY allow the tester to
control the range of possible
answers
6. It tests local reading
7. It activates content and formal
schemata
1.
2.
3.
4.
5.
6.
7.
8.
Construction is problematic
It restricts what can be tested
Do not know exactly what the
test measures
It is very difficult to write
successful and balanced
pieces geared to particular
objectives
Scoring based on
“acceptable/appropriate” is
problematic
It is irritating for testees
High levels of exam anxiety.
It does not assess global
comprehension
The C-test
One cool autumn evening, Bob Lang, a young
professional, returned home from a trip to the
supermarket to find his computer gone. Gone! All
so- of cr- thoughts ra- through h- mind: H- it bestolen? H- it be- kidnapped? H- searched h- house
f- a cl- until h- noticed a sm- piece o- printout pastuck un- a mag- on h- refrigerator do-. His hesank a- he re- this sim- message: can’t continue,
file closed, bye.”
The C-test
The C-test
• integrative testing instrument
• measures overall language competence
• consists of four to six short, preferably authentic, texts in the
target language,
• “the rule of two” has been applied: the second half of every
second word has been deleted,
• beginning with the second word of the second sentence;
• the first and last sentences are left intact.
• If a word has an odd number of letters, the “bigger” part is
omitted, e.g., proud becomes pr-.
• One-letter words, such as I, are ignored in the counting.
• The students’ task is to restore the missing parts.
• In a typical C-test there are 100 gaps, that is, missing parts.
• Only entirely correct restorations are accepted.
The C-test
…el C-test o “prueba C” es una prueba de cierre que fue
desarrollada a partir de los clozes tradicionales. Sus
creadores, Klein-Braley y Raatz (1981), lo consideraron un
instrumento de evaluación muy adecuado para medir la
competencia lingüística global en lengua extranjera.
C-tests are more reliable and valid than cloze
tests in terms of assessing but are thought to be
more irritating than cloze tests.
…[they are] based upon the same theory of
closure as the cloze test.
The Cloze Elide test
A test or an examination (or "exam") is an open assessment,
which often administered on this paper or on the computer,
intended to measure the test-takers' or respondents' (often a
good student) knowledge, skills, sport aptitudes, etc. Tests
that usually are often used in education, the professional
certification, counseling, psychology, the some military, and
many other the fields. The measurement that is the goal of
assessment testing is to called a test score, and is "a
summary of the written evidence which contained in an
examinee's few responses to the items....."
The Cloze Elide test
•
•
•
•
alternative integrated approach
technique introduced as the ‘Intrusive Word Technique’ also as
“...‘text retrieval’, ‘text interruption’, ‘doctored text’, ‘mutilated text’
and ‘negative cloze’...”
tester inserts words and the test-taker is asked to find the words that
do not belong to the text
be sure that the inserted words do not belong to the text.
The Cloze Elide test
Cloze-elide tests are good,
indirect measures of English
language proficiency,
comparing very favorably with
more commonly used testing
procedures.
Questions
Questions can relate to:
•
•
•
•
•
•
•
•
•
•
Main idea/topic
Supporting details
Implied information (inference)
Textual reference
Scanning of details
Recognition of unstated information
Inference of word/idiom meanings
Grammatical features
Pragmatic function of a chunk of text
Cohesion devices
Questions
3 main levels or strands of
comprehension:
Superficial, explicit
1. Literal comprehension
2. Interpretive or Referential
comprehension
3. Critical reading comprehension
Deeper, implicit
Questions
3 main levels or strands of
comprehension:
Comprehension at this level involves surface
meanings.
Teachers can ask students to find information
and ideas that are explicitly stated in the text.
In addition, it is also appropriate to test
vocabulary.
1. Literal comprehension
2. Interpretive or Referential
comprehension
Students need to be able to see relationships
among ideas.
Re-arrange the ideas or topics discussed in the
text. Explain the author's purpose for writing the
text. Summarize the main idea when this is not
explicitly stated in the text. Select conclusions
which can be deduced from the text .
3. Critical reading comprehension
Ideas and information are evaluated.
The ability to differentiate between
facts and opinions; to recognize
persuasive statements; and to judge
the accuracy of the information given
in the text.
Ordering tasks
Through ‘ordering tasks’, test-takers are
asked to put the scrambled words,
sentences, paragraphs or texts into correct
order.
Put the following sentences in order.
A.
B.
C.
D.
E.
F.
G.
it was called “The Last Waltz”
the street was in total darkness
because it was one he and Richard had learnt at school
Peter looked outside
he recognized the tune
and it seemed deserted
he thought he heard someone whistling
DGECABF
DBFGECA
Ordering tasks
1. It is not difficult to desing (?)
2. It tests cohesion-detection
abilities
3. It assesses the recognition of
overall text organization
4. It tests the spotting of
grammatical patters, lexical
chains, etc.
1.
2.
3.
4.
It is difficult to administer
(proposed new order)
Scoring is very problematic
(some sequences may be
correct but not the whole
ordering)
Scoring: wholly correct or
wholly incorrect.
Protocols are too complex to
make the effort and time
worth the while
….testing professionals think it unfair to evaluate this type of question
according to the traditional method of marking it completely right or
completely wrong.
Ordering tasks
Summary-writing
•
Summary tests
•
Gapped summary tests
Summary-writing
1. It involves almost no
contruction of the activity (?)
2. It tests global comprehension
3. It taps the ability to select from
a hierarchy of ideas
4. It activates content and formal
schemata
5. It allows for free-recall
techniques
1.
2.
3.
4.
5.
6.
7.
Scoring is very problematic
Does the test assess reading
/listening comprehension or
writing?
Very difficult to know exactly
what the test measures
It lends itself for very
subjective measures
Scoring based on selection of
main ideas (?)
Should spelling, grammar,
lexical-choice errors be
considered?
It may allow for a copy-andpaste approach
Summary of main ideas
Main ideas for whom?
Parallel Texts
•
Author?
•
Teacher?
•
Student?
•
A reader with a particular
purpose in mind?
IDEA UNITS
TEXT
IU 1
IU 2
STUDENT’S
SUMMARY
IU 3
IU 4
IU 5
Is the test assessing READING COMPREHENSION or WRITING PRODUCTION?
1- Asking the test-takers to write the
summary in their first language
2- Presenting a number of
summaries and asking the testees
to select the best summary.
3- The gapped summary.
1. Students read a text for a
specific period of time.
2. They put away the text.
3. They are presented with
a summary of the text
with words missing
4. They are to restore the
missing words.
5. Teacher scores the
summary as it was a gapfilling test.
A tester’s obligation….
Trianguale
measurements of 2, 3
or more
performances and/or
contexts before
drawing a conclusion
INSTRUCTIONS
TESTS
ACTIVITIES
Questions or no questions?
I won’t answer any
questions
whatsoever!
Where…?
What….?
How….?
????
Please, ask me any
questions you
want.
How do you answer
questions 5, 6, 7 and
8? Are 1, 2, 3 and 4
correct?
¿Cómo marco
la correcta?
¿Se me acabó el
lugar, sigo
escribiendo
atrás?
¿Está bien la
respuesta así?
¿Tenemos más
tiempo? Estoy
en la 3 todavía.
¿Sacamos otra
hoja para
escribir las
respuestas?
¿Y pongo todo
lo que dice
de….?
¿Puede ir más
de una palabra
en el espacio
en blanco?
¿Qué
significa ….?
¿Y corrige las
faltas de
ortografía?
¿Y si no está
del todo bien,
vale la mitad?
¿De cuántos
renglones?
¿Primero lo
leo?
¿Y cómo va a
corregir esto?
¿Cuánto
vale este
ejercicio?
¿Y cómo hago este ejercicio?
Nunca lo hicimos en clase.
Este artículo está lleno de
palabras que no entiendo,
¿cómo hago?
¿Puede dar un
ejemplo de
cómo se hace
el ejercicio?
¿Se puede
contestar con
lápiz?
¿Marcamos con una
cruz en la tabla o
ponemos la palabra
en el espacio?
A generic
test
template
Introduction, objectives, topics tested, timing, tips
about how to do the test, anticipation of student’s FAQs
Test activities with clear and complete
directions, reminders of time, tips
about particular activities, anticipated
answers to students’ FAQs, marks per
activity or item, etc.
Area for
students to
comment on
their
difficulties,
thought
processes,
express
uncertainty,
communicate
with the
teacher,
express
feelings, etc.
Conclusion, motivating and relaxing remarks,
information about test results delivery date and
format, type of feedback to be expected, etc.
The use of an AVATAR in the test/exam sheet.
So, where are the 2 sides of the coin?
Teaching and Testing
The Teacher and the Test
The Teacher and the Tester
Pluses and Minuses of
activity types
INTEGRATION
THANK YOU VERY MUCH
for taking part in
“Testing: A Double-sided Analysis of
Testing Tools and Techniques”
(The two sides of a coin)
Prof. Juan Díaz – 2009
Descargar