ASSESSING SPEAKING
There are four categories of listening performance assessment tasks. A similar taxonomy emerges for oral production.
Imitative
At one end of a continuum of types of speaking performance is the ability to simply parrot back (imitate) a word or phrase or possibly a sentence. While this is a purely phonetic level of oral production, a number of prosodic, lexical, and grammatical properties of language may be included in the criterion performance .
We are interested only in what is traditionally labeled”pronunciation” no inferences are made about the test takers ability to understand or convey meaning or to participate in an interactive conversation. The only role of listening here is in the short-term storage of a prompt,just long enough to allow the speaker to retain the short stretch of language that must be imitated.
Intensive
A second type of speaking frequently employed in assessment contexts is the production of short streches of oral language designed to demonstrate competence in a narrow band of grammatical, phrasal, lexical, or phonological relationship (such as prosodic elements-intonation, stress, rhythm, juncture).
The speaker must be aware of semantic properties in order to be able to respond,but interaction with an interlocutor or test administrator is minimal at best.examples intensive assesment tasks include directed response tasks, reading aloud,sentence and dialogue completion; limited picture-cued tasks including simple sequences; and translation up to the simple sentence level.
Responsive
Responsive assessment task include interaction and test comprehension but at the soomewhat limited level of very sort conversations,standard greetings and small talk,simple request and comments,and the like.
The stimulus is almost always a spoken prompt (in order to preserve authenticity), with perhaps only one or two follow-up questions or retorts:
Mary : Excuse me,do you have the time?
Doug : Yeah. Nine-fifthteen.
T : What is the most urgent environmental problem today?
S : I would say massive deforestation.
Jeff : Hey, Stef, how’s it going ?
steff : Not bad, and yourself?
Jeff : I’m good.
steff : Cool. Okay, gotta go.
Interactive
The difference between responsive and interactive speaking is in die length and complexity of the interaction, which sometimes includes multiple exchanges and/or multiple participants. Interaction can take the two forms of transactional language, which has the purpose of exchanging specific information or interpersonal exchanges, which have the purpose of maintaining social relationships. (In the three dialogues cited above. A and B were transactional, and C was interpersonal.) In interpersonal exchanges, oral production can become pragmatically complex with the need to speak in a casual register and use colloquial lan¬ guage. ellipsis, slang, humor, and other sociolinguistic conventions.
Extensive (monologue)
Extensive oral production tasks include speeches, oral presentations, and story-telling, during which the opportunity for oral interaction from listeners is either highly limited (perhaps to nonverba responses) or ruled out altogether, language style is frequently more deliberative (planning is involved) and formal for extensive tasks, but we cannot rule out certain informal monologues such as casually delivered speech (for example, nvvacation in the mountains, a recipe for outstanding pasta primavera, recounting die plot of a novel or movie).
Micro- And Macroskills Of Speaking
A list of listening micro- and macroskills enumerated the various components of listening that make up criteria for assessment. A similar list of speaking skills can be drawn up for the same purpose; to serve as a taxonomy of skills from which you will select one or several that will become die objeclive(s) of an assessment task. The microskills refer to producing the smaller chunks of language sue as phonemes, morphemes, words, collocations, and phrasal units. The macroski imply the speaker’s focus on the larger elements: fluency, discourse, function, stycohesion. nonverbal communication, and strategic options. The micro- ar. macroskills total roughly 16 different objectives to assess in speaking.
Microskills
- Produce differences among English phonemes and allophonic variants.
- Produce chunks of language of different lengths.
- Produce English stress patterns, words in stressed and unstressed positions, rhythmic structure, and intonation contours.
- Produce reduced forms of words and phrases.
- Use an adequate number of lexical units (words) to accomplish pragmatic purposes.
- Produce fluent speech at different rates of delivery.
- Monitor one's own oral production and use various strategic devices— pauses, fillers, self-corrections, backtracking—to enhance the clarity of the message.
- Use grammatical word classes (nouns, verbs, etc.), systems (e.g., tense, agreement, pluralization), word order, patterns, rules, and elliptical forms.
- Produce speech in natural constituents: in appropriate phrases, pause groups, breath groups, and sentence constituents.
- Express a particular meaning in different grammatical forms.
- Use cohesive devices in spoken discourse.
Macroskills
- Appropriately accomplish communicative functions according to situations, participants, and goals.
- Use appropriate styles, registers, implicature, redundancies, pragmatic conventions, conversation rules, floor-keeping and -yielding, interrupting, and other sociolinguistic features in face-to-face conversations,
- Convey links and connections between events and communicate such relations as focal and peripheral ideas, events and feelings, new information and given information, generalization and exemplification.
- Convey facial features, kinesics, body language, and other nonverbal cues along with verbal language.
- Develop and use a battery of speaking strategies, such as emphasizing key words, rephrasing, providing a context for interpreting the meaning of words, appealing for help, and accurately assessing how well your interlocutor is understanding you.
As you consider designing tasks tor assessing spoken language, these skills can act as a checklist of objectives. While the macroskills have the appearance of being more complex than the microskills, both contain ingredients of difficulty, depending on the stage and context of the test-taker
There is such an array of oral production tasks that a complete treatment is almost impossible within the confines of one chapter in this book. Below is a con¬ sideration of the most common techniques with brief allusions to related tasks. As already noted in the introduction to fins chapter, consider three important issues as you set out to design tasks;
- No speaking task is capable of isolating the single skill of oral production. Concurrent involvement of the additional performance of aural comprehension, and possibly reading, is usually necessary.
- Eliciting the specific criterion you have designated for a task can be tricky because beyond the word level, spoken language offers a number of productive options to test-takers. Make sure your elicitation prompt achieves its aims as closely as possible.
- Because of the above two characteristics of oral production assessment, it is important to carefully specify' scoring procedures for a response so that ultimately you achieve as high a reliability index as possible.
Designing Assessment Tasks: Imitative Speaking
You may be surprised to see the inclusion of simple phonological imitation in a consideration of assessment of oral production. After all. endless repeating of words, phrases, and sentences was the province of the long-since-discarded Audiolingual Method, and in an era of communicative language teaching, many believe that nonmeaningful imitation of sounds is fruitless. Such opinions have faded in recent year' as we discovered that an overemphasis on fluency can sometimes lead to the decline of accuracy in speech. And so we have been paying more attention to pro¬ nunciation, especially suprasegmentals, in an attempt to help learners be more comprehensible.
An occasional phonologically focused repetition task is warranted as long as repetition tasks are not allowed to occupy a dominant role in an overall oral pro¬ duction assessment, and as long as you artfully avoid a negative washback effect Such tasks range from word level to sentence level, usually with each item focusing on a specific phonological criterion. In a simple repetition task, test-takers repea the stimulus, whether it is a pair of words, a sentence, or perhaps a question (to icy for intonation prtoduction).
Word repetition task
Test-takers hear: Repeat after me:
beat [pause]bit [pause]
bat [pause]vat [pause]
etc.
I bought a boat yesterday.
The glow of the candle is growing.
etc.
When did they go on vacation?
Do you like coffee?
etc.
Test-takers repeat the stimulus.
A variation on such a task prompts test-takers with a brief written stimuli!? which they are to read aloud. (In the section below on intensive speaking, tasks are described in which test-takers read aloud longer texts.) Scoring specifictions must be clear in order to avoid reliability' breakdowns. A common form scoring simplv indicates a two- or tliree-point system for each response .
Scoring scale for repetition tasks
2 acceptable pronunciaton
3 comprehensible,partially correct pronunciation
0 silence,seriously incorrect pronunciation
The longer the stretch of language, the more possibility for error and therefore the more difficult it becomes to assign a point system to die text. In such a case, it may be imperative to score only the criterion of the task. For example, in the sentence "When did they go on vacation?" since the criterion is falling intonation for w'h-questions, points should be awarded regardless of any mispronunciation.
Phonepass Test
An example of a popular test that uses imitative (as well as intensive) production tasks is PhonePass, a widely used, commercially available speaking test in many countries. Among a number of speaking tasks on the test, repetition of sentences (of 8 to 12 words) occupies a prominent role. It is remarkable that researeh on die PhonePass test has supported the construct validity of its repetition tasks not just for a testtaker’s phonological ability but also for discourse and overall oral production ability (Townshend et al.. 1998: Bernstein et al., 2000; Cascallar & Bernstein, 2000).
PhonePass* test specifications
Part A:
Test-lakers read aloud selected sentences from among those printed on the test
sheet. Examples:
1. Traffic is a huge problem in Southern California.
2. The endless city has no coherent mass transit system.
3. Sharing rides was going to be the solution to rush-hour traffic.
4. Most people still want to drive their own cars, though.
Part B;
Test-takers repeat sentences dictated over the phone Examples: “Leave town on the next train."
Part C:
Test-takers answer questions with a single word or a short phrase of two or three words. Example: "Would you get water from a bottle or a newspaper?”
Part D:
Test-takers hear three word groups in random order and must link them in a correctly ordered sentence. Example: was reading/my mother/a magazine.
Part E:
Test-takers have 30 seconds to talk about their opinion about some topic that is dictated over the phone. Topics center on family, preferences, and choices.
Scores for the PhonePass test are calculated by a computerized scoring template _ reported back to the test-taker within minutes. Six scores are given; an overall score between 20 and 80 and five subscores on the same scale that rate pronunciation; reading fluency, repeat accuracy, repeat fluency, and listening vocabulary'.
Designing Assessment Tasks: Intensive Speaking
At the intensive level, test-takers are prompted to produce short stretches of discourse (no more than a sentence) through which they demonstrate linguistic ability at a specified level of language. Many tasks are ’ cued" tasks in that they lead the testtaker into a narrow band of possibilities.
Parts C and D of the PhoncPass test fulfill the criteria of intensive tasks as they elicit certain expected forms of language. Antonyms like highand low, happy and sad are prompted so that the automated scoring mechanism anticipates only one word. The either/or task of Pm D fulfills the same criterion. Intensive tasks may also be described as limited response tasks (Madsen, 1983), or mechanical tasks (Underhill, 1987), or what classroom pedagogy would label as controlled responses.
Directed Response Tasks
In tills type of task, the test administrator elicits a particular grammatical form or a transformation of a sentence. Such tasks are clearly meclianical and not commu¬ nicative, but they do require minimal processing of meaning in order to produce the correct grammatical output.
Directed response :
Test-takers hear: Tell me he went home.
Tell me that you like rock music.
Tell me that you aren’t interested in tennis.
Tell him to come to my office at noon. Remind him what time it is.
Picture-Cued Tasks
One of the more popular ways to elicit oral language performance at both intensive and extensive levels is a picture-cued stimulus that requires a description from die testtaker. Pictures may be very simple, designed to elicit a word or a phrase; somewhat more elaborate and "busy”; or composed of a series diat tells a story or incident. Here is an example of a picture-cued elicitation of the production of a simple minimal pair.
Scoring responses on picture-cued intensive speaking tasks varies, depending on die expected performance criteria The tasks above that asked just for one-word or simple-sentence responses can be evaluated simply as "correct" or “incorrect.” The three-point rubric (2, 1. and 0) suggested earlier may apply as well, with these modifications:
Scoring scale for intensive tasks
comprehensible; acceptable target form
comprehensible; partially correct target form
silence, or seriously incorrect target form
Opinions about paintings, persuasive monologue, and directions on a map create a more complicated problem for scoring. More demand is placed on the test administrator to make calculated judgments, in which case a modified form of a scale such as the one suggested for evaluating interviews (below) could be used:
• grammar
• vocabulary7
• comprehension
• fluency
• pronunciation
• task (accomplishing the objective of the elicited task)
Each category' may be scored separately, with an additional composite score that attempts to synthesize overall performance, To attend to so many factors, you will probably' need to have an audiotaped recording for multiple listening.
One moderately successful picture-cued technique involves a pairing of two test-takers. They are supplied with a set of four identical sets of numbered pic¬ tures, each minimally distinct from the others by one or two factors. One test-taker is directed by a cue card to describe oneof the four pictures in as few words as possible.The second test-taker must then identify the picture. On the next page is an example of four pictures
Designing Assessment Tasks: Responsive Speaking
Assessment of responsive tasks involves brief interactions with an interlocutor, differing from intensive tasks in the increased creativity given to the test-taker and from interactive tasks by the somewhat limited length of utterances.
Question and Answer
Question-and-answer tasks can consist of one or two questions from an interviewer, or they can make up a portion of a whole battery of questions and prompts in an oral interview. They can vary from simple questions like What is this called in English?” to complex questions like'What are the steps governments should take, if any, to stem the rate of deforestation in tropical countries?” The first question is intensive in its purpose; it is a display question intended to elicit a predetermined correct response.We have already looked at some of these types of questions in the previous section. Questions at the responsive level tend to be genuine referential questions in which the test-taker Ls given more opportunity to produce meaningful language in response.
In designing such questions for test-takers, it's important to make sure that you know whyyou are asking the question. Are you simply trying to elicit strings of language output to gain a general sense of the test-taker's discourse competence? Are you combining discourse and grammatical competence in the same question? Is each question just one in a whole set of related questions? Responsive questions may take the following forms:
Questions eliciting open-ended responses
Test-takers hear:
1. What do you think about the weather today?
2. What do you like about the English language?
3. Why did you choose your academic major?
4. What kind of strategies have you used to help you learn English?
a. Have you ever been to the United States before?
b. What other countries have you visited?
c. Why did you go there? What did you like best about it?
d. If you could go back, what would you like to do or see?
e. What country would you like to visit next, and why?
Oral interaction with a test administrator often involves the latter forming the questions. The flip side of this usual concept of question-and-answer task to elicit questions from the test-taker. To assess the test-taker’s ability to produce questions, prompts such as this can be used:
Elicitation of questions from the test-taker
Test-takers hear:
- Do you have any questions for me?
- Ask me about my family or job or interests.
- If you could interview the president or prime minister of your country, what would you ask that person?
- Test-takers respond with questions.
A potentially tricky form of oral production assessment involves more thi. test-taker with an interviewer, which is discussed later in this chapter. With m dents in an interview context, both test-takers can ask questions of each other
Designing Assessment Tasks: Interactive Speaking
Interview
Every effective interview contains a number of mandatory stages. Two decadeago. Michael Canale (1984) proposed a framework for oral proficiency testing th.lias withstood the test of time. He suggested that test-takers will perform at the best if they are led through four stages:
- Warm-up. In a minute or so of preliminary small talk, the interviewerdirects mutual introductions, helps the test-taker become comfortable with the situation, apprises the test-taker of the format, and allays anxieties. No scoring this phase takes place.
- Level check. Through a series of preplanned questions, the interview stimulates the test-taker to respond using expected or predicted forms and fur. tions. If. for example, from previous test information, grades, or other data, t: test-taker has been judged to be a “Level 2" (see below) speaker, the interviewer prompts will attempt to confirm this assumption. The responses may take vt: simple or very complex form, depending on the entry level of the learr r Questions are usually designed to elicit grammatical categories (such as tense or subject-verb agreement), discourse structure (a sequence of even vocabulary usage, and/or sociolinguistic factors (politeness conventioraJ formal/inforraal language).This stage could also give the interviewer a picture <3 the test-taker’s extroversion, readiness to speak, and confidence, all of which m. be of significant consequence in the interview’s results. Linguistic target critew. are scored in this phase. If this stage is lengthy, a tape-recording of the inter is important
- Probe.Probe questions and prompts challenge test-takers to go to "s heights of their ability, to extend beyond the limits of the interviewer’s expecta: through increasingly difficult questions. Probe questions may he complex in ttri framing and/or complex in their cognitive and linguistic demand. Through pr t items, the interviewer discovers the ceiling or limitation of the test-taker's experiency. litis need not be a separate stage entirely, hut might be a set of question that are interspersed into the previous stage. At the lower levels of proficier a probe items may simply demand a higher range of vocabulary or grammar from the test-taker than predicted. At the higher levels, probe items will typically ask test-taker to give an opinion or a value judgment, to discuss his or her field oi cialization, to recount a narrative, or to respond to questions that are wordr. complex form Responses to probe questions may be scored, or they ma ignored if the test-taker displays an inability to handle such complexity.
- Wind-down.This final phase of the interview is simply a short period of during which the interviewer encourages the test-taker to relax with some questions, sets the test-taker s mind at ease, and provides information about v and where to obtain the results of the interview. This part is not scored.
Discussions and Conversations
As formal assessment devices, discussions and conversations with and among students are difficult to specif)' and even more difficult to score. But as informaltech niques to assess learners, they offer a level of authenticity and spontaneity that the assessment techniques may not provide. Discussions may be especially appropriate tasks through which to elicit and observe such abilities as
- Topic nomination, maintenance, and termination; » attention getting, interrupting, floor holding, control;
- Clarifying, questioning, paraphrasing;
- Comprehension signals (nodding,“uh-huh," “hmm,” etc.);
- Negotiating meaning;
- Intonation patterns for pragmatic effect;
- Kinesics, eye contact, proxemics, body language; and • politeness, formality, and other sociolinguistic factors.
CONCLUSION
In assessing english speaking, there are some things that have to be considered. Assessing speaking also differ from one to another, depends on the level of the learner itself. Therefore there are several guidelines on how we could assess learners’ ability in speaking. Besides we can use any ways to assess them in appropriate way, so that it could be coherent and cohessive english speaking.
REFERENCES
Brown, H. D. (2003). Language Assessment : principles and classroom practice. California: Longman.
Comments
Post a Comment