Вы находитесь на странице: 1из 4

Pactera Transcription Guidelines

Pactera Transcription Guidelines Version 3.1, 2013/11/12

1. WRITE WHAT YOU HEAR. Transcriptions should reflect exactly what the speaker says.

a. It is not necessarily the same as the text provided.

If you see order a pizza in the text but hear order a hamburger in the audio file, transcribe it as order a
b. Do not correct the speakers grammar mistakes. For example, if you hear apples is delicious, transcribe the
sentence as apples is delicious.

c. Transcribe the repetitions/restarts as uttered by the speaker. If the word is damaged, mark it as <UNKNOWN/>.
its its its very interesting (the speaker says its three times)
its ni~~ nice (the speaker does not finish the word) -> its <UNKNOWN/> nice
2. No short-hand writing should be used. The following are examples of wrong transcriptions:
a. coffee & tea -> coffee and tea
b. Ill come @ 5 -> Ill come at five
c. youre #1 -> youre number one

3. All numbers should be spelled out as said by the speaker. The number zero when spoken as o is transcribed as oh.
Toy Story 3 -> Toy Story three
January 2008 -> January two thousand eight (or two oh oh eight, or two zero zero eight, etc, depending on the way
they were spoken.)
Room 104 -> room one oh four (depending on the way they were spoken.)
4. If someone did not speak a punctuation, the punctuation should NOT appear in the transcription - unless it is a
normal writing part of an abbreviation, like in its or Inc.

Nice to meet you. -> nice to meet you Are you sure? -> are you sure

If someone speaks a punctuation, please use S\N format (SYMBOL\NAME). This should only be done when it is
certain the user has spoken a punctuation:
Google.com -> Google .\DOT com (if the speaker says google dot com)
Thank you! -> thank you (if the speaker says thank you)
Thank you! -> thank you !\EXCLAMATION_MARK (if the speaker says thank you exclamation mark)
Thank you! -> thank you !\EXCLAMATION_POINT (if the speaker says thank you exclamation point)
Special attention should be paid to correct transcription of hyphens , for words where the hyphen provides
semantic value (such as for French numbers) the hyphen must be transcribed.

Apostrophes should be included as part of the transcription if said that way, such as:

5. In cases of letter-by-letter spelling (including spelled words, spelled acronyms, and spelled abbreviations), each
spelled-out letter will be capitalized and followed by a dot and a space.

tv -> T. V. dvd -> D. V. D. UN -> U. N. ok -> O. K.
AIDS (if spoken as a word [eiz]) -> Aids

6. If used as an exclamation, oh should be transcribed as ohh.

Oh I didnt know! -> ohh I didnt know

7. Trademarks, brand names, and registered names should appear as used by the company that created them.
Hotmail not hot mail

8. Do not capitalize the word just because its at the beginning of a sentence. Words should only be capitalized if they
are usually capitalized mid-sentence (like names and titles).

Correct: apples are delicious
Correct: Michael told me that he couldnt come to the party

Wrong: tom seems unhappy today
Wrong: Its very nice to meet you.

9. Deliberately misspelled movie titles, song titles, band names, etc. should appear as used. For example, the following
should be kept as they are.
Inglourious Basterds
Boyz n the Hood
Pet Sematary
10. Use well-recognized representations of spoken forms, such as wanna, gonna, kinda, sorta; betcha, but do not
invent non-words (fulla for full of, lika for like a, ).

Examples, these are only a small set of potential issues and as such are not exhaustive:
- Use yup and yeah if this is what the speaker says. [The transcription yea (rhymes with day) is only used for
the exclamation meaning woohoo]
- Note: Do not use shortened forms like gettin' or gettin to transcribe -ing words like getting

- You may transcribe with ouais rather than oui, if this is what the speaker says.

- You may use 1
-person singular forms without the final -e, if this is what the user says (ich geh ins Kino), and
conversely imperative forms with final e (denke nicht dran)

Spelling Reform:
Use post-Reform spelling rules for languages with spelling reform (German, French, Portuguese), e.g. German:
write dass rather than da.

Tagging Guidelines:
The following tags will be used:
New Tag Definition Shortcut
Non-Primary Speaker. (NOT overlapping)
If you hear a secondary speaker other than the primary (main) speaker and the two does NOT
overlap, Use this tag.
When there is more than one non-primary speaker, only necessary to tag once.
1. If the speech can be clearly understood, transcribe what is said between the tags.
2. If the speech cannot be understood, mark it as <NPS><UNKNOWN/></NPS>.
<NPS> :
Ctrl + n
</NPS> :
Alt + n
Continuous non-primary-speaker. (Overlapping)
When the primary speaker is speaking, if there are other speakers speaking at the same time
(including but not limited to those from TV), use this tag around the transcription for the
primary speaker.
This tag pair indicates continuous secondary speakers in the background, no matter the
background speech can be understood or not.
If you dont hear this clearly, do not tag. Dont push your headphones against your ears.
Dont raise the volume above 60%.
If you hear people talking in the background (CNPS) AND a very loud continuous non-
human noise in the background (CNON) or speech noise like humming or coughing
(CSPN), choose CNPS as speech takes precedence over noises. In theory, human
background speech can easily interfere with recognition of the primary speaker, even when
at lower volumes, so choose CNPS over CSPN and CNON.
Only use one of the continuous noise tags. Continuous noise tags are not to be used
<CNPS> :
Ctrl + q
: Alt + q
/CNPS> :
Ctrl + 1
Obvious human speech command but the actual words cannot be determined for any reason.
Situations where <UNKNOWN/> is usually used:
- The word is cut off in the middle;
- Strong accent and/or foreign language;
- The speakers voice is too low or too fast to hear;
- You are not sure what the speaker is trying to say.
- No meaningful words at all

Usage of another language is okay when these words are part of the usual language (for
example: sorry, okay) because it is EXPECTED. For other words, just put <UNKNOWN/>. This
does not require <NONNATIVE/> automatically. The criteria for <NONNATIVE/> stands alone.
Ctrl + w

<NIS/> Use <NIS/> ("Not Intended for Service/Device") tag for accidental recordings caused by "pocket
dial" (accidental dialing):
- The audio is entirely casual conversations that are obviously no web search or device
- The audio is entirely TV or Radio
- The whole utterance is continuous non-human noise

No other tags are needed when a utterance is marked as <NIS/>;
If The primary speaker is addressing to the device, Please transcribe the irrelevant non-
primary speakers in the utterance if their words can be clearly understood.
If you hear the whole utterance is continuous non-human noise, please mark it as <NIS/>. If
Ctrl + d

it's one sudden non-human noise (for example a Beep), please mark it as <NON/>
If there is part of the speech directed to the machine, please transcribe it.

<SPN/> Non speech noise (human noise), e.g. singing, cough, audible breath etc. Ctrl + p
Continuous human noise (with no words) in the background, e.g. continuous singing (even
singing words), baby crying, cough, audible breath, humming of songs, laughter, etc.
Only used when youre sure there is no one talking but only human noises in the background.
If you dont hear this clearly, do not tag. Dont push your headphones against your ears.
Dont raise the volume above 60%.
If you hear people talking in the background (CNPS) and someone humming or other CSPN
in the background (CSPN), choose CNPS over CSPN and use it alone. Continuous noise
tags are not to be used together.
Ctrl + k
Alt + k
Ctrl + 2
<NON/> Non-human noise. E.g. Static electric sound from Radios or TV, mouse or keyboard clicks
Even if you hear a sudden noise in the middle of the word, use the <NON/>. If the noise is
closer to the beginning of the word, then tag it before the word; but if the noise is closer
to the end of the word, then tag it at the after the word.

Ctrl + o
Continuous non-human noise: tag the section of speech in continuous non-human speech.
This can be the whole utterance or part of an utterance.

IMPORTANT: Only tag when volume of continuous non-human background noise is near, at, or
higher than the volume of the speech so that it interferes with the primary speakers speech.
If you can hear what the primary speaker is saying without problems then DO NOT TAG./span>
Note: ONLY if recordings are made in noisy environments, the continuous humming
background non-human noise should be tagged with <CNON></CNON>, but the noises
that STAND OUT or sudden non-human noise shall be tagged as <NON/>.
Ctrl + u
Alt + u
Ctrl + 3
<FILL/> Use <FILL/> tag if user is producing a filled pause (umm, er, ah, etc.). If they are using a real
word as filler please transcribe that word.

Ctrl + f
<SBP/> Sentence Boundary Pause. Usage of <SBP> is subjective and should ONLY be used where
there is clearly a significant pause. DO NOT use this tag if you are not sure.
Ctrl + b
<SB/> Sentence Boundary. Two sentences/commands together without much of a pause in between. Ctrl + g
Use <NOSOUND/> tag if the audio file can be openedbut there is no sound at all. Ctrl + j
Put <NONNATIVE/> at the beginning of the transcription if the primary speaker of the
utterance sounds like a non-native speaker of the locale.

This tag applies to all locales, including en-US, and should be used whenever the transcriber
hears a speaker with a non-native accent. For instance, a de-DE utterance should be marked
<NONNATIVE/> if it sounds like the speaker is not a native speaker of German from Germany
If youre not sure, or you cannot tell, DONT TAG <NONNATIVE/>;
The NATIVE criteria applies to the locale youre working in. Anything else is NONNATIVE.
For example, an Argentinean accent in an es-MX (Spanish- Mexico) task is <NONNATIVE/>.
Ctrl + m
Use <CORRUPT/> tag if the utterance cannot be opened due to an issue with the audio file.
Note: ONLY use it when the audio cannot be played. In the case when an utterance can be
played but theres no sound at all, use the <NOSOUND/> tag.

Ctrl + r