Transcription Guidelines

Transcription guidelines, transcription, transcribing
  1. Introduction
  2. Transcription accuracy and structure
  3. Full verbatim transcription
  4. Clean verbatim transcription




Transcribing is a process of converting audio to text either manually or using automated transcription software. It is used by researchers, students, journalists, lawyers, etc., in scientific and non-scientific fields.

Professional transcribers and people who use automated transcription software to transcribe audio files need to follow certain guidelines. Transcription software transcribes the file automatically, so you only need to follow transcription guidelines and make changes where necessary. Guidelines may vary depending on the type of audio that needs to be transcribed.  They may also change according to different transcription companies. Those guidelines are made in order to make transcripts accessible to everyone. For every transcription company, there is a guideline book that you must follow if you are a professional transcriber. You don’t need to follow any specific rules if you transcribe something for yourself, but some of these rules make the transcript more structured. You can easily read and analyze the transcript this way.

Guidelines in one company may contradict the guidelines in other companies.

Let’s have a look at some of the factors every transcriber should consider before, during and after transcribing a file depending on the company guidelines or personal choices.



Transcription accuracy and structure


Type the words you hear from the audio file. There might be some words you don’t understand on the first try. Listen to that part again, and if you still don’t understand the word. Another option is to tag the word as “inaudible” with its timestamp. You can also make a note somewhere in a separate notebook if you are doing the transcript for yourself.

Don’t omit any words deliberately. Those words could be important in the context, especially if the topic is about a serious issue. Even though you may feel that those words are off-topic, you still should type them.

Don’t add any words that aren’t in the audio file.

Don’t paraphrase or reconstruct the words, sentences or ideas you hear on the audio.

Spelling is important. In some cases, there is a need to spell the word considering the context. For instance, some words are spelt differently in American and British English.

Some speakers may make grammatical mistakes, and you may have to correct those mistakes. It is generally recommended to leave those mistakes as it is. It all depends on the company regulations you work with. If it is your transcript and you will publish it, you may definitely need to correct all grammatical mistakes you can find and even have a professional editor/proofreader to edit it.

Dividing the entire transcript into paragraphs makes it handier for people to read. Searching for something in a wall of text that doesn’t have paragraphs is quite a difficult task. The length of each paragraph depends on the file you have. Paragraphs can be divided according to speakers or ideas. As a thumb rule, it is recommended to have every paragraph as 4-5 lines where possible.

All new sentences should start with a capital letter. Punctuation marks are important in order for the reader to be able to differentiate between the sentences.

Labelling speakers is another factor every transcriber should pay attention to. You can label the speakers as Speaker 1, Participant 1, Organizer, etc. If their name is known, you can use that instead. If the name is revealed later at some point, you can go back and change the speaker label to that name. With the “search and replace” function, you can find and change them with one click.

Film, TV show, book, magazine, etc., names should be italicized.

It is also important to consider nonverbal communication. Nonverbal communication such as crying, laughter, pauses, etc. is usually transcribed using square brackets: [crying], [laugher], [pause]

If the speaker quotes someone, the quote should be inside double quotation marks.

Proofreading the transcript is one of the last steps but one of the most important. You can hire a proofreader or use proofreading apps to correct any possible mistakes you may have made.

As mentioned earlier in the article, format, structure and other factors of a transcript depends on the guidelines the different companies have. There are two forms of transcription when it comes to what to include in the transcript.

  1. Full verbatim transcription
  2. Clean verbatim transcription


Full verbatim transcription


Full verbatim transcription - transcribing everything you hear.

It includes mistakes, errors, corrections, etc., the speaker makes, etc.

What should you include during the full verbatim transcription?

False starts – Speaker wants to start with a certain sentence but changes it to a similar one. For example: “I would like to talk, um … today’s topic is climate change and how it affects countries.”

Speech errors – Speaker makes a mistake but corrects it. For example: “I was in a meeting on Thursday, sorry no, Wednesday.”

Slang words – These are informally used in speech rather than writing and can be specific to certain people or areas. Some of the slang widely used are gonna, wanna, shoulda, gotta, kinda, yep, yeah, etc.

Filler words – Words that don’t contribute much to speech most of the time: um, oh, you know, like, kind of

Filler words may have significant importance in some cases.

Repetitions – Words or sentence fragments that we naturally repeat sometimes: “I would … I would like to talk about climate change.”

Stutters – These are incomplete words fragments: “I ar-arrived here the last mo-month.”


Clean verbatim transcription


Clean verbatim transcripts don’t usually contain false starts, speech errors, filler words, slang words, repetitions and stutters, but there are exceptions.

Repetitions – Some of the words repeated a few times may add some value to the sentence and context. These repetitions aren’t repetition in the sense that they are used to place emphasis on something: “It is a very, very good idea.”

Filler words – Although it is easy to identify filler words, they aren’t filler at all and contribute to speech in some context. 

Slang words – In a clean verbatim transcript, slang words should be written in full, proper forms (yes instead of yep, don’t know instead of dunno, want instead of wanna, kind of instead of kinda, etc.), and some slang words should be avoided completely.

Words such as “okay” should be written “okay” instead of “ok”.

If there is a dialogue and one speaker (speaker A) asks questions and the other speaker (speaker B) answers “yes” to these questions, you may combine speaker A’s questions in one paragraph and put all the “yes” answers speaker B provides as one “yes”.