Trans-ylvania by Steve Hullfish - ProVideo Coalition

October 17, 2009

The topic of transcription came up on several forums recently, so I thought I’d share the peanut gallery’s wisdom on the subject. If you don’t think you need transcriptions, think again. With the recent FCC rules, all TV programs need closed-captioning, and unless you’re doing scripted drama, you’ll need transcriptions for the closed-captioner. It’s also handy to have as an editor when you’re trying to “Frankenstein” a soundbite together and you’re looking for the sentence-ending word with an “S” at the end or something like that. Sometimes if I’m trying to end a speaker’s thought in mid-sentence, I do a search for the same ending word with a period after it. Splice them together and you’ve got an elegant ending to an otherwise choppy sounding sentence. If you’ve got a transcription to search, doing these kinds of tasks is much, much easier.

I could claim to be somewhat of a celebrity in transcription circles… Long ago, I posted an off-the-cuff response to a similar thread about transcription and it was pointed out to me that since then, my name has been co-opted as a verb! Google “Hullfishing transcription.” It shows up on producer’s websites and blogs, editors, audio editors, and even a stenography blog! Hopefully, I get known for something a little better than this before someone has to write my obituary.

Many of the voice-to-text software programs like Dragon Naturally Speaking do a much better job of transcribing text if it is first “trained” to understand a particular speaker’s voice patterns. Obviously if you’re trying to transcribe interviews for a documentary, you can’t train the software to each speaker’s voice. So my idea was to train the software to your voice – or an assistant’s – and then listen to the interviews with headsets on and speak the words you hear shortly after you hear them, much like an on-camera talent using an ear prompter.

Once it’s set up properly, it actually works fairly well – especially with the improvements in speech-to-text software.

But there are lots of other solutions and hopefully one of them will work for your particular set of circumstances.

One recent post on the topic asked about making dubs for the transcriptionist. This may still be the way some of them work, but it’s a digital world and instead of VHS dubs, most transcriptionists use digital files. That’s good news, since you have to digitize the footage or transfer it into your editing system anyway.

What you send to the transcriptionist depends a lot on whether you want timecode times and how often. If you just want a plain transcription, you can usually send an .mp3 or .wav file. Some transcriptionists will prefer a QT file with a timecode burn. You can send large files a number of ways:

• If you, your client or the transcriptionist has an ftp site, use that.

• On a Mac, if you’ve got an iDisk (MobileMe) account, use that.

• Try one of the web-based services for sending large files, like www.yousendit.com.

• A recent internet story showed that sending a carrier pigeon 60 miles with a flash drive attached to its leg was faster than DSL, so you could do that.

The original post was trying to figure out a way of maybe recording on set to some other medium so that the files could get to the transcriber faster. Remember that the transcriptionist can probably only work on one file at a time, so if you capture just the first tape into your editor and spit out a simple mp3 or QT file, you’ve gotten her started with part of the project and you can continue sending as you go.

If you want to record on set on the cheap so you can send something even faster, then try either a cheap digital recorder from one of the big-box office supply stores, or if you’re looking for something cheaper and hipper, try the Griffin iTalk app for your iPhone or for those of us without the latest iPod, get Belkin’s TuneTalk Stereo which turns regular iPods into recording devices.

Something to remember

Most transcriptionists charge by the minute, so if you send them a file with lots of wasted “blank” time, you’re just throwing money away. Trim your files to eliminate any time before or after the interview itself.

There are plenty of transcriptionists out there on the net. Terry Curren pointed out that he has outsourced some transcriptions overseas because it’s cheaper. This is obviously a job that could be done by almost anyone, anywhere with a computer, some decent typing speed and good hearing (but PLEASE know how to spell!), so I would think that the supply and demand equation is weighted in your favor.

I was contacted by a transcriptionist who does entertainment work and here is what she had to say:
I charge $1.25/min ($75/1-hour file or a portion thereof) for transcription. Files are usually sent to me as mp3’s (wav). Since many of the files I receive are too large to send via email, the files are uploaded to a server such as Box.net and SendthisFile.com. Both servers are free and I have used them for this purpose.
I have transcribed programs for the National Geographic and Smithsonian Channels before.
Please feel free to browse through my website at www.sstranscriptions.com and to contact me if I can be of further assistance.
Regards, Susan Siegmann

I suspect that her rates and workflow are typical.

NEXT – HOW TO DO IT YOURSELF

Doing it yourself

There are a number of solutions for doing transcriptions easier than just repeatedly hitting the space bar on Final Cut while typing madly away in Word.

One of the first pieces of software that I was introduced to for this – and wrote about in an earlier column – was called InqScribe.

Check out this PVC article where I first heard about InqScribe:
https://www.provideocoalition.com/index.php/apple/story/large_scale_final_cut_pro_installations_part_ii/

The genius of InqScribe (www.inqscribe.com) is that it allows you to control playback of a .wav or QT movie without leaving the app to type. You can pause and play with a tap of the Tab key. You can jump back 8 seconds – which is a great interval of time when transcribing – with the control-tab key. And you can control the speed of playback in one-tenth increments, so if you can find a speed where you can keep up, you don’t have to pause at all! This obviously depends a lot on the speaker. The other thing is that you can hit a keystroke that automatically inserts the precise timecode from the file into the text without having to type it out!

I am NOT a great typist and I can usually do my own ?transcriptions in about 3 times real time….meaning 3 hours to ?transcribe a one hour interview.

The software is now available for both Mac and Windows and it can even export for closed-caption, or as XML files directly into Final Cut Pro. A few months ago, when it was in version 1, I think it was $30, but now it’s in version 2 and is $99.

Adobe Speech Search in Premiere Pro and SoundBooth

I’ve been working with Premiere’s “Speech Search” function for about a ?year now. Sometimes it’s pretty good, and sometimes it produces ?absolute gibberish. A friend of mine gave it a try on my advice and his comment on his file was that Speech Search “nailed it.”

Check out the PVC Speech Search/ScriptSync article at:
https://www.provideocoalition.com/index.php/shullfish/story/speech_search_meets_scriptsync/

It seems to depend on the speaker. Having a good ?audio recording is a given, of course. With crappy audio, it’s pointless. But Adobe Premiere Pro CAN create transcriptions automatically. You ?capture or import files into it (it also works with Soundbooth) and ?then you go to the Metadata panel for the clip and hit the Transcribe ?button. It is basically a realtime process to transcribe a clip. It ?puts the transcription in the metadata window and you can copy and ?paste it to Word or anything else that you want. I wrote another article here on PVC about using Adobe’s Speech Search ?capability and then exporting the file to Avid where you can use ?ScriptSync to be able to use the transcribed file as a bin, basically, ?linking the words to the exact point in the video file. Adobe does ?this too (linking the exact transcribed words to the exact part of the ?video where they’re spoken.) It allows you to actually edit dialog by ?using text instead of scrubbing to find a word. The other cool thing about Adobe Speech Search is that the metadata of ?the transcription is embedded in the video permanently so that if you ?edit a video with transcribed clips and export it to an edited, final ?video for the web, you can actually do a web search on words that are ?SPOKEN in the video! This requires a specific Adobe free app to do the ?search now and the video has to be coded, but in the near future it is ?possible that Google and Yahoo will be able to search for spoken words ?in the transcripts of video and audio files just as it searches web ?pages for text now.

If you’re planning on using Avid, there’s really no need for a transcriptionist to put in timecodes, since when you import a transcription into Avid using ScriptSync, it will link the text up directly to the words in the interview so you don’t need timecodes to find them.

Another transcription software is from Videotoolshed. You can check it out here: http://www.videotoolshed.com/?page=products&pID=27

Terry Curren suggested this freeware app that is actually for subtitling, but can create a text file which could be used for transcription:
http://www.urusoft.net/products.php?cat=sw&lang=1

And finally some great additional tools from a post by Tod Hopkins:

His company uses three different tools for different reasons, including the one I already mentioned – Inqscribe. For doing transcription on the same machine as the media is already on – like an FCP system – InqScribe can utilize ANY QT source, allowing you to transcribe directly from the edit media. Another cool feature that Tod points out about InqScribe is that it can match back to the source file when you click on the timecode markers in the transcribed file.

Another software that they use is DittoAV which is only $10 US. It transcribes from QT files?too and can insert timecode, but only at the “end of file” which is where ?it assumes you are. Ditto AV is Mac only.
http://www.softlow.com/mac-os-x/business/word-processing/demo/dittoav.html

ExpressScribe?Free? also looks like a nice app for Mac, PC or even Linux. There’s a full-featured version and a “lite” free version. It transcribes from audio only, though. In a humorous note, while I was checking out the website, the software can use footpedals for start and stop, like many transcription software allows (like InqScribe), but their system can be rigged to use the foot controllers from game systems! They also offer wiring directions for hacking a $12 Radio Shack foot pedal to work on your PC (sorry, no Mac-hack).
http://www.nch.com.au/scribe/pedals.html

I’d love to know how to hack my old Yamaha synth pedals to function via USB.

And if you want to REALLY get transcribed files back from your transcriptionist in a hurry, the same company that gives you ExpressScribe for free, also has a phone transciption program, so you could just send the audio live through your cell phone directly to your transcriptionist!