Every now and then a little tool comes along that just might transform your editing and post-production workflow. I’m not talking about a new NLE update but rather a little tool that might stand alone or plug into your favorite NLE. One of those tools that might fit that description is Digital Anarchy’s Transcriptive. This is a $299 Adobe Premiere Pro panel ($249 until September 30) that uses cloud-based services (per minute pricing is above and beyond the $299 cost) to transcribe your video clips and integration them into Premiere Pro in a number of useful ways. Dollar for dollar, Transcriptive might be one of the single most useful third party tools I’ve ever encountered for the storytelling editor.
Transcriptive is an Adobe Premiere Pro panel from Digital Anarchy that integrates automated, cloud-based transcriptions right into Premiere Pro. We took a quick look at Transcriptive a few months ago at NAB 2017. After a few months of beta testing (and offering an advnaced sale at a reduced price) Transcriptive released this week.
Before you dig into the details of my experience (thus far) with Transcriptive check out the press release that provides a good top-down overview of Transcriptive and the bullet list points of what it can do. Like many bullet lists of a new product the actual implementation of certain features might not do exactly what you’re expecting the first time but it can give you an idea of what the product will try to do and where it might be going. Digital Anarchy also has a short FAQ on their website.
I used the beta a good bit over the months Transcriptive was in beta testing. It was an interesting process as I saw the tool go from a rather simple, straight forward transcription panel to a much more advanced tool with a lot more options and a lot more usefulness. While the basic interface didn’t change very much at all, new features were added that make Transcriptive quite adaptable to many different workflows. There are a number of features I have yet to use but as a craft editor I found myself going back to the same features over and over.
Pricing and set-up and … secrets
Digital Anarchy has priced Transcriptive at $299. That is the price for the PPro panel product and doesn’t include any actual transcription minutes. This is important to know as automated transcription is based around how many minutes of footage you want to transcribe. With this model Digital Anarchy makes its money off the selling of the panel and it’s up to the editor how they want to pay for the transcribing minutes.
Currently there are two options (I’m guessing more could be added in the future): Speechmatics and IBM’s Watson. Speechmatics is more accurate, works better with less than pristine audio and from what I can tell might be a bit faster. It’s also more expensive at 7¢ per minutes vs. Watson’s 2¢. And Watson give you 1000 free minutes a month. That’s huge. The more accurate the speech recognition the less editing and cleanup you have to do afterward. While the Transcriptive interface is optimized and pretty fast for text cleanup here’s the dirty little secret about even above 90% speech recognition accuracy: it still sucks, isn’t much fun and can take up a lot of time.
But here’s another little secret to all you craft editors out there who aren’t looking to generate word-accurate transcriptions: You can get an incredible amount of use out of these automated transcriptions when it comes to digging into an interview without cleaning up much text. This can give you a jump on telling your story as being able to read through hours and hours of talking heads can get you to that first creative cut a lot faster than watching and logging it all yourself. This was true back when Adobe had speech analysis built in PPro (though few folks used it) and it’s especially true now with a tool like Transcriptive where speech analysis is more accurate and the interface is better to interact with that analysis.
But what about actually using the Transcriptive panel?
Transcriptive benefits from, and also suffers from what is the Adobe Premiere Pro panel/extension architecture. I talked to several developers of PPro panels recently and they all told me the same thing: developing a Premiere panel/extension can be a real pain as there are a lot of limitations as to what it can do; documentation on the process isn’t great; it can be quirky and frustrating. These were common themes.
But the plus side that I heard (and this is something I’ve always thought about panels as well) is that these third-party extension panels can add a whole level of both functionality (into the app itself) and interactivity (say with web-based services) that few NLEs can match. I do hope Adobe engineers keep developing this little area of the NLE.
How does Transcriptive suffer? Not a lot as it’s quite functional across the board.
One example is using UNDO/REDO in the Transcriptive panel. I’m on a Mac and COMMAND+Z doesn’t undo. It’s CONTROL+Z as undo, CONTROL+SHIFT+Z as a redo. This is apparently not Transcriptive’s fault but rather the panel architecture. If you’re working in the Transcriptive panel and you COMMAND+Z to undo something it will undo actions in say the PPro timeline so be careful. There are undo/redo buttons in the Transcriptive interface.
Another limitation is that if you have a lot of transcriptions you’ve done and you’re moving between sequences, Transcriptive can’t automatically load up the correct transcription as you swap sequences.
You’ll need to go under the Transcriptive menu and Load Transcript to make a current sequence active in Transcriptive. A button or keyboard shortcut for this would be nice. It would also be very nice if somewhere in the Transcriptive interface the NAME of the sequence or clip was shown as that would be a quick way to know what was loaded.
On the flip side those benefits of the Adobe Premiere Pro panel/extension architecture are tremendous. The simple fact that we have this advanced tool running right inside PPro is a testament to that. While you might have heard PPro panels are HMLT5-based their development can go well beyond that as Transcriptive is pretty much its own custom developed application. The big plus of that is the advanced functionality you see as well as security so someone doesn’t just reverse engineer it and offer it up for free.
It’s easiest to go through some of the features of Transcriptive with some images so here we go.
Editing incorrect text in the Transcriptive panel will be straight forward to anyone who has used a word processor. The editing functions of any of these cloud transcription tools has to be fast and easy and Transcriptive is no exception.
A number of single click editing buttons at the top allow for some simple corrections. Transcriptive will show words in red that it has less confidence in being correct. I’m guessing this is data from the cloud service that Transcriptive is using. Hit RETURN to go into a full editing mode and ESCAPE to leave editing mode. The status bar in the upper right will change to let you know what you’re doing.
The Transcriptive panel at work
Transcriptive is a very feature rich little tool. I’m not going to get into what is one of its biggest advantages: utilizing transcriptions for things like open and closed captions. That’s mainly a matter of exporting into one of the supported formats and but rather I’m going to talk more about the experience itself and how you can use the data in Premiere Pro. Captions and a lot more features are detailed with videos on Digital Anarchy’s website.
If you have multiple speakers, Transcriptive can use the speech analysis from the cloud to attempt to identify those speakers. It’s not going to be 100% accurate (nothing speech-cloud related is) but it can be very helpful.
There’s a search box at the bottom to search for words or phrases. They highlight in green.
This is great when you have a speaker that might have an accent (such as the above example interview in New Orleans) and the automated transcription constantly gets a phrase wrong. A FIND AND REPLACE while in edit mode would be fantastic as would a button to jump between search results. Remember we are at version 1 of a brand new tool so maybe that’ll be in future updates. Or maybe there’s a keyboard shortcut I don’t know about.
There are a number of tools that can come in handy such as the Group Transcript by Speaker button. Doing what the names says, it can really tidy up a transcript when you have multiple people talking.
Even though the same person is speaking a lot in the transcript above they are broken into different lines.
Clicking the Group Transcript by Speaker button can clean it up and bring a lot more text into view. This will be especially useful after you’ve gone in and manually corrected errors.
There’s a few things worth noting under the Transcriptive menu in the upper left that I haven’t been able to try out yet.
Those are the BATCH functions. The ability to batch transcribe clips in a number of different ways could save a ton of time. Just be careful of your minutes and the charges for the services as that could add up if you’re not careful.
Job management is where you can go to manage all of the data that Transcriptive has generated. I don’t know much about this.
And a lot of that data ends up in a Transcriptive “scratch” folder. The .flac files are the audio clips that Transcriptive extracts and uploads to Speechmatics or Watson. I suppose reading the manual would be a good idea with all of these different menu options.
What about accuracy in the transcriptions?
While the general consensus is that Speechmatics is more accurate than Watson I wanted to see them side-by-side so I took a 42 second taking head courtesy of one of my wife’s music projects and sent that to both services. We aren’t looking at how well they handle bad audio so this audio was recorded well but there are a couple of restarts in the take, a bit of mugging when there is a mess-up and just a tiny bit of background words from the camera op.
Here is the 42 second audio clip:
How did Speechmatics do?
Quite well. Near the end it’s completely accurate but early on it had some trouble with the word psalm and the numbers. It correctly identified the proper name though my wife’s spelling is Karin with an i but it had no way of knowing that.
How did Watson do?
You can see that Watson wasn’t as accurate as Speechmatics. The first thing I noticed was that Watson was never able to correctly get the word “psalm.” Being a rather interesting word that’s one reason I chose this clip. Watson also didn’t identify the “107” with numbers instead writing them out. Neither of them really knew what to do with the vocal frustrations from the talent. That’ll be a source of amusement on a lot of jobs!
What you can see from both of those transcriptions above is how you can get a very usable transcription very quickly. Both clips took about 22 seconds to export, upload, transcribe and return the transcription data. Your milage will very there depending on factors such as clip length, computer speed and mainly internet connection. Transcriptive doesn’t upload a piece of video for transcription but rather a much smaller extracted flac audio file.
How can you use that transcription data?
Beyond the obvious use of open and closed captioning what I see as Transcriptive’s killer feature is the ability to send the transcription data to a number of places in Adobe Premiere Pro. The flexibility is quite amazing.
Transcriptive can export data and lay that transcription data into the sequence as Sequence Markers.
Pretty simple stuff especially in a short sequence like my 42 second example. You have an option of how often to lay those markers down but it’ll depend a bit on how Transcriptive has laid down the data. I didn’t see much difference in the above export with either 1 or 5 seconds.
While you do have the ability to search for words right in the Transcriptive panel once you get that data into markers you can also search for words in the Premiere Marker window or the timeline.
Another potentially useful export option for transcriptions is to send that data to Clip Markers.
A Clip Marker export has no relationship between the Transcriptive interface and the master clip loaded in the Source monitor as Transcriptive interfaces with a sequence timeline when it comes to jumping your playhead around when you click in the interface. This might be a Premiere panel limitation. But you do get those full transcriptions in the clip’s Markers window which is incredibly useful in its own way. I’m glad Transcriptive puts the actual transcription into the Comments pane and not the Name pane.
A third way you can use the transcription data is to export it to Speech Analysis.
This will attach all of Transcriptive speech analysis to a clip which is then viewed and marked up in the Premiere Pro Metadata panel under, you guessed it, Speech Analysis.
Didn’t know Speech Analysis was there? You’re not alone. It’s been there for a long time but was removed a few versions ago. Despite what you may have heard it was actually useful. And did you notice the IN to OUT marked in the image above? That was marked with an I and O in the Speech Analysis Analysis Text field.
Just be careful after exporting to Speech Analysis metadata. I wasn’t able to to back to the same clip and export to Clip Markers.
For most of what I work on this ability for Transcriptive to export transcriptions as Sequence Markers, Clip Markers and Speech Analysis is the killer feature. Depending on the clip types you’re working with, Clip Markers and Speech Analysis can travel with the clips between projects (my tests were ProRes .mov files). Imagine a workflow where you have an assist station transcribing and correcting clips before sending them to the editor. Print out the transcriptions and you’ve got the best of both worlds.
I am a very big fan of Transcriptive. Adobe Premiere Pro seems like the perfect place for this to live as, if I had to guess, I think Premiere is currently enjoying a lot of success in what is often the boring world of corporate video. Something similar is coming for Final Cut Pro X and the amazingly user-friendly SpeedScriber is now shipping with multiple NLE support.
But as of right now Transcriptive is offering what seems to be the only real option in the affordable, fast, integrated transcription solution in a non-linear editing application.
I mentioned corporate video above. Often corporate video is a lot of boring, mundane talking heads where there’s no way “they” are going to spend money on transcripts for the editor. At 300 bucks and 1000 free IBM minutes Transcriptive could be the independent editor’s secret weapon. Come to think of it … I don’t want anyone else to really know about it so I can keep it for myself.
ctrl-alt-delete this review.
- Sequence, clip and speech analysis metadata via automated cloud-based transcription services deeply integrated into Premiere Pro. What’s not a pro about that?
- Quite full featured for a version 1 product
- Documentation is a bit lacking
- Full progress bars when Transcriptive is performing a task would be nice
- No way too see what sequence you have loaded
- You’ve got to setup the speech transcription services (and pay for) independent of Transcriptive
- I’m guessing IBM could decide to do away with 1000 free minutes whenever they want to … or they could give away more
- It isn’t perfect either in accuracy or overall use but I struggle to see how Transcriptive won’t easily pay for itself many times over for most interview/talking head /dialog-based workflows
- Be sure and keep IN/OUT points cleared from the sequence you’re working in when using Transcriptive. Marked IN to OUT points seem to confuse it.