Although I have seen scattered pieces of the tools contained in Descript elsewhere (some of which I have reviewed before), Descript is the very first time I have seen an application/service dedicated to combining auto-transcription of audio (without training), manual correction/editing the text —as with a dedicated text editor— with the audio instantly reflecting the changes made in the text. For example, you eliminate ums and repeated words via text, and they are instantly removed from the audio track. Descript allows us to export the resulting edited audio and text files in a variety of formats, including captions/subtitles. I was blown away by the tutorial, and then used Descript to work with an actual recent recording, where I was recently interviewed on Chris Curran’s show, Podcast Engineering School. Learn how I was blown away with the tutorial, actual use, and what I hope Descript will add soon.
Recent audio interview
Chris Curran recently invited me on his Podcast Engineering Show, and I am scheduled to be back there soon. I used the audio file from this episode to learn how to use Descript, beyond the tutorial. Although I have not (yet) done the entire episode, I did more than enough to determine how intuitive and powerful the Descript app is.
Why didn’t I use any of my own show material to test Descript? Because currently, Descript only accepts audio files that are in the English language. Although I am preparing shows in English, the only ones that are already “on air” (like CapicúaFM) are not in English. Fortunately, I was able use the above interview with Chris Curran, since it is in English.
Descript’s killer features that blew me away
Although some of these features exist in other tools (some of which I have reviewed before), the unique synergy of having all of them in a single app changes everything.
- Auto transcription: I imported the audio file, which has a duration over an hour. Five minutes later, Descript was finished the the automated transcription of the interview. Thanks to Google Speech technology, it is amazingly accurate for an automated software.
- Correct text: I was able to fix text immediately, i.e. homophones (like hear/here, their/there, to/too/two, you’re/your) and punctuation marks, where the audio sync is already perfect, and only the visual element needs to be polished.
Venn diagram which compares and contrasts heterographs, homophones, homographs, and synonyms. Diagram by Will Heltsley, used by permission under GNU Free Documentation License.
- Edit audio (by editing text): I was able to delete ums and repeated words via text in the same window, and they were instantly removed from the audio! I just had to switch from Correct text to Edit audio within the text window.
- Audio word processing: By clicking the words on the Wordbar (upper part the audio timeline, just above the waveforms, as shown above) you can delete or reorder words or phrases. You can also add or remove pauses.
- Export audio, text, captions/subtitles: See more details ahead in this article.
- Powerful keyboard shortcuts for almost everything, which means keeping your hands on the keyboard most of the time.
Three features I haven’t yet tried myself, but are self-explanatory:
- Export session for fine tuning in Apple Logic Pro X/Avid Pro Tools (.aaf), Adobe Audition (.sesx), OMF or Samplitude EDL (.edl).
- Web publishing and commenting allows collecting feedback by letting collaborators listen to audio and leave comments on an interactive transcript.
- White Glove service (optional) to subcontract Descript for human editing at US$1 per minute. (Descript employees and White Glove transcriptionists sign confidentiality agreements.)
Whats the deal?
The Descript app is a free download, currently for macOS. (More platforms to come.)
Descript Standard costs US$0.07 per minute. In the Conclusions section of this article, there’s a link to get your first 100 minutes of transcription free. (Otherwise you’d only get 30 minutes free.) The early adopter discount for Descript Standard is currently US$10 per month, rather than the official list price of US$20 per month. That includes full audio editing capabilities. This offer is subject to change.
Descript Free is for occasional transcribers who don’t need to edit audio, but only transcribe and export text, captions and/or subtitles. For them, the automatic transcription costs US$0.15 (15 cents per minute), with no monthly subscription.
Audio export options
The edited audio can be exported as AIFF, WAV or AAC (.m4a).
I am surprised that the MP3 códec is not currently included in the list, since it recently became royalty free. I happen to prefer AAC over MP3, but I realized how important it was to distribute in MP3 when many years ago, I guy from Tenerife, Spain wrote to me telling me that he wanted to subscribe to my radio show as a podcast, but he could only play MP3 files on his device, no AAC. That’s why I switched, and —for now— still distribute my radio show as a podcast using the MP3 códec. It’s old; it’s inefficient compared to AAC, but it plays everywhere and sound good, as long as your original is very good and you know the ideal way to encode it.
Text export options
The text export options are Microsoft Word (.docx) and Rich Text (.tft).
Captions/Subtitles export format
After you transcribe and edit audio and text, you can export subtitles and captions from transcripts, directly from Descript. Descript currently exports:
- SubRip (SRT) files, commonly used to add subtitles or captions to DiVX and DVD video playback and Amazon Video Direct’s preferred format for subtitles, as I covered in this 2016 article.
- Web Video Text Tracks (VTT) files, commonly used to add subtitles or captions to web video formats.
This is much fewer than 28 export formats available from MovieCaptioner for Mac (which I reviewed in 2016), but that is a much more specialized tool. Descript might offer more captions/export formats in the future, depending upon customer demand. Otherwise, Descript could be use for part of the work, and other tools, like MovieCaptioner for Mac or SoftNI for Windows could be used for the rest. See My wish list (ahead) to see the particular format I’d like added first to Descript.
My wish list
Here is my wish list, in order of priority (for me). You may have a different set of priorities:
- While maintaining the English-language user interface, add transcription audio in Castilian, the most widely used Spanish language, but not the only one. Considering the Google engine already supports transcribing Castilian (even though Google inaccurately calls it “Spanish”), this should take minimal effort for the Descript developer. Castilian is the second language spoken in the US, and an official language in 21 countries. Castilian is considered the world’s second-most spoken native language, after Mandarin, which is the most widely spoken Chinese language.
- Add MP3 export option (even though it’s old and inefficient) for the reason indicated earlier in this article.
- Add SCC export capability, Amazon Video Direct’s preferred format for closed captions. This is separate from SubRip (SRT), which is Amazon Video Direct’s preferred format for subtitles, which fortunately is already available in the current version of Descript, as indicated in the prior section. If you have read my article about Amazon Video Direct, you already know that closed captions are obligatory there, just like on over-the-air television in the United States since 2010.
- Web version of Descript, to expand its compatibility beyond macOS to ChromeOS (Chromebook and Chromebox), Linux and Windows. (This one is already listed as “coming soon” by Descript.)
Conclusions, and how to get your first 100 minutes of auto-transcription free.
If you would like to try Descript, use this special referral link to get 100 minutes of auto-transcription free. The Descript application is free to download. Current pricing is on the Descript website.
I have never seen anything like Descript before. For a 1.0.1 version (which happens to be a palindromic number, and I love palindromes), Descript has taken auto and manual audio transcription, text editing, audio editing, captioning and subtitling where no single app has ever gone before (to my knowledge).
Upcoming articles, reviews, radio shows, books and seminars/webinars
Stand by for upcoming articles, reviews, and books. Sign up to my free mailing list by clicking here. Most of my current books are at books.AllanTepper.com, and my personal website is AllanTepper.com.
Si deseas suscribirte a mi lista en castellano, visita aquí. Si prefieres, puedes suscribirte a ambas listas (castellano e inglés).
Save US$20 on Project Fi, Google’s mobile telephony and data
Learn to speak Castilian, the most widely used Spanish language
No manufacturer is specifically paying Allan Tépper or TecnoTur LLC to write this article. Descript gave Allan Tépper sufficient transcription credit to carry out the auto transcription of his test for this review. Some of the other manufacturers listed above have contracted Tépper and/or TecnoTur LLC to carry out consulting and/or translations/localizations/transcreations. Many of the manufacturers listed above have sent Allan Tépper review units. So far, none of the manufacturers listed above is/are sponsors of the TecnoTur programs, although they are welcome to do so, and some are, may be (or may have been) sponsors of ProVideo Coalition magazine. Some links to third parties listed in this article and/or on this web page may indirectly benefit TecnoTur LLC via affiliate programs. Allan Tépper’s opinions are his own.
Copyright and use of this article
The articles contained in the TecnoTur channel in ProVideo Coalition magazine are copyright Allan Tépper/TecnoTur LLC, except where otherwise attributed. Unauthorized use is prohibited without prior approval, except for short quotes which link back to this page, which are encouraged!