Live Action to Anime with Corridor Digital by Iain Anderson

With machine learning to speed things along

March 8, 2023

Introduction

In recent years, the quality and variety of YouTube videos has gone way, way up. As viewers split their time across more and more platforms, and as traditional broadcast audiences shrink, you’re more likely to find content for your specific niche on YouTube than you are by searching through all the cable channels in the world.

Visual effects is one of those niches, and it’s the major focus of Corridor Digital — though they also create videos on debunking, stunts, and other filmmaking topics. Recently, Weta’s VFX Supervisor for Avatar: The Way of Water appeared as a guest in one of their videos, breaking down some of the film’s most advanced water-related VFX shots.

But pure VFX isn’t our focus today. Recently, they showcased a new way to turn live action into animation without having to draw every frame by hand, and that’s what I’ll be examining here. Before we dig into the details of their workflow, let’s have a look at how this problem has been tackled in the past.

Before we start, please note — this is not a sponsored post and there are no affiliate links. If you want to explore Corridor Digital’s workflow, most of their content is free on YouTube, but the full tutorial is behind a paywall with a free trial. This article provides a quick overview, but please support Corridor directly if you’d like to use this workflow yourself.

Rotoscoping and other approaches

Live anime-style animation has always been drawn by hand, and the move to computers hasn’t made it a much more automated process. While 3D models can play a part with background elements, characters are typically still a line-by-line process. At the end of the day, an artist ends up drawing the outline of a character in one frame, and then another, and another, until it’s done.

Because it’s not always easy to create a movement entirely from your imagination, live video is frequently used as a reference — often tracing directly over the original video. Max Fleischer invented this technique, calling it rotoscoping, in 1915, and when his patents expired, Disney used it on some of their biggest movies in the 1930s. (Here’s some history.) The term “rotoscoping” is now used to describe other frame-by-frame painting projects, such as removing wires from a VFX shot, but it’s still used in the traditional animation sense too.

In the last couple of decades, the growing power of computers has inspired some filmmakers to create animations using a more generative style, where animation is more obviously linked to the original filmed sources. A Scanner Darkly is a well-known example from 2006, and the Amazon series undone followed a similar path in 2019 and 2022. Here’s what that looks like:

If you don’t have the budget or time to employ a team of animators for years, you can get at least part-way there by processing your video using a plug-in or pre-made effect.

Final Cut Pro’s Comic Looks effects were introduced in 2018 — not bad for realtime

Final Cut Pro includes a series of Comic Looks effects which don’t do a bad job at all, but of course, one of the big reasons to use animation in the first place is to create realities that can’t be filmed. At some point, you’ll probably need to step beyond the basics.

What AI can offer

The rise of generative AI has been explored here in many prior articles, and it’s proven to be a solid way to create novel still images in a theme. While the process today is a little like outsourcing your work to someone who doesn’t quite understand what you want, it’s cheap and quick enough that you can keep asking for more until you get what you want.

One of the major issues that generative AI faces today is accusations of copyright infringement. Machine learning models are trained on a vast set of images, and it seems that not all of those artists gave permission for their images to be included. Getty Images, for example, has alleged that Stable Diffusion used their image library without permission, and it’s certainly possible to include “stock photo” in a text prompt and receive a watermarked “iStock” image in return.

Live Action to Anime with Corridor Digital 1 — On my first attempt with Diffusion Bee, I asked for “model stock photo”, and got this

However, you don’t have to use these pre-built, potentially legally dubious models. You can train your own model, using your own source images, to create artwork in a particular style, or featuring particular people. That’s the workflow Corridor Digital went for, using a selection of source frames from the Vampire Hunter D: Bloodlust anime as the training pool.

While the legalities of this are still ambiguous (and I Am Not A Lawyer™) it’s important to remember that the images themselves are not being directly copied, just the style they’re produced in. It’s a little like asking an animator to watch all the Ghibli films a thousand times, and then asking them to make art that looks like it. Indeed, Hollywood films often use another film’s music throughout editing, then ask a composer to make a close match at the end — a process at least as legally dubious, uncreative and widely accepted.

That’s not to say that the process isn’t free from controversy, and of course if you go looking, you’ll find a number of people who don’t like what Corridor have done here at all. If you’re curious, start in the comments, then follow the trail of angry reaction videos. Still, in theory at least, it’s possible to use a tool like this legally. So how did they do it?

The workflow

Here’s their overview video, which explains everything — watch the whole thing.

First, they shoot their live video plates, in costume, against a green screen, with phones. Like the text-to-image generators, they’re using a Diffusion technique to transform one image into another, rather than starting from a text prompt. To control the output style, they’re using a model trained in a particular anime style, and with a series of shots of the actual people (themselves) who appear in the real-life videos.

The major remaining issue is that output tends to flicker wildly frame-to-frame. Because AI performs its magic on a frame rather than a sequence of them, temporal stability can suffer, an issue I’ve sometimes seen with the excellent AI-based person-keying plug-in Keyper To combat this, a simple Deflicker effect is added in DaVinci Resolve — or a series of them if needed. That’s it — style flicker and light flicker controlled.

Next, they strip out the green-screen background with a keyer, reduce the frame rate to 12fps, and that’s the characters sorted. Backgrounds are 2D screenshots of a stock 3D model they’ve positioned cameras within (not 3D moves around the space) and that makes the environments a relatively simple affair. From that point, it’s a more traditional compositing, editing and sound design job. There’s a lot of work in this, using animated overlays and effects for light rays, glows and lens distortions, but there’s no secret sauce in these steps.

You can explore the full workflow here, but it’s not free — you’ll have to sign up first.

The results

Here’s the finished video:

And here’s a before/after comparison of each shot:

While it has obvious imperfections, this short is still a big step technological step forward. Yes, I’m sure you could thoroughly sanitise the input data to avoid legal issues, though ethical issues would remain. I’m sure that time, effort, and a few more Deflicker instances could deal with a few of the other issues too. Maybe it would be worth generating several versions of each potential still image, and then automatically choosing the images that work best together? Temporal consistency is an area of active research in the AI community, and there’s more to come. And of course, with more time and money, you could pay artists to draw a selection of the key frames and then train the AI to fill in the gaps.

Conclusion

None of this is slowing down, and no matter where you sit in the filmmaking universe, it’s wise to be aware of new techniques. Anything your clients can see is something your clients can ask for, and you just might need to get your head around a more advanced workflow someday. Yes, this will get easier over time, but that’s a dual-edged sword — if a pre-made plug-in can make it easier for you, it’s making it easier for everyone else too.

On the other hand, if you can build something tailored to a specific need, something that isn’t quite as accessible to everyone, that’s your unique selling point. Embrace the challenge next time you’re between jobs, and you’ll be better suited for whatever comes next. Thanks to Corridor Digital for sharing their workflow and for everything else they do too.