Loathing and Loving Prisma by Eric Escobar

And The Tools to Build Your Own

August 1, 2016

You’ve probably noticed your Instagram feed getting filled with fine art-ish looking pictures from your friends. You’re also certain that none of your friends are actual fine artists, or at least not good enough to make such interesting images out of their usual mundane Instagram imagery.

“Coffee Grinder”, Oil Pastel on Canvas by Author (ok, not really)

Chances are, they’re using Prisma, an app for iOS and Android that converts photos into artistic renderings based on the various styles of popular artists and art styles. In response, you did one of two things: immediately downloaded the free app and started spamming away (like me); or you blew it off as just another auto “art” filter, soon to go the way of Kai’s Power Tools and Corel Painter.

I am almost always in the second camp — cynical at the idea of automated art and burnt out bad the final renders. Most of these attempts to build effects that mimic the delicate work of brush strokes are done with a ham-fisted convolutional filter. In the end, the images look like someone took a Sharpie marker and splooshed colored blotches all over a photograph. There’s no thought to what the image is, just a pure application of pixel math.

Prisma is not another kernel image process, or a simple raster-to-vector conversion, it is something much deeper, more interesting and more troubling. Prisma is a machine learning (ml) process that deconstructs an image, identifies shapes and objects in the frame (face, trees, water, etc) and then changes those objects based on the style of artist you’ve selected. The artistic style is also derived from machine learning, whereby an artistic work (or many) are fed into the system and an artistic style is abstracted into mathematical model. The final image created isn’t just a filter applied to your original image, rather it’s brand new image that is an interpretation of your original photo based on a set of rules and concepts derived from an entirely different image.

Call it “neural art”.

PRISMA IS DEFINITELY NOT “ON YOUR PHONE”

Prisma doesn’t live on your phone. It runs on servers (I have no idea how many), somewhere in the cloud (the developers are in Russia). When you download the app, you’re really just downloading the front end tool where you select your image and the process you want to apply. When you click the render button, you’re just sending your order off into the ether to be analyzed and rendered, then beamed back to you, all via the wonder of the internet. There’s no local processing on your smartphone, because it would take forever and probably make it explode. There’s a lot of hot math going on.

Accessing Prisma on your iPhone is orders of magnitude faster than doing a similar process on a single personal computer, locally. This is an image effects process that takes a few hours on the fastest trashcan Mac, but a few seconds on racks of servers. As a user of this effect, you’re in a more efficient position to use this with a $300 camera phone than a $9000 Mac. Since A) your phone has a camera and you can snap directly to the Prisma servers and B) there’s no OS X app available yet. Even if there was a desktop version of the app, you’re still on equal footing when you send processing and rendering up to the cloud. Imagine if every effect you used worked this way?

In light of the rush to a subscription model for almost every application you use right now, what happens when all of the processing and rendering also moves to racks and racks of servers in the cloud? What does a post house look like in three years? Where does post happen?

ROLL YOUR OWN PRISMA

Prima is most likely an implementation of “A Neural Algorithm of Artistic Style”, where the concept and application is spelled out in detail. A further interesting reading, along with some tools to build your own version of Prisma, take a look at “Neural Style” up on GitHub.

This idea of a “neural art”, in concept and practice has been floating around Arxiv and GitHub for a couple of years now. We all got a taste of it with Google’s Deep Dream images. The ones that shocked us with trippy, psychedelic dog-infused imagery awhile back. However, the process of reconstructing what the eye sees into constituent brush strokes, shapes and color palettes is something that human artists have done since the first cave paintings. Mastering the styles of fellow artists is a key step in developing ones own voice as a fine artist. Picasso spent years in the Louvre copying the works of the masters. Training a neural network to approximate this process is exciting and frightening.

If you dig into the Arxiv and GitHub articles I linked to, you’ll see the wackiest consequences of abstracting artistic style from the art product — style merges. Take a little from Van Gogh, and a little from Kandinsky, sprinkle in some Degas and you’ll have something strange and new and unprecedented. This is similar to IBM’s Deep Blue recipe generating AI process, coming up with dishes a human wouldn’t have thought of. Or AlphaGo making Go moves that confounded the best players in the world. It is a different kind of “thinking” involved.

How good can this process be?

Take a look at this article about a “neural art” system that created a brand new Rembrandt painting using a 3D printer. It’s not just that a computer system was used to cannily recreate an image that looked like a pretty good Rembrandt (art forgers have been copying and selling masterworks for centuries). It’s that a system created a new image that never existed before that looked like it could actually be an original work of the long-dead painter. The system determined the most likely type of image that Rembrandt would have painted, had he painted it, then rendered it with a 3D printer.

MORE “LESS HUMANS WORKING”

Consider this a follow up to my alarm bell ringing article last year about most editorial jobs going away in the coming years. Not just editorial, but color grading, sound design and every aspect of media creation will be affected by the combination of ML and massive computational power in the cloud. Systems like Prisma and other kinds of neural art challenge the role of human creativity in the process of creating media. It may not completely replace what we do as artists, but it most certainly devalues our work. And we’re all going to jump on it, mostly because is takes away a lot of the grunt work we had to spend years learning how to do. The argument is that these kinds of intelligent applications will augment our creative work, which indeed, they already are.

In response to the article last year, as well as the subsequent podcast, I got some responses that I was talking about some kind of post-production singularity moment. Some people argued that machines will never be able to make the choices an experienced editor would make. Others told me that we are decades away from anything approaching that moment.

I offer up Prisma as an example of just how quickly these changes happen. A year or two ago, there wasn’t anything as powerfully complex in output or as simple to use as Prisma outside of GitHub or in computer science doctoral programs. Now it’s on your phone and millions of people are pinging the servers. In a month, Prisma will include the option to upload video clips and voila “instant Miyazaki”.

And that’s the important point — the source material (pictures, movies, paintings, etc.) to build these models and processes are not pulled from a vacuum, rather they are derived from the collected works of artists over hundreds of years. This is the part that is hard to swallow — every bit of human creativity and originality that exists as recorded media serves as the free data mine for neural art machine learning systems. There is no “Miyazaki” filter without the life and art of the man that produced it. There will be no “Murch Edit Process” without a careful machine learning analysis of every single one of his edits (some of which took him days to decide). These creators will not be remunerated for their effort and work. Artists using these tools will benefit from it, but will most likely be paid far less than in the past. The work is machine-enhanced and will require less time to learn how to master, and be about as valuable as those pictures of your plants filtered through the “Mononoke” filter on Instagram.

Prisma has abstracted and quantified the artistic styles of graphic artists and fine art painters. The resulting images are impressive and unprecedented. There are already projects in development applying similar ML concepts to editing style, color grading, sound design, actor performance, etc. ML is delivering, and poised to deliver even more, incredibly sophisticated tools for every aspect of media creation by digging into the areas that are currently dominated by actual human artists.

THE BRIGHT SIDE

There’s nothing stopping artists from taking control of these tools. Imagine a fine artist using an ML system to create incredible new images using their own work — inspiring bold new ideas. When it comes to analyzing temporal media, how about a tool that offers up cut and shot suggestions to an experienced editor to choose from?

Like Alpha Go making game moves a person wouldn’t make, a creative ML system may offer artistic choices that a person doesn’t see, but spurs a new rush of creativity.

UPDATE: My friend, Josh Welsh, a Prisma/Instagram enthusiast (Prismagrammer?) pointed me in the direction of Artisto the “Prisma for Video” app created in just 8 days by the Russian company Mail.ru. It’s crashy but a lot of fun. He also noticed that after running his iPhone images through Prisma, they were now geo-tagged as originating in the province of Jiangsu in China. So iPhone snaps from Southern California, run through Russian code on Chinese servers posted to Instagram. It’s a small world after all.