My wrap up of interesting/practical machine learning of 2020

As part of my work (but also personal interest) I try to stay on top of not just the research side of machine learning (ML — what many folks think of as part of artificial intelligence) but also practical and interesting examples. On the professional side, this is because a lot of the software I’ve worked to create over the past 13 years has directly built in elements of machine learning. On a personal level, I’m interested particularly in the economic impacts of some of what machine learning will bring: which jobs will be automated away or significant portions will be automated (see: “Humans Need Not Apply”).

To some extent, this is a sort of dystopian view, and I won’t get into my thoughts on what can/should be done about it, but I do want to point out that it’s not just the simple, repetitive, or labor-intensive jobs that can be automated. Some of the most interesting developments in machine learning over the past couple years have been in creative tasks and tasks which most people associated with the type of thinking only a human could do.

In this blog, I’m going to outline some of the most interesting projects in ML/AI that fit the bill of doing creative tasks or logical reasoning and which have online demos or videos of the demos, most of which have launched roughly in the past 1-2 years. You can actually go play with many of these things yourself to get a sense of where certain aspects of ML are at the start of 2021.

Image Generation

One of the things that most people associate with “a thing only a human could do” is to generate art. Maybe it’s taking a lovely photo or painting something very creative. Here are some online demos that show that machines can now do this too (and how well they do so):

DeepArt allows you to upload a photo that you take and then apply a style. For example, here I am “painted” automatically by a machine in the style of Vincent Van Gogh in just minutes from a photo I took in seconds. There are a number of interesting implications to this, ranging from forgery to novel artwork creations to “allowing anyone to become an artist.”

GauGAN allows you to create photo-realistic images by just drawing an image like you would in MS Paint. Here’s an image I drew of a mountain and a hill in the ocean next to a beach with a cloud in just a few minutes and an example output:

It doesn’t take that much to imagine how you could use something like to create art of places that don’t/can’t exist and you can imagine combining strategies of something like this with something like DeepArt to create paintings that require very little skill and only a good imagination.

Dall-E: Taking the previous examples a step further, what if you could just type up what you want an image of? That’s what Dall-E does (fresh off the presses as of January 5, 2021). Dall-E can take text that you type and generate an image for it. Their examples on the blog do a lot to spark imagination and you can play around with a few examples. You can go to this link to see how something like this might generate an image of “an armchair in the shape of an avocado” or “a professional high quality emoji of a happy alpaca” or my favorite: “an illustration of a baby daikon radish in a tutu walking a dog.” This type of thing has the potential to radically change illustration and design work.

Audio/Music Generation

It’s not just visual art/artists that are going to be under the ML gun. ML can now make music too.

MuseNet allows computers to dynamically generate new music from a text prompt. For example, “use the first 5 notes of Chopin Op 1, No 9 as a basis to generate an entirely new song.”

The original piece
Computer generated piece

If you follow through to the MuseNet blog, you’ll see it can combine musical styles, generate new music from a few starting notes, or just give it a prompt like “Bluegrass piano-guitar-bass-drums.”

GPT-3 lyric generation. It doesn’t just stop at the tone generation, ML can generate lyrics now too. Here’s a song with lyrics written entirely by a machine:

Oh yeah, and ML can even sing your song for you. Here’s over 7000 songs that are generated/sung entirely by machines. Are they perfect? No — especially not rhyme scheme or some of the voice impersonations. But those are getting better too…

Impersonation

There are now a series of “this _____ does not exist” generators that you can explore. This person doesn’t exist, this cat doesn’t exist, this horse doesn’t exist, this artwork doesn’t exist, and hey, even this chemical doesn’t exist because why not. Reload each page to see a new one of these that don’t exist. Don’t find something category of thing you want to create? There’s a way to generate the new category here if you have some software knowhow. These seem fairly benign at the surface (who cares that a fake person/cat/horse/… image could be generated), but the implications to this type of thing go far beyond the amusing.

Want to impersonate another person’s voice? Generate your own audio as Dr Who or HAL 9000 at 15.ai.

Want to impersonate another person entirely as a video? All of the following are fabricated by having ML figure out how to generate a person’s lookalike with explicit facial expressions.

This is over 2 year old technology
https://www.youtube.com/watch?v=VhFSIR7r7Yo
Now it’s coming up in an entirely new way to generate satire and much more nefarious purposes
Definitely not the most PC, but that’s part of the craziness of ML-generated audio/video

The visual artifacts you see on these videos are going to disappear over time as computational power increases. Now imagine combining these 2 together: fake speech generated by a ML model of a famous person combined with a fake video of that person’s facial expressions and movements and you can see you don’t even need to hire a voice actor to create really serious challenges in categories of fake news, legal challenges against verbal contracts, etc.

Games

The classic example of ML beating a human lies in the realm of Chess, and more recently with a game computers were thought to be unable to play competitively, Go. But there are other games you may not think of.

ML can play pictionary, for example now live in your browser against you. Or hey, need help drawing your pictionary item? ML can help you complete your sketch. It can answer trivia questions. Or it can make up a dungeons and dragons game on the fly for you. Or check out these 3 videos of ML playing games you’ve probably played — and doing so better than you.

ML playing Mario
ML playing pool
My favorite: AI playing hide and seek

There are a number of interesting implications to this type of thing. One is that — if you play games — I imagine we’ll see much more complicated AI bots that play against you. But the AI playing hide and seek in particular is interesting because it involves some lightweight construction with specific goals. There are far more advanced versions of engineering and behavioral optimizations that exist outside of these demos. For example, in the past year, an AI pilot beat the top Air Force fighter pilots 5-0 in a dogfight simulation. You can see where “games” can quickly apply to real-world situations.

Other Professions

There are already entire companies set up to reduce time, improve the quality of output, or entirely replace people from the process of certain professions. Here are a few recent examples:

And there are a variety of other professions which already have working demos or systems in place to help.

This is not comprehensive and “academic” and certain other types of applications that are still too new to be available to the public in demo form aren’t here but I hope this helps show a bit of what’s come around in the past year or so in the world of ML in ways you can go exploring yourself!