AI's Cool New Trick

An AI lets you generate an image based on a simple sentence, such as, "Stonehenge with a McDonald's drive-thru" or "Gangnam Style Lego."

Paul Carroll

May 2, 2022

a astronaut riding a white horse in outer space.

One of my lingering misgivings from my days as a reporter at the Wall Street Journal concerns the role I played in unleashing PowerPoint on the world.

I was just doing my job. Honest. Like all good journalists, I was at the bar, at a conference in the spring of 1987, when a consultant I knew introduced me to one of the two founders of a startup that was about to unveil PowerPoint. The demo he showed me was intriguing, and the prospects for the product checked out with a number of smart folks I interviewed at the conference, so I wrote about it.

Next thing you know, Microsoft buys the startup, and we're all awash in bullet points and so many fonts and type sizes that presentations may look like ransom notes.

Today, I'd like to introduce you to the latest advancement in visual presentations, an artificial intelligence that lets you generate an image based on a single sentence -- for instance, the AI produced the image above based on the prompt, "astronaut riding a horse in a photorealistic style."

I can't guarantee that problems won't eventually arise with this technology, too, but at least it'll be a lot more fun than fussing with PowerPoint templates.

The AI was developed by OpenAI, a nonprofit company backed by Microsoft, among many others, that has the very serious goal of developing what's known as artificial general intelligence -- basically, AI that works broadly, like the human brain, rather than being finetuned for a specific task, such as playing chess or recognizing images. OpenAI has produced an intriguing series of highlights, including defeating a professional e-sports world championship team on a livestream and solving Rubik's Cube with a robot hand.

But I'm mostly intrigued by the playfulness that becomes possible for presentations via OpenAI's latest development, which is known as DALL-E 2.

At the risk of telling you more than you want to know about my twisted sense of humor, here are the two images generated by DALL-E 2 that I've enjoyed most so far:

medieval painting of complaining that the Wi-Fi isn't working

That was generated based on the phrase, "medieval painting of complaining that the Wi-Fi isn't working."

And:

Ancient Egyptian painting depicting an argument about whose turn it is to take out the trash

"Ancient Egyptian painting depicting an argument about whose turn it is to take out the trash."

Oh, okay, two more:

Photo of a grizzly bear confused in calculus class

"Photo of a grizzly bear confused in calculus class."

And:

eonardo enters the metaverse

"Leonardo enters the metaverse."

I can imagine that the ability to just conjure up an image could contribute to the growing problems for deep fakes -- though the OpenAI folks are also working on ways to make sure AI is used only ethically. So, for now, I'll just let my imagination enjoy itself.

Cheers,

Paul

Six Things