Wonder and Despair over AI Art
How will artificial intelligence change the world of art and design?
Doesn’t it sometimes feel like technology no longer has the ability to impress us? We’re living in an age of unparalleled computational power and yet… meh. Perhaps the change is too incremental. Apple adds a little black “dynamic island” or another camera to their phones, a rugged watch to their lineup, and a faster chip in their laptops, and all I can think is, “cool, but hardly groundbreaking.” Other innovations feel borderline oppressive, like Zuckerburg’s insistence that we wear the equivalent of an orthodontic head brace to play pretend in the metaverse. No thanks.
If you, like me, have felt a little uninspired as of late, may I turn your attention to the thrilling new advancements at the intersection of artificial intelligence and creativity? It’s unofficially called AI-generated art, and I’m serious when I say that it blows my mind.
While you may have heard of AI art before or seen mentions of it online, for the sake of this article, I’m going to assume it’s a pretty new concept and start with the basics of the technology before discussing its impact and the lingering unease I feel about the whole thing.
What is AI-Generated Art and How Does it Work?
AI-generated art is actually quite a simple concept to understand even if the algorithms are not. In a nutshell, new AI tools can utilize textual prompts (aka written words) to create totally new images or “art”.
To understand how this works, think first about a normal image search on a platform like Google. If you punch the word “elephant” into the search bar, millions of images appear containing some version of an elephant. Google scoured the web and recalled every single image of an elephant it could find.
How did Google know each image actually contained an elephant and not something else, like a rhino? Though technology is getting better at literally “reading” images (like the ability to search through Google or Apple photos with words like “beach”, “dog”, or “food”), the main way Google knows the contents of an image is through text. Sometimes Google uses the words you see on the screen, say in an article about elephants that contains the word “elephant” a lot.
More often, though, an image search relies on something called alt-text, a word or string of words manually attached to an image file by a marketer, web admin, editor, or photographer who wants their image to show up in search results. A regular web user will never see alt-text, though it’s attached to most images on the web within the HTML.1 When Google bots look at a website, they see the alt-text and use it later to populate your search results with relevant images.
AI tools were trained on the same alt-text but instead of being merely a tool to recall information, the AI takes things a step further, and, though the metaphor is a little bit obvious, thinks like a brain. When you type “elephant” into an AI art tool, you’re not asking it to search for images of elephants, you’re asking it to create a new image of an elephant, to generate one based on the AI’s cognition of all the other elephants it has ever seen.
Where things get really crazy is in the AI's ability to accept long strings of words that combine ideas and even give stylistic instructions. For example, instead of just generating a photo of an elephant, why doesn't it sit in a rocking chair? And how about making the elephant type on a typewriter? Best of all, I can ask the AI to make our image an illustration. In a matter of seconds, the AI can create something like this.2
The Potential (and limitations) of AI-Generated Art
I used a tool called Dall-E 2 (a play on Salvador Dali and Wall-E) to generate the image of the elephant above as well as the image at the top of the newsletter.
To further demonstrate the ability of these tools, let me showcase how it responds to stylistic instructions. I’m going to use another AI tool called Midjourney for this task, and since I’m writing about art, I’ll use the prompt “A New Zealand hillside” (my favorite landscape) followed by “in the style of _____” and pick a few different artists.
Notice the similarities. Each has a similar color palette, and, with the exception of the van Gogh, included a mountain peak rising above the hills. The style varies widely, however, and the inspiration of each great artist is clear in the AI-generated picture. It took me about five minutes to make all four images.
Generating art with artificial intelligence, while not a creative act, per se, still requires creativity. The main limitation is your own ability to craft an interesting sentence, speak in plain, direct language, and adopt the lingo of each tool to achieve the best results.
That said, the tools aren’t perfect. If you have a certain image in mind, it’s challenging to make the platforms bend to your will. As of right now, there isn’t a way to edit the images aside from generating similar-looking variations.
Just as a little bonus, one interesting feature both platforms offer, though Dall-E 2 accelerates in, is generating similar images from an uploaded image. In this case, no text prompt is required. I uploaded my headshot to see what Dall-E 2 could do. Call me vain, but I think I’m the most handsome of the doppelgangers.
AI Art and the Consumption Cycle
These tools are powerful, but how far-reaching are their effects? Of everything I’ve read, Ben Thompson, author of the popular Stratechery blog, has the clearest understanding of the issue. Thompson says it takes three steps for art or information to go from an idea in the creator’s head to being consumed by the audience. They are:
Substantiation: The first physical (or digital) expression of the creation, a story, painting, music, aka the creation itself
Duplication: Producing multiple versions of the creation
Distribution: Making the duplications available to a wide audience
4,000 years ago, when humans only communicated orally or visually, each of these steps, including consumption, happened at the same time. Think of Homer traveling around ancient Greece from village to village to relay The Odyessy. There was no written version, and therefore no duplication or distribution; you had to be in the room to hear the story.
In the subsequent millenniums, different innovations have removed the bottlenecks between each step. Writing, and, more importantly, the printing press, removed the duplication bottleneck. Incremental innovations around distribution made it easier to obtain duplicate copies, but it was the internet that finally removed the major bottleneck. You no longer needed printed words in the form of a book or newspaper; you could simply read online and access an infinite amount more information than you had in print.
Until now, the last bottleneck remained around substantiation because the creative process, be it painting, writing, composing, or anything else, always took the most time. AI generation will effectually remove the substantiation bottleneck, allowing anyone with a wifi connection to produce art in a matter of seconds.
I like Thompson’s framework because it helps me understand the limits of AI’s disruption to creativity, isolating it in the substantiation part of the cycle. Furthermore, I don’t think the current AI tools actually solve the substantiation bottleneck as of yet, but they’re a step in that direction, a sign of the impending progress to come.
What Will AI-Generated Art Disrupt the Most?
As I’ve mentioned in previous newsletters, my wife runs an Etsy shop where she sells her own paintings. When I showed her these tools, she was immediately deflated and asked the most relevant question on every artist’s mind. “Who is going to buy art if you can make your own so easily?” This question became very real when a piece of AI Art won a contest at the Colorado State Fair. Immediately, artists were up in arms for this obvious violation of creative morality.
There will be more moments like this, though I’m not very worried about AI disrupting Fine Art or Etsy businesses for two reasons.
The first is the human element behind art or any creative output. One of the reasons we consume art in the first place is because we recognize human ingenuity in the creation. Additionally, art carries so much other human baggage with it, concepts like status and historicity, that draw our attention. In other words, AI art has no cache.
Second, there’s a certain physicality to art that we appreciate. Most of us would prefer to view or own an original. In the case of Etsy, I think this still holds true because the customers know an original exists and was originally crafted by a human even if they only purchase a print.
AI art feels a little bit like an NFT, those digital images that soared in value in 2021 only to crash in 2022. It’s hard to place too much economic value on something that exists only digitally, and it’s even harder to place value on a piece of AI art.
Let me further illustrate my point with some questions.
Would you pay money for an AI image generated by someone famous?
Would you feel cheated if you went to the Louvre only to find out the Mona Lisa was a replica?
Would you pay money for ownership over a certain string of text used to generate a specific image?
Would you frame an AI image you generated yourself and hang it on your wall?
Nothing about AI art, even attempts to add status with celebrity, would make it all that appealing because it still fails the two tests of human touch and physicality.
AI art has its best use case when humanity and physicality don’t matter as much. I’m thinking mostly about graphic design. While we appreciate craft and skill, most people (except maybe other designers) aren’t thinking about who made the piece, nor do we care about any kind of originality since the design is made to be duplicated and blasted to the masses. We already refer to images of this type with a different vocabulary; it’s graphic design, marketing collateral, UX, not “art”.
The disruption to graphic design will look a little bit like what happened to photography. It used to be that photography was only available to a select few who could afford the equipment. 35mm film, digital, and smartphones made quality photography available to everyone. Did this destroy the photography industry? Not really. Just because one can take a photo doesn’t make the photo good, nor does it mean it’s worth sharing or consuming. Professional photographers with actual skills in composition and lighting are still in demand. Additionally, photographers utilize new technology, not shun it, and the new tools to keep themselves relevant, faster, and more efficient.
Graphic design has the same talent gap between professionals and amateurs as photography, and AI art won’t change that too much. Most professionals in the field will use AI tools to automate and improve aspects of their day-to-day work, though some graphic design, particularly work requiring a lower level of skill, may be totally replaced by AI.
Some AI Despair
I’ve tried to show my awe of this technology as well as a fair assessment of its potential impact. Nevertheless, I feel a nagging pessimism in the back of my mind after using the tools.
Creating images with AI is addicting. I feel like I’m in control of a magic wand. Each new image I make gives me that quick flash of dopamine so common to the digital world, the same sensation I’ve felt seeing an app notification, posting a photo to Instagram, or playing a mindless game on my phone. It doesn’t matter if the generated image is mediocre or not what I expect. I’m the master of the tool and can simply adjust my input and produce a new work instantaneously.
Using these tools also feels like a substitute for ingenuity, and that scares me. I made the point earlier that making a new AI-generated image takes creativity. Perhaps a better wording would be to say “it takes cleverness”. Cleverness is an admirable skill to be sure, but it’s not actually the actual creative work. Creativity has always involved time and effort. AI art requires no work yet something new and wonderful is generated every time. It’s easy to imagine people falling into the delusion that they’re undertaking some creative project when in reality, they’re simply tweaking an algorithm.
The digital world is such an effective supplement for reality, so close to the real thing.
A zoom call is almost as effective as a face-to-face meeting.
Social media is nearly as good as hanging out in person.
Conquering a new level in a video game is almost as rewarding as conquering something in real life.
Is creativity the next human experience to fall prey to a digital replication? To date, digital tools have only helped our ability to create, but these AI tools, especially as they improve, feel like the literal replacement of our craftsmanship. Why spend maximum effort for a maximum outcome when you can spend 5% effort for a 90% outcome? Many will settle for good enough.
Alt-text is also helpful for those with disabilities like the vision-impaired who utilize programs that audibly read web articles.
Make sure to look closely at the image too see all the weird stuff going on. First, the chair appears to have a tail rather than the elephant. The cord to the typewriter plugs into the elephant’s crotch. The paper has a cord coming out of it, and for some reason, the elephant is holding a book.