The AI art revolution, pt.1

Suddenly, in the year 2022, we are a long way from persistent claims that “machines can not create art”. It was only 2014 when the Lovelace Test of AI creativity was proposed as the new stronghold for humanity’s intellectual superiority. It was only 2014 when a researcher at Google invented Generative Adversarial Networks (GAN) by pitting an image-generating AI against an image-recognition AI in the role of art critic. It was only 2015 when Google’s DeepDream project had an image recognition AI enhance specks until they looked like hallucinations of dogs. It was only 2017 when photorealistic faces generated by GANs fraught the internet with deepfakes. Today, image generating AI algorithms like DALL-E, Midjourney, and Stable Diffusion can turn any brief description into a visual, enabling anyone to create images in seconds. As whimful AI-generated art started drowning out 90% of all other online content, there was bound to be a backlash from both artistic and AI communities. We are at the epicenter of an AI art revolution, and as the world struggles for answers on how to deal with it, I want to address what this means for art and artists.

You see, I used to be a comic artist before I took an interest in AI, as witnessed by the half-hour doodles that occasionally illustrate these articles. I started drawing as a talentless 5-year-old who couldn’t even draw a straight line. Refusing to let that stop me, I spent 10 years tracing animal encyclopedia and drawing from life until I could draw subjects from any angle by heart, then another 5 partaking in art contests before my illustrations were published in magazines. After that, I joined a collective of artists to create, publish, and promote manga-inspired comics in a society that thought ill of it, until we gained acceptance. Hence many of my acquaintances are illustrators and animators by profession, some the creators of award-winning graphic novels and your childhood’s Nickelodeon cartoons. They have concerns.

As there are many aspects to AI art, this article will come in two parts: The first to clear up the current situation, the second to suggest ways forward. Throughout, I will also try to generate a specific image myself.

The workings of the technology

I don’t think a mathematical explanation really helps anyone, so instead imagine a neural network AI as a sea of numbers. The numbers represent currents of various strengths. Pixels of a cat image are scattered into the sea and carried across the currents. The pixels that do not sink along the way eventually beach on cat island or dog island, two categories. This is repeated with millions of cat and dog images, while the strengths of the currents are semi-randomly adjusted until the pixels reach the correct island most of the time. The resulting network is so “trained”, i.e. its pathways are optimally configured, to recognise cats and dogs by the sum of pixels that end up in each category.

To generate an image of a cat, the network is given an image of random static noise and a description of a cat. It then uses its ability of “recognising” or hallucinating the described features to try and remove noise from the image in steps, until all that remains is what it thought to see. It’s like when you stare at speckled tiles and start seeing faces in them after a while, then focus and imagine more details. A game of connect-the-dots, but with dots all over.

Rather than remembering parts of the images on which the AI is trained, it learns the features they have in common, after which it no longer needs the original image database. An abundance of brown pixels may represent the colour palette of cats. The frequent location of brown pixels near the center of an image may represent where cats are commonly depicted. A combination of a dark and light pixel may represent an edge, and a cluster of contrasting pixels may represent the texture of fur.

Artificial neural networks are known to memorise small-scale features like colours and textures better than large-scale concepts like shapes and compositions, because the shapes of cats in the training examples have far more variation than their colours. This is why image generating AI often produce disfigured shapes, or paws with an excessive number of claws, and still assume to have painted a “cat”, whatever that is. It is like AI is trying to recreate the brushstrokes without knowing the subject matter. The basic structure of artificial neural networks may be inspired by biological ones, but it is misleading to say that AI works just like the human brain, or that image generators work the same as artists. In practice, the field of AI is all about taking shortcuts to achieve the same results.

One of the current advantages of human artists over AI is their ability to edit parts, as inevitably requested by clients. With most AI image generators, one can only try to type a better description and spin the reels again as the image is reformed in whole with a degree of randomisation. However, given that AI like face recognition are particularly flexible with scale and position, I expect that it will not be a year before AI can also redo a specified area and blend it with the previous result. Indeed, some image generators like DALL-E are already capable of this. Inherent flaws of statistical algorithms will more likely remain: Results will be worse for rare subjects, and unrestrained by logic, but the majority will eventually be “good enough” for use, given the vested interests in making this so. Remember that “machines can not…” is the very phrase that birthed creative AI.

The nature of art

More philosophical writers have attempted to define “art” without consensus. Instead, I shall acknowledge all established forms of art, from Rembrandt paintings to a banana stuck to a wall in an art exhibition. Judging from the public discourse about AI art, I think it is most useful to make a distinction between expressive art, and impressive art. In expressive art, a visual carries a message or feeling that the artist wanted to express. In impressive art, a viewer unilaterally derives an impression or feeling from a visual. Both can be meaningful, only the source of that meaning differs. A wanton splatter of paint on canvas is unimpressive, but may be an artist’s attempt to express a feeling. A commissioned Rembrandt potrait is impressive, but carries no message. AI art can be acknowledged in the category of impressive art: The computer meant nothing by it, but people may be impressed by it nonetheless.

Traditionally, skill and creative effort have been key characteristics of art, but as touted by the producers of image generators, AI art requires little of either. Even though today’s image generators rarely produce good results without manual refinement, this won’t be the case in the future. Some experiece with the digital tool is needed, naturally. It takes a few tries to recognise that the AI does not understand language so much as keywords. But the effort is minimal when compared to a week’s process of composing, sketching, revising, and painting, not to mention the decades of training of a human artist. In this regard I find AI art most analogous to photography: It takes knowledge of the tools, and a taste in composition and lighting, but the actual process of creation is reduced to the press of a button and a choice of results. Although AI-generated art mimics the styles from which it derives, it had best be treated as a seperate category of art, just as photography. The method is vastly different, and with that, the criteria by which people judge and appreciate it. That simply means art contests should create a seperate category for AI art, art sites should implement new tags and filters, and AI users should not imagine themselves to be on equal footing with traditional and digital artists, any more than photographers imagine themselves to be painters. Art is not just what you make, but also how you make it.

The makings of an artist

I think it is easy enough to agree that an artist is someone who creates art. Writing a description of an idea and then reviewing the returned propositions is more accurately what a client does when commissioning an artist to make such-and-so. That would make AI users clients or curators, while the AI image generator fulfills the role of artist, literally the one trained to generate art. Writing effective text prompts for image generators may well be a skill, but if that were an art, then we should also consider everyone who has effectively used Google Image Search an artist, and the search results their personal creations.

Of course there is a grey area, wherein someone takes an initially AI-generated image, and edits it to perfection. Depending on the amount of change and creative choices, that makes them an editor or artist, as digital editing requires many of the same skills as digital artists. The question then merely becomes whether they are good at it.

Professional artists have noted that AI art tends not to be good art, because those who outsource the creative process typically lack the experience and resulting intuition to recognise what makes an art work. Composition, lighting, subject matter, everything appears average, which is what one might expect from an algorithm that learns statistically average features. Arts generated by the popular NovelAI algorithm are quickly becoming “that style”: Overlit, soft shaded, soft edged, centered characters with a nondescript out-of-focus background. It takes after some of the best artists that it was trained on, but the quantity of lacklustre results reduces a previously top tier style to the literally ordinary, gratifying no-one.

Artists be damned, apparently

It is all too easy to assume that technological progress is good and therefore all detractors evil, but technology can be used for either, and thus the concerns of artists should be heard. Contrary to popular opinion, artists are not opposed to the progress of technology: They have welcomed digital painting and 3D modeling in the past. What they take issue with is the sociological consequences.

Let’s start with the most apparent consequence: The sudden flood of competition from cheap automated art threatens the livelihoods of artists, many of whom already work below minimum wage. The prospect of losing one’s income will put anyone on edge and heats up the debate considerably. Yet surprisingly, most professional artists actually say that they do not (yet) feel threatened, because the job of artist is much more than making a pretty picture. One of their main tasks is to figure out what the client wants, which is rarely what the clients says. Artists have to research the subject matter and negotiate revisions to create something that works for both the client and their target audience, and convince the client of this. Even as AI art improves, few serious paying clients will fumble about with AI directly, because the reasons they needed artists remain: They are not experts in visual matters, and they have other business to attend to.

On the other hand, freelance artists that cater to low-budget commissions may take a hit, as for example stylised portraits or abstract book covers can be easily done with apps. Even though parallell breakthroughs in text generating AI will increase the output of books and the need for illustrating them, I would expect clients who find their way to text generators to also find their way to image generators. Standing out as an artist for hire on sites like Deviantart and Artstation has also become more difficult among the 50 to 1 output ratio of AI art. As a counterconsequence, I am certain other art portfolio sites will rise to exclude AI art entirely, which could make professional artists more visible to paying clients as target audiences are split. Either art sites have to change, or artists will have to migrate.

A more moral issue is that AI image generators are trained on existing artists’ artworks. Although image generators do not literally collage bits and pieces of existing arts (I will come back on copyrights later), they undeniably did use artists’ work for commercial purposes without their knowledge and consent, and not in the way that a human artist gleans the occasional element from another’s art. An image generator is not a human by any means. The primary sources of inspiration for human artists are still nature and their own experiences, while AI users can literally specify to have images generated “in the style of [popular artist]”. Outdated laws notwithstanding, plagiarism is a morally acknowledged wrong, a violation of honour and respect. Some artists do not have as much a problem with the generated art as they do with it being tied to their name. What AI art proponents don’t seem to realise is that art is much more personal than any other product: It is an extension of the artist’s thoughts and feelings. Their connection deserves respect, and that is exactly what has been absent in both the training and publishing of AI art. A show of respect could go a long way.

Let’s say that AI art was only derived from 16th century art, and there was no danger of job loss, would AI art still be opposed? Judging from the arguments used when discussing the above issues, I’d say yes. In fact, the supposed “first” AI-generated portrait was based on 14th to 20th century paintings, and its auction was ill received by traditional and AI artists alike. I think the core problem with AI art is our collective belief in merit. We instinctively believe that effort deserves reward, and logically so, as this motivates people to continue their efforts and contribute to the community. AI art takes the collective effort of artists, and displaces the reward to people who have dedicated relatively little effort. None see this contrast better than artists who have spent their entire lives honing their skills, worked and paid their way through art school, and still ended up undervalued. To make things worse, those same lifetimes of effort are now being used against them, competing with and devaluing their own art.

Of course the inventors of the technology meant for none of this. They merely picked up the gauntlet thrown by philosophers who said machines could never create art. Others ran with it. Artists can run with it too, but that does not stabilise the situation. I will address the positive possibilities later, but first I must point something out.

We did not stop to think if we should

History is a foundation for the future, but not a guarantee. The first industrial revolution automated back-breaking physical tasks, and coal miners became mechanics and operators, better jobs. The second technological revolution automated painstaking agriculture and infrastructure, and farmboys became factory workers. The digital revolution automated mind-numbing administrative tasks, and bank tellers became customer service agents. The online revolution automated 24/7 on-demand remote services, and cashiers became warehouse workers at, worse jobs. Each technological revolution enabled an increase in production, in turn enabling greater consumer demand, which, after a period of turmoil, returned a need for workers in related positions. However, where the first two revolutions solved the serious problem of unhealthy work, we are now solving luxuries, and at our own expense.

You’d think that after four industrial revolutions, we would have social safety nets in place to smoothen the transitions, but each time it is every person for themselves. This time it’s artists that are being trampled with complete disregard, but they will soon be followed by other creative and communicative fields. With the boom of both image and text generating AI, the roles of artists stand to be reduced to editors and curators, writers to proofreaders, and programmers to debuggers. As someone who has experience with all, I can not say I would enjoy these new roles: They are a considerable intellectual downgrade to the mundane parts of the job. For many in our society, the act of creating is what gives their lives meaning, and in using AI we outsource the creative process to machines who can derive no pleasure from it. Of course there will still be jobs after this revolution, many artists already need a second job to make ends meet, but they also want satisfaction from what they do, and acknowledgement of their hard-earned skills. Can you say otherwise?

The promise of automation still echoes: “Machines will do all the repetitive, boring tasks, freeing people to pursue more meaningful activities”. Elon Musk recently made a similar glamorous statement about his Teslabot robots-to-be, while he had his employees working an impossible schedule to rush out a prototype for cryptocurrency investors. Despite centuries of automation, we are still working 40-hour weeks in rotating night shifts. We are given more tasks, to do in less time, for less reward. Production has tripled, but burnouts have become common, and we struggle to find free time. Now we are automating writing, and music, and art: Those meaningful activities that we would be doing once machines took over our “repetitive, boring tasks”. Aren’t we in the wrong lane? Did we skip a revolution?

There are undoubtedly applications where image generating AI is welcome, places where it does alleviate arduous routine or enriches all our lives, but that is not where this technology seems to be heading right now, off the rails. I think we should take a moment to consider carefully, what future we want to create. One that appreciates people, or products?

To be continued

In the upcoming second part of this article, hopefully next month, I will go into how artists may benefit from image generating AI, how all this works with copyrights, and how the future may turn out. In the meantime, I advise artists to try out an image generator to know what they are dealing with, and alleviate some anxiety.

DALL-E: Requires email and phone number. 50 free tries. Easy to use, decent results, nice editing options.

Midjourney: Requires Discord account sign-up. 25 free tries. Cumbersome interface, good results.

Stable Diffusion: No signup required. Unlimited free tries. Easy to use, pretty bad results.