The Art the AI Never Asked Permission to Learn From

Monet is in the training data. So is Greg Rutkowski, a living Polish fantasy artist whose name became a more popular style prompt on Midjourney than Picasso. He never consented. Here's what happened when artists discovered the internet had been scraped, what they've built to fight back, and what it actually means for anyone who creates for a living.

When AI image generators launched to the public in 2022, most people marvelled at the output. A small group of people recognised the input. Illustrators, concept artists, photographers, and painters started noticing something unsettling: the systems weren't just producing generic images in a general visual style. They were producing work that looked uncannily like specific living artists, down to their brushstroke patterns, colour palettes, and compositional quirks.

The reason was simple, and nobody had really tried to hide it. These models were trained on billions of images scraped from the internet. Every portfolio site, every DeviantArt gallery, every ArtStation page, every personal website where an artist had proudly published their work. All of it had been harvested. The data scrapers didn't ask. The terms of service said "publicly available." The artists said that wasn't what publicly available meant.

78M Artworks opted out of AI training via Have I Been Trained

#1 Greg Rutkowski: more Midjourney prompts than Picasso at peak

2025 UK court ruling: Getty's copyright infringed by Stability AI training

5B+ Images in LAION-5B training dataset, backbone of most AI image models

The Greg Rutkowski Problem

Greg Rutkowski is a Polish fantasy artist whose work graces the covers of Dungeons & Dragons sourcebooks and the card art of Magic: The Gathering. He has spent years building a distinctive visual style: warm light, dramatic scale, figures dwarfed by mythic landscapes.

In 2022, he discovered that his name had become one of the most-used style descriptors on Midjourney and Stable Diffusion. Users were generating images "in the style of Greg Rutkowski" by the thousands. At one point, his name appeared in more AI art prompts than Picasso.

He had not consented to any of this. He received no compensation. His name had become a raw ingredient, a style parameter that users could invoke to summon visual output that competed directly with the work he spent his career building. He spoke publicly about the concern that AI-generated work labelled with his style would flood search results and commission platforms, making it harder for clients to find, and harder for themselves to justify paying for, the real thing.

"I'm very concerned about my career and the careers of other artists. It will be very hard to find a new Greg Rutkowski if the algorithm will make all these copies of the existing one."

Greg Rutkowski, artist

Rutkowski's situation wasn't unique. It was representative. Thousands of artists found their names turned into prompts, their styles extracted and commodified, their years of portfolio building turned into free training data for a technology they'd never been consulted about.

The Monet Question

Claude Monet died in 1926. His work entered the public domain decades ago. Under copyright law, anyone can reproduce, remix, or build on his paintings freely. AI companies have trained on Monet extensively. Technically, they're in the clear. The Impressionist masters are, in legal terms, fair game.

But artists use Monet as precisely the wrong kind of precedent. The problem isn't Monet. The problem is what happens to the artists being trained on today. A 35-year-old illustrator whose work is in the training data right now will be in exactly the same position as Monet in 70 years, except their style extraction is happening in real time, while they're still alive and still need to earn a living from it.

The public domain argument proves the technical legality but misses the point entirely. You can't retroactively consent on behalf of a dead artist. You can still ask a living one.

There's also a second Monet problem: the museum angle. Major cultural institutions (the Musée d'Orsay, the Metropolitan Museum of Art) own physical canvases that have never been digitised for AI training. But AI models have been trained on high-resolution photographs of these works scraped from museum websites and third-party databases. The museums weren't consulted either. Some have begun asserting rights over digital reproductions of works in their physical collections, a legal argument that remains contested but signals the breadth of the stakeholder problem.

Getty Images vs Stability AI: The Case That Actually Has Teeth

Getty Images is the world's largest stock photography agency. In January 2023, it filed lawsuits against Stability AI in both the UK and the United States, claiming the company scraped millions of Getty's copyrighted images without permission or compensation to train Stable Diffusion. The evidence was striking: in some cases, the AI-generated outputs contained corrupted, half-rendered versions of the Getty watermark, a ghost of the original that proved what the model had been trained on.

The UK High Court ruled in December 2023 that Getty's copyright and trademark claims were valid, holding that responsibility for training data infringement lies with the model provider, not the end user. A further ruling in November 2025 reinforced the finding that Getty's images had been used without permission. The cases are still working their way toward resolution on damages, but the liability question has been largely answered: scraping copyrighted images for commercial AI training is infringement.

Getty's case is the most financially consequential, but it's not alone. A class action filed in January 2023 by artists Sarah Andersen, Kelly McKernan, and Karla Ortiz against Stability AI, Midjourney, and DeviantArt alleges systematic copyright violation across the creative AI industry. These cases collectively establish that the "publicly available" argument does not provide blanket legal cover. A conclusion the AI companies built their entire data acquisition strategy on.

What Artists Built to Fight Back

While lawyers argued in court, researchers and artists built tools.

Have I Been Trained? (haveibeentrained.com)

Created by Spawning.ai, this tool lets artists search the LAION-5B dataset (the backbone of most major AI image models) to see if their work appears. It also provides an opt-out mechanism. To date, artists have used it to remove approximately 78 million images from AI training datasets. Spawning has partnered with ArtStation and Shutterstock to implement opt-out at the platform level, giving artists a route that doesn't require hunting down individual image URLs.

Glaze (University of Chicago)

Developed by Professor Ben Zhao's research team, Glaze applies imperceptible perturbations to digital images before they're published online. To the human eye, the image looks identical. To an AI training pipeline, the style signal is scrambled, making it much harder for a model to learn and replicate a specific artist's visual identity. It's a defensive shield for living artists who want to share their work publicly without feeding the systems that compete with them.

Nightshade (University of Chicago)

A more aggressive tool from the same team. Where Glaze protects, Nightshade actively poisons. Images processed through Nightshade embed hidden signals that corrupt AI outputs when the model trains on them. A batch of Nightshade-processed cat images, for example, might cause a model to output dogs when asked to generate cats. The goal is to raise the cost of non-consensual scraping high enough that AI companies begin seeking licensed data instead.

The Style Copyright Gap, and Why It Matters

Copyright law has never protected artistic "style." You can paint like Rembrandt without infringing Rembrandt's estate. You can write sentences that sound like Hemingway without paying his heirs. This was always the rule, and for most of history, it was a reasonable one. Learning to paint like Rembrandt took years of study and still produced something new.

AI changes the calculus entirely. The gap between "learning from" and "instantly replicating at scale" is so large as to make the legal distinction feel absurd. An artist can protect their specific expression (their actual paintings) but not the visual language they spent decades developing. AI companies have built commercial products directly on that loophole. It's legal. It may not remain so. And even where it stays legal, the question of whether it's right is a separate conversation the industry has largely avoided having.

The One Company Doing It Differently

Adobe's AI image tool, Firefly, was trained exclusively on licensed images from Adobe Stock, content Adobe owns outright, and public domain material. Adobe created a compensation model for Stock contributors whose work was used in training. When Firefly launched, it was commercially safe in a way that Midjourney and Stable Diffusion were not, because the data behind it came with permission attached.

Firefly's output was initially considered lower quality than Midjourney's. That gap has narrowed considerably. But the more important point is that Adobe demonstrated the model is viable: you can build a commercially competitive AI image generator on consented, licensed data. It costs more and takes longer. Several companies chose not to do it that way, and are now in court explaining why.

What It Means for Creators Right Now

The picture is mixed. Concept art and illustration have been hit hardest. These are fields where AI output is genuinely competitive with mid-tier commercial work, and clients who once paid for that work have, in some cases, stopped doing so. Studios using AI for early-stage visual development is now standard practice.

But the field hasn't collapsed. The artists who have adapted, using AI as a workflow tool rather than treating it as a replacement, are competitive in ways they weren't before. The distinctive, deeply personal work at the top of the market remains in demand. Not easily replicated by a style prompt. What's been disrupted is the middle: the large volume of competent, generic commercial illustration that was always the bulk of the industry's revenue base.

The tools that matter right now: register your work with the Copyright Office or equivalent in your country. Use Glaze or Nightshade before publishing high-resolution work online. Check Have I Been Trained to see if your work is already in LAION. Watch the Getty case too. When the damages phase concludes, the amount the court awards will either establish a precedent that makes scraping financially viable to contest, or financially catastrophic to continue.

The Bottom Line

The art world's fight with AI is not about being anti-technology. It's about the difference between inspiration and extraction. Between a tradition of artists learning from other artists across centuries, and a systematic commercial harvest of creative output that never asked anyone's permission. The tools to fight back exist. The legal framework is slowly catching up. The companies that built on licensed data from the start are proving the ethical path was also viable. The question now is whether the industry corrects itself or waits to be corrected.