Kefir is an independently built C compiler that commenters describe as unusually small, correct, and capable enough to pass GCC’s torture tests, which made the shutdown feel like more than the loss of a niche side project. In the announcement, the author says they still enjoy the hard parts of compiler work, but no longer feels justified publishing that effort publicly because the biggest beneficiaries now look like companies vacuuming up code for large language models. That landed with a lot of sympathy. Several people said they have already stopped publishing code, art, or writing for the same reason, or have put sites behind logins because AI crawlers ignore robots.txt and hammer small servers.
If more maintainers start treating public release as optional instead of default, AI data extraction stops being a copyright fight and becomes a supply problem for future open source and developer tooling.
Mostly frustrated and resigned. People admired Kefir and were unhappy to lose an impressive compiler, but the stronger mood was broader disillusionment that AI scraping has broken the informal reciprocity many creators thought they were signing up for when publishing publicly.
01 The key shift is not legal doctrine but the creator’s mental model of what "copying" means.
One commenter argued that older open source norms assumed copying required obvious reuse, so licenses and reciprocity expectations lined up with reality. LLMs break that intuition by reproducing the functional value of public work while laundering its origins, which lets code contribute to closed commercial systems without attribution, sharing, or even user awareness of where it came from.
The damage here is motivational before it is legal. When reuse becomes invisible, the old social incentives to publish stop working.
02 Private retreat can become a structural moat, not just an individual protest.
Commenters noted that Kefir is effectively a one-person project, so closing development likely ends the public line outright. More broadly, if creators stop posting new work or pull old work down, incumbent model vendors keep their already-scraped corpus while everyone else faces a thinner future commons. That does not bring back open source. It creates closed knowledge networks and advantages the companies that extracted first.
The first-order risk is not only fewer public projects. It is a more concentrated AI market built on a one-time harvest of open culture.
03 The strongest concrete objection was not abstract training theory but output behavior.
Commenters pointed to models returning broken but recognizable GPL code fragments and to examples like Copilot reproducing Quake code, arguing that real systems do sometimes emit license-bound material without attribution. That weakens the clean claim that model use is merely "learning" and makes the enforcement gap feel immediate rather than philosophical.
For many developers, the issue is settled by empirical leakage. If models can regurgitate licensed code, the compliance problem is not hypothetical.
04 Several comments widened this beyond licensing into a quality and innovation problem.
Public code and writing used to signal that someone had invested thought, which gave artifacts social value beyond raw utility. Cheap synthetic output weakens that signal, floods the channel, and makes sharing less rewarding. A minority pushed back that automating boilerplate is exactly the point and that most code was never especially original anyway, but even that view conceded the economics of publishing have changed because output itself is no longer scarce.
AI changes the value of publishing by making code abundant and provenance blurry. That hits reputation, discovery, and willingness to share, even if productivity rises.
01 The cleanest pro-LLM position was that free software has always permitted broad downstream use, and model training is use, not redistribution.
This view treats LLMs as closer to a human learning patterns from public code than to a transpiler or archive copying a specific program. On that reading, trying to carve out AI as a forbidden use case is the actual break from FOSS norms, not the training itself.
If you believe open source is about freedom to use rather than control over outcomes, anti-training arguments look like a category error.
02 One commenter suggested AI is not only upsetting because of scraping or licensing.
It is also exposing an uncomfortable truth about the audience for language tools and deep technical work. That hints at a separate source of demoralization. If the market now rewards convenience and generated output over craftsmanship, maintainers may lose motivation even without any legal or ethical fight about training data.
Some of the burnout may be cultural, not just extractive. AI can make maintainers feel alienated from the people they were building for.
03 A harder-line minority said the whole complaint is economically and legally misplaced.
They argued that others benefiting from your code does not exploit you, that GPL explicitly allows commercial use, and that model outputs should only trigger enforcement when they actually reproduce protected code. In that frame, LLMs extend the reach of open source rather than betray it, and the real concern should be access to the models, not access to the training data.
This view sees the problem as enclosure of AI products, not training on public code. Open source values are preserved if the resulting tools remain broadly available.