Wikipedia bans AI-generated content. Here are the reasons for this decision

The English-language version of the encyclopedia forbids the use of large language models to generate and reword entries, leaving only two narrow exceptions: very basic processing of one's own text and translations after human verification. This is not a sudden turn, but a the next stage in a longer process of tightening the rules – from restrictions on AI-generated images and comments to quick removal of pages that appear to have been created by LLM.
But the most important thing is why Wikipedia rejects generative AI. The simplest answer is: hallucinations. But not the most spectacular ones, i.e. making up complete nonsense. A much bigger problem is hallucinations that make one look like a wolf in sheep's clothing — content that looks credible and only after careful analysis reveals its true nature. Quotes in which the model subtly changes the meaning of the statement, in practice completely distorting it. Importantly, in this case these are not just theoretical speculations. Moderators and Wikipedia authors have been struggling with the pressure of AI slop for several years and have had many opportunities to observe what generative confabulations look like in practice.
Practice shows that Wikipedia's concerns are justified
I know this problem not only from other people's analyses, but also from everyday work with AI – either in the form of the company's Copilot based on ChatGPT, or in the form of a more advanced version of Gemini, paid for from my own pocket. I use these tools primarily to search for information, because classic search engines more and more often resemble trash, where finding something valuable becomes a rare surprise. This is enough for me to see all the sins of artificial intelligence on a daily basis and to be able to warn myself and others: uncritically trusting AI is very risky when it is used for topics that you cannot yet verify yourself.
Some of these sins are relatively minor. When I was preparing an article about Huawei's new accelerators, I tasked Copilot with creating a table comparing the specifications of different accelerators. Something was wrong with one of the cells. After pointing this out, the language model apologized to me for the – and I quote – “Czech mistake”. The AI took data from another cell regarding a completely different parameter of another accelerator and inserted it in the wrong place, because the numbers 141 and 148 looked the same to it. Such “Czech errors” are not unusual for AI, because this technology has a problem with nuance and gets lost when certain pieces of information are similar to each other.
The above example is a small thing in practice, but this tendency to lose context and fail to understand nuances can manifest itself in much more problematic ways. I recently came across an AI-generated text fragment about a person in “Chinese Taipei”. This is a perfect example of a statement that looks correct from the point of view of the linguistic model, but for a person even slightly versed in geopolitics, it turns on a red flag.
I also regularly encounter situations discussed by Wikipedia in which the AI is wrong in very convincing and veiled ways. I encountered the most striking example of such a situation when preparing an article about Jensen Huang. I was looking for detailed information about a situation involving Nvidia sending accelerators to researchers at the University of Toronto. The AI, responding to my query, took three different events several years apart, mixed them together, and finally he supplemented the words actually spoken with made-up quotes matching the narrative created by artificial intelligence. After pointing out that “I don't see these quotes in the sources provided,” Gemini replied, in part: “My previous mistake – this particular quote comes from a different source.” So it admitted its mistake and then… created another quote that did not exist in the linked article.
No, Gemini. Those quotes just aren't there.
|
Onet
Enforcing the new AI policy will become increasingly difficult
In order not to fall into a cheap anti-technology narrative, it must be admitted that artificial intelligence can be very useful. It works well as a tool for quickly mapping a topic, spotting clues, organizing material, etc. However, this does not change the fact that Wikipedia's concerns are perfectly reasonable, and the goal of the people managing this website is noble. Just… does this fight have any future??
AI enthusiasts will probably say that this technology is developing so quickly that it is only a matter of time before all the identified problems are solved. Models are getting better and better, people are also learning to formulate queries more precisely, which has a huge impact on the results obtained. Perhaps one day it will even be possible to do something to make the text generated by AI stop sounding bland – which results from the models' natural tendency to average everything, and thus in practice make the text bland. And even if the text generated by AI is not perfect, according to this logic, it is enough that it will be better than the one created by many people, who also make mistakes, can be biased and are usually not masters of writing.
In short, the ban on publishing AI-generated content will become increasingly difficult to enforce. It is also possible that in some applications it may simply become redundant. The problem is that even if we reach this stage of AI development and high-quality artificial intelligence becomes widely available – and not only to the richest – Wikipedia's mission may first lose out to something much more prosaic: economics.
The specter of the Internet losing the economy
Whether we like it or not, the fact is that ChatGPT has permanently changed how people search for information. The direction of Google's development is additionally fueled by the trend in which users go to sources less and less often and more and more often they uncritically accept what artificial intelligence serves them, not noticing the so-called Gell-Mann amnesia (although it is not dangerous only in the case of content generated by AI). This has a very large impact on the finances of all websites where content is still created by people.
Ouroboros, or a snake eating its own tail.
It is true that this mainly affects commercial media, but… Wikipedia, which relies on donations, also sees declining traffic — and growing bot traffic, for which it also has to pay. Sooner or later this will have a very noticeable impact on its business. Let's hope that the Internet will find some new idea for itself before then. We already know from practice that artificial intelligence is a snake that does not despise biting its own tail and if necessary, quotes content generated by another AI, for example from Grokipedia. In this way, a closed loop of synthetic content is created, in which models reinforce each other's own answers, moving further and further away from the original sources, and therefore also from the truth. So we are currently dealing with a paradox in which AI needs quality content created by people to function properly, while effectively cutting off these people from the resources and motivation necessary to continue working.
To sum up, Wikipedia is still trying to defend the old principle that a sentence without a verifiable source should not enter the circulation of knowledge. The question is not only whether it will be able to defend it against the AI slope, but also whether the Internet in its current form will allow such places to continue to function. This will largely depend on whether we still remember that content that the author put no effort into creating rarely deserves the reader's effort.





