Cloudflare is on the free internet side. Does not agree to theft of data by AI models

Ashley Davis02/07/2025

0 215 3 minutes read

Technically, Cloudflare uses extensive heuristic systems and machine learning to distinguish AI generative agents from traditional search engines or archivists. Already in 2024, he released a button blocking some of such crawers, and more than a million customers have enabled it. Now this one protection becomes always-oni.e. always turned on, and the user can only add exceptionsif he wants to share data with selected business partners.

Matthew Prince, head of Cloudflare, warned at one of the conferences that the publishers had an existential moment: AI algorithms are increasingly giving the answer without sending the reader to the sourcewhile indexing robots download many times more pages in the background than ever return in the form of clicks. Within half a year, the “Download – Reference” report dropped from 6: 1 to 18: 1, and for Opennai reaches as much as 1500: 1. In other words, OpenAI sends an average of one user to as much as 1,500 data downloads from a given website. With such relationships, it is difficult to monetize content using the most popular model, which is displaying ads.

Check also: Wikipedia on the edge of overload. AI bots can take away free access to us

Publishers gained strong support

For publishers, it is primarily a chance to regain control and a new stream of revenues. Pay per crawl is already testing, among others Time, Condé Nast and Associated Press. The system is to operate in a simple way – the publisher sets the rate or completely refuses access, and AI decides whether it will pay or will be satisfied with data from another place.

Read also at Business Insider

Cloudflare is part of A broader trend in which content creators choose between the courtroom and the license. New York Times suits OpenAI and Microsoft, accusing them of mass copyright violations, but at the same time concluded the first license agreement with Amazon in May for the needs of Alexa and AI Amazon models. Axel Springer and News Corp signed similar agreements with OPENAI.

However, there is no guarantee that the technical barriers proposed by Cloudflare will prove tight. Some scrapers are already ignoring files for bot controls such as robots.txt, and cybercriminals can impersonate ordinary users to automate theft of content. As a result, this can turn into a fierce competition: more and more smart bots vs more and more invasive analysis of the behavior on the server. This, of course, raises questions about privacy and false alarms.

Read also: Protection against quantum computers will not be easy. AI attacks are “Pikuś”

Billions of dollars in the game

From the advertising market perspective, the rate is high. Press Gazette analysis showed that The appearance of AI Overview (review to AI) in Google search engine reduces the clickability of the results of the online mail service by up to 56 percent. This is the outflow of movement that makes free content financed by advertising banners cease to closure economically.

If the tendency persists, publishers can massively switch to Paywalle, micropayments or hybrid API licenses. At the same time, the AI platforms will test the income division models – from displaying links with a commission, to embedding contextual ads in the answer stream itself. However, there is a risk that the smallest creators who have lived from advertising in the long tail of the search will no longer find a place in this ecosystem. On the one hand, they will no longer receive enough redirects from search engines and LLMs, and the other they will have too small recipients base to earn on paid content.