Photo by Rahul Pandi
More than 36 intellectual property-focused businesses, including Universal Music Group, have welcomed a move by internet infrastructure firm Cloudflare to block AI companies from automatically collecting content online — a shift that could reshape the AI licensing market.
AI web crawlers are automated programs or bots used by artificial intelligence companies to scan and collect large amounts of data from websites. This data, which can include text, images, audio, and video, is then used to train AI models, including large language models like ChatGPT or image generators.
As of July 1, AI web crawlers are now blocked by default from accessing content on sites using Cloudflare’s infrastructure. With the company powering nearly 20% of the internet, the change could significantly disrupt AI models trained on scraped online content, much of it copyrighted. Website owners will also gain more control. They’ll be able to choose which crawlers access their content and introduce “pay-per-crawl” systems, letting them charge AI companies for access or negotiate licensing deals directly.
Originally designed to help search engines index the web, crawlers are now commonly used by AI firms to gather text, images, and media — often without permission. As AI increasingly becomes the go-to source for information and entertainment, many publishers have seen sharp declines in traffic, a trend known as the “zero-click” phenomenon.“If the internet is going to survive the age of AI, we need to give publishers the control they deserve,” said Matthew Prince, co-founder and CEO of Cloudflare. “Our goal is to put the power back in the hands of creators, while still helping AI companies innovate.”
The move has broad support across media and IP-based industries, including backing from the News/Media Alliance, Associated Press, Conde Nast, Fortune, Sky News Group, Time, and others.
UMG Chief Operating Officer Boyd Muir also praised the decision. “This initiative will help address the disruptive and unauthorized scraping of both creative and commercial IP by AI developers and support new licensing opportunities,” he said. “We believe AI, used ethically and transparently, can unlock meaningful creative and commercial potential.”
Cloudflare already has a verification system that lets crawlers identify themselves and their intent. If they don’t comply, the company can apply tools typically used against malicious traffic.“A web crawler is just another kind of bot,” said Will Allen, Cloudflare’s head of AI privacy, control, and media products. “All our experience tracking bots helps us understand what crawlers are doing.”
Still, some researchers caution against a blanket approach. “Not all AI systems compete with all web publishers,” said Shayne Longpre, a PhD candidate at the MIT Media Lab. “Open research and personal use shouldn’t be sacrificed.”




