AI and the newsroom: Nine Indian media houses are opting out of AI trackers. Here’s why

Concerns vary from copyright issues to loss of revenue and jobs.

WrittenBy:Shivnarayan Rajpurohit
Date:
Illustration of a journalist hiding his laptop screen from a robot.

In times of technology, almost every other news website has a think-piece on how AI can change industries for the better, from healthcare to e-commerce to education.

The future of journalism too lies in AI, some say, with potential to copy edit reports, transcribe interviews, subtitle videos, gather research, and crunch data. 

But even as news publishers experiment with the idea of using generative AI to improve content creation, distribution and management, they’re also anxious about how these AI platforms will use their content. Concerns vary from loss of jobs and revenue to audience. 

Which is why around half the news publishers in the world have denied permission to AI platforms to scrape their websites, according to a tracker run by homepage.news, an open-source platform. 

Out of 1,149 news publishers surveyed, 559 – or 48.7 percent – have “instructed OpenAI, Google AI or the non-profit Common Crawl to stop scanning their sites”. This includes nine media houses in India.

At the time of publishing this story, 48.4 percent of the surveyed organisations opted out of Microsoft-backed OpenAi and 35.8 percent opted out of Common Crawl. Just seven percent opted out of Google AI.

OpenAI, Google AI and Common Crawl scrape data on the internet with the help of “crawler” programmes, also called spiders. This data is fed to generative chatbots like ChatGPT and Bard, which talk to us like humans and answer questions on anything from the Israel-Palestine violence to the difference between a chatbot and a crawler.

Ben Welsh, a news application editor at Reuters, created the open-source code for the Homepage.news project in March 2022. Subsequently, a group of volunteers contributed to the project. He initially tracked 100 media houses; as more volunteers contributed to the code, the repository expanded to over 1,000 media houses. 

Welsh told Newslaundry the tracker’s findings reflect a “broad cross-section of the English-language publishing world, but it is far from comprehensive”.

Importantly, the tracker comes close on the heels of terse battles between Big Tech and news publishers for revenue and audiences across the world.

This fight is waged on India’s shores too. Last year, India’s News Broadcasters & Digital Association and other news organisations filed complaints against Google for “abusing its dominant position”.


paywall image

Sign up to read this story for free

Make an account to continue reading this story. For free! We will email you a weekly newsletter written by our reporters, linking our best stories.

Sign up for free

Already have an account? Login

You may also like