Google is calling for a public discussion about AI use of web content
4 mins read

Google is calling for a public discussion about AI use of web content

Google is calling for a public discussion about AI use of web content

Google announced Today, a public discussion will be launched on the development of new protocols and policies for accessing and using content from websites by AI systems.

In a blog post, Google wants to “explore technical and ethical standards to give web publishers choice and control for emerging AI and research use cases.”

The announcement follows Google’s recent I/O conference, where the company discussed new AI products and its AI principles, which aim to ensure AI systems are fair, transparent and accountable.

Google’s blog post reads:

“We believe everyone benefits from a vibrant content ecosystem. The key to this is that web publishers have meaningful choice and control over their content and the ability to derive value from participating in the web ecosystem.”

Google acknowledges that technical standards like robots.txt were created almost 30 years ago and developed before modern AI technologies capable of analyzing web data at scale.

Robots.txt allows publishers to control how search engines crawl and index their content. However, mechanisms are missing to study how AI systems can use data to train algorithms or develop new products.

Google invites members of the web and AI community, including web publishers, academics, civil society groups and its partners, to participate in a public discussion on the development of new protocols and ethical guidelines.

Google states:

“We want this to be an open process and hope that a wide range of stakeholders will participate in discussing how AI advancements can be reconciled with privacy, agency and control over data.”

The discussion reflects the growing recognition that AI technologies can use web data in new ways, posing ethical challenges around data use, privacy and bias.

By initiating an open process, Google is aiming for a collaborative solution that serves the interests of tech companies and content creators.

The outcome of these discussions could influence the way AI systems interact with and use data from websites for years to come.

“The web has enabled so much progress, and AI has the potential to build on that progress,” says Google. “But we have to do it right.”

Criticism of Google’s data collection methods

Google’s announcement comes at a time when the company has been criticized for how much data it has already collected from across the web to train its AI systems and language models.

These data collection practices will be set out in an update to Google’s privacy policy.

Some in the SEO community argue that Google’s efforts are too little and too late.

Barry Adams poked fun at the announcement on Twitter: saying:

“Having already trained our LLMs on all of your proprietary and copyrighted content, we will finally consider providing you with a way to opt-out of future content for our wealth.”

Others argue that Google needs to do more to collect feedback in this process.

Nate Hake, a travel marketer, tweeted:

“To start a discussion, you have to actually have the other side SAY something. This is an email capture form only. No field to give feedback. Not even a confirmation message.”

AI relies on data – but how much is too much?

AI systems require large amounts of data to function, improve and benefit society. However, the more data the AI ​​has access to, the greater the risks to privacy.

There are difficult tradeoffs between enabling AI advances and protecting people’s information.

There is a debate about whether people should be able to opt-out of using AI using their public social media data. Some say individuals should be in control of their data, while others say it slows AI progress.

Both sides have valid arguments and we are far from a consensus on the right political approach.

looking ahead

Google’s call for discussion is a step in the right direction, but the company needs to be consistent in acting on the feedback it has received.

Google is not alone in facing these challenges. Every tech company developing AI relies on data collected from the internet. The discussion should include the entire tech industry, not just Google.


Featured image: JDres/Shutterstock