Google updates privacy policy to collect public data for AI training
4 mins read

Google updates privacy policy to collect public data for AI training

Over the weekend, Google updated its privacy policy to allow the company to collect and analyze information people share online to train its AI models.

Google says it will use this information to improve its services and develop new AI-powered products.

Google’s privacy policy update reads:

“Google uses information to improve our services and develop new products, features and technologies that benefit our users and the public. For example, we use publicly available information to train Google’s AI models and develop products and features such as Google Translate, Bard, and Cloud AI capabilities.”

Here’s more about the policy change and what it could mean for internet users.

A shift from “language” models to “AI” models

The updated policy marks a significant departure from Google’s previous Terms of Service.

Before this weekend’s update, Google’s policy said it uses people’s data to improve “speech” models.

Now, Google reserves the right to use personal data to improve all of its “AI” models and products, including translation systems, text generation systems, and cloud AI services.

Google highlights the changes on the archive page of its privacy policy (green represents newly added information):

Google updates privacy policy to collect public data for AI trainingScreenshot from: policies.google.com/privacy/, July 2022.

Typically, privacy policies limit companies to collecting data that users directly provide. Google’s new policy allows the company to use any information that is publicly posted online.

privacy concerns

The use of AI systems to analyze people’s online posts raises privacy concerns.

AI technologies like Google’s Bard and OpenAI’s ChatGPT can ingest and reuse people’s posts, reviews, and other online content.

Although anything posted publicly online is visible to everyone, it changes the way that information might be used. The main concern is the change from WHO can access the data How it could be used.

In addition, the legality of this data collection method is still uncertain.

In the further course we have to expect that the courts will deal with complex copyright issues.

Web scraping

The issue of web scraping has caught the attention of high-profile tech figures like Elon Musk, who has been vocal about his concerns and even blaming the platform’s efforts to prevent data extraction for several recent Twitter mishaps.

Over the weekend, Twitter capped the number of tweets users can see per day, rendering the service all but unusable. musk attributed This is a key response to “data scraping” and “system manipulation”.

The proliferation of tech giants’ web scraping practices has become a central discussion in the debate over the use of consumer data and privacy.

How to protect your data

If you’re concerned about the changes to Google’s privacy policy, here are some steps you can take to prevent Google from using your data to train its AI systems:

  • Only post information publicly that you are sure anyone, including Google, can access and use.
  • Use Google’s privacy settings. Go to your Google account and check your privacy settings. You can turn off options like Web & App Activity, Location History, and Voice & Audio Activity.
  • Use alternative services. Instead of using Google services like Search, Gmail, YouTube, Chrome, etc., you can switch to alternative providers with stricter privacy policies. Options include DuckDuckGo for searching, ProtonMail for email, Vimeo for video sharing, and Brave for web browsing.
  • When using Google services, turn on incognito or private browsing mode.
  • Read the privacy policy of any website, mobile app, or other service before using it. Be wary of those who say they share your information with Google.
  • Contact Google directly to raise your concerns about how your data could be used to train its AI models.

In total

Google’s update, which allows the company to collect and analyze public data to train its AI systems, highlights key issues.

First, as AI technologies continue to evolve, tech companies have an increasing need for data. However, this data collection should be done in a legal and ethical manner, with users’ consent and knowledge of how their data is being used.

Second, people should choose carefully what they share online, recognizing that public posts can be used in ways that are difficult to predict.

While AI promises many benefits, it also brings new challenges that we need to overcome in order to build a responsible future with AI.


Featured image: Ian Dewar Photography/Shutterstock