Brave browser under fire for alleged sale of copyrighted data
3 mins read

Brave browser under fire for alleged sale of copyrighted data

Brave, a privacy-focused web browser, is under fire for allegedly selling proprietary data to train artificial intelligence models.

This has sparked debates about the ethical use of data and the need for transparency.

An article by Stack Diary’s Alex Ivanovs brought the allegations against Brave to light.

Ivanovs expressed concerns that Brave may collect user data without permission and sell it to companies developing AI systems.

Although Brave touts strong privacy protections, the alleged sale of copyrighted material for AI training raises questions about data practices that could violate user trust and privacy expectations.

The emerging controversy highlights the tensions between using personal data to advance AI capabilities and protecting privacy and property rights. It underscores the need for clear communication and user consent to share their information.

The situation calls into question whether, as claimed, Brave really prioritizes user privacy and data control.

Unpack the allegations

Ivanovs claimed that Brave provides access to copyrighted content through its Brave Search API and allows third parties to use that data for AI training without the appropriate license.

He argued that Brave’s lack of regard for copyright and the monetization of data access were ethically questionable practices.

Ivanovs writes:

“Brave allows you to ingest copyrighted material through the Brave Search API, for which you are also assigned ‘rights’.”

Good answer

The allegations prompted Josep M. Pujol, Brave’s head of tracing, to defend the company’s actions. Pujol said the rights issues were related to Brave’s search engine results, not the content itself.

Pujol explains:

“Brave Search has the right to monetize the results of its search engine and set terms of use.”

Pujol also stated that any data provided is always mapped to the URL of the content.

The investigation

Ivanovs noted that Brave Search provides long “Extra Alternative Snippets,” similar to Google’s Featured Snippets. He questioned whether these long snippets, ranging from 150 to 260 words, comply with copyright fair use principles.

In addition, Ivanovs criticized Brave for not disclosing details about its web crawler, which indexes website content. He argued this prevents site owners from blocking Brave from potentially selling their content.

Brave countered that its crawler respects the robots.txt standard that websites use to control crawlers.

The implications

Concluding his report, Ivanovs noted that the ramifications of Brave’s practices extend beyond the search engine itself.

He raised concerns about the possibility of abuse of the system and ambiguity about the legality of Brave’s methods.

He also questioned Brave’s stance that, as a search engine, the company had the right to extract and resell data verbatim.

Ivanov’s warns:

“I don’t see a world where this cannot be abused.”

From now on the debate will continue.

This problem raises important questions about the ethical use of data, making money off the content of others, and the level of openness of large tech companies.

The technology industry will closely monitor the development of these conversations.