Home Kripto Bluesky’s Open API Makes User Data Vulnerable to Scraping
Kripto

Bluesky’s Open API Makes User Data Vulnerable to Scraping

Bluesky’s Open API Makes User Data Vulnerable to Scraping

Bluesky is facing its first major controversy over data scraping after a dataset containing one million public posts appeared on the AI platform Hugging Face, 404Media reports.

Compiled by Daniel van Strien, a machine learning librarian, the dataset was intended for AI research and analysis. However, it was removed following public outcry about user consent and data privacy.

I've removed the Bluesky data from the repo. While I wanted to support tool development for the platform, I recognize this approach violated principles of transparency and consent in data collection. I apologize for this mistake.

— Daniel van Strien (@danielvanstrien.bsky.social) 2024-11-27T02:19:57.958Z

The dataset included Bluesky users’ decentralized identifiers (DIDs), metadata, and post details, offering a searchable function to locate specific user content. Van Strien described the data as useful for developing language models, studying social media trends, and testing moderation tools. While technically legal, Bluesky users did not consent to this use of their content, prompting backlash. Van Strien later apologized in a Bluesky post, admitting the move failed to uphold transparency and consent principles.

Bluesky’s decentralized platform relies on the Authenticated Transfer (AT) Protocol and a public firehose API. This API provides access to an aggregated stream of all public activity, such as posts, likes, and follows. Bluesky representatives explained that its structure mirrors the openness of the internet, where public data can be crawled. They acknowledged the need for tools that allow users to express consent preferences but admitted these settings cannot be enforced on external developers.

The platform’s rapid growth has attracted millions of users, many fleeing platforms like X (formerly Twitter) over AI-related concerns. Although Bluesky itself does not train AI models on user content, its open system allows third parties to do so. This incident has raised alarms about the privacy risks of public networks.

Bluesky said it is working on consent mechanisms but emphasized that honoring these preferences will depend on external actors. “We’re having ongoing conversations with engineers & lawyers and we hope to have more updates to share on this shortly!” the company said.

For example, this might look like a setting that allows Bluesky users to specify whether they consent to outside developers using their content in AI training datasetsBluesky won’t be able to enforce this consent outside of our systems. It will be up to outside developers to respect these settings

— Bluesky (@bsky.app) 2024-11-27T01:52:05.791Z

While Bluesky’s efforts to address consent are a step in the right direction, it’s clear that more concrete measures are needed. Giving users stronger tools to control their data, like opt-in systems and anti-scraping protections, would be a meaningful way to balance innovation with respect for user rights. Platforms have a responsibility to ensure that openness doesn’t come at the expense of user trust.

Related Articles

Trudeau Calls Trump’s Tariffs ‘Dumb’ as Retaliation Looms
Kripto

Trudeau Calls Trump’s Tariffs ‘Dumb’ as Retaliation Looms

Prime Minister Justin Trudeau sharply criticized U.S. President Donald Trump’s sweeping 25%...

X Enhances Communities with New Sorting Options, Filters, and Post Visibility Features
Kripto

X Enhances Communities with New Sorting Options, Filters, and Post Visibility Features

X has rolled out a series of new features aimed at enhancing...

Apple Challenges UK’s Data Privacy Request in Legal Battle
Kripto

Apple Challenges UK’s Data Privacy Request in Legal Battle

The UK government’s recent demand for Apple to provide access to its...

Mexico Vows to Impose Retaliatory Tariffs on U.S. Goods
Kripto

Mexico Vows to Impose Retaliatory Tariffs on U.S. Goods

Mexico’s President Claudia Sheinbaum announced that the country will impose retaliatory tariffs...