Bluesky offered a new suggestion on how user data can be used for artificial intelligence training and public archiving. The company plans to offer options to determine whether users will allow users to scraping their shipments and data.
Bluesky CEO Jay Graber discussed this proposal on the stage at South by Southwest (SXSW) at the beginning of last week. However, the issue, Friday night Graber’in Bluesky’da shared about this plan came up again after sharing. Many users reacted, thinking that Bluesky would not sell their data to advertisers and use their posts in artificial intelligence training.
Some users criticized this step by Bluesky, argued that the platform moved away from privacy -oriented approach. A person whose username is “Sketchette ,,“ Oh my God, no! The beauty of this platform was that the information was not shared. Especially with artificial intelligence! Now don’t take a step back, ”he made a comment.
Graber said that artificial intelligence companies have already excavated everyone open data on the Internet, and that Bluesky wants to make this process more transparent and controlled. According to him, the system proposed by Bluesky will allow users to specify their preferences by providing a structure similar to the Robots.txt file that helps websites to communicate with search engines.
The ongoing discussions on artificial intelligence education and copyright issues have also raised problems such as the fact that the robots.txt file is not legally binding. Bluesky states that the new standard it presents will have a similar mechanism, that it will create an ethical framework, but it will not bring a legal obligation.
What does Bluesky recommend?
According to the proposal, users who prefer other applications using Bluesky application or infrastructure will be able to determine how to use their data in four different categories from the Settings menu:
- Artificial Intelligence Training (Providing data for the training of artificial intelligence models),
- Protocol bridge (Connection between different social media platforms),
- Collective Data Sets (Creating large data clusters for research or analysis purposes),
- Web Archiving (For example, sharing with archiving services such as Wayback Machine).
When a user states that he does not want his data to be used for artificial intelligence training, companies and research teams will have to respect this preference. This will be valid both in the Scripting processes and the collective data transfer made through the protocol.
Technology writer Molly White, this proposal is a positive step and Bluesky found it surprising to be criticized. According to him, this change aims to add the consent of users to the already process, rather than encouraging artificial intelligence to engrave data.
However, White questions the effectiveness of similar “preference signals” proposed by Creative Commons. He said that such systems will work only if data scrapers are well -intentioned, and that some artificial intelligence companies continue to engrave the content by disregarding robots.txt rules.
Bluesky’s proposal is not yet finalized and discussions continue. It seems that users will have to be convinced that the platform does not contradict the previous words on transparency and confidentiality.