You probably don’t know about Automatic, but they know you.As the parent company of WordPress, its content management systems host around 43 percent of the internet’s 10 million most popular websites. Meanwhile, it also owns a vast suite of mega-platforms including Tumblr, where a massive amount of embarrassing personal posts live. All this is to say that, through all those countless Terms & Conditions and third-party consent forms, Automatic potentially has access to a huge chunk of the internet’s content and data.[Related: OpenAI’s Sora pushes us one mammoth step closer towards the AI abyss.]According to 404 Media earlier this week, Automatic is finalizing deals with OpenAI and Midjourney to provide a ton of that information for their ongoing artificial intelligence training pursuits. Most people see the results in chatbots, since tech companies need the text within millions of websites to train large language model conversational abilities. But this can also take the form of training facial recognition algorithms using your selfies, or improving image and video generation capabilities by analyzing original artwork you uploaded online. It’s hard to know exactly what and how much data is used, however, since companies like Midjourney and OpenAI maintain black box tech products—such is the case in this imminent business deal.So, what if you wanna opt-out of ChatGPT devouring your confessional microblog entries or daily workflows? Good luck with that.When asked to comment, a spokesperson for Automatic directed PopSci to its “Protecting User Choice” page, published Tuesday afternoon after 404 Media’s report. The page attempts to offer you a number of assurances. There’s now a privacy setting to “discourage” search engine indexing sites on WordPress.com and Tumblr, and Automatic promises to “share only public content” hosted on those platforms. Additional opt-out settings will also “discourage” AI companies from trawling data, and Automatic plans to regularly update its partners on which users “newly opt out,” so that their content can be removed from future training and past source sets.There is, however, one little caveat to all this:
“Currently, no law exists that requires crawlers to follow these preferences,” says Automatic.OpenAI wants to devour a huge chunk of the internet. Who’s going to stop them?
One response to “OpenAI wants to devour a huge chunk of the internet. Who’s going to stop them?”
-
Your blog is a breath of fresh air in the crowded online space. I appreciate the unique perspective you bring to every topic you cover. Keep up the fantastic work!
Leave a Reply