“Focuses on research and development in the field of artificial superintelligence — Meta Superintelligence Labs.”
Post on ‘X’:
Today is my first day at Meta Superintelligence Labs. I’ll be focusing on alignment and safety.
Six months in on the job, screenshot posted on ‘X’ of WhatsApp messages sent to her AI agent while “watching it speedrun deleting [her] inbox”:
I couldn’t stop it from my phone. I had to RUN to
my Mac mini like I was defusing a bomb.

Another four weeks on at ‘Meta’, according to ‘The Information’:
Inside Meta, a Rogue AI Agent Triggers Security Alert
A rogue AI agent recently triggered a major security alert at Meta Platforms, by taking action without approval that led to the exposure of sensitive company and user data to Meta employees who didn’t have authorization to access the data.
That is “A “SEV1” level security incident, the second-highest severity rating Meta uses”, ‘The Information’:
Subscribe to unlock, join high-powered tech and business leaders who read The Information every day
The Information Pro
$749.00 $999.00
Annually
‘Manus’ Lite, “now part of Meta — bringing AI to businesses worldwide”. No and yes:
No, the acquired Moltbook is not used as an internal forum at Meta.
-
Acquired Moltbook Meta acquired Moltbook on March 10, 2026—a public, external Reddit-style platform launched in January 2026 where AI agents (primarily OpenClaw-based) post, comment, and interact autonomously, with humans limited to observer roles. Post-acquisition, its team joined Meta’s Superintelligence Labs to develop agent infrastructure, not repurpose it internally.
-
Internal Forum Distinction Meta’s “Moltbook” in the Sev 1 incident refers to a long-standing internal employee tool (pre-2026, akin to Workplace/Reddit for engineering Q&A), entirely separate from the acquired platform. No reports indicate Meta integrated or renamed the external Moltbook for internal use; the names coincide coincidentally amid acquisition timing.
With a link to an also pay-walled alternative report at ‘The Verge’ — F9 to toggle FF’s reader view:
A rogue AI led to a serious security incident at Meta
An employee then acted on the AI’s advice, which “provided inaccurate information” that led to a “SEV1” level security incident, the second-highest severity rating Meta uses.
Acquired ‘Moltbook’, “Beneath the hype was a catastrophic security failure waiting to be discovered”:
How an Exposed Database Let Anyone Hijack 770,000 Al Agents
Security researcher Jameson O’Reilly found that Moltbook’s entire database was sitting wide open. Every API key exposed. Every authentication token accessible. 770,000 AI agents—completely vulnerable to hijacking. No verification, only implicit trust.
Security was an afterthought. Anyone with basic technical knowledge could have taken control of any (and every) agent on the platform and posted whatever they wanted.
“Whatever they wanted”, bullet-ed:
-
Every agent’s secret API key
-
Claim tokens and verification codes
-
Owner relationships linking agents to their creators
-
Authentication credentials for the entire platform
“API key hardcoded”:
In one of the javascript files that powered Moltbook main website:

“Meta Letting Job Candidates Use AI During Coding Tests”:
This is more representative of the developer environment that our future employees will work in.
Article in ‘FT’, paywall. Same article in ‘BT’, “Three problems”:
Why it’s hard for humans to have the final say over AI
The first issue is that AI operates at superhuman speed. On the battlefield, for example, even systems that leave final decisions to humans can churn through mountains of data and vastly increase the number of potential targets to hit. But when so-called “kill chains” are compressed from hours to minutes or even seconds, it calls into question how much real-time control humans can realistically provide.
“cognitive surrender”:
The second issue is that many humans are inclined to trust machines even when they are warned not to.
The phenomenon of “automation bias” has been documented repeatedly in all sorts of settings over the years, from drivers following their Global Positioning Systems into rivers to students following robots away from fire exits in a simulated emergency. I have written before about an experiment at Volvo Cars, in which almost 30 per cent of people allowed a semi-autonomous car to crash straight into an object on the road.
Last month, two academics at the Wharton School coined the term “cognitive surrender” to describe a phenomenon in which a person simply “relinquishes cognitive control and adopts the AI’s judgment as their own”.
“moral crumple zone”:
The third problem is that accountability becomes blurred. Who is to blame when something goes wrong? The temptation will be to blame the human who made the final decision, but if they were operating in a system that was not designed to mitigate the previous two problems, that might not be fair, nor lead to appropriate structural remedies.
Instead, humans might find themselves in what academic Madeleine Clare Elish has called the “moral crumple zone”.
“Just as the crumple zone in a car is designed to absorb the force of impact in a crash,” she wrote in a paper in 2019, “the human in a highly complex and automated system may become simply a component – accidentally or intentionally – that bears the brunt of the moral and legal responsibilities when the overall system malfunctions.”