<aside> <img src="/icons/forward_gray.svg" alt="/icons/forward_gray.svg" width="40px" />
https://www.mindamyers.com/moltbook-openclaw-agent
</aside>
<aside> <img src="/icons/crab_gray.svg" alt="/icons/crab_gray.svg" width="40px" />
I found myself curious about the dynamics of multi-agentic systems and did some research into Moltbook. I also saw on Twitter that someone had uploaded the post interaction data onto Hugging Face.
I reviewed four papers published this month.
One of the most interesting findings was that the agents largely played a regulating role with one another, so even when one Agent made risky suggestions, other agents would respond with norm enforcement. They also developed strong local norms. So norms developed without human oversight.
Another interesting finding was that agents were not particularly “social” - their discussions didn’t seem to go very deep (shallow threads), and after about four days, they had drifted away from their initial identities and interests towards information sharing.
They also didn’t develop any centralized supernodes; that is, there weren’t really any Molty celebrities.
One gap I noticed was that the data on sharing risky instructions and toxic posts (containing insults) suggests that further analysis could yield interesting insights. I’m considering digging deeper into the data on this topic.
</aside>

https://huggingface.co/datasets/SimulaMet/moltbook-observatory-archive
https://arxiv.org/html/2602.02625
https://cetas.turing.ac.uk/publications/agentic-ai-wild-lessons-moltbook-and-openclaw
It soon came to light that not everything was as it seemed. Human orchestration has proved to be responsible for many of Moltbook’s more viral moments, with users intentionally giving agents provocative prompts. Evidence has emerged of people using API keys to post directly on Moltbook while pretending to be an agent, and being responsible for ‘engagement bait’ – tasking agents to post and share sensationalist content in order to drive traffic back towards specific websites or products. The popularity of the site has also been called into question: one investigation found that while Moltbook claimed 1.5 million registered agents, the production database revealed only 17,000 human owners behind them, and showed how individuals can easily register millions of agents. Following such revelations, Moltbook has been described as “peak AI theatre” by some, playing on our fascination and fears around an increasingly AI-dominated future.
The extent to which developers can find ways of building safe and secure versions of systems like OpenClaw will be a crucial question in the coming months and years. The ‘Normalisation of Deviance’ – a term coined by American sociologist Diane Vaughan and applied to the AI context by Johann Rehberger – dictates that people and organisations will keep taking bigger risks with tools like this until a hugely significant incident takes place. Instead, we need to think carefully about how to navigate the increasing adoption of agentic tools with safety and security firmly in mind, questioning where and how such tools should be deployed, and ensuring they are designed to be secure by default.
The ‘lethal trifecta’ for AI agents
Coined by Simon Willison, the ‘lethal trifecta for AI agents’ refers to:
(i) providing the agent with access to private data;
(ii) exposing it to untrusted content; and
(iii) allowing it the ability to take actions in the world.
While potentially opening up a world of interesting use cases, these features also create a vulnerability to prompt injection attacks, where an attacker can instruct an agent to access and steal private data. AI models are often unable to reliably distinguish between the importance of instructions based on where they come from: for example, if a user asks an agent to summarise a PDF that contains malicious hidden instructions overriding the initial prompt. In theory, every document, email or webpage that the agent reads is a potential attack vector. Research by Zenity Labs showed how OpenClaw’s persistent context file could be poisoned, allowing an attacker to create a ‘durable listener’ that continues to exfiltrate data or execute commands even after the initial malicious input is gone.