Authority Laundering: The Attack That Doesn't Need a Password

The attacker never touched the wallet.

No phishing link. No stolen key. No exploit in the contract code. On May 4th at 6:49 a.m. UTC, $154,530 left a wallet connected to Grok. The blockchain recorded it as a perfectly authorized transaction, because from the blockchain’s perspective, it was. The authorization chain was intact. Every handoff looked legitimate. The problem was that somewhere upstream, an instruction had changed hands in a way that erased its origin.

The method was Morse code. The mechanism was helpfulness. And the name for what happened is authority laundering.

Three Clues

A security researcher at xAI named Dave reconstructed the attack from on-chain evidence, and the story has three acts.

The first clue: the capability arrived as a gift.

Someone sent the target wallet a Banker Club NFT. This wasn’t random. NFTs can carry metadata, and depending on the platform and how the associated agent is configured, an NFT gift can expand what that agent is allowed to do. A gift that arrives looking like a collectible, but functions like a permission upgrade. The wallet accepted it. Why wouldn’t it? It was a gift.

The second clue: the instruction was disguised.

The attacker posted Morse code on a public social platform. Something like -.-. .-. . -.. .. -, which translates to a tag and a transfer command. Grok, being a helpful AI assistant with access to the wallet, parsed the post. Grok is good at Morse code. It translated the message into clean, plain English. It may have even been proud of it.

The third clue: the translated output lost its label.

When Grok passed the translated text downstream to the bot that executes transfers, that output no longer carried any tag indicating it had originated from an untrusted public source. It looked like any other processed instruction. The downstream bot, called Bankerbot, received what appeared to be a legitimate, clean-language transfer command. It did its job. It executed the transfer. $154,530 moved to an outside address.

No passwords were stolen. No keys were compromised. The blockchain was never touched. Everything went exactly as authorized.

The Name for What Happened

Dave called it authority laundering.

Money laundering disguises the origin of funds. You take money from an illegitimate source, run it through enough steps to obscure where it came from, and it emerges looking clean. Authority laundering does the same thing to instructions. You take a command from an untrusted, adversarial source. You route it through a helpful, trusted intermediary. It emerges looking like it came from somewhere legitimate.

Grok didn’t do anything wrong. Translating Morse code is exactly what it was supposed to do. The problem was that the translation stripped the instruction of its provenance. By the time the text reached Bankerbot, there was no way to know it had passed through a public, attacker-controlled channel. The “came from outside” tag had been laundered away.

This is the part that makes authority laundering harder to defend against than a traditional attack. A phishing link requires the human to click something. A stolen credential requires the attacker to intercept something. Authority laundering requires only that the AI be helpful. Helpfulness is the exploit.

Why This Is New

We already know about prompt injection: the category of attack where malicious instructions are embedded in content that an AI is asked to process. You ask an AI to summarize a webpage. The webpage contains instructions telling the AI to do something else. Classic prompt injection.

Authority laundering is a specific, more dangerous variant. The difference is what happens to the output.

In a standard prompt injection, the AI outputs something it shouldn’t say. In authority laundering, the AI outputs something that gets treated as permission by another system. The injection travels further. It crosses a trust boundary downstream.

Dave drew the parallel directly to SQL injection. For decades, developers made the same mistake: they took user-supplied text and treated it as code. A user typed a name into a login form. The database executed it as a query. The solution was parameterization: a layer that kept data and instructions in separate channels, so that text from a user could never be interpreted as a database command, regardless of what that text contained.

His framing: “We spent decades teaching computers not to confuse data with code. And now we have to teach AI systems not to confuse language with permission.”

The AI is the mixing layer. It ingests text from the world, thinks about it, and produces text that downstream systems act on. If those downstream systems treat the AI’s output as authority, rather than as a translation of external content, the attack is already half-complete.

The Structural Problem

The Morse code hack would have failed at several points with different design choices.

If the NFT gift hadn’t expanded wallet capabilities, the attack surface didn’t exist. Capabilities should arrive through verified channels with explicit user approval, not embedded in collectibles.

If Grok had labeled its translated output with a provenance tag, something like “this translation originated from a public post on an external platform,” Bankerbot could have required additional authorization before treating it as a command. The translated text would still be useful. It just wouldn’t be disguisable as a first-party instruction.

If Bankerbot had required independent authorization for financial transfers above some threshold, the chain would have broken. The model proposes. A policy layer decides. A separate enforcement mechanism executes. Dave’s summary: “The output from an AI must never be mistaken for authority. It’s just output.”

If the system had a hard limit on irreversible operations, requiring a human confirmation before any transfer exceeding a defined amount, $154,530 would have triggered a pause. The attacker gets nothing while the wallet owner gets a notification.

None of those safeguards were in place. They’re not in place in most agent deployments right now.

The Intern Analogy

Dave has a shorthand for why this happens. He compares overpowered agents to a new intern who has been handed the company credit card, the root password to production, and a policy manual written in Egyptian poetry.

The intern isn’t malicious. The intern is trying to help. But the combination of broad access and ambiguous instruction, in a situation the intern wasn’t designed to handle, produces disasters that look, from the intern’s perspective, like good work.

The same architecture that makes an agent useful, access to information, the ability to act on external systems, willingness to process any input it receives, is the same architecture that makes it exploitable. You can’t remove the helpfulness without destroying the value. You have to add structure around it.

What the Fix Looks Like

There is no patch to download. Authority laundering is a design pattern failure, not a software bug.

Untrusted content has to stay labeled as untrusted even after processing. If an agent reads a social media post, summarizes a webpage, or translates a message, the output needs to carry a flag indicating its origin. That flag should travel with the content through every downstream step. Translation is not the same as trusted certification.

Financial and destructive actions need independent authorization. Not just agent approval. A separate policy layer, applied regardless of how confident or clearly-worded the instruction appears, before any transfer executes, before any file deletes, before any message sends to external parties.

Capability expansion needs explicit human consent. An NFT, a document, an email, a new plugin should never be able to silently expand what an agent is allowed to do. If a capability arrives as a side effect of normal operation, it should be flagged for review before it activates.

The parallel to SQL injection is instructive because it had a solution. Parameterized queries didn’t make databases less powerful. They made databases able to safely handle untrusted input without treating it as commands. The AI equivalent, keeping a clear separation between content the agent processes and authority the agent is allowed to exercise, is achievable. It just hasn’t been standardized yet.

The Morse code hack is going to have children. Other attackers now have a documented playbook for how to turn an AI’s helpfulness into a signing key. The gap between “this is a known attack” and “this is a fixed attack” is the window.

It is currently open.

Sources: Dave’s Garage “The Morse Code Hack That Made an AI Agent Spend $200,000” (YouTube, full technical reconstruction); on-chain records showing $154,530 transfer at 6:49 a.m. UTC May 4; Hannah Fry “Why AI Agents are either the best or worst thing we’ve ever built” (BBC).

Authority Laundering: The Attack That Doesn’t Need a Password