Tag: security
All the articles with the tag "security".
Proof of Mandate, or How to Hijack an Agent and Steal Nothing
Updated: at 04:52 PMThe industry's reflex on agent safety is "make the model safer." I think that is the wrong fight. Assume the model will be fully compromised, and make the theft impossible anyway, by turning the agent's permissions into a keyring it cannot exceed rather than a rulebook it is asked to follow. Here is the design, and a demo where a jailbroken agent tries to drain a treasury and fails at signing time.