It looks like somebody at PocketOS needs to be booted far. Their architecture was weird: the backups were on the same volume as the production database (????), everything was on the cloud with no local copy, the designer gave unfiltered control to the AI agent -- lots of dumbness. But the AI's "confession" seemed really weird. If the agent had rules, how did it ignore them? There's something odd here.
FWIW, we had databases too, some more mission critical than others. Depending on the "brand" of database (mysql/mariadb or postgres or mongodb or sqlite) we had different backup approaches, but the copy was always done by entirely different agents, and copies kept in different servers in different buildings. I can't think of a way anything but deliberate admin action on different machines that could damage both. The whole point of backup is to keep the data somewhere safely distant from problems on the original host. Ideally you'd like a copy that only a different admin can delete, just in case somebody goes postal.
It turns out the cloud provider here was able to provide a way to access the data after all, but that's not usually the case.
I think Pixie at AofSHQ made a reference to the PocketOS issues in one of her overnight briefs, and if I recall it correctly she summarized the situation as the database backups being a periodic shadow copy of the disk volumes and the AI agent didn't just delete the DB, it deleted the entire volume along with the shadow copies.
ReplyDeleteMy take on the Anthropic situation was less that they didn't want the DOW using their software but that they wanted to retain the right for their agents to make "are you sure" human verifications in any system controlling weapons. The DOW position was their own policies already required such verification so there was no reason for Anthropic to include redundant ones. The reported behavior of various agents, such as the one you discussed, has always made me wonder why Anthropic was so sure their internal controls were actually going to be effective.
Yes, apparently PocketOS rented a virtual volume, and divided it into logical chunks, using one chunk for the production and another for backups (and presumably others for other purposes). Rather like using the same disk for the original and backups. It saves money to only rent one volume, but if something happens to that...
ReplyDelete