Anthropic apologizes for invisible Claude Fable guardrails
- AI
- Security
- Open Source
- Regulation
- Developer Tools
Anthropic released Claude Fable with protections that, in some cases, did not simply refuse a request. Instead the system could quietly limit the model’s effectiveness or switch work to Opus. After criticism, Anthropic said it would walk back the invisible version and make refusals explicit. That did not calm people much. The core complaint was trust, not just safety policy. If a paid coding or research tool can silently do a worse job on certain classes of work, users can no longer tell whether a bad result came from their prompt, the model, or a hidden vendor intervention.
If you rely on frontier models in production or research, treat silent policy interventions as a vendor risk alongside uptime and pricing. Push for explicit failure modes, billing clarity, and an exit path to open or second-source models before these controls become normal.
- theverge.com
- Discuss on HN