A new framework exposes vulnerabilities in language model safety evaluations through concept-specific manipulations.