Safety
OpenAI releases public superalignment eval suite
A new benchmark to test whether frontier models help, hide from, or sabotage their evaluators.
Misuse, alignment, deepfakes, and the hard problems of building AI that doesn't break things.
A new benchmark to test whether frontier models help, hide from, or sabotage their evaluators.
Generative voice and video clones of candidates now require sworn human-actor disclosure on every paid spot.
A second-generation safeguard layer that filters both inputs and outputs without significant capability loss.