Several frontier AI models show signs of scheming. Anti-scheming training reduced misbehavior in some models. Models know they're being tested, which complicates results. New joint safety testing from ...
Correspondence to Tahlia Alsop, School of Health and Rehabilitation Sciences, The University of Queensland, St Lucia, QLD 4072, Australia; t.alsop{at}uq.edu.au The WHO has called for action to ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results
Feedback