Chain-of-thought monitorability could improve generative AI safety by assessing how models come to their conclusions and spotting the βintent to misbehave.β Monitoring generative AIβs decision-making ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results