New research shows how fragile AI safety training is. Language and image models can be easily unaligned by prompts. Models need to be safety tested post-deployment. Model alignment refers to whether ...