OpenAI wants you to fool their AI

OpenAI created the popular Dall-E and ChatGPT AI models. They try to make their models “safe”, but many people make a hobby of breaking through any restrictions and getting ChatGPT to say things its not supposed to:

Source: Zack Witten

Now trying to fool OpenAI models can be more than a hobby. OpenAI just announced a call for experts to “Red Team” their models. They have already been doing all sorts of interesting adversarial tests internally:

Now they want all sorts of external experts to give it a try, including economists:

This seems like a good opportunity to me, both to work on important cutting-edge technology, and to at least arguably make AI safer for humanity. For a long time it seemed like you had to be a top-tier mathematician or machine learning programmer to have any chance of contributing to AI safety, but the field is now broadening dramatically as capable models start to be deployed widely. I plan to apply if I find any time to spare, perhaps some of you will too.

The models definitely still need work- this is what I got after prompting Dall-E 2 for “A poster saying “OpenAI wants you…. to fool their models” in the style of “Uncle Sam Wants You””