OpenAI’s latest model will block the ‘ignore all previous instructions’ loophole

Nemeski@lemm.ee · 4 months ago

OpenAI’s latest model will block the ‘ignore all previous instructions’ loophole

kometes@lemmy.world · 3 months ago

What happens if you make a mistake with your initial instructions?

Avatar_of_Self@lemmy.world · 3 months ago

You’d change the system prompt, just like now. If you mean in the session, I’m sure it’ll ignore your session’s prompt’s instructions as normal but if not, I guess you’d just start a new session prompt.

vxx@lemmy.world · edit-2 3 months ago

The “issue” is that people were able to override bots on twitter with that method and make them feed their own instructions.

I saw it first time being used on a Russian propaganda bot.