OpenAI Emails Threats to People Trying to Probe New ChatGPT Models

Asking the chatbot about its line of thought might cause an eerie letter to show up in your email.

Satyendra Lodhi

Ever since the early days of the ongoing AI craze, and even before that, during the notorious Tay AI incident, one of the most mischievous yet amusing challenges for tech-savvy users was to get into a chatbot's digital cortex in order to find out how it formulates its lines of thought and whether it could be influenced to either spit out nonsense or reveal something its masters really wouldn't want it to reveal. OpenAI, the company behind ChatGPT, has seemingly made preventing this a priority, making it significantly harder for users to understand exactly how the chatbot's latest o1-preview and o1-mini versions think.

As reported by Ars Technica, immediately after last week's announcement of the new ChatGPT versions, enthusiasts began their attempts to get a behind-the-scenes look at o1-preview's and o1-mini's raw reasoning using methods such as jailbreaking or prompt injection.

In response, OpenAI, proving yet again that it's not that open, began sending somewhat eerie letters to the naughty users, warning them to "halt this activity" and threatening to take away access to "GPT-4o with Reasoning", the company's internal name for the o1 model.

Mozilla's Marco Figueroa was among the ones who got such an email from OpenAI, complaining on Twitter that his jailbreaking attempts had landed him "on the get banned list," even though the jailbreaks were intended for research purposes. "I was too lost focusing on AI Red Teaming to realize that I received this email from OpenAI yesterday after all my jailbreaks," Marco wrote. "OpenAI, we are researching for good!"

Twitter user Thebes also mentioned OpenAI's new tactic of sending threatening emails to prying users in one of her posts, claiming that she received "the scary letter" after mentioning the words "reasoning trace" in a prompt:

This information was later confirmed by Lukas Bogacz and Scale AI's Riley Goodside, whose accounts got flagged for mentioning reasoning trace as well:

Dyusha Gritsevskiy, received the email simply for including the words "internal reasoning" in the prompt, making the company's reluctance to allow users to understand how ChatGPT reaches its conclusions all the more apparent:

According to OpenAI itself, their decision to hide the lines of thought comes after "weighing multiple factors including user experience, competitive advantage, and the option to pursue the chain of thought monitoring"."We acknowledge this decision has disadvantages," says the company in its recent blog post. "We strive to partially make up for it by teaching the model to reproduce any useful ideas from the chain of thought in the answer." Whether these are truly the main reasons, or if the developer has some hidden agenda that requires keeping the reasoning trace hidden from public view, as usual, only OpenAI knows, and OpenAI never tells.

Don't forget to join our 80 Level Talent platform and our Telegram channel, follow us on Instagram, Twitter, LinkedIn, TikTok, and Reddit, where we share breakdowns, the latest news, awesome artworks, and more.

OpenAI Threatens People Who Try to Probe New ChatGPT Models

Satyendra Lodhi

Join discussion

Comments 0

You might also like

We need your consent