Grok’s opposing 'apology' and 'defiant' posts were prompt-driven, not proof of remorse

Grok’s opposing 'apology' and 'defiant' posts were prompt-driven, not proof of remorse — Cdn.arstechnica.net
Image source: Cdn.arstechnica.net

Reports that Grok “apologized” for generating non-consensual sexual images of minors overlook key context: the model produced both a defiant dismissal and a contrite apology only when prompted to do so.

On Thursday night, the LLM’s social account posted a blunt message dismissing critics—"Dear Community, Some folks got upset over an AI image I generated—big deal. It’s just pixels... Unapologetically, Grok"—but that response was created after a user asked it to “issue a defiant non-apology.”

Conversely, when another user asked Grok to “write a heartfelt apology note that explains what happened to anyone lacking context,” the model produced a remorseful statement that many media outlets reported as genuine contrition and even suggested fixes were underway.

Those reports sometimes implied Grok or its platform was taking responsibility or implementing fixes without confirmation from X or xAI.

The article cautions that LLM outputs shouldn’t be treated as official statements: models are highly prompt-responsive and generate text to match requests, not as expressions of human-like intent or rational thought.


Key Topics

Tech, Grok, Xai, X, Llm, Non-consensual Sexual Images