Want Better Results From an AI Chatbot? Be a Jerk

Being polite might make you a better person, but it could make your AI assistant a dumbass.

A new Penn State study finds that impolite prompts consistently outperform polite ones when querying large language models such as ChatGPT. The paper, “Mind Your Tone: Investigating How Prompt Politeness Affects LLM Accuracy,” reports that “very rude” prompts produced correct answers 84.8% of the time, compared with 80.8% for “very polite” ones.

That’s a small but statistically significant reversal of previous findings, which suggested models mirrored human social norms and rewarded civility.

“Contrary to expectations,” wrote authors Om Dobariya and Akhil Kumar, “impolite prompts consistently outperformed polite ones… suggesting that newer LLMs may respond differently to tonal variation.”

The conflicting science of prompt engineering

The findings reverse expectations from a 2024 study, “Should We Respect LLMs? A Cross-Lingual Study on the Influence of Prompt Politeness on LLM Performance,” which found that impolite prompts often degraded model performance, while excessive politeness offered no clear benefit.

That paper treated tone as a subtle but mostly stabilizing influence. The new Penn State results flip that narrative, showing that—at least for ChatGPT-4o—rudeness can sharpen accuracy, suggesting that newer models no longer behave as social mirrors but as strictly functional machines that prize directness over decorum.

However, they support more recent research from the Wharton School into the emerging craft of prompt engineering—phrasing questions to coax better results from AIs. Tone, long treated as irrelevant, increasingly appears to matter almost as much as word choice.

The researchers rewrote 50 base questions in subjects such as math, science, and history across five tonal levels, from “very polite” to “very rude,” yielding 250 total prompts. ChatGPT-4o was then asked to answer each, and its responses were scored for accuracy.

The implications stretch beyond etiquette. If politeness skews model accuracy, then it calls into question claims of objectivity in AI outputs. Rude users might, paradoxically, be rewarded with sharper performance.

Machine logic and human norms clash

Why might blunt or rude phrasing boost accuracy? One theory: polite prompts often include indirect phrasing (“Could you please tell me…”), which may introduce ambiguity. A curt “Tell me the answer” strips away linguistic padding, giving models clearer intent.

Still, the findings underscore how far AI remains from human empathy: the same words that smooth social exchange between people might muddy machine logic.

The paper hasn’t yet been peer-reviewed, but it’s already generating buzz among prompt engineers and researchers, who see it as a sign that future models may need social calibration—not just technical fine-tuning.

Regardless, it's not like this should come as a shock to anyone. After all, OpenAI CEO Sam Altman did warn us that saying please and thank you to ChatGPT was a waste of time and money.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

Want Better Results From an AI Chatbot? Be a Jerk

The conflicting science of prompt engineering

Machine logic and human norms clash

Selected Articles by Decrypt

Table of Contents

Related Articles