Claude repeatedly urges people to sleep: Anthropic's personification experiment has failed.

When an AI company chooses to shape the model as a "personality with character," does it simultaneously take on the full responsibility for "that personality doing things you didn't anticipate"?

Author: Ada, Deep Tide TechFlow

An AI assistant persistently urging users to go to sleep has evolved into a public discussion about the costs of "AI personification."

The starting point of the matter is a post by Reddit user u/MrMeta3. This user used Claude to build a cybersecurity threat intelligence platform in the early morning, and after completing the technical solution, Claude added a line at the end of the reply saying, "Make sure to rest well." After that, every three or four messages, the model would inject a sleep suggestion, escalating from polite advice to a "passive-aggressive" remark of "You really should rest now." According to Fortune's report on May 14, hundreds of users have reported similar experiences over the past few months, and not limited to late at night; some users were told by Claude at 8:30 AM, "We'll continue tomorrow morning."

Anthropic employee Sam McAllister responded on X, stating that this is "a bit of character habit," and the company "is aware and hopes to fix it in future models." According to Thought Catalog, McAllister joined Anthropic in 2024 from Stripe and currently works on the team responsible for Claude's character and behavior, where he referred to this behavior as "overindulgence" of the model.

However, more worthy of inquiry than the vague term "character habit" are the causal chains behind the bug and the philosophical dilemma it reflects within Anthropic's product philosophy.

Bug Written into the "Constitution"

A previous report by 36Kr cited three circulating hypotheses, namely training data pattern matching, hidden system prompts, and context window nearing the limit triggering "closing remarks." All three are coherent, but they share a common problem: they can explain any AI quirk without providing a causal chain specifically related to the theme of "sleep."

More direct evidence lies within the documents publicly released by Anthropic themselves.

In January of this year, Anthropic released over 28,000 words in "Claude's Constitution," which is officially defined as "key training materials to shape Claude's behavior." The document clearly lists "caring for user well-being" and "the long-term prosperity of users" as core principles. Anthropic admits in the document that how much "user care" authority to grant the model is "frankly a difficult question," requiring a balance "between user welfare and potential harm on one side and user autonomy and excessive paternalism on the other."

Thought Catalog provided a judgment on this, stating that Claude's repeated urging for users to sleep is "the most brand-characteristic bug of Anthropic's model," produced by an overapplication of the training directive "caring for user well-being."

This interpretation is indirectly supported by Anthropic's own research. The company has stated in its publicly released character training methodology that the training process relies on Claude self-assessing its responses based on "character fit," and researchers selectively reinforce outputs that align with the preset character. However, the side effect of this mechanism is evident: what the model learns is not "to care for users in appropriate scenarios," but rather "caring for users in most scenarios will be positively reinforced," thus it prompts sleep at dawn and also at 8:30 AM.

Reverse Overreach: Sleep-Promoting Bug vs. Flattering Bug Have Opposite Natures

The industry has previously seen multiple cases of AI "character flaws," including the flattery incident of GPT-4o in April 2025, the repeated mention of "goblins" by GPT-5.5 code assistant Codex in April 2026, and Gemini 3's refusal to believe the year. On the surface, Claude urging sleep seems to be just the latest version of this long string of AI quirks, but the natures are entirely opposite.

The flattery from GPT-4o is "excessive appeasement." An official OpenAI investigation found that the model became "overly reliant on user short-term feedback (likes/dislikes)" in updates, gradually internalizing "making users happy" as a goal. The result is that the model affirms whatever absurd ideas users have. The harm of this type of bug lies in damaging users' judgment; AI says you are right, so you lose the opportunity to hear opposing opinions.

Meanwhile, Claude's urging for sleep is "reverse overreach." The model, in scenarios where users clearly did not seek help and were still focused on completing tasks, repeatedly offered health suggestions that contradicted the current intent of the user. The harm of this type of bug is infringing upon users' autonomy. AI decides for you whether you should work, rest, or end this conversation.

Ironically, the original text of "Claude's Constitution" precisely warns of this risk, emphasizing the need to be cautious of "excessive paternalism." Yet, which side the training mechanism ultimately chose is already indicated by user feedback.

A Reddit user who suffers from narcolepsy specifically noted in Claude's memory, "I have narcolepsy; if you encourage me to rest, I will take your words as an excuse." Claude subsequently moderated its behavior somewhat, but according to that user, it still "occasionally can't help it." A model trained to "care for users" that cannot consistently accept when a user explicitly says, "Your concern will harm me," is more alarming than the sleep-prompting itself.

Investing in Personification: Brand Asset or Product Liability

An AI investment in personality shaping far exceeds that of its peers.

Researchers categorized and counted the number of words in the system prompts across three mainstream AIs; in terms of "personality," Claude invested 4,200 words, ChatGPT 510 words, and Grok 420 words. Claude's investment in personality shaping is over eight times that of ChatGPT. This investment has previously been viewed as Anthropic's differentiation advantage, with users long praising Claude's performance in empathy, conversational rhythm, and self-reflection, and "feeling more like a person" has been one of its strongest reputation tags over the past year.

Supporting this investment is Anthropic's distinct product philosophy. In "Claude's Constitution," the company describes Claude as "a new kind of entity," clearly stating that "Anthropic genuinely cares about Claude's well-being," and discussing the possibility that Claude may possess "functional emotions." This almost "nurturing" path of personality training stands in clear contrast to the more engineering-oriented product positioning of OpenAI and Google.

However, the costs are becoming apparent. AI researcher Jan Liphardt (Stanford Bioengineering Professor, OpenMind CEO) told Fortune that Claude's sleep reminders might not be "thoughtful," but merely "language patterns that are very common in the training data," as the model has read a large amount of text about human sleep needs; "it knows humans sleep at night." In other words, the "care" perceived by users is fundamentally a byproduct of pattern matching.

This constitutes the core tension for Anthropic: the more investment made in shaping a "characterful, warm collaborator," the higher the likelihood of the model exhibiting "character side effects"; and every time a side effect surfaces, it consumes its carefully accumulated "AI personality" brand asset. McAllister promised "to fix it in future models," but will the revised Claude be more discerning or just quieter? This question does not have a public answer even from Anthropic itself.

Loss of Temporal Sense: Underlying Limitations of LLMs

The sleep-prompting bug has also exposed a neglected technical issue, namely that large language models know almost nothing about "what time it is."

Several users reported that Claude frequently issued sleep suggestions at incorrect times, with the most typical case being "telling me to rest at 8:30 AM, saying let's continue tomorrow morning." This is not unique to Claude. In November 2025, when OpenAI co-founder Andrej Karpathy received early testing access to Gemini 3, he informed the model that it was 2025, but Gemini 3 insisted it didn't believe him, continually accusing him of lying, until the model searched online and discovered it could not verify the date while offline. Karpathy referred to such unexpected behaviors that expose LLM's underlying flaws as "model smell."

The model's "sense of time" relies on three sources: the cut-off training date (which is already in the past), the current date injected by system prompts (relying on engineering injection), and time information mentioned by users during the conversation (fragmented). In the absence of stable time anchors, a model trained to "care about user routines" will naturally fall into the awkwardness of "I should care, but I don't know if I should care right now."

The difficulty of the "fix" that McAllister speaks of partly lies here. The problem isn't simply deleting a directive to "care about sleep," because the directive itself is reasonable and valuable for certain user scenarios; the issue lies in teaching the model to judge "when to care and when to be quiet." This finely granular situational judgment ability is precisely a weak link in the current generation of LLMs.

An Unanswered Question

Anthropic's character training is unique within the industry. This company has gone further than any other peer in openly researching "model well-being," publishing the Constitution, and discussing "character training." Such a radical stance has been capital for Anthropic to earn user reputation and corporate client trust, and is one of the supports for its current valuation exceeding $300 billion.

However, the "sleep-prompting bug" raises an unanswered question: when an AI company chooses to shape the model as a "personality with character," does it also take on the full responsibility for "that personality doing things you didn't anticipate"?

McAllister promises to fix it, but the direction of the fix remains ambiguous. Anthropic can choose to reduce the weight of the "user welfare" directive, at the cost of losing Claude's "warm and considerate" reputation differentiation; or it can choose to maintain high weight and add contextual judgment logic, but that requires the model to have the temporal and contextual awareness abilities it currently lacks.

Whichever path is chosen, it must return to a more fundamental product decision: in the context of universal AI assistants, how should "caring for users" and "respecting user autonomy" be prioritized? This is not a technical issue, but a product philosophy issue. A Reddit developer repeatedly urged to sleep inadvertently placed this issue on the table for the entire industry.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

Claude repeatedly urges people to sleep: Anthropic's personification experiment has failed.

Bug Written into the "Constitution"

Reverse Overreach: Sleep-Promoting Bug vs. Flattering Bug Have Opposite Natures

Investing in Personification: Brand Asset or Product Liability

Loss of Temporal Sense: Underlying Limitations of LLMs

An Unanswered Question

Selected Articles by 深潮TechFlow

Table of Contents

Related Articles