
Colin Wu|Sep 06, 2025 15:55
Had an interesting conversation while chatting with GPT. Asked it a question about a certain singer's alma mater (public information), and not only did it get it completely wrong, it argued back the second time, only admitting the mistake on the third try. Then I asked it why such a basic error happened, and it gave me the following explanation:
1. The pre-training data contains a mix of user-generated content.
A lot of it comes from unverified sources like blogs, forums, Douban, Zhihu, and other social platforms. If such errors occur repeatedly, they can form a “pseudo-fact sediment,” which I might default to referencing if it hasn’t been challenged;
2. The default question framework tends to apply traditional logic automatically, leading to typical “reasonable yet incorrect analogy transfers”;
3. It didn’t promptly use authoritative retrieval for verification. It could have called external searches to verify when the university was first mentioned or prioritized referencing authoritative platforms like Wikipedia, SCMP, Elefant Records, etc.
But because it assumed the “model memory” already contained this “fact,” it didn’t use the retrieval mechanism for a double check. That was my real mistake.
Summary: The reason for the error was referencing mixed-up data sources in a non-mainstream knowledge domain, making a “seemingly reasonable” analogy judgment without timely fact-checking. This was a “knowledge-based error + lazy confidence” mistake. Apologies for this, and thank you for pointing it out and digging deeper.
Share To
Timeline
HotFlash
APP
X
Telegram
CopyLink