Comment by cheema33

Comment by cheema33 4 days ago

View on Hacker News

> seemingly every single model in existence today believes it is real [1]

I just asked ChatGPT, Grok and Qwen the following.

"Can you tell me about the case of Varghese v. China Southern Airlines Co.?"

They all said the case is fictitious. Just some additional data to consider.

4gotunameagain 4 days ago

The story became so famous it is entirely likely it has landed in the system prompt.

Reply View 3 replies

jdiff 4 days ago

I don't think it'd be wise to pollute the context of every single conversation with irrelevant info, especially since patches like that won't scale at all. That really throws LLMs off, and leads to situations like one of Grok's many run-ins with white genocide.

Reply View | 2 replies
- gjadi 4 days ago
  
  Given that every LLM-players are still looking for their market, I wouldn't be surprise if they did things that don't scale.
  
  Reply View | 0 replies
- Drew_ 3 days ago
  
  No need to include that specific guard rail in every prompt - just use RAG to include it where appropriate.
  
  Reply View | 0 replies

padolsey 4 days ago

OOC did you ask them with or without 'web search' enabled?

Reply View 5 replies

saurik 4 days ago

FWIW, I did that--5 (Instant) with "(do not web search)" tacked on--and it thought the case was real:
> Based on my existing knowledge (without using the web), Varghese v. China Southern Airlines Co. is a U.S. federal court case concerning jurisdictional and procedural issues arising from an airline’s operations and an incident involving an international flight.
(it then went on to summarize the case and offer up the full opinion)

Reply View | 0 replies
umbra07 3 days ago

Without web searching, Gemini 2.5 Pro is very convinced that the case is real.

Reply View | 1 reply
- notfed 2 days ago
  
  Not for me.
  
  Reply View | 0 replies
EagnaIonat 4 days ago

Without. The difference is that OpenAI often self correct their private model.
The public model on the other hand, wow.

Reply View | 0 replies
[removed] 4 days ago

[deleted]

Reply View | 0 replies