Comment by NewsaHackO

Comment by NewsaHackO 3 days ago

>We brought GPT‑4o back after hearing clear feedback from a subset of Plus and Pro users, who told us they needed more time to transition key use cases, like creative ideation, and that they preferred GPT‑4o’s conversational style and warmth.

This does verify the idea that OpenAI does not make models sycophantic due to attempted subversion by buttering up users so that that they use the product more, its because people actually want AI to talk to them like that. To me, that's insane, but they have to play the market I guess

Scene_Cast2 3 days ago

As someone who's worked with population data, I found that there is an enormous rift between reported opinion (and HN and reddit opinion) vs revealed (through experimentation) population preferences.

Reply View 29 replies

Macha 3 days ago

I always thought that the idea that "revealed preferences" are preferences, discounts that people often make decisions they would rather not. It's like the whole idea that if you're on a diet, it's easier to not have junk food in the house to begin with than to have junk food and not eat more than your target amount. Are you saying these people want to put on weight? Or is it just they've been put in a situation that defeats their impulse control?
I feel a lot of the "revealed preference" stuff in advertising is similar in advertisers finding that if they get past the easier barriers that users put in place, then really it's easier to sell them stuff that at a higher level the users do not want.

Reply View | 3 replies
- cal_dent 3 days ago
  
  Perfectly put. Revealed preference simply assumes impulses are all correct, which is not the case, an exploits that.
  Drugs make you feel great, in moderation perfectly acceptable, constantly not so much.
  
  Reply View | 0 replies
- simonjgreen 2 days ago
  
  Absolutely. Nicotine addiction can meet the criteria for a revealed preference, certainly an observed choice
  
  Reply View | 1 reply
  
  sandspar 2 days ago
  
  One example I like to use is schadenfreude. The emotion makes us feel good and bad at the same time: it's pleasurable but in an icky way. So should social media algorithms serve schadenfreude? Should algorithms maximize for pleasure (show it) or for some kind of "higher self" (don't show it). If they maximize for "higher self" then which designer gets to choose what that means?
  
  Reply View | 0 replies
tunesmith 3 days ago

Well that's what akrasia is. It's not necessarily a contradiction that needs to be reconciled. It's fine to accept that people might want to behave differently than how they are behaving.
A lot of our industry is still based on the assumption that we should deliver to people what they demonstrate they want, rather than what they say they want.

Reply View | 0 replies
make3 3 days ago

Exactly, that sounds to me like a TikTok vs NPR/books thing, people tell everyone what they read, then go spend 11h watching TikToks until 2am.

Reply View | 0 replies
ComputerGuru 2 days ago

Not true. People can rationally know what they want but still be tempted by the poorer alternative.
If you ask me if I want to eat healthy and clean and I respond on the affirmative, it’s not a “gotcha” if you bait me with a greasy cheeseburger and then say “you failed the A/B test, demonstrating we know what you actually want more than you.”

Reply View | 0 replies
toss1 3 days ago

Sounds both true and interesting. Any particularly wild and/or illuminating examples of which you can share more detail?

Reply View | 5 replies
- jaggederest 3 days ago
  
  My favorite somewhat off topic example of this is some qualitative research I was building the software for a long time ago.
  The difference between the responses and the pictures was illuminating, especially in one study in particular - you'd ask people "how do you store your lunch meat" and they say "in the fridge, in the crisper drawer, in a ziploc bag", and when you asked them to take a picture of it, it was just ripped open and tossed in anywhere.
  This apparently horrified the lunch meat people ("But it'll get all crusty and dried out!", to paraphrase), which that study and ones like it are the reason lunch meat comes with disposable containers now, or is resealable, instead of just in a tear-to-open packet. Every time I go grocery shopping it's an interesting experience knowing that specific thing is in a small way a result of some of the work I did a long time ago.
  
  Reply View | 0 replies
- hnuser123456 3 days ago
  
  The "my boyfriend is AI" subreddit.
  A lot of people are lonely and talking to these things like a significant other. They value roleplay instruction following that creates "immersion." They tell it to be dark and mysterious and call itself a pet name. GPT-4o was apparently their favorite because it was very "steerable." Then it broke the news that people were doing this, some of them falling off the deep end with it, so they had to tone back the steerability a bit with 5, and these users seem to say 5 breaks immersion with more safeguards.
  
  Reply View | 1 reply
  
  Sabinus 3 days ago
  
  If you ask the users of that sub why their boyfriend is AI they will tell you their partner or men in general aren't providing them with enough emotional support/stimulation.
  I do wonder if they would accept the mirror explanation for men enjoying porn.
  
  Reply View | 0 replies
- anal_reactor 2 days ago
  
  Classic example: people say they'd rather pay $12 upfront and then no extra fees but they actually prefer $10 base price + $2 fees. If it didn't work then this pricing model wouldn't be so widespread.
  
  Reply View | 1 reply
  
  112233 2 days ago
  
  wow, framing. "people say they prefer quitting smoking, but actually they prefer to relapse when emotionally manipulated."
  The most commonly taken action does not imply people wanted to do it more, or felt happiest doing it. Unless you optimize profit only.
  
  Reply View | 0 replies
cm2012 3 days ago

This is why I work in direct performance advertising. Our work reveals the truth!

Reply View | 15 replies
- make3 3 days ago
  
  Your work exploits people's addictive propensity and behaviours, and gives corporations incentives and tools to build on that.
  Insane spin you're putting on it. At best, you're a cog in one of the worst recent evolutions of capitalism.
  
  Reply View | 14 replies
  
  cm2012 3 days ago
  
  Exploitative ads are a small minority. I also think gambling advertising should be banned.
  
  Reply View | 0 replies
  
  marrone12 3 days ago
  
  Advertising is not a recent evolution of capitalism, it's a foundational piece of it. Whatever you do as a job would not exist if there was no one marketing it. This hostility seems insane.
  
  Reply View | 10 replies
  
  [removed] 3 days ago
  
  [deleted]
  
  Reply View | 0 replies
  
  [removed] 3 days ago
  
  [deleted]
  
  Reply View | 0 replies

22c 3 days ago

> its because people actually want AI to talk to them like that

I can't find the particular article (there's a few blogs and papers pointing out the phenomenon, I can't find the one I enjoyed) but it was along the lines of how in LLMArena a lot of users tend to pick the "confidently incorrect" model over the "boring sounding but correct" model.

The average user probably prefers the sycophantic echo chamber of confirmation bias offered by a lot of large language models.

I can't help but draw parallels to the "You are not immune to propaganda" memes. Turns out most of us are not immune to confirmation bias, either.

Reply View 0 replies

9x39 3 days ago

I thought this was almost due to the AI personality splinter groups (trying to be charitable) like /myboyfriendisai and wrapper apps who vocally let them know they used those models the last time they sunset them.

Reply View 0 replies

cj 3 days ago

I was one of those pesky users who complained when o3 suddenly was unavailable.

When 5.2 was first launched, o3 did a notably better job at a lot of analytical prompts (e.g. "Based on the attached weight log and data from my calorie tracking app, please calculate my TDEE using at least 3 different methodologies").

o3 frequently used tables to present information, which I liked a lot. 5.2 rarely does this - it prefers to lay out information in paragraphs / blog post style.

I'm not sure if o3 responses were better, or if it was just the format of the reply that I liked more.

If it's just a matter of how people prefer to be presented their information, that should be something LLMs are equipped to adapt to at a user-by-user level based on preferences.

Reply View 0 replies

yieldcrv 2 days ago

you haven't been in tech long enough if you don't realize most decisions are decided by "engagement"

if a user spends more time on it and comes back, the product team winds up prioritizing whichever pattern was supporting that. it's just a continual selective evolution towards things that keep you there longer, based on what kept everyone else there longer

Reply View 0 replies

josephg 3 days ago

They have added settings for this now - you can dial up and down how “warm” and “enthusiastic” you want the models to be. I haven’t done back to back tests to see how much this affects sycophancy, but adding the option as a user preference feels like the right choice.

If anyone is wondering, the setting for this is called Personalisation in user settings.

Reply View 0 replies

accrual 2 days ago

I don't want sycophantic AI, but I do have warmer memories of using 4o vs 5. It just felt a little more interesting and consistent to talk to.

Reply View 0 replies

pdntspa 3 days ago

I thought it was based on the user thumbs-up and thumbs-down reactions, it evolving the way that it does makes it pretty obvious that users want their asses licked

Reply View 0 replies

SeanAnderson 3 days ago

This doesn't come as too much of a surprise to me. Feels like it mirrors some of the reasons why toxic positivity occurs in the workplace.

Reply View 0 replies

542354234235 2 days ago

I think we underestimate the power that our unconscious and lizard brains have in shaping our behavior/preferences. I was using GPT for work and the sycophantic responses were eyerollingly annoying, but I still noticed that I got some sort of dopamine hit when it would saying something like "that is an incredibly insightful question. You are truly demonstrating a deep understanding of blah blah blah". Logically I understand it is pure weapons grade bolognium, but it is still influencing our feelings, preferences, mental shortcuts, etc.

Reply View 0 replies

cornonthecobra 3 days ago

Put on a good show, offer something novel, and people will gleefully march right off a cliff while admiring their shiny new purchase.

Reply View 0 replies

PlatoIsADisease 3 days ago

Your absolutely right. You’re not imagining it. Here is the quiet truth:

You’re not imagining it, and honestly? You're not broken for feeling this—its perfectly natural as a human to have this sentiment.

Reply View 0 replies