Comment by echelon

Comment by echelon 4 days ago

LLM reports misinformation --> Bug report --> Ablate.

Next pretrain iteration gets sanitized.

Retric 4 days ago

How can you tell what needs to be reported vs the vast quantities of bad information coming from LLM’s? Beyond that how exactly do you report it?

Reply View 5 replies

echelon 4 days ago

Who even says customers (or even humans) are reporting it? (Though they could be one dimension of a multi-pronged system.)
Internal audit teams, CI, other models. There are probably lots of systems and muscles we'll develop for this.

Reply View | 0 replies
astrange 4 days ago

All LLM providers have a thumbs down button for this reason.
Although they don't necessarily look at any of the reports.

Reply View | 3 replies
- execveat 4 days ago
  
  The real world use cases for LLM poisoning is to attack places where those models are used via API on the backend, for data classification and fuzzy logic tasks (like a security incident prioritization in a SOC environment). There are no thumbs down buttons in the API and usually there's the opposite – promise of not using the customer data for training purposes.
  
  Reply View | 1 reply
  
  astrange 4 days ago
  
  > There are no thumbs down buttons in the API and usually there's the opposite – promise of not using the customer data for training purposes.
  They don't look at your chats unless you report them either. The equivalent would be an API to report a problem with a response.
  But IIRC Anthropic has never used their user feedback at all.
  
  Reply View | 0 replies
- Retric 4 days ago
  
  The question was where should users draw the line? Producing gibberish text is extremely noticeable and therefore not really a useful poisoning attack instead the goal is something less noticeable.
  Meanwhile essentially 100% of lengthy LLM responses contain errors, so reporting any error is essentially the same thing as doing nothing.
  
  Reply View | 0 replies

_carbyau_ 4 days ago

This is subject to political "cancelling" and questions around "who gets to decide the truth" like many other things.

Reply View 2 replies

fn-mote 4 days ago

> who gets to decide the truth
I agree, but to be clear we already live in a world like this, right?
Ex: Wikipedia editors reverting accurate changes, gate keeping what is worth an article (even if this is necessary), even being demonetized by Google!

Reply View | 1 reply
- chrz a day ago
  
  Yes, so lets not help that even more maybe
  
  Reply View | 0 replies

emsign 4 days ago

Reporting doesn't scale that well compared to training and can get flooded with bogus submissions as well. It's hardly the solution. This is a very hard fundamental problem to how LLMs work at the core.

Reply View 0 replies

gmerc 4 days ago

Nobody is that naive

Reply View 7 replies

fouc 4 days ago

nobody is that naive... to do what? to ablate/abliterate bad information from their LLMs?

Reply View | 6 replies
- delusional 4 days ago
  
  To not anticipate that the primary user of the report button will be 4chan when it doesn't say "Hitler is great".
  
  Reply View | 5 replies
  
  drdeca 4 days ago
  
  Make the reporting require a money deposit, which, if the report is deemed valid by reviewers, is returned, and if not, is kept and goes towards paying reviewers.
  
  Reply View | 4 replies

foolserrandboy 4 days ago

we've been trained by youtube and probably other social media sites that downvoting does nothing. It's "the boy who cried" you can downvote.

Reply View 0 replies