Comment by aucisson_masque

Comment by aucisson_masque 10 months ago

It could be used to spot LLM generated text.

compare the frequency of words to those used in human natural writings and you spot the computer from the human.

Lvl999Noob 10 months ago

It could be used to differentiate LLM text from pre-LLM human text maybe. The thing, our AIs may not be very good at learning but our brains are. The more we use AI, the more we integrate LLMs and other tools into our life, the more their output will influence us. I believe there was a study (or a few anecdotes) where college papers checked for AI material were marked AI written even though they were written by humans because the students used AI during their studying and learned from it.

Reply View 3 replies

MPSimmons 10 months ago

You're exactly right. You only have to look at the prevalence of the word "unalive" in real life contexts to find an example.

Reply View | 0 replies
thfuran 10 months ago

>our AIs may not be very good at learning but our brains are
Brains aren't nearly as good at slightly adjusting the statistical properties of a text corpus as computers are.

Reply View | 0 replies
left-struck 10 months ago

> The more we use AI, the more we integrate LLMs and other tools into our life, the more their output will influence us
Hmm I don’t disagree but I think it will be valuable skill going forward to write text that doesn’t read like it was written by an LLM
This is an arms race that I’m not sure we can win though. It’s almost like a GAN.

Reply View | 0 replies

TacticalCoder 10 months ago

> ... compare the frequency of words to those used in human natural writings and you spot the computer from the human.

But that's a losing endeavor: if you can do that, you can immediately ask your LLM to fix its output so that it passes that test (and many others). It can introduce typos, make small errors on purpose, and anything you can think of to make it look human.

Reply View 0 replies

ithkuil 10 months ago

it may work for a short time, but after a while natural language will evolve due to natural exposure of those new words or word patterns and even human will write in ways that, while being different from the LLMs, will also be different from the snapshot captured by this snapshot. It's already the case that we used to write differently 20 years ago from 50 years ago and even more so 100 years ago, etc

Reply View 0 replies

slashdave 10 months ago

Hardly. You are talking about a statistical test, which will have rather large errors (since it is based on word frequencies). Not to mention word frequencies will vary depending on the type of text (essay, description, advertisement, etc).

Reply View 0 replies