Comment by ben_w

Comment by ben_w 6 months ago

2 replies

Long tail, coping with typos, and understanding negation.

If natural language was as easy as "enough patience to write down a hundred patterns to match", we'd have had useful natural language interfaces in the early 90s — or even late 80s, if it was really only "a hundred".

semi-extrinsic 6 months ago

For narrow use cases we did have natural language interfaces in the 90s, yes. See e.g. IRC bots.

Or to take a local example, for more than 20 years my city has had a web service where you can type "When is the next bus from Street A to Road B", and you get a detailed response including any transfers between lines. They even had a voice recognition version decades ago that you could call, which worked well.

From GP post, I was replying specifically to

> LLMs in data pipelines enable all sorts of “before impossible” stuff. > For example, this creates an event calendar for you based on emails you have received

That exact thing has been a feature of Gmail for over a decade. Remember the 2018 GCal spam?

https://null-byte.wonderhowto.com/how-to/advanced-phishing-i...

  • ben_w 6 months ago

    > For narrow use cases we did have natural language interfaces in the 90s, yes. See e.g. IRC bots.

    "Narrow" being the key word. Thing is, even in the 2010s, we were doing sentiment analysis by counting the number of positive words and negative words, because it doesn't go past "narrow".

    Likewise, "A to B" is great… when it's narrow. I grew up on "Southbrook Road" — not the one in London, not the one in Southampton, not the one in Exeter, …

    And then there's where I went to university. Ond mae hynny'n twyllo braidd, oherwydd y Gymraeg. But not cheating very much, because of bilingual rules and because of the large number of people with multi-lingual email content. Cinco de mayo etc.

    I also grew up with text adventures, which don't work if you miss the expected keyword, or mis-spell it too hard. (And auto-correction has its own problems, as anyone who really wants to search for "adsorption" not "absorption" will tell you).

    > That exact thing has been a feature of Gmail for over a decade. Remember the 2018 GCal spam?

    Siri has something similar. It misses a lot and makes up a lot. Sometimes it sets the title to be the date and makes up a date.

    These are examples of not doing things successfully with just a hundred hard-coded rules.