Comment by sodafountan

Comment by sodafountan a day ago

5 replies

Can someone explain to me how this was allowed to happen? Wasn't Siri supposed to be the leading AI agent not ten years ago? How was there such a large disconnect at Apple between what Siri could do and what "real" AI was soon to be capable of?

Was this just a massive oversight at Apple? Were there not AI researchers at Apple sounding the alarm that they were way off with their technology and its capabilities? Wouldn't there be talk within the industry that this form of AI assistant would soon be looked at as useless?

Am I missing something?

raw_anon_1111 a day ago

Source: while I don’t have any experience with the inner workings of Siri, I have extensive experience with voice based automation with call centers (Amazon Connect) and Amazon Lex (the AWS version of Alexa).

Siri was never an “AI agent”, with intent based systems, you give the system phrases to match on (intents) and to fulfill an intent, all of the “slots” have to be fulfilled. For instance “I want to go from $source to $destination” and then the system calls an API.

There is no AI understanding - it’s a “1000 monkeys implementation”, you just start giving the system a bunch of variations and templates you want to match on in every single language you care about and match the intents to an API. That’s how Google and Alexa also worked pre LLM. They just had more monkeys dedicated to creating matching sentences.

Post LLM, you tell the LLM what the underlying system is capable of, the parameters the API requires to fulfill an action and the LLM can figure out the users intentions and ask follow up questions until it had enough info to call the API. You can specify the prompt in English and it works in all of the languages that the LLM has been trained on.

Yes I’ve done both approaches

  • sodafountan 2 hours ago

    I appreciate the response, but that doesn't really answer my question.

    I want to know why the executive leadership at Apple failed to see LLMs as the future of AI. ChatGPT and Gemini are what Siri should be at this point. Siri was one of the leading voice-automated assistants of the past decade, and now Apple's only options are to strap on an existing solution to the name of their product or let it go defunct. So now Siri is just an added layer to access Gemini? Perhaps with a few hard-coded solutions to automate specific tasks on the iPhone, and that's their killer app into the world of AI? That's pathetic.

    Is Apple already such a bloated corporation that it can no longer innovate fast enough to keep up with modern trends? It seems like only a few years ago they were super lean and able to innovate better than any major tech company around. LLMs were being researched in 2017. I guess three years was too short of a window to change the direction of Siri. They should have seen the writing on the wall here.

    • raw_anon_1111 2 hours ago

      According to everything that has been reported, both the Google Assistant and Alexa are less reliable now that they are LLM based.

      I don’t know why, in my much smaller scale experience, converting to an LLM “tools” based approached from the Intent based approach is much more reliable.

      Siri was behind pre LLM because Apple didn’t throw enough monkeys at the problem.

      Everything that an assistant can do is “hardcoded” even when it is LLM based.

      Old way: voice -> text -> pattern matching -> APIs to back end functionality.

      New Way: voice -> text -> LLM -> APIs to back end functionality.

      How often have you come across a case where Siri understood something and said “I can’t do that”? That’s not an AI problem. That’s Apple not putting people on the intent -> API mapping. An LLM won’t solve the issue of exposing the APIs to Siri.

      • sodafountan an hour ago

        I don't really want to continue on with this discussion, as AI in general can be absolutely infuriating. It's one of those buzzwords that's just being thrown around without a care in the world at this point, but do you have any links to those reports? I'd be willing to bet that if Google Assistant and Alexa were being run properly, then they shouldn't be less reliable when working with an LLM.

        I don't think Apple didn't have enough people working on Siri, I think they had too many people working on the wrong problems. If they had any eye on the industry like they did in their heyday when Jobs was at the helm they would've been all over LLMs like Sam Altman was with his OpenAI startup. This report of SIRI using Gemini going forward is one of the biggest signs that Apple is failing to innovate, let alone the constant rehashing of Iphone and IOS. They haven't been innovative in years.

        And yes that's the point I was trying to make, AI assistants shouldn't be hardcoded to do certain things, that's not AI - but with Apple's marketing, they'd have you believe that SIRI is what AI should be, except now everyone's wiser, everyone and their grandmother has used ChatGPT which is really what SIRI should have been. Changes to the IOS API should roll out and an LLM backed AI assistant should be able to pick up on those changes automatically, SIRI should be an LLM trained on Apple Data, its APIS, your personal data (emails, documents,etc.), and a whole host of publicly available data. This would actually make SIRI useful going into the future.

        Again, if Apple's marketing team were to be believed, SIRI would be the most advanced LLM on the planet, but from a technical standpoint, they haven't even started training an LLM at all. It's nonsense.

        • raw_anon_1111 37 minutes ago

          AI assistants can’t magically “do stuff” without “tools” exposed. A tool is always an API that someone has to write an expose to the orchestrator whether it’s AI or just a dumb intent system.

          And ChatGPT can’t really “do anything” without access to tools.

          You don’t want an LLM to have access to your total system without deterministic guardrails and limiting the permissions of what the tools can do just like you wouldn’t expose your entire database with admin privileges to the web.

          You also don’t want to expose too many tools to the system. Every tool you expose you also have to have a description of what the tool does, the parameters it needs etc. Ot will both blow up your context window and start hallucinating. I suspect that’s why Alexa and Google Assistant got worse when they became LLM based and my narrow use cases don’t suffer those problems when I started implementing LLM based solutions.

          And I am purposefully yada yada yadaing some of the technical complexities and I hate the entire “appeal to authority” thing. But I worked at AWS for 3.5 years until 2 years ago and I was at one point the second highest contributor to a popular open source “AWS Solution” that almost everyone in the niche had heard of dealing with voice automation. I really do know about this space.