Comment by bradly
Good question. While I had a fairly narrow view of a very large system, I'll give my personal perspective.
I worked on systems for evaluating the quality of models over time and for evaluating the quality of new models before release to understand how the new models would perform compared to current models once in the wild. It was difficult to get Siri to use these tools that were outside of their org. While this wouldn't solve the breadth of Siri's functionality issues, it would have helped improve the overall user experience with the existing Siri features to avoid the seemingly reduction of quality over time.
Secondly, and admittedly farther from where I was... Apple could have started the move from ML models to LLMs much sooner. The underlying technology for LLMs started gaining popularity in papers and research quite a few years ago, and there was a real problem of each team developing their own ML models for search, similarity, recommendations, etc that were quite large and that became a problem for mobile device delivery and storage. If leadership had a way to bring the orgs together they may have landed on LLMs much sooner.
Despite my positive experience between building systems based on intent recognition and how much better LLMs are than “1000 monkeys”, it seems like the two examples we have of LLM backed assistants - Google and Amazon - that it made them worse from reports.
I don’t know why that is from a technical level.