Comment by elzbardico

Comment by elzbardico a day ago

It is more like Apple have no need to spend billions on training with questionable ROI when it can just rent from one of the commodity foundation model labs.

nosman 20 hours ago

I don't know why people automatically jump to Apple's defense on this.... They absolutely did spend a lot of money and hired people to try this. They 100% do NOT have the open and bottom-up culture needed to pull off large scale AI and software projects like this.

Source: I worked there

Reply View 4 replies

elzbardico 20 hours ago

Well, they stopped.
Culture is overrated. Money talks.
They did things far more complicated from an engineering perspective. I am far more impressed by what they accomplished along TSMC with Apple Silicon than by what AI labs do.

Reply View | 3 replies
- tech-historian 18 hours ago
  
  Is Apple silicon really that impressive compared to LLMs? Take a step back. CPUs have been getting faster and more efficient for decades.
  Google invented the transformer architecture, the backbone of modern LLMs.
  
  Reply View | 2 replies
  
  Terretta 15 hours ago
  
  > Google invented...
  "Google" did? Or humans who worked there and one who didn't?
  https://www.wired.com/story/eight-google-employees-invented-...
  In any case, see the section on Jakob Uszkoreit, for example, or Noam Shazeer. And then…
  > In the higher echelons of Google, however, the work was seen as just another interesting AI project. I asked several of the transformers folks whether their bosses ever summoned them for updates on the project. Not so much. But “we understood that this was potentially quite a big deal,” says Uszkoreit.
  Worth noting the value of “bosses” who leave people alone to try nutty things in a place where research has patronage. Places like universities, Xerox, or Apple and Google deserve credit for providing the petri dish.
  
  Reply View | 0 replies
  
  xmcqdpt2 14 hours ago
  
  You can understand how transformers work from just reading the Attention is All You Need paper, which is 15 pages of pretty accessible DL. That's not the part that is impressive about LLMs.
  
  Reply View | 0 replies

aurareturn 12 hours ago

It’s such a commodity that there are only 3 SOTA labs left and no one can catch them. I’m sure it’ll be consolidated further in the future and you’re going to be left with a natural monopoly or duopoly.

Apple has no control over the most important change to tech. They have control to Google.

Reply View 5 replies

elzbardico an hour ago

Really, don't believe benchmarks as gospel. Chinese models are pretty much competitive with offerings from Anthropic, OpenAI or Google. Meta is currently at a disadvantage, but I believe they will find their mojo and soon be competitive again.
Frankly, a lot of times I prefer using GLM 4.6 running on Cerebras Inference, than having to deal with the performance hiccups from Claude. For most practical purposes, I've seen no big penalty in using it compared to Opus 4.5, even the biggest qwen-coder models are pretty much competitive.
Between me and the company I work for, I spend some serious money with AI. I use it extensively in my main job, on two side projects that I have paying customers for, and for graduate school work. I can tell you that there quite a few more SOTA models around than what the benchmarks tell you.

Reply View | 0 replies
kouteiheika 5 hours ago

> It’s such a commodity that there are only 3 SOTA labs left and no one can catch them.
No one can outpace them in improving the SOTA, everyone can catch up to them. Why are open-weight models perpetually 6 months behind the SOTA? Given enough data harvested from SOTA models you can eventually distill them.
The biggest differentiator when training better models are not some new fancy architectural improvements (even the current SOTA transformer architectures are very similar to e.g. the ancient GPT-2), but high quality training data. And if your shiny new SOTA model is hooked into a publicly available API, guess what - you've just exposed a training data generator for everyone to use. (That's one of the reasons why SOTA labs hide their reasoning chains, even though those are genuinely useful for users - they don't want others to distill their models.)

Reply View | 0 replies
qcnguy 12 hours ago

Four. You forgot xAI. And that's ignoring the Chinese labs.

Reply View | 2 replies
- aurareturn 11 hours ago
  
  Chinese labs aren’t SOTA due to lack of compute.
  Yes I forgot xAI. So 4 left. I’m betting that there will be one or two dominant ones in next 10 years. Apple won’t be one of them.
  
  Reply View | 0 replies
- [removed] 11 hours ago
  
  [deleted]
  
  Reply View | 0 replies