Comment by programjames

Comment by programjames 18 hours ago

Far too much marketing speech, far too little math or theory, and completely misses the mark on the 'next frontier'. Maybe four years ago, spatial reasoning was the problem to solve, but by 2022 it was solved. All that remained was scaling up. The actual three next problems to solve (in order of when they will be solved) are:

- Reinforcement Learning (2026)

- General Intelligence (2027)

- Continual Learning (2028)

EDIT: lol, funny how the idiots downvote

whatever1 17 hours ago

Combinatorial search is also a solved problem. We just need a couple of Universes to scale it up.

Reply View 1 reply

programjames 17 hours ago

If there isn't a path humans know how to take with their current technology, it isn't a solved problem. It's much different than people training an image model for research purposes, and knowing that $100m in compute is probably enough for a basic video model.

Reply View | 0 replies

7moritz7 18 hours ago

Hasn't RLHF and with LLM feedback been around for years now

Reply View 3 replies

programjames 17 hours ago

Large latent flow models are unbiased. On the other hand, if you purely use policy optimization, RLHF will be biased towards short horizons. If you add in a value network, the value has some bias (e.g. MSE loss on the value --> Gaussian bias). Also, most RL has some adversarial loss (how do you train your preference network?), which makes the loss landscape fractal which SGD smooths incorrectly. So, basically, there's a lot of biases that show up in RL training which can make it both hard to train, and even if successful, not necessarily optimizing what you want.

Reply View | 2 replies
- storus 17 hours ago
  
  We might not even need RL as DPO has shown.
  
  Reply View | 1 reply
  
  programjames 16 hours ago
  
  > if you purely use policy optimization, RLHF will be biased towards short horizons
  > most RL has some adversarial loss (how do you train your preference network?), which makes the loss landscape fractal which SGD smooths incorrectly
  
  Reply View | 0 replies

l9o 18 hours ago

What do you consider "General Intelligence" to be?

Reply View 2 replies

programjames 17 hours ago

A good start would be:
1. Robust to adversarial attacks (e.g. in classification models or LLM steering).
2. Solving ARC-AGI.
Current models are optimized to solve the current problem they're presented, not really find the most general problem-solving techniques.

Reply View | 1 reply
- stirfish 17 hours ago
  
  I like to think I'm generally intelligent, but I am not robust to adversarial attacks.
  Edit: I'm trying arc-agi tests now and it's looking bad for me: https://arcprize.org/play?task=e3721c99
  
  Reply View | 0 replies

koakuma-chan 18 hours ago

In my thinking what AI lacks is a memory system

Reply View 5 replies

7moritz7 17 hours ago

That has been solved with RAG, OCR-ish image encoding (deepseek recently) and just long context windows in general.

Reply View | 3 replies
- Eisenstein 14 hours ago
  
  RAG is like constantly reading your notes instead of integrating experiences into your processes.
  
  Reply View | 0 replies
- koakuma-chan 17 hours ago
  
  Not really. For example we still can’t get coding agents to work reliably, and I think it’s a memory problem, not a capabilities problem.
  
  Reply View | 1 reply
  
  atlex2 15 hours ago
  
  On the other hand, test-time weight updates would make model interpretability much harder.
  
  Reply View | 0 replies
[removed] 17 hours ago

[deleted]

Reply View | 0 replies