Comment by mcbuilder

Comment by mcbuilder a day ago

This article stands as complete hype. They just seem to offer an idea of "replication training" which is just some vague agentic distributed RL. Multi-agent distributed reinforcement learning algorithms have been in the actual literature for a while. I suggest studying what DeepMind is doing for current state of the art in agentic distributed RL.

janalsncm 21 hours ago

I didn’t think it was vague. Given an existing piece of software, write a detailed spec on what it does and then reward the model for matching its performance.

The vague part is whether this will generalize to other non software domains.

Reply View 1 reply

intrasight 14 hours ago

> write a detailed spec on what it does
A much harder task than writing said software

Reply View | 0 replies