Comment by bbor

Comment by bbor 10 months ago

Fascinating. I have two quick questions, if you find the time:

  …we’ve built our own foundation model, GFT (Generative Forecasting Transformer), a 1.5B parameter frontier model that simulates global weather…

I’m constantly scolding people for trying to use LLMs for non-linguistic tasks, and thus getting deceptively disappointing results. The quintessential example is arithmetic, which makes me immediately dubious of a transformer built to model physics. That said, you’ve obviously found great empirical success already, so something’s working. Can you share some of your philosophical underpinnings for this approach, if they exist beyond “it’s a natural evolution of other DL tech”? Does your transformer operate in the same rough way as LLMs, or have you radically changed the architecture to better approach this problem?

  Hence: simulate the Earth.

When I read “simulate”, I immediately think of physics simulations built around interpretable/symbolic systems of elements and forces, which I would usually put in basic opposition to unguided/connectionist ML models. Why choose the word “simulate”, given that your models are essentially black boxes? Again, a pretty philosophical question that you don’t necessarily have to have an answer to for YC reasons, lol

Best of luck, and thanks for taking the leap! Humanity will surely thank you. Hopefully one day you can claim a bit of the NWS’ $1.2B annual budget, or the US Navy’s $infinity budget — if you haven’t, definitely reach out to NRL and see if they’ll buy what you’re selling!

Oh and C) reach out if you ever find the need to contract out a naive, cheap, and annoyingly-optimistic full stack engineer/philosopher ;)

cbodnar 10 months ago

Re question 1: LLMs are already working pretty well for video generation (e.g. see Sora). You can also think of weather as some sort of video generation problem where you have hundreds of channels (one for each variable). So this is not inconsistent with other LLM success stories from other domains.

Re question 2: Simulations don't need to be explainable. Being able to simulate simply means being able to provide a resonable evolution of a system given some potential set of initial conditions and other constraints. Even for physics-based simulations, when run at huge scale like with weather, it's debatable to what degree they are "interpretable".

Thanks for your questions!

Reply View 0 replies

britannio 10 months ago

Andrej Karpathy states that LLMs are highly general purpose technology for statistical modelling of token streams [1]. For example, comma.ai uses transformers in their self-driving model which is far from a linguistic task.

[1] https://x.com/karpathy/status/1835024197506187617 [2] https://www.youtube.com/watch?v=-KMdo9AWJaQ&t=1010s

Reply View 0 replies