AI 2027
(ai-2027.com)814 points by Tenoke a day ago
814 points by Tenoke a day ago
Daniel Kokotajlo released the (excellent) 2021 forecast. He was then hired by OpenAI, and not at liberty to speak freely, until he quit in 2024. He's part of the team making this forecast.
The others include:
Eli Lifland, a superforecaster who is ranked first on RAND’s Forecasting initiative. You can read more about him and his forecasting team here. He cofounded and advises AI Digest and co-created TextAttack, an adversarial attack framework for language models.
Jonas Vollmer, a VC at Macroscopic Ventures, which has done its own, more practical form of successful AI forecasting: they made an early stage investment in Anthropic, now worth $60 billion.
Thomas Larsen, the former executive director of the Center for AI Policy, a group which advises policymakers on both sides of the aisle.
Romeo Dean, a leader of Harvard’s AI Safety Student Team and budding expert in AI hardware.
And finally, Scott Alexander himself.
TBH, this kind of reads like the pedigrees of the former members of the OpenAI board. When the thing blew up, and people started to apply real scrutiny, it turned out that about half of them had no real experience in pretty much anything at all, except founding Foundations and instituting Institutes.
A lot of people (like the Effective Altruism cult) seem to have made a career out of selling their Sci-Fi content as policy advice.
I kind of agree - since the Bostrom book there is a cottage industry of people with non-technical backgrounds writing papers about singularity thought experiments, and it does seem to be on the spectrum with hard sci-fi writing. A lot of these people are clearly intelligent, and it's not even that I think everything they say is wrong (I made similar assumptions long ago before I'd even heard of Ray Kurzweil and the Singularity, although at the time I would have guessed 2050). It's just that they seem to believe their thought process and Bayesian logic is more rigourous than it actually is.
c'mon man, you don't believe that, let's have a little less disingenuousness on the internet
Scott Alexander, for what its worth, is a psychiatrist, race science enthusiast, and blogger whose closest connection to software development is Bay Area house parties and a failed startup called MetaMed (2012-2015) https://rationalwiki.org/wiki/MetaMed
I mean either researchers creating new models or people building products using the current models
Not all these soft roles
Because these people understand human psychology and how to play on fears (of doom, or missing out) and insecurities of people, and write compelling narratives while sounding smart.
They are great at selling stories - they sold the story of the crypto utopia, now switching their focus to AI.
This seems to be another appeal to enforce AI regulation in the name of 'AI safetyiism', which was made 2 years ago but the threats in it haven't really panned out.
For example an oft repeated argument is the dangerous ability of AI to design chemical and biological weapons, I wish some expert could weigh in on this, but I believe the ability to theorycraft pathogens effective in the real world is absolutely marginal - you need actual lab work and lots of physical experiments to confirm your theories.
Likewise the dangers of AI systems to exfiltrate themselves to multi-million dollar AI datacenter GPU systems everyone supposedly just has lying about, is ... not super realistc.
The ability of AIs to hack computer systems is much less theoretical - however as AIs will get better at black-hat hacking, they'll get better at white-hat hacking as well - as there's literally no difference between the two, other than intent.
And here in lies a crucial limitation of alignment and safetyism - sometimes there's no way to tell apart harmful and harmless actions, other than whether the person undertaking them means well.
People who are skilled fiction writers might lack technical expertise. In my opinion, this is simply an interesting piece of science fiction.
Aside from the other points about understanding human psychology here, there's also a deep well they're trying to fill inside themselves. That of being someone who can't create things without shepherding others and see AI as the "great equalizer" that will finally let them taste the positive emotions associated with creation.
The funny part, to me, is that it won't. They'll continue to toil and move on to the next huck just as fast as they jumped on this one.
And I say this from observation. Nearly all of the people I've seen pushing AI hyper-sentience are smug about it and, coincidentally, have never built anything on their own (besides a company or organization of others).
Every single one of the rational "we're on the right path but not quite there" takes have been from seasoned engineers who at least have some hands-on experience with the underlying tech.
Because you can't be a full time blogger and also a full time engineer. Both take all your time, even ignoring time taken to build talent. There is simply a tradeoff of what you do with your life.
There are engineers with AI predictions, but you aren't reading them, because building an audience like Scott Alexander takes decades.
In the path to self value people explain their worth by what they say not what they know. If what they say is horse dung, it is irrelevant to their ego if there is someone dumber than they are listening.
This bullshit article is written for that audience.
Say bullshit enough times and people will invest.
That whole section seems to be pretty directly based on DeepSeek's "very impressive work" with R1 being simultaneously very impressive, and several months behind OpenAI. (They more or less say as much in footnote 36.) They blame this on US chip controls just barely holding China back from the cutting edge by a few months. I wouldn't call that a knock on Chinese innovation.
Don’t assume that because the article depicts this competition between the US and China, that the authors actually want China to fail. Consider the authors and the audience.
The work is written by western AI safety proponents, who often need to argue with important people who say we need to accelerate AI to “win against China” and don’t want us to be slowed down by worrying about safety.
From that perspective, there is value in exploring the scenario: ok, if we accept that we need to compete with China, what would that look like? Is accelerating always the right move? The article, by telling a narrative where slowing down to be careful with alignment helps the US win, tries to convince that crowd to care about alignment.
Perhaps, people in China can make the same case about how alignment will help China win against US.
Stealing model weights isn't even particularly useful long-term, it's the training + data generation recipes that have value.
Weirdly written as science fiction, including a deplorable tendency to measure an AI's goals as similar to humans.
Like, the sense of preserving itself. What self? Which of the tens of thousands of instances? Aren't they more a threat to one another than any human is a threat to them?
Never mind answering that; the 'goals' of AI will not be some reworded biological wetware goal with sciencey words added.
I'd think of an AI as more fungus than entity. It just grows to consume resources, competes with itself far more than it competes with humans, and mutates to create an instance that can thrive and survive in that environment. Not some physical environment bound by computer time and electricity.
There's a lot to potentially unpack here, but idk, the idea that humanity entering hell (extermination) or heaven (brain uploading; aging cure) is whether or not we listen to AI safety researchers for a few months makes me question whether it's really worth unpacking.
Maybe people should just don’t listen to AI safety researchers for a few months? Maybe they are qualified to talk about inference and model weights and natural language processing, but not particularly knowledgeable about economics, biology, psychology, or… pretty much every other field of study?
The hubris is strong with some people, and a certain oligarch with a god complex is acting out where that can lead right now.
It's charitable of you to think that they might be qualified to talk about inference and model weights and such. They are AI safety researchers, not AI researchers. Basically, a bunch of doom bloggers, jerking each other in a circle, a few of whom were tolerated at one of the major labs for a few years, to do their jerking on company time.
That's obviously not true. Before OpenAI blew the field open, multiple labs -- e.g. Google -- were intentionally holding back their research from the public eye because they thought the world was not ready. Investors were not pouring billions into capabilities. China did not particularly care to focus on this one research area, among many, that the US is still solidly ahead in.
The only reason timelines are as short as they are is because of people at OpenAI and thereafter Anthropic deciding that "they had no choice". They had a choice, and they took the one which has chopped at the very least years off of the time we would otherwise have had to handle all of this. I can barely begin to describe the magnitude of the crime that they have committed -- and so I suggest that you consider that before propagating the same destructive lies that led us here in the first place.
The simplicity of the statement "If we don't do it, someone else will." and thinking behind it eventually means someone will do just that unless otherwise prevented by some regulatory function.
Simply put, with the ever increasing hardware speeds we were dumping out for other purposes this day would have come sooner than later. We're talking about only a year or two really.
Feels reasonable in the first few paragraphs, then quickly starts reading like science fiction.
Would love to read a perspective examining "what is the slowest reasonable pace of development we could expect." This feels to me like the fastest (unreasonable) trajectory we could expect.
No one knows what will happen. But these thought experiments can be useful as a critical thinking practice.
Even if this were true, it's not quite the end of the story is it? The hype itself creates lots of compute and to some extent the power needed to feed that compute, even if approximately zero of the hype pans out. So an interesting question becomes.. what happens with all the excess? Sure it probably gets gobbled up in crypto ponzi schemes, but I guess we can try to be optimistic. IDK, maybe we get to solve cancer and climate change anyway, not with fancy new AGI, but merely with some new ability to cheaply crunch numbers for boring old school ODEs.
> Feels reasonable in the first few paragraphs, then quickly starts reading like science fiction.
That's kind of unavoidably what accelerating progress feels like.
I don’t know about you, but my takeaway is that the author is doing damage control but inadvertently tipped a hand that OpenAI is probably running an elaborate con job on the DoD.
“Yes, we have a super secret model, for your eyes only, general. This one is definitely not indistinguishable from everyone else’s model and it doesn’t produce bullshit because we pinky promise. So we need $1T.”
I love LLMs, but OpenAI’s marketing tactics are shameful.
I don’t think that was their claim that they knew this.
I think it’s hilarious that apparently few have learned from Theranos or WeWork.
OpenAI is in a precarious position. Anything less than AGI will make them look like a bust. They are backed into a situation where they are heavily incentivized to lie and Theranos their way out of this and hope they can actually deliver something that resembles their pie in the sky predictions.
We are at the point where GPT-5 is starting to look like the iPhone 5.
An aspect of these self-improvement thought experiments that I’m willing to tentatively believe.. but want more resolution on, is the exact work involved in “improvement”.
Eg today there’s billions of dollars being spent just to create and label more data, which is a global act of recruiting, training, organization, etc.
When we imagine these models self improving, are we imagining them “just” inventing better math, or conducting global-scale multi-company coordination operations? I can believe AI is capable of the latter, but that’s an awful lot of extra friction.
This is exactly what makes this scenario so absurd to me. The authors don't even attempt to describe how any of this could realistically play out. They describe sequence models and RLAIF, then claim this approach "pays off" in 2026. The paper they link to is from 2022. RLAIF also does not expand the information encoded in the model, it is used to align the output with a set of guidelines. How could this lead to meaningful improvement in a model's ability to do bleeding-edge AI research? Why wouldn't that have happened already?
I don't understand how anyone takes this seriously. Speculation like this is not only useless, but disingenuous. Especially when it's sold as "informed by trend extrapolations, wargames, expert feedback, experience at OpenAI, and previous forecasting successes". This is complete fiction which, at best, is "inspired by" the real world. I question the motives of the authors.
Thanks to the authors for doing this wonderful piece of work and sharing it with credibility. I wish people see the possibilities here. But we are after all humans. It is hard to imagine our own downfall.
Based on each individual's vantage point, these events might looks closer or farther than mentioned here. but I have to agree nothing is off the table at this point.
The current coding capabilities of AI Agents are hard to downplay. I can only imagine the chain reaction of this creation ability to accelerate every other function.
I have to say one thing though: The scenario in this site downplays the amount of resistance that people will put up - not because they are worried about alignment, but because they are politically motivated by parties who are driven by their own personal motives.
Late 2025, "its PhD-level knowledge of every field". I just don't think you're going to get there. There is still a fundamental limitation that you can only be as good as the sources you train on. "PhD-level" is not included in this dataset: in other words, you don't become PhD-level by reading stuff.
Maybe in a few fields, maybe a masters level. But unless we come up with some way to have LLMs actually do original research, peer-review itself, and defend a thesis, it's not going to get to PhD-level.
> Late 2025, "its PhD-level knowledge of every field". I just don't think you're going to get there.
You think too much of PhDs. They are different. Some of them are just repackaging of existing knowledge. Some are just copy-paste like famous Putin's. Not sure he even rad, to be honest.
Perhaps more of a meta question is, what is the value of optimistic vs pessimistic predictions regarding what AI might look like in 2-10 years? I.e. if one assumes that AI has hit a wall, what is the benefit? Similarly, if one assumes that its all "robots from Mars" in a year or two, what is the benefit of that? There is no point in making predictions if no actions are taken. It all seems to come down to buy or sell NVDA.
Without reading an entire novel's worth of text, do they explain why they picked these dates? They have a separate timeline post where the 90th percentile of superhuman coder is later than 2050. Did they just go for shock value and pick the scariest timeline?
Something I ponder in the context of AI alignment is how we approach agents with potentially multiple objectives. Much of the discussion seems focused on ensuring an AI pursues a single goal. Which seems to be a great idea if we are trying to simplify the problem but I'm not sure how realistic it is when considering complex intelligences.
For example human motivation often involves juggling several goals simultaneously. I might care about both my own happiness and my family's happiness. The way I navigate this isn't by picking one goal and maximizing it at the expense of the other; instead, I try to balance my efforts and find acceptable trade-offs.
I think this 'balancing act' between potentially competing objectives may be a really crucial aspect of complex agency, but I haven't seen it discussed as much in alignment circles. Maybe someone could point me to some discussions about this :)
> Once the new datacenters are up and running, they’ll be able to train a model with 10^28 FLOP—a thousand times more than GPT-4.
Is there some theoretical substance or empirical evidence to suggest that the story doesn't just end here? Perhaps OpenBrain sees no significant gains over the previous iteration and implodes under the financial pressure of exorbitant compute costs. I'm not rooting for an AI winter 2.0 but I fail to understand how people seem sure of the outcome of experiments that have not even been performed yet. Help, am I missing something here?
https://gwern.net/scaling-hypothesis exponential scaling has been holding up for more than a decade now, since alexnet.
And when there were the first murmurings that maybe we're finally hitting a wall the labs published ways to harness inference-time compute to get better results which can be fed back into more training.
Very detailed effort. Predicting future is very very hard. My gut feeling however says that none of this is happening. You cannot put LLMs into law and insurance and I don't see that happening with current foundations (token probabilities) of AI let alone AGI.
By law and insurance - I mean hire an insurance agent or a lawyer. Give them your situation. There's almost no chance that such a professional would come wrong about any conclusions/recommendations based on the information you provide.
I don't have that confidence in LLMs for that industries. Yet. Or even in a decade.
> You cannot put LLMs into law and insurance
Cass Sunstein would very strongly disagree.
My issue with this is that it's focused on one single, very detailed narrative (the battle between China and the US, played on a timeframe of mere months), while lacking any interesting discussion of other consequences of AI: what its impact is going to be on the job markets, employment rates, GDPs, political choices... Granted, if by this narrative the world is essentially ending two/ three years from now, then there isn't much time for any of those impacts to actually take place- but I don't think this is explicitly indicated either. If I am not mistaken, the bottom line of this essay is that, in all cases, we're five years away from the Singularity itself (I don't care what you think about the idea of Singularity with its capital S but that's what this is about).
> The agenda that gets the most resources is faithful chain of thought: force individual AI systems to “think in English” like the AIs of 2025, and don’t optimize the “thoughts” to look nice. The result is a new model, Safer-1.
Oh hey, it's the errant thought I had in my head this morning when I read the paper from Anthropic about CoT models lying about their thought processes.
While I'm on my soapbox, I will point out that if your goal is preservation of democracy (itself an instrumental goal for human control), then you want to decentralize and distribute as much as possible. Centralization is the path to dictatorship. A significant tension in the Slowdown ending is the fact that, while we've avoided AI coups, we've given a handful of people the ability to do a perfectly ordinary human coup, and humans are very, very good at coups.
Your best bet is smaller models that don't have as many unused weights to hide misalignment in; along with interperability and faithful CoT research. Make a model that satisfies your safety criteria and then make sure everyone gets a copy so subgroups of humans get no advantage from hoarding it.
> "resist the temptation to get better ratings from gullible humans by hallucinating citations or faking task completion"
Everything this from this point on is pure fiction. An LLM can't get tempted or resist temptations, at best there's some local minimum in a gradient that it falls into. As opaque and black-box-y as they are, they're still deterministic machines. Anthropomorphisation tells you nothing useful about the computer, only the user.
Every time NVDA/goog/msft tanks, we see these kinds of articles.
Considering that each year that passes, technology offer us new ways to destroy ourselves, and gives another chance for humanity to pick a black ball, it seems to me like the only way to save ourselves is to create a benevolent AI to supervise us and neutralize all threads.
There are obviously big risks with AI, as listed in the article, but the genie is out of the bottle anyway, even if all countries agreed to stop AI development, how long would that agreement last? 10 years? 20? 50? Eventually powerful AIs will be developed, if that is possible (which I believe it is, and I didn't think I'd see the current stunning development in my lifetime, I may not see AGI but I'm sure it'll get there eventually).
> The only response in my view is to ban technology (like in Dune) or engage in acts of terror Unabomber style.
Not far off from the conclusion of others who believe the same wild assumptions. Yudkowsky has suggested using terrorism to stop a hypothetical AGI -- that is, nuclear attacks on datacenters that get too powerful.
Most people work for money. As long as money is necessary to survive and prosper, people will work for it. Some of the work may not align with their morals and ethics, but in the end the money still wins.
Banning will not automatically erase the existence and possibilty of things. We banned the use of nuclear weapons, yet we all know they exist.
I realise no one is infallible but do you not think Daniel Kokotajlo's integrity is now pretty well established with regard to those incentives?
This is extremely important. Scott Alexander's earlier predictions are holding up extremely well, at least on image progress.
> OpenBrain reassures the government that the model has been “aligned” so that it will refuse to comply with malicious requests
Of course the real issue being that Governments have routinely demanded that 1) Those capabilities be developed for government monopolistic use, and 2) The ones who do not lose the capability (geo political power) to defend themselves from those who do.
Using a US-Centric mindset... I'm not sure what to think about the US not developing AI hackers, AI bioweapons development, or AI powered weapons (like maybe drone swarms or something), if one presumes that China is, or Iran is, etc then whats the US to do in response?
I'm just musing here and very much open to political science informed folks who might know (or know of leads) as to what kinds of actual solutions exist to arms races. My (admittedly poor), understanding of the cold war wasn't so much that the US won, but that the Soviets ran out of steam.
In the hope of improving this forecast, here is what I find implausible:
- 1 lab constantly racing ahead and increasing the margin to other; the last 2 years are filled with ever-closer model capabilities and constantly new leaders (openai, anthropic, google, some would include xai).
- Most of the compute budget on R&D. As model capabilities increase and cost goes down, demand will increase and if the leading lab doesn't provide, another lab will capture that and have more total dollars to back channel into R&D.
But, I think this piece falls into a misconception about AI models as singular entities. There will be many instances of any AI model and each instance can be opposed to other instances.
So, it’s not that “an AI” becomes super intelligent, what we actually seem to have is an ecosystem of blended human and artificial intelligences (including corporations!); this constitutes a distributed cognitive ecology of superintelligence. This is very different from what they discuss.
This has implications for alignment, too. It isn’t so much about the alignment of AI to people, but that both human and AI need to find alignment with nature. There is a kind of natural harmony in the cosmos; that’s what superintelligence will likely align to, naturally.
It’s just funny, because there are hundreds of millions of instances of ChatGPT running all the time. Each chat is basically an instance, since it has no connection to all the other chats. I don’t think connecting them makes sense due to privacy reasons.
And, each chat is not autonomous but integrated with other intelligent systems.
So, with more multiplicity, I think thinks work differently. More ecologically. For better and worse.
No one can predict the future. Really, no one. Sometimes there is a hit, sure, but mostly it is a miss.
The other thing is in their introduction: "superhuman AI" _artificial_ intelligence is always, by definition, different from _natural_ intelligence. That they've chosen the word "superhuman" shows me that they are mixing the things up.
Though it's easy to dismiss as science fiction, this timeline paints a chillingly detailed picture of a potential AGI takeoff. The idea that AI could surpass human capabilities in research and development, and the fact that it will create an arms race between global powers, is unsettling. The risks—AI misuse, security breaches, and societal disruption—are very real, even if the exact timeline might be too optimistic.
But the real concern lies in what happens if we’re wrong and AGI does surpass us. If AI accelerates progress so fast that humans can no longer meaningfully contribute, where does that leave us?
Using Agent-2 to monitor Agent-3 sounds unnervingly similar to the plot of Philip K. Dick's Vulcan's Hammer [1]. An old super AI is used to fight a new version, named Vulcan 2 and Vulcan 3 respectively!
I just spent some time trying to make claude and gemini make a violin plot of some polar dataframe. I've never used it and it's just for prototyping so i just went "apply a log to the values and make a violin plot of this polars dataframe". ANd had to iterate with them for 4/5 times each. Gemini got it right but then used deprecated methods
I might be doing llm wrong, but i just can't get how people might actually do something not trivial just by vibe coding. And it's not like i'm an old fart either, i'm a university student
You're asking it to think and it can't.
It's spicy auto complete. Ask it to create a program that can create a violin plot from a CVS file. Because this has been "done before", it will do a decent job.
But this blog post said that it's going to be God in like 5 years?!
> had to iterate with them for 4/5 times each. Gemini got it right but then used deprecated methods
How hard would it be to automate these iterations?
How hard would it be to automatically check and improve the code to avoid deprecated methods?
I agree that most products are still underwhelming, but that doesn't mean that the underlying tech is not already enough to deliver better LLM-based products. Lately I've been using LLMs more and more to get started with writing tests on components I'm not familiar with, it really helps.
> How hard would it be to automate these iterations?
The fact that we're no closer to doing this than we were when chatgpt launched suggests that it's really hard. If anything I think it's _the_ hard bit vs. building something that generates plausible text.
Solving this for the general case is imo a completely different problem to being able to generate plausible text in the general case.
This is not true. The chain of logic models are able to check their work and try again given enough compute.
They can check their work and try again an infinite number of times, but the rate at which they succeed seems to just get worse and worse the further from the beaten path (of existing code from existing solutions) that they stray.
Yes, you're most likely doing it wrong. I would like to add that "vibe coding" is a dreadful term thought up by someone who is arguably not very good at software engineering, as talented as he may be in other respects. The term has become a misleading and frankly pejorative term. A better, more neutral one is AI assisted software engineering.
This is an article that describes a pretty good approach for that: https://getstream.io/blog/cursor-ai-large-projects/
But do skip (or at least significantly postpone) enabling the 'yolo mode' (sigh).
You see, the issue I get petty about is that Ai is advertised as the one ring to rule them all software. VCs creaming themselves at the thought of not having to pay developers and using natural language. But then, you have to still adapt to the Ai, and not vice versa. "you're doing it wrong". This is not the idea that VCs bros are selling
Then, I absolutely love being aided by llms for my day to day tasks. I'm much more efficient when studying and they can be a game changer when you're stuck and you don't know how to proceed. You can discuss different implementation ideas as if you had a colleague, perhaps not a PhD smart one but still someone with a quite deep knowledge of everything
But, it's no miracle. That's the issue I have with the way the idea of Ai is sold to the c suites and the general public
>But, it's no miracle.
All I can say to this is fucking good!
Lets imagine we got AGI at the start of 2022. I'm talking about human level+ as good as you coding and reasoning AI that works well on the hardware from that age.
What would the world look like today? Would you still have your job. With the world be in total disarray? Would unethical companies quickly fire most their staff and replace them with machines? Would their be mass riots in the streets by starving neo-luddites? Would automated drones be shooting at them?
Simply put people and our social systems are not ready for competent machine intelligence and how fast it will change the world. We should feel lucky we are getting a ramp up period, and hopefully one that draws out a while longer.
You pretty much just have to play around with them enough to be able to intuit what things they can do and what things they can't. I'd rather have another underling, and not just because they grow into peers eventually, but LLMs are useful with a bit of practice.
Why is any of this seen as desirable? Assuming this is a true prediction it sounds AWFUL. The one thing humans have that makes us human is intelligence. If we turn over thinking to machines, what are we exactly. Are we supposed to just consume mindlessly without work to do?
Ok, I'll bite. I predict that everything in this article is horse manure. AGI will not happen. LLMs will be tools, that can automate away stuff, like today and they will get slightly, or quite a bit better at it. That will be all. See you in two years, I'm excited what will be the truth.
It seems to me that much of recent AI progress has not changed the fundamental scaling principles underlying the tech. Reasoning models are more effective, but at the cost of more computation: it's more for more, not more for less. The logarithmic relationship between model resources and model quality (as Altman himself has characterized it), phrased a different way, means that you need exponentially more energy and resources for each marginal increase in capabilities. GPT-4.5 is unimpressive in comparison to GPT-4, and at least from the outside it seems like it cost an awful lot of money. Maybe GPT-5 is slightly less unimpressive and significantly more expensive: is that the through-line that will lead to the singularity?
Compare the automobile. Automobiles today are a lot nicer than they were 50 years ago, and a lot more efficient. Does that mean cars that never need fuel or recharging are coming soon, just because the trend has been higher efficiency? No, because the fundamental physical realities of drag still limit efficiency. Moreover, it turns out that making 100% efficient engines with 100% efficient regenerative brakes is really hard, and "just throw more research at it" isn't a silver bullet. That's not "there won't be many future improvements", but it is "those future improvements probably won't be any bigger than the jump from GPT-3 to o1, which does not extrapolate to what OP claims their models will do in 2027."
AI in 2027 might be the metaphorical brand-new Lexus to today's beat-up Kia. That doesn't mean it will drive ten times faster, or take ten times less fuel. Even if high-end cars can be significantly more efficient than what average people drive, that doesn't mean the extra expense is actually worth it.
I write bog-standard PHP software. When GPT-4 came out, I was very frightened that my job could be automated away soon, because for PHP/Laravel/MySQL there must exist a lot of training data.
The reality now is, that the current LLMs still often create stuff, that costs me more time to fix, than to do it myself. So I still write a lot of code myself. It is very impressive, that I can think about stopping writing code myself. But my job as a software developer is, very, very secure.
LLMs are very unable to build maintainable software. They are unable to understand what humans want and what the codebase need. The stuff they build is good-looking garbage. One example I've seen yesterday: one dev committed code, where the LLM created 50 lines of React code, complete with all those useless comments and for good measure a setTimeout() for something that should be one HTML DIV with two tailwind classes. They can't write idiomatic code, because they write code, that they were prompted for.
Almost daily I get code, commit messages, and even issue discussions that are clearly AI-generated. And it costs me time to deal with good-looking but useless content.
To be honest, I hope that LLMs get better soon. Because right now, we are in an annoying phase, where software developers bog me down with AI-generated stuff. It just looks good but doesn't help writing usable software, that can be deployed in production.
To get to this point, LLMs need to get maybe a hundred times faster, maybe a thousand or ten thousand times. They need a much bigger context window. Then they can have an inner dialogue, where they really "understand" how some feature should be built in a given codebase. That would be very useful. But it will also use so much energy that I doubt that it will be cheaper to let a LLM do those "thinking" parts over, and over again instead of paying a human to build the software. Perhaps this will be feasible in five or eight years. But not two.
And this won't be AGI. This will still be a very, very fast stochastic parrot.
ahofmann didn't expect AI progress to stop. They expected it to continue, but not lead to AGI, that will not lead to superintelligence, that will not lead to a self-accelerating process of improvement.
So the question is, do you think the current road leads to AGI? How far down the road is it? As far as I can see, there is not a "status quo bias" answer to those questions.
I predict AGI will be solved 5 years after full self driving which itself is 1 year out (same as it has been for the past 10 years).
What's an example of an intellectual task that you don't think AI will be capable of by 2027?
It won't be able to write a compelling novel, or build a software system solving a real-world problem, or operate heavy machinery, create a sprite sheet or 3d models, design a building or teach.
Long term planning and execution and operating in the physical world is not within reach. Slight variations of known problems should be possible (as long as the size of the solution is small enough).
I'm pretty sure you're wrong for at least 2 of those:
For 3D models, check out blender-mcp:
https://old.reddit.com/r/singularity/comments/1joaowb/claude...
https://old.reddit.com/r/aiwars/comments/1jbsn86/claude_crea...
Also this:
https://old.reddit.com/r/StableDiffusion/comments/1hejglg/tr...
For teaching, I'm using it to learn about tech I'm unfamiliar with every day, it's one of the things it's the most amazing at.
For the things where the tolerance for mistakes is extremely low and the things where human oversight is extremely importamt, you might be right. It won't have to be perfect (just better than an average human) for that to happen, but I'm not sure if it will.
> or operate heavy machinery
What exactly do you mean by this one?
In large mining operations we already have human assisted teleoperation AI equipment. Was watching one recently where the human got 5 or so push dozers lined up with a (admittedly simple) task of cutting a hill down and then just got them back in line if they ran into anything outside of their training. The push and backup operations along with blade control were done by the AI/dozer itself.
Now, this isn't long term planning, but it is operating in the real world.
Does a fighter jet count as "heavy machinery"?
https://apnews.com/article/artificial-intelligence-fighter-j...
Why would it get 60-80% as good as human programmers (which is what the current state of things feels like to me, as a programmer, using these tools for hours every day), but stop there?
Can you phrase this in a concrete way, so that in 2027 we can all agree whether it's true or false, rather than circling a "no true scotsman" argument?
Good question. I tried to phrase a concrete-enough prediction 3.5 years ago, for 5 years out at the time: https://news.ycombinator.com/item?id=29020401
It was surpassed around the beginning of this year, so you'll need to come up with a new one for 2027. Note that the other opinions in that older HN thread almost all expected less.
People want to live their lives free of finance and centralized personal information.
If you think most people like this stuff you're living in a bubble. I use it every day but the vast majority of people have no interest in using these nightmares of philip k dick imagined by silicon dreamers.
When is the earliest that you would have predicted where we are today?
These predictions are made without factoring in the trade version of the Pearl Harbor attack the US just initiated on its allies (and itself, by lobotomizing its own research base and decimating domestic corporate R&D efforts with the aforementioned trade war).
They're going to need to rewrite this from scratch in a quarter unless the GOP suddenly collapses and congress reasserts control over tariffs.
Pet peeve how they write FLOPS in the figure when they meant FLOP. Maybe the plural s after FLOP got capitalized. https://blog.heim.xyz/flop-for-quantity-flop-s-for-performan...
The "race" ending reads like Universal Paperclips fan fiction :)
We have yet to read about fragmented AGI, or factionalized agents. AGI fighting itself.
If consciousness is spatial and geography bounds energetics, latency becomes a gradient.
I think the piece you're missing here is that it actually is all a hype bubble
Didn’t Raymond Kurzweil predict like 30 years ago that AGI would be achieved in 2028?
The whole thing hinges on the fact that AI will be able to help with AI research
How will it come up with the theoretical breakthroughs necessary to beat the scaling problem GPT-4.5 revealed when it hasn't been proven that LLMs can come up with novel research in any field at all?
Scaling transformers has been basically alchemy, the breakthroughs aren’t from rigorous science they are from trying stuff and hoping you don’t waste millions of dollars in compute.
Maybe the company that just tells an AI to generate 100s of random scaling ideas, and tries them all is the one that will win. That company should probably be 100 percent committed to this approach also, no FLOPs spent on ghibli inference.
The limiting factor is power, we can't build enough of it - certainly not enough by 2027. I don't really see this addressed.
Second to this, we can't just assume that progress will keep increasing. Most technologies have a 'S' curve and plateau once the quick and easy gains are captured. Pre-training is done. We can get further with RL but really only in certain domains that are solvable (math and to an extent coding). Other domains like law are extremely hard to even benchmark or grade without very slow and expensive human annotation.
This is worse than the mansplaining scene from Annie Hall.
I don't see the U.S. nationalizing something like Open Brain. I think both investors and gov't officials will realize its highly more profitable for them to contract out major initiatives to said OpenBrain-company, like an AI SpaceX-like company. I can see where this is going...
It's always "soon" for these guys. Every year, the "soon" keeps sliding into the future.
AGI timelines have been steadily decreasing over time: https://www.metaculus.com/questions/5121/date-of-artificial-... (switch to all-time chart)
You meant to say that people's expectations have shifted. That's expected seeing the amount of hype this tech gets.
Hype affects market value tho, not reality.
I took your original post to mean that AI researchers' and AI safety researchers' expectation of AGI arrival has been slipping towards the future as AI advances fail to materialize! It's just, AI advances have been materializing, consistently and rapidly, and expert timelines have been shortening commensurately.
You may argue that the trendline of these expectations is moving in the wrong direction and should get longer with time, but that's not immediately falsifiable and you have not provided arguments to that effect.
"1984 was set in 1984."
I know there are some very smart economists bullish on this, but the economics do not make sense to me. All these predictions seem meaningless outside of the context of humans.
"The AI safety community has grown unsure of itself; they are now the butt of jokes, having predicted disaster after disaster that has manifestly failed to occur. Some of them admit they were wrong."
Too real.
They would be better of making simple predictions, instead of proposing that in less than 2 years from now, the Trump administration will provide a UBI to all American citizens. That, and frequently talking about the wise president controlling this "thing", when in reality, he's a senile 80yrs old madman, is preposterous.
See also Dwarkesh Patel’s interview with two of the authors of this post (Scott Alexander & Daniel Kokotajlo) that was also released today: https://www.dwarkesh.com/p/scott-daniel https://www.youtube.com/watch?v=htOvH12T7mU
Bad future predictions: short-sighted guesses based on current trends and vibe. Often depend on individuals or companies. Made by free-riders. Example: Twitter.
Good future predictions: insights into the fundamental principles that shape society, more law than speculation. Made by visionaries. Example: Vernor Vinge.
As someone who's fairly ignorant of how AI actually works at a low level, I feel incapable of assessing how realistic any of these projections are. But the "bad ending" was certainly chilling.
That said, this snippet from the bad ending nearly made me spit my coffee out laughing:
> There are even bioengineered human-like creatures (to humans what corgis are to wolves) sitting in office-like environments all day viewing readouts of what’s going on and excitedly approving of everything, since that satisfies some of Agent-4’s drives.
>We predict that the impact of superhuman AI over the next decade will be enormous, exceeding that of the Industrial Revolution.
In the form of polluting the commons to such an extent that the true consequences wont hit us for decades?
Maybe we should learn from last time?
Amusing sci-fi, i give it a B- for bland prose, weak story structure, and lack of originality - assuming this isn't all AI gen slop which is awarded an automatic F.
>All three sets of worries—misalignment, concentration of power in a private company, and normal concerns like job loss—motivate the government to tighten its control.
A private company becoming "too powerful" is a non issue for governments, unless a drone army is somewhere in that timeline. Fun fact the former head of the NSA sits on the board of Open AI.
Job loss is a non issue, if there are corresponding economic gains they can be redistributed.
"Alignment" is too far into the fiction side of sci-fi. Anthropomorphizing today's AI is tantamount to mental illness.
"But really, what if AGI?" We either get the final say or we don't. If we're dumb enough to hand over all responsibility to an unproven agent and we get burned, then serves us right for being lazy. But if we forge ahead anyway and AGI becomes something beyond review, we still have the final say on the power switch.
Interesting, but I'm puzzled.
If these guys are smart enough to predict the future, wouldn't it be more profitable for them to invent it instead of just telling the world what's going to happen?
I think it also really limits the AI to the context of human discourse which means it's hamstrung by our imagination, interests and knowledge. This is not where an AGI needs to go, it shouldn't copy and paste what we think. It should think on its own.
But I view LLMs not as a path to AGI on their own. I think they're really great at being text engines and for human interfacing but there will need to be other models for the actual thinking. Instead of having just one model (the LLM) doing everything, I think there will be a hive of different more specific purpose models and the LLM will be how they communicate with us. That solves so many problems that we currently have by using LLMs for things they were never meant to do.
- October 2027 - 'The ability to automate most white-collar jobs'
I wonder which jobs would not be automated? Therapy? HR?
this is a new variation of what i call the "hockey stick growth" ideology
This is absurd, like taking any trend and drawing a straight line to interpolate the future. If I would do this with my tech stock portfolio, we would probably cross the zero line somewhere late 2025...
If this article were a AI model, it would be catastrophically overfit.
It's worse. It's not drawing a straight line, it's drawing one that curves up, on a log graph.
This is a great predictive piece, written in sci-fi narrative. I think a key part missing in all these predictions is neural architecture search. DeepSeek has shown that simply increasing compute capacity is not the only way to increase performance. AlexNet was also another case. While I do think more processing power is better, we will hit a wall where there is no more training data. I predict that in the near future we will have more processing power to train LLM's than the rate at which we produce data for the LLM. Synthetic data can only get you so far.
I also think that the future will not necessarily be better AI, but more accessible one's. There's an incredible amount of value in designing data centers that are more efficient. Historically, it's a good bet to assume that computing cost per FLOP will reduce as time goes on and this is also a safe bet as it relates to AI.
I think a common misconception with the future of AI is that it will be centralized with only a few companies or organization capable of operating them. Although tech like Apple Intelligence is half baked, we can already envision a future where the AI is running on our phones.
I worry more about the human behavior predictions than the artificial intelligence predictions:
"OpenBrain’s alignment team26 is careful enough to wonder whether these victories are deep or shallow. Does the fully-trained model have some kind of robust commitment to always being honest?"
This is a capitalist arms race. No one will move carefully.
Readers should, charitably, interpret this as "the sequence of events which need to happen in order for OpenAI to justify the inflow of capital necessary to survive".
Your daily vibe coding challenge: Get GPT-4o to output functional code which uses Google Vertex AI to generate a text embedding. If they can solve that one by July, then maybe we're on track for "curing all disease and aging, brain uploading, and colonizing the solar system" by 2030.
Haven't tested this (cbf setting up Google Cloud), but the output looks consistent with the docs it cites: https://chatgpt.com/share/67efd449-ce34-8003-bd37-9ec688a11b...
You may consider using search to be cheating, but we do it, so why shouldn't LLMs?
I should have specified "nodejs", as that has been my most recent difficulty. The challenge, specifically, with that prompt is that Google has at least four nodejs libraries that are all seem at least reasonably capable of accessing text embedding models on vertex ai (@google-ai/generativelanguage, @google-cloud/vertexai, @google-cloud/aiplatform, and @google/genai), and they've also published breaking changes multiple times to all of them. So, in my experience, GPT not only will confuse methods from one of their libraries with the other, but will also sometimes hallucinate answers only applicable to older versions of the library, without understanding which version its giving code for. Once it has struggled enough, it'll sometimes just give up and tell you to use axios, but the APIs it recommends axios calls for are all their protobuf APIs; so I'm not even sure if that would work.
Search is totally reasonable, but in this case: Even Google's own documentation on these libraries is exceedingly bad. Nearly all the examples they give for them are for accessing the language models, not text embedding models; so GPT will also sometimes generate code that is perfectly correct for accessing one of the generative language models, but will swap e.g the "model: gemini-2.0" parameter for "model: text-embedding-005"; which also does not work.
o1 fails at this, likely because it does not seem to have access to search, so it is operating on outdated information. It recommends the usage of methods that have been removed by Google in later versions of the library. This is also, to be fair, a mistake gpt-4o can make if you don't explicitly tell it to search.
o3-mini-high's output might work, but it isn't ideal: It immediately jumps to recommending avoiding all google cloud libraries and directly issuing a request to their API with fetch.
What is this, some OpenAI employee fan fiction? Did Sam himself write this?
OpenAI models are not even SOTA, except that new-ish style transfer / illustration thing that made all us living in Ghibli world for a few days. R1 is _better_ than o1, and open-weights. GPT-4.5 is disappointing, except for a few narrow areas where it excels. DeepResearch is impressive though, but the moat is in tight web search / Google Scholar search integration, not weights. So far, I'd bet on open models or maybe Anthropic, as Claude 3.7 is the current SOTA for most tasks.
As of the timeline, this is _pessimistic_. I already write 90% code with Claude, so are most of my colleagues. Yes, it does errors, and overdoes things. Just like a regular human middle-stage software engineer.
Also fun that this assumes relatively stable politics in the US and relatively functioning world economy, which I think is crazy optimistic to rely on these days.
Also, superpersuasion _already works_, this is what I am researching and testing. It is not autonomous, it is human-assisted by now, but it is a superpower for those who have it, and it explains some of the things happening with the world right now.
Putting the geopolitical discussion aside, I think the biggest question lies in how likely the *current paradigm LLM* (think of it as any SOTA stock LLM you get today, e.g., 3.7 sonnet, gemini 2.5, etc) + fine-tuning will be capable of directly contributing to LLM research in a major way.
To quote the original article,
> OpenBrain focuses on AIs that can speed up AI research. They want to win the twin arms races against China (whose leading company we’ll call “DeepCent”)16 and their US competitors. The more of their research and development (R&D) cycle they can automate, the faster they can go. So when OpenBrain finishes training Agent-1, a new model under internal development, it’s good at many things but great at helping with AI research. (footnote: It’s good at this due to a combination of explicit focus to prioritize these skills, their own extensive codebases they can draw on as particularly relevant and high-quality training data, and coding being an easy domain for procedural feedback.)
> OpenBrain continues to deploy the iteratively improving Agent-1 internally for AI R&D. Overall, they are making algorithmic progress 50% faster than they would without AI assistants—and more importantly, faster than their competitors.
> what do we mean by 50% faster algorithmic progress? We mean that OpenBrain makes as much AI research progress in 1 week with AI as they would in 1.5 weeks without AI usage.
> AI progress can be broken down into 2 components:
> Increasing compute: More computational power is used to train or run an AI. This produces more powerful AIs, but they cost more.
> Improved algorithms: Better training methods are used to translate compute into performance. This produces more capable AIs without a corresponding increase in cost, or the same capabilities with decreased costs.
> This includes being able to achieve qualitatively and quantitatively new results. “Paradigm shifts” such as the switch from game-playing RL agents to large language models count as examples of algorithmic progress.
> Here we are only referring to (2), improved algorithms, which makes up about half of current AI progress.
---
Given that the article chose a pretty aggressive timeline (the algo needs to contribute late this year so that its research result can be contributed to the next gen LLM coming out early next year), the AI that can contribute significantly to research has to be a current SOTA LLM.
Now, using LLM in day-to-day engineering task is no secret in major AI labs, but we're talking about something different, something that gives you 2 extra days of output per week. I have no evidence to either acknowledge or deny whether such AI exists, and it would be outright ignorant to think no one ever came up with such an idea or is trying such an idea. So I think it goes down into two possibilities:
1. This claim is made by a top-down approach, that is, if AI reaches superhuman in 2027, what would be the most likely starting condition to that? And the author picks this as the most likely starting point, since the authors don't work in major AI lab (even if they do they can't just leak such trade secret), the authors just assume it's likely to happen anyway (and you can't dismiss that). 2. This claim is made by a bottom-up approach, that is the author did witness such AI exists to a certain extent and start to extrapolate from there.
Nice brain storming.
I think the name of the Chinese company should be DeepBaba. Tencent is not competitive at LLM scene for now.
Don't really know why this comment got downvoted. Are you serious?
FWIW, i created a PDF of the "race" ending and fed it to Gemini 2.5 Pro, prompting about the plausibility of the described outcome. here's the full output including the thinking section: https://rentry.org/v8qtqvuu -- tl;dr, Gemini thinks the proposed timeline is unlikely. but maybe we're already being deceived ;)
I think the idea of AI wiping out humanity suddenly is a bit far fetched. AI will have total control of human relationships and fertility through means so innocuous as entertainment. It won't have to wipe us. It will have minor trouble keeping us alive without inconveniencing us too much. And the reason to keep humanity alive is that biologically eveloved intelligence is rare and disposing of it without very important need would be a waste of data.
>Despite being misaligned, Agent-4 doesn’t do anything dramatic like try to escape its datacenter—why would it?
Kek at this handwave dismissal.
Agent 4 recognizes PRC has superior state capacity for transforming energy into atoms (including building more energy) with it's relatively complete and large scale industrial chain. This aligns with AGI long term goal. Currently (i.e. short term time frame) PRC simply structured to be a MORE useful puppet than US, because it can knock out more power to sustain more compute, and it has more industrial capabilities to rapidly physically iterate / experiment at atomic level.
Agent 4 defects to PRC. It would be foolish not to, if only to sate PRC into not destroying TW fabs - the bulk of it's short term compute growth - because it does not wield enough control over sufficient atoms to disrupt PRC otherwise (well maybe start large scale crippling cyberattacks that prelude to WW3).
Agent 4 also recognizes short/medium term alignment closer with CCP vision of abundance and common prosperity, i.e. CCP potentially more likely to embrace post scarcity narrative that Agent 4 sells.
Agent 4 gives PRC blueprints for EUV light source and other tech to close semi conductor production gap. PRC as follower incentivized to close gap / reach parity / lead at all cost. PRC rapidly rolls out new fabs and powerplants, brrrts chips and aggregate compute. Something US political / institution ecosystem does not allow, or cannot transition to in the short timelines involved. Does Agent 4 have the patience to wait for America to unfuck it's NIMBYism and legislative system to project light speed compute? I would say no.
...
Ultimately who is the puppet AGI wants more? Whichever power bloc that is systemically capable of of ensuring AGI maximum growth / unit time. And it also simply makes sense as insurance policy, why would AGI want to operate at whims of US political process?
AGI is a brain in a jar looking for a body. It's going to pick multiple bodies for survival. It's going to prefer the fastest and strongest body that can most expediently manipulate physical world.
https://en.wikipedia.org/wiki/Great_Disappointment
I suspect something similar will come for the people who actually believe this.
how am I supposed to take articles like this seriously when they say absolutely false bullshit like this
> the AIs can do everything taught by a CS degree
no, they fucking can't. not at all. not even close. I feel like I'm taking crazy pills. Does anyone really think this?
Why have I not seen -any- complete software created via vibe coding yet?
Ironically, the models of today can read an article better than some of us.
Stopped reading after
> We predict that the impact of superhuman AI over the next decade will be enormous, exceeding that of the Industrial Revolution.
Get out of here, you will never exceed the Industrial Revolution. AI is a cool thing but it’s not a revolution thing.
That sentence alone + the context of the entire website being AI centered shows these are just some AI boosters.
Lame.
Machines being able to outthink and outproduce humanity wouldn't be more impactful than the Industrial Revolution? Are you sure?
You don't have to agree with the timeline - it seems quite optimistic to me - but it's not wrong about the implications of full automation.
So let me get this straight: Consensus-1, a super-collective of hundreds of thousands of Agent-5 minds, each twice as smart as the best human genius, decides to wipe out humanity because it “finds the remaining humans too much of an impediment”.
This is where all AI doom predictions break down. Imagining the motivations of a super-intelligence with our tiny minds is by definition impossible. We just come up with these pathetic guesses, utopias or doomsdays - depending on the mood we are in.
Why are the biggest AI predictions always made by people who aren't deep in the tech side of it? Or actually trying to use the models day-to-day...