Qwen3-Coder-Next

583 points by danielhanchen 11 hours ago

storus 9 hours ago

Does Qwen3 allow adjusting context during an LLM call or does the housekeeping need to be done before/after each call but not when a single LLM call with multiple tool calls is in progress?

Reply View 3 replies

segmondy 9 hours ago

Not applicable... the models just process whatever context you provide to them, context management happens outside of the model and depends on your inference tool/coding agent.

Reply View | 2 replies
- cyanydeez 7 hours ago
  
  It's interesting how people can be so into LLMs but dont, at the end of the day, understand they're just passing "well formatted" text to a text processor and everything else is build around encoding/decoding it into familiar or novel interfaces & the rest.
  The instability of the tooling outside of the LLM is what keeps me from building anything on the cloud, because you're attaching your knowledge and work flow to a tool that can both change dramatically based on context, cache, and model changes and can arbitrarily raise prices as "adaptable whales" push the cost up.
  Its akin to learning everything about beanie babies in the early 1990's and right when you think you understand the value proposition, suddenly they're all worthless.
  
  Reply View | 1 reply
  
  storus 5 hours ago
  
  That's why you can use latest open coding models locally that reportedly reached the performance of Sonet-4.5 so almost SOTA. And then you can think of tricks like I mentioned above to directly manipulate GPU RAM for context cleanup when needed which is not possible with cloud models unless their provider enables that.
  
  Reply View | 0 replies

orliesaurus 10 hours ago

how can anyone keep up with all these releases... what's next? Sonnet 5?

Reply View 5 replies

gessha 10 hours ago

Tune it out, come back in 6 months, the world is not going to end. In 6 months, you’re going to change your API endpoint and/or your subscription and then spend a day or two adjusting. Off to the races you go.

Reply View | 0 replies
Squarex 10 hours ago

Well there are rumors sonnet 5 is coming today, so...

Reply View | 0 replies
Havoc 6 hours ago

Pretty much every lab you can think of has something scheduled for february. Gonna be a wild one

Reply View | 0 replies
cmrdporcupine 6 hours ago

This is going to be a crazy month because the Chinese labs are all trying to get their releases out prior to their holidays (Lunar New Year / Spring Festival).
So we've seen a series of big ones already -- GLM 4.7 Flash, Kimi 2.5, StepFun 3.5, and now this. Still to come is likely a new DeepSeek model, which could be exciting.
And then I expect the Big3, OpenAI/Google/Anthropic will try to clog the airspace at the same time, to get in front of the potential competition.

Reply View | 0 replies
bigyabai 8 hours ago

Relatively, it's not that hard. There's like 4-5 "real" AI labs, who altogether manage to announce maybe 3 products max, per-month.
Compared to RISC core designs or IC optimization, the pace of AI innovation is slow and easy to follow.

Reply View | 0 replies

ltbarcly3 3 hours ago

Here's a tip: Never name anything new, next, neo, etc. You will have a problem when you try to name the thing after that!

Reply View 0 replies

StevenNunez 8 hours ago

Going to try this over Kimi k2.5 locally. It was nice but just a bit too slow and a resource hog.

Reply View 0 replies

endymion-light 11 hours ago

Looks great - i'll try to check it out on my gaming PC.

On a misc note: What's being used to create the screen recordings? It looks so smooth!

Reply View 1 reply

kevinsync 2 hours ago

It might be Screen Studio [0] -- I was gonna write "99% sure" but now I'm not sure at all!!
[0] https://screen.studio

Reply View | 0 replies

ossicones 10 hours ago

What browser use agent are they using here?

Reply View 1 reply

novaray 7 hours ago

Yes, the general purpose version is already supported and should have the same identical architecture

Reply View | 0 replies

fudged71 9 hours ago

I'm thrilled. Picked up a used M4 Pro 64GB this morning. Excited to test this out

Reply View 0 replies

throwaw12 10 hours ago

We are getting there, as a next step please release something to outperform Opus 4.5 and GPT 5.2 in coding tasks

Reply View 16 replies

gordonhart 10 hours ago

By the time that happens, Opus 5 and GPT-5.5 will be out. At that point will a GPT-5.2 tier open-weights model feel "good enough"? Based on my experience with frontier models, once you get a taste of the latest and greatest it's very hard to go back to a less capable model, even if that less capable model would have been SOTA 9 months ago.

Reply View | 12 replies
- cirrusfan 10 hours ago
  
  I think it depends on what you use it for. Coding, where time is money? You probably want the Good Shit, but also want decent open weights models to keep prices sane rather than sama’s 20k/month nonsense. Something like a basic sentiment analysis? You can get good results out of a 30b MoE that runs at good pace on a midrange laptop. Researching things online with many sources and decent results I’d expect to be doable locally by the end of 2026 if you have 128GB ram, although it’ll take a while to resolve.
  
  Reply View | 2 replies
  
  bwestergard 10 hours ago
  
  What does it mean for U.S. AI firms if the new equilibrium is devs running open models on local hardware?
  
  Reply View | 1 reply
  
  selectodude 10 hours ago
  
  OpenAI isn’t cornering the market on DRAM for kicks…
  
  Reply View | 0 replies
- yorwba 10 hours ago
  
  When Alibaba succeeds at producing a GPT-5.2-equivalent model, they won't be releasing the weights. They'll only offer API access, like for the previous models in the Qwen Max series.
  Don't forget that they want to make money in the end. They release small models for free because the publicity is worth more than they could charge for them, but they won't just give away models that are good enough that people would pay significant amounts of money to use them.
  
  Reply View | 0 replies
- tosh 10 hours ago
  
  It feels like the gap between open weight and closed weight models is closing though.
  
  Reply View | 3 replies
  
  theshrike79 10 hours ago
  
  Mode like open local models are becoming "good enough".
  I got stuff done with Sonnet 3.7 just fine, it did need a bunch of babysitting, but still it was a net positive to productivity. Now local models are at that level, closing up on the current SOTA.
  When "anyone" can run an Opus 4.5 level model at home, we're going to be getting diminishing returns from closed online-only models.
  
  Reply View | 2 replies
- thepasch 9 hours ago
  
  If an open weights model is released that’s as capable at coding as Opus 4.5, then there’s very little reason not to offload the actual writing of code to open weight subagents running locally and stick strictly to planning with Opus 5. Could get you masses more usage out of your plan (or cut down on API costs).
  
  Reply View | 0 replies
- rglullis 10 hours ago
  
  I'm going in the opposite direction: with each new model, the more I try to optimize my existing workflows by breaking the tasks down so that I can delegate tasks to the less powerful models and only rely on the newer ones if the results are not acceptable.
  
  Reply View | 0 replies
- rubslopes 7 hours ago
  
  I used to say that Sonnet 4.5 was all I would ever need, but now I exclusively use Opus...
  
  Reply View | 0 replies
- littlestymaar 7 hours ago
  
  > Based on my experience with frontier models, once you get a taste of the latest and greatest it's very hard to go back to a less capable model, even if that less capable model would have been SOTA 9 months ago.
  That's the tyranny of comfort. Same for high end car, living in a big place, etc.
  There's a good work around though: just don't try the luxury in the first place so you can stay happy with the 9 months delay.
  
  Reply View | 0 replies
Keyframe 9 hours ago

I'd be happy with something that's close or same as opus 4.5 that I can run locally, at reasonable (same) speed as claude cli, and at reasonable budget (within $10-30k).

Reply View | 0 replies
segmondy 9 hours ago

Try KimiK2.5 and DeepSeekv3.2-Speciale

Reply View | 0 replies
IhateAI 9 hours ago

Just code it yourself, you might surprise yourself :)

Reply View | 0 replies

valcron1000 10 hours ago

Still nothing to compete with GPT-OSS-20B for local image with 16 VRAM.

Reply View 0 replies

dzonga 6 hours ago

the qwen website doesn't work for me in safari :(. had to read the announcement in chrome

Reply View 0 replies

kylehotchkiss 6 hours ago

Is there any online resource tracking local model capability on say... a $2000 64gb memory Mac Mini? I'm getting increasingly excited about the local model space because it offers us a future where we can benefit from LLMs without having to listen to tech CEOs saber rattle about removing America of its jobs so they can get the next fundraising round sorted

Reply View 0 replies

jtbaker 7 hours ago

any way to run these via ollama yet?

Reply View 0 replies

syntaxing 10 hours ago

Is Qwen next architecture ironed out in llama cpp?

Reply View 0 replies

moron4hire 8 hours ago

My IT department is convinced these "ChInEsE cCcP mOdElS" are going to exfiltrate our entire corporate network of its essential fluids and vita.. erh, I mean data. I've tried explaining to them that it's physically impossible for model weights to make network requests on their own. Also, what happened to their MitM-style, extremely intrusive network monitoring that they insisted we absolutely needed?

Reply View 0 replies

cpill 6 hours ago

I wonder if we could have much smaller models if they train on less languages? i.e. python + yaml + json only or even an single languages with an cluster of models loaded into memory dynamically...?

Reply View 0 replies

lysace 5 hours ago

Is it censored according to the wishes of the CCP?

Reply View 3 replies

mirekrusin 4 hours ago

Who cares? If you don't like it, you can fine tune.

Reply View | 2 replies
- lysace 4 hours ago
  
  I think a lot of people care. Most decidedely not you.
  
  Reply View | 0 replies
- [removed] 4 hours ago
  
  [deleted]
  
  Reply View | 0 replies

raphaelmolly8 10 hours ago

[dead]

Reply View 0 replies

Soerensen 10 hours ago

The agent orchestration point from vessenes is interesting - using faster, smaller models for routine tasks while reserving frontier models for complex reasoning.

In practice, I've found the economics work like this:

1. Code generation (boilerplate, tests, migrations) - smaller models are fine, and latency matters more than peak capability 2. Architecture decisions, debugging subtle issues - worth the cost of frontier models 3. Refactoring existing code - the model needs to "understand" before changing, so context and reasoning matter more

The 3B active parameters claim is the key unlock here. If this actually runs well on consumer hardware with reasonable context windows, it becomes the obvious choice for category 1 tasks. The question is whether the SWE-Bench numbers hold up for real-world "agent turn" scenarios where you're doing hundreds of small operations.

Reply View 6 replies

cirrusfan 10 hours ago

I find it really surprising that you’re fine with low end models for coding - I went through a lot of open-weights models, local and "local", and I consistently found the results underwhelming. The glm-4.7 was the smallest model I found to be somewhat reliable, but that’s a sizable 350b and stretches the definition of local-as-in-at-home.

Reply View | 5 replies
- NitpickLawyer 10 hours ago
  
  You're replying to a bot, fyi :)
  
  Reply View | 4 replies
  
  CamperBob2 9 hours ago
  
  If it weren't for the single em-dash (really an en-dash, used as if it were an em-dash), how am I supposed to know that?
  And at the end of the day, does it matter?
  
  Reply View | 1 reply
  
  axus 7 hours ago
  
  Some people reply for their own happiness, some reply to communicate with another person. The AI won't remember or care about the reply.
  
  Reply View | 0 replies
  
  IhateAI 9 hours ago
  
  "Is they key unlock here"
  
  Reply View | 1 reply
  
  mrandish 9 hours ago
  
  Yeah, that hits different.
  
  Reply View | 0 replies