Claude Code is suddenly everywhere inside Microsoft
(theverge.com)346 points by Anon84 13 hours ago
346 points by Anon84 13 hours ago
At the beginning of my career, sometime around 1999 or 2000, I was at Microsoft with our team because we were trying to integrate our product with this absolute piece of junk called Microsoft Biztalk.
It simply didn’t work. I complained about it and was eventually hauled into a room with some MS PMs who told me in no uncertain terms that indeed, Biztalk didn’t work and it was essentially garbage that no one, including us, should ever use. Just pretend you’re doing something and when the week is up, go home. Tell everyone you’ve integrated with Biztalk. It won’t matter.
I work for Microsoft/Azure and my incentives are (roughly in descending order): minimize large/long outages, ship lots of stuff (with some concern for customer utility, but not too much), don't get yelled at for missing mandated work (security, compliance, etc.) I'd love to improve product quality, but incentives for that are negative. We're running a tight ship, and every second I spend on quality is a second I don't spend on the priorities above. Since there isn't any slack in the system, that means my performance assessment will drop, which I obviously don't want. Multiply that by 200k employees, and you get the current state of quality across the whole product portfolio.
My experience in the Teams org is the same. It's all about security, compliance, and recently AI. Fixing bugs and similar "non-flashy" work is a sure way of postponing one's promotion indefinitely.
I think is funny, because is not the first time I hear about microsoft employees not using the company products.
I worked on a project with some microsoft engineers to create a chatbot plugin for Salesforce, using Microsoft Power Virtual Agent, and the comunication tool they used was Slack and not teams. And I was obligated to use teams because of the consuting company I worked at the time.
And also the version control they used at the time was I think SVN, and not TFS.
A friend of mine over there told me their VP put a mandate that everyone should install and use Claude Code and write a weekly report on their usage (what they did, what worked, etc.). They also track token usage and have a leaderboard of who uses the most token.
It reminds me of this [0] Dilbert comic, but heh.
Indeed it's not: https://www.windowslatest.com/2026/01/09/is-microsoft-losing... And: https://www.perspectives.plus/p/microsoft-365-copilot-commer...
Tldr: Copilot has 1% marketshare among web chatbots and 1.85% of paid M365 users bought a subscription to it.
As much as I think AI is overrated already, Copilot is pretty much the worst performing one out there from the big tech companies. Despite all the Copilot buttons in office, windows, on keyboards and even on the physical front of computers now.
We have to use it at work but it just feels like if they spent half the effort they spend on marketing on actually trying to make it do its job people might actually want to use it.
Half the time it's not even doing anything. "Please try again later" or the standard error message Microsoft uses for every possible error now: "Something went wrong". Another pet peeve of mine, those useless error messages.
Hmm, 8M paid M365 Copilot users leaked in August, and at last week's earnings call the number was 15M.
Assuming the leak was accurate, almost doubling usage in 4 months for an enterprise product seems like pretty fast growth?
Its growth trajectory seems to be on par with Teams so far, another enterprise product bundled with their M365 suite, though to be fair Teams was bundled for free: https://www.demandsage.com/microsoft-teams-statistics/
Yeah, my problem the way it has been pushed is that how it doesn't make sense at all.
Improve the workflows that would benefit "AI" algorithms, image recognition, voice control, hand writing, code completion, and so on.
No need to put buttons to chat windows all over the place.
Yeah but it's the mainstream public that was just blown away with the LLM party trick. If it sounds like a human it must be smart like a human. So that's what everyone wants to sell :(
PS: When I say party trick I don't deny it has its uses but it's currently used like the jesus-AI that can do anything.
They put it into the Azure portal, and I tried to get it to answer me what the open resource cost us in storage. It appeared retarded at first, but then I realized it didn't have access to know what I had opened or anything.
Until MS makes sure their models get the necessary context, I don't even care to click on them.
I think Copilot is a platform or marketplace more than anything an Microsoft doesn't really need to care about what models are being used. They don't need to have a secret sauce as much as they need to make the entire ecosystem easy to use. They have had a lot of success over the years with VSC and this seems to build on that.
Claude Code is everywhere inside Apple too. Almost everyone has access to it and many use it
We can certainly see, every Windows update requires flipping a coin now.
I don’t understand how their various Copilot tools are so bad. Are they using a proprietary model instead of ChatGPT or Claude?
What are we discussing here?
The tools or the models? It's getting absurdly confusing.
"Claude Code" is an interface to Claude, Cursor is an IDE (I think?! VS Code fork?), GitHub Copilot is a CLI or VS Code plugin to use with ... Claude, or GPT models, or ...
If they are using "Claude Code" that means they are using Anthropic's models - which is interesting given their huge investment in OpenAI.
But this is getting silly. People think "CoPilot" is "Microsoft's AI" which it isn't. They have OpenAI on Azure. Does Microsoft even have a fine-tuned GPT model or are they just prompting an OpenAI model for their Windows-builtins?
When you say you use CoPilot with Claude Opus people get confused. But this is what I do everyday at work.
shrug
Best friendship takes place!
That and if Nvidia backs out of their $100B promise it may not be the death knell but it would certainly by a step backward for OpenAI.
https://www.wsj.com/tech/ai/the-100-billion-megadeal-between...
Yes the product's secret sauce is out and it's becomming a commodity.
But OpenAI is still innovating with new subcategories, and even in cases where it did not innovate (Claude Code came first and OpenAI responded with Codex), it outdoes its competitors. Codex is being widely preferred by the most popular vibecode devs, notably Moltbook's dev, but also Jess Fraz.
In terms of pricing, OAI holds by far the most expensive product so it's still positioned as a quality option, to give an example, most providers have a 3 tier price for API calls.
Anthropic has 1$/3$/5$ (per output MTokens) Gemini has 3$/12$ (2tier) OpenAI has 2$/14$/168$
So the competitors are mainly competing in price in the API category
To give another datapoint, Google just released multimodal (image input) models like 1 or 2 months ago. This has been in ChatGPT for almost a year now
Explains why Windows updates have been more broken than usual lately.
But I guess having my computer randomly stop working because a billion dollar corporation needs to save money by using a shitty text generation algorithm to write code instead of hiring competent programmers is just the new normal now.
I switched to Ubuntu last week for my desktop. First time in my 25+ year career I’ve felt like Microsoft was wasting my time more than administering a Linux desktop would take. The slop effect is real.
I've used Kubuntu for several years, wife too now which is an official, supported flavor of Ubuntu using KDE desktop instead of Gnome. It gives a more Windows like or CDE (Common Desktop Environment - from UNIX systems) feel than Gnome which gives a more Mac feel.
You might want to change to Debian or some other distro more radical.
Linux kernels will all eventually be permeated with AI-gen code as well. It will just take longer to see and feel the effects.
2 week old post feeling like part of the other weirdly promotional "Claude is everywhere right now" pieces that were around. Someone called it an advertising carpet bombing run.
A.I. Tool Is Going Viral. Five Ways People Are Using It
https://www.nytimes.com/2026/01/23/technology/claude-code.ht...
Claude Is Taking the AI World by Storm, and Even Non-Nerds Are Blown Away
https://www.wsj.com/tech/ai/anthropic-claude-code-ai-7a46460...
They have been unstable for decades. Does anyone still use self-hosted (running in a basement) windows servers? Running a windows machine feels like it's about as reliable as fast food order accuracy. Most of the time sure, but I hope you can afford to miss out sometimes.
I try GitHub Copilot every once in a while, and just last month it still managed to produce diffs with unbalanced curly braces, or tried to insert (what should be) a top-level function into the middle of another function and screw up everything. This wasn’t on a free model like GPT 4.1 or 5-mini, IIRC it was 5.2 Codex. What the actual fuck? Only explanation I can come up with is that their pay-per-request model made GHC really stingy with using tokens for context, even when you explicitly ask it to read certain files it ends up grepping and adding a couple lines.
You're not using the good models and then blaming the tool? Just use claude models.
Copilot's main problem seems to be people don't know how to use it. They need to delete all their plugins except the vscode, CLI ones, and disable all models except anthropic ones.
The Claude Code reputation diff is greatly exaggerated beyond that.
What, 5.2 Codex isn’t a good model? Claude 4.5 and Gemini 3 Pro with Copilot aren’t any better, I don’t have enough of a sample of Opus 4.5 usage with Copilot to say with confidence how it fares since they charge 3x for Opus 4.5 compared to everything else.
If Copilot is stupid uniquely with 5.2 Codex then they should disable that instead of blaming the user (I know they aren’t, you are). But that’s not the case, it’s noticeably worse with everything. Compared to both Cursor and Claude Code.
I had my first go at using it (Github Copilot) last week, for a simple refactoring task. I'd have to say I reasonably specified it, yet it still managed to to fail to delete a closing brace when it removed the opening block as specified.
That was using the Claude Sonnet 4.5 model, I wonder if using the Opus 4.5 model would have managed to avoid that.
Reading about ubiquitous Claude Code use inside of Apple and Microsoft, and not Codex, makes me very worried about forthcoming software quality.
Claude Code is fun, full of personality, many features to hack around model shortcomings, and very quick, but it should not be let anywhere near serious coding work.
That's also why OpenClaw uses Claude for personality, but its author (@steipete) disallows any contribution to it using Claude Code and uses Codex exclusively for its development. Claude Code is a slop producer with illusions of productivity.
Apple is all Claude Code internally
(Also a signal for why devs should not bother with their shoddy Xcode AI work - Apple devs are not using it)
I have found that Claude Code is better in every way I've used it. I like to use LLM's just as an advanced refactoring tool, especially where plain string search isn't enough. Anyway, my first experience of Copilot was it plainly lying that it deleted files I asked it to, and it insisted the file no longer existed (it did).
The difference between the two is stark.
Microsoft have a goal that states they want to get to "1 engineer, 1 month, 1 million lines of code." You can't do that if you write the code yourself. That means they'll always be chasing the best model. Right now, that's Opus 4.5.
> "Microsoft have a goal that states they want to get to "1 engineer, 1 month, 1 million lines of code.""
No, one researcher at Microsoft made a personal LinkedIn post that his team were using that as their 'North Star' for porting and transpiling existing C and C++ code, not writing new code, and when the internet hallucinated that he meant Windows and this meant new code, and started copypasting this as "Microsoft's goal", the post was edited and Microsoft said it isn't the company's goal.
That's still writing new code. Also, its kind of an extremely bad idea to do that because how are you going to test it? If you have to rewrite anything (hint: you probably don't) its best to do it incrementally over time because of the QA and stakeholder alignment overhead. You cannot push things into production unless it works as its users are expecting and it does exactly what stakeholders expect as well.
No no, your talking common sense and logic. You can't think like that. You have to think "How do I rush out as much code as possible?" After all, this is MS we're talking about, and Windows 11 is totally the shining example of amazing and completely stable code. /s
Porting legacy code is definitely one of its strengths. It can even... do wilder things if you're creative enough.
It is kind of funny that throughout my career, there has always been pretty much a consensus that lines of code are a bad metric, but now with all the AI hype, suddenly everybody is again like “Look at all the lines of code it writes!!”
I use LLMs all day every day, but measuring someone or something by the number of lines of code produced is still incredibly stupid, in my opinion.
Microsoft never got that memo. They still measure LoC because it’s all MBAs.
Fuck is there a way to have that degree and not be clueless and toxic to your colleagues and users.
If so, it hasn't always been that way. Steve Ballmer on IBM and KLoC's: https://www.youtube.com/watch?v=kHI7RTKhlz0
(I think it is from "Triumph of the Nerds" (1996), but I can't find the time code)
Ballmer hasn’t been around for a long long time. Not since the Red Ring of Death days. Ever since Satya took the reins, MBAs have filled upper and middle management to try to take over open source so that Sales guys had something to combat RedHat. Great for open source. Bad for Microsoft. However, Satya comes from the Cloud division so he knows how to Cloud and do it well. Azure is a hit with the enterprise. Then along comes AI…
Microsoft lost its way with Windows Phone, Zune, Xbox360 RRoD, and Kinect. They haven’t had relevance outside of Windows (Desktop) in the home for years. With the sole exception being Xbox.
They have pockets of excellence. Where great engineers are doing great work. But outside those little pockets, no one knows.
I wonder if we can use the compression ratio that an LLM-driven compressor could generate to figure out how much entropy is actually in the system and how much is just boilerplate.
Of course then someone is just going to pregenerate a random number lookup table and get a few gigs of 'value' from pure garbage...
I believe the "look at all the lines of code" argument for LLMs is not a way to showcase intelligence, but more-so a way to showcase time saved. Under the guise that the output is the/a correct solution, it's a way to say "look at all the code I would have had to write, it saved so much time".
The line of code that saves the most time is the one you don't write.
> measuring someone or something by the number of lines of code produced is still incredibly stupid, in my opinion.
Totally agree. I see LOC as a liability metric. It amazes me that so many other people see it as an asset metric.
Ironically, AI may help get past that. In order to measure "value chunks" or some other metric where LoC is flexibly multiplied by some factor of feature accomplishment, quality, and/or architectural importance, an opinion of the section in question is needed, and an overseer AI could maybe do that.
https://devblogs.microsoft.com/engineering-at-microsoft/welc...
"Microsoft has over 100,000 software engineers working on software projects of all sizes."
So that would mean 100 000 000 000 (100 billion) lines of code per month. Frightening.
With those kinds of numbers you don’t need logic anymore, just a lookup table with all possible states of the system.
No, no. 100 trillion lines of code per day is great! The only thing better would be 200 trillion ;)
CEO: I want big numbers of things. Big numbers = success.
Thats still 10 billion lines of code per month if that insane metric were a real goal (it’s not).
That’s 200 Windows’ worth of code every month.
Maybe they can use 5 - 10 loc to move the classic window shell button so it's not on top of the widgets button
I used to work at a place that had the famous Antoine de Saint-Exupéry quote painted near the elevators where everyone would see it when they arrived for work:
Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.
I miss those days.Cf. -2000 Lines Of Code:
Looks like the guy who posted that updated his post to say he was just talking about a research project he is working on.
Which is a bald-faced lie written in response to a PR disaster. The original claims were not ambiguous:
> My goal is to eliminate every line of C and C++ from Microsoft by 2030. Our strategy is to combine AI and Algorithms to rewrite Microsoft’s largest codebases. Our North Star is “1 engineer, 1 month, 1 million lines of code”.
Obviously, "every line of C and C++ from Microsoft" is not contained within a single research project, nor are "Microsoft's largest codebases".
The original claims were not ambigious, it's "My" goal not "Microsoft's goal".
The fact that it's a "PR disaster" for a researcher to have an ambitious project at one of the biggest tech companies on the planet, or to talk up their team on LinkedIn, is unbelievably ridiculous.
The authentic quote “1 engineer, 1 month, 1 million lines of code” as some kind of goal that makes sense, even just for porting/rewriting, is embarassing enough from an OS vendor.
As @mrbungie says on this thread: "They took the stupidest metric ever and made a moronic target out of it"
I mean 100% that was his goal. But that was one guy without the power to set company wide goals talking on LinkedIn.
The fact that there are distinguished engineers at MS who think that is a reasonable goal is frightening though.
I mean, if 1% out of 8 billion is "top" and that applies to Lines of Code, too, than ... more code contains more quality, ... by their logic, I guess ...
What if the % declines proportionally (or worse) to the growth in code.
it might, but not if you isolate/repurpose that % (over time), which is the promise
I've written a C recompiler in an attempt to build homomorphic encryption. It doesn't work (it's not correct) but it can translate 5 lines of working code in 100.000 lines of almost-working code.
Any MBAs want to buy? For the right price I could even fix it ...
> “My goal is to eliminate every line of C and C++ from Microsoft by 2030,” Microsoft Distinguished Engineer Galen Hunt writes in a post on LinkedIn. “Our strategy is to combine AI and Algorithms to rewrite Microsoft’s largest codebases.
they're fucked
Eliminate C/C++ in favor of what? Perhaps the plan is to use AI to write plain assembler? Why stop there, maybe let's do prompt in - machine-code out?
Microsoft went from somewhat good in Windows 7 to absolute dog shit in approximately 10 years.
So with this level of productivity Windows could completely degrade itself and collapse in one week instead of 15 years.
We’re back to measuring productivity by lines of code are we? Because that always goes well.
Yay another stupid metric to game!
This will lead to so much enshitification.
That is what the AI said:
1. Classic Coding (Traditional Development) In the classic model, developers are the primary authors of every line.
Production Volume: A senior developer typically writes between 10,000 and 20,000 lines of code (LOC) per year.
Workflow: Manual logic construction, syntax memorization, and human-led debugging using tools like VS Code or JetBrains IDEs.
Focus: Writing the implementation details. Success is measured by the quality and maintainability of the hand-written code.
2. AI-Supported Coding (The Modern Workflow)
AI tools like GitHub Copilot and Cursor act as a "pair programmer," shifting the human role to a reviewer and architect. Production Volume: Developers using full AI integration have seen a 14x increase in code output (e.g., from ~24k lines to over 810k lines in a single year).
Work Distribution: Major tech leaders like AWS report that AI now generates up to 75% of their production code.
The New Bottleneck: Developers now spend roughly 70% of their time reviewing AI-generated code rather than writing it.
I think realistic 5x to 10x is possible. 50.000 - 200.000 LOC per YEAR !!!! Would it be good code? We will see.
To this day I cannot wrap my head around the fact why did Microsoft allow a culture to grow inside the company (either through hiring, or through despondence) that at best is indifferent towards the company's products and at worst openly despises them?
I'm sure no other tech company is like this.
I think technologies like the Windows kernel and OS, the .NET framework, their numerous attempts to build a modern desktop UI framework with XAML, their dev tools, were fundamentally good at some point.
Yet they cant or wont hire people who would fix Windows, rather than just maintain it, really push for modernization, make .NET actually cool and something people want to use.
They'd rather hire folks who were taught at school that Microsoft is the devil and Linux is superior in all ways, who don't know the first thing about the MS tech stack, and would rather write React on the Macbooks (see the start menu incident), rather than touch anything made by Microsoft.
It seems somehow the internal culture allows this. I'm sure if you forced devs to use Copilot, and provided them with the tools and organizational mandate to do so, it would become good enough eventually to not have to force people to use it.
My main complaint I keep hearing about Azure (which I do not use at workr)