Comment by mystcb

Comment by mystcb 5 days ago

Update 16:57 UTC:

Azure Portal Access Issues

Starting at approximately 16:00 UTC, we began experiencing Azure Front Door issues resulting in a loss of availability of some services. In addition. customers may experience issues accessing the Azure Portal. Customers can attempt to use programmatic methods (PowerShell, CLI, etc.) to access/utilize resources if they are unable to access the portal directly. We have failed the portal away from Azure Front Door (AFD) to attempt to mitigate the portal access issues and are continuing to assess the situation.

We are actively assessing failover options of internal services from our AFD infrastructure. Our investigation into the contributing factors and additional recovery workstreams continues. More information will be provided within 60 minutes or sooner.

This message was last updated at 16:57 UTC on 29 October 2025

---

Update: 16:35 UTC:

Azure Portal Access Issues

Starting at approximately 16:00 UTC, we began experiencing DNS issues resulting in availability degradation of some services. Customers may experience issues accessing the Azure Portal. We have taken action that is expected to address the portal access issues here shortly. We are actively investigating the underlying issue and additional mitigation actions. More information will be provided within 60 minutes or sooner.

This message was last updated at 16:35 UTC on 29 October 2025

---

Azure Portal Access Issues

We are investigating an issue with the Azure Portal where customers may be experiencing issues accessing the portal. More information will be provided shortly.

This message was last updated at 16:18 UTC on 29 October 2025

---

Message from the Azure Status Page: https://azure.status.microsoft/en-gb/status

planewave 5 days ago

Azure Network Availability Issues

Starting at approximately 16:00 UTC, we began experiencing Azure Front Door issues resulting in a loss of availability of some services. We suspect that an inadvertent configuration change as the trigger event for this issue. We are taking two concurrent actions where we are blocking all changes to the AFD services and at the same time rolling back to our last known good state.

We have failed the portal away from Azure Front Door (AFD) to mitigate the portal access issues. Customers should be able to access the Azure management portal directly.

We do not have an ETA for when the rollback will be completed, but we will update this communication within 30 minutes or when we have an update.

This message was last updated at 17:17 UTC on 29 October 2025

Reply View 2 replies

croemer 5 days ago

"We have initiated the deployment of our 'last known good' configuration. This is expected to be fully deployed in about 30 minutes from which point customers will start to see initial signs of recovery. Once this is completed, the next stage is to start to recover nodes while we route traffic through these healthy nodes."
"This message was last updated at 18:11 UTC on 29 October 2025"

Reply View | 1 reply
- croemer 5 days ago
  
  At this stage, we anticipate full mitigation within the next four hours as we continue to recover nodes. This means we expect recovery to happen by 23:20 UTC on 29 October 2025. We will provide another update on our progress within two hours, or sooner if warranted.
  This message was last updated at 19:57 UTC on 29 October 2025
  
  Reply View | 0 replies

cyptus 5 days ago

AFD is down quite often regionally in Europe for our services. In 50%+ the cases they just don‘t report it anywhere, even if its for 2h+.

Reply View 36 replies

RajT88 5 days ago

Spam those Azure tickets. If you have a CSAM, build them a nice powerpoint telling the story of all your AFD issues (that's what they are there for).
> In 50%+ the cases they just don‘t report it anywhere, even if its for 2h+.
I assume you mean publicly. Are you getting the service health alerts?

Reply View | 32 replies
- tomashubelbauer 5 days ago
  
  CSAM apparently also means Customer Success Account Manager for those who might have gotten startled by this message like me.
  
  Reply View | 5 replies
  
  ifwinterco 5 days ago
  
  Alternative für Deutschland was strange enough, when I saw CSAM I was really wondering what thread I had stumbled into
  
  Reply View | 1 reply
  
  cyptus 4 days ago
  
  haha :D
  
  Reply View | 0 replies
  
  linohh 5 days ago
  
  Thank you, not going to google that shit.
  
  Reply View | 2 replies
- psunavy03 5 days ago
  
  Some really unfortunate acronyms flying around the Microsoft ecosystem . . .
  
  Reply View | 4 replies
  
  RajT88 5 days ago
  
  Quite so. The acronym collision rate is high.
  
  Reply View | 3 replies
- nijave 5 days ago
  
  Back when we used Azure the only outcome was them trying to upsell us on Premium Support
  
  Reply View | 1 reply
  
  RajT88 4 days ago
  
  Do you recall the kind of premium support? Azure Rapid Response?
  
  Reply View | 0 replies
- cyptus 5 days ago
  
  in many cases: no service health alerts, no status page updates and no confirmations from the support team in tickets. still we can confirm these issues from different customers accross europe. Mostly the issues are regional dependent.
  
  Reply View | 0 replies
- cyberax 5 days ago
  
  > CSAM
  Child Sex-Abuse Material?!? Well, a nice case of acronym collision.
  
  Reply View | 14 replies
  
  mirekrusin 5 days ago
  
  They should rename to Success Customer Account Manager.
  
  Reply View | 6 replies
  
  RajT88 5 days ago
  
  Definitely the most baffling acronym collision I have seen with Microsoft. I did one time count 4 different products abbreviated VSTS at one point.
  
  Reply View | 1 reply
  
  dotancohen 5 days ago
  
  Didn't MS have three things called "link" at one time? They were all spelled differently, of course.
  
  Reply View | 0 replies
  
  SAI_Peregrinus 5 days ago
  
  They must really depend on their government contracts with this administration…
  
  Reply View | 0 replies
  
  codeduck 5 days ago
  
  Oh dear. Will make for an awkward thing to have on your resume.
  
  Reply View | 3 replies
- alias_neo 4 days ago
  
  Where do these alerts supposedly come from? I started having issues around 4PM (GMT), couldn't access portal, and couldn't make AKV requests from the CLI, and initially asked our Ops guys but with no info and a vague "There may be issues with Portal" on their status page, that was me done for the day.
  
  Reply View | 0 replies
- llama052 5 days ago
  
  I got a service health alert an hour after it started, saying the portal was having issues. Pretty useless and misleading.
  
  Reply View | 1 reply
  
  RajT88 5 days ago
  
  That should go into the presentation you provide your CSAM with as well.
  Storytelling is how issues get addressed. Help the CSAM tell the story to the higher ups.
  
  Reply View | 0 replies
nevf1 5 days ago

This is the single most frustrating thing about these incidents. As you're harmstrung on what you can do or how you can react until Microsoft officially acknowledges a problem. Took nearly 90mins both today and when it happened on 9th October.

Reply View | 1 reply
- cyptus 5 days ago
  
  so true. instead of getting a fast feedback we are wasting time searching for our own issues first.
  
  Reply View | 0 replies
hallh 5 days ago

Same experience. We've recently migrated fully away from AFD due to how unreliable it is.

Reply View | 0 replies

jjp 5 days ago

Whilst the status message acknowledge's the issue with Front Door (AFD), it seems as though the rest of the actions are about how to get Portal/internal services working without relying on AFD. For those of us using Front Door does that mean we're in for a long haul?

Reply View 4 replies

llama052 5 days ago

Please migrate off of front door. It's been a failure mode since it came out historically. Anything else is better at this point

Reply View | 2 replies
- everfrustrated 5 days ago
  
  Didn't the underlying vendor they used for Azure Front Door go bankrupt? It's probably on life support.
  
  Reply View | 1 reply
  
  guptadagger 4 days ago
  
  i understood that to be a different third party that provided a CDN and was different than afd. https://learn.microsoft.com/en-us/azure/frontdoor/migrate-cd...
  
  Reply View | 0 replies
progmetaldev 5 days ago

Currently even the Front Door landing page is only partially loading.
https://azure.microsoft.com/en-us/products/frontdoor

Reply View | 0 replies

8cvor6j844qw_d6 5 days ago

I'll be interested in the incident writeup since DNS is mentioned. It will be interesting in a way if it is similar to what happened at AWS.

Reply View 11 replies

Insanity 5 days ago

It's pretty unlikely. AWS published a public 'RCA' https://aws.amazon.com/message/101925/. A race condition in a DNS 'record allocator' causing all DNS records for DDB to be wiped out.
I'm simplifying a bit, but I don't think it's likely that Azure has a similar race condition wiping out DNS records on _one_ system than then propagates to all others. The similarity might just end at "it was DNS".

Reply View | 8 replies
- parliament32 5 days ago
  
  That RCA was fun. A distributed system with members that don't know about each other, don't bother with leader elections, and basically all stomp all over each other updating the records. It "worked fine" until one of the members had slightly increased latency and everything cascade-failed down from there. I'm sure there was missing (internal) context but it did not sound like a well-architected system at all.
  
  Reply View | 2 replies
  
  nijave 5 days ago
  
  >slightly increased latency
  They didn't provide any details on latency. It could have been delayed an hour or a day and no one noticed
  
  Reply View | 0 replies
  
  RajT88 5 days ago
  
  Needs STONITH
  
  Reply View | 0 replies
- kyrra 5 days ago
  
  https://isitdns.com/
  
  Reply View | 0 replies
- cdr420 5 days ago
  
  It's always DNS
  
  Reply View | 3 replies
  
  tempest_ 5 days ago
  
  It is a coin flip, heads DNS, tails BGP
  
  Reply View | 2 replies
layer8 5 days ago

DNS has both naming and cache invalidation, so no surprise it’s among the hardest things to get right. ;)

Reply View | 1 reply
- dotancohen 5 days ago
  
  That's three of the hardest problems in CS ))
  
  Reply View | 0 replies

NDizzle 5 days ago

They briefly had a statement about using Traffic Manager to work with your AFD to work around this issue, with a link to learn.microsoft.com/...traffic-manager, and the link didn't work. Due to the same issue affecting everyone right now.

They quickly updated the message to REMOVE the link. Comical at this point.

Reply View 2 replies

Aperocky 5 days ago

The statement is still there though on the status page though

Reply View | 1 reply
- NDizzle 5 days ago
  
  They re-added it once the site was accessible.
  
  Reply View | 0 replies

jdc0589 5 days ago

yea its not just the portal. microsoft.com is down too

Reply View 11 replies

mystcb 5 days ago

Yeah, I am guessing it's just a placeholder till they get more info. I thought I saw somewhere that internally within Microsoft it's seen as a "Sev 1" with "all hands on deck" - Annoyingly I can't remember where I saw it, so if someone spots it before I do, please credit that person :D
Edit: Typo!

Reply View | 2 replies
- verst 5 days ago
  
  It's a Sev 0 actually (as one would expect - this isn't a big secret). I was on the engineering bridge call earlier for a bit. The Azure service I work on was minimally impacted (our customer facing dashboard could not load, but APIs and data layer were not impacted) but we found a workaround.
  
  Reply View | 0 replies
- chad_c 5 days ago
  
  It was here https://news.ycombinator.com/item?id=45749054 but that comment has been deleted.
  
  Reply View | 0 replies
PeterCorless 5 days ago

Seems all Microsoft-related domains are impacted in some way.
• https://www.xbox.com/en-US also doesn't fully paint. Header comes up, but not the rest of the page.
• https://www.minecraft.net/en-us is extremely slow, but eventually came up.

Reply View | 0 replies
bossyTeacher 5 days ago

It sure must be embarrassing for the website of the second richest company in the world to be down.

Reply View | 0 replies
daxfohl 5 days ago

Downdetector says aws and gcp are down too. Might be in for a fun day.

Reply View | 4 replies
- rozenmd 5 days ago
  
  From what I can tell, Downdetector just tracks traffic to their pages without actually checking if the site is down.
  The other day during the AWS outage they "reported" OVH down too.
  
  Reply View | 0 replies
- jdc0589 5 days ago
  
  yea I saw that, but im not sure on how accurate that is. a few large apps/companies I know to be 100% on AWS in us-east-1 are cranking along just fine.
  
  Reply View | 0 replies
- linhns 5 days ago
  
  Not sure if this is true. I just login to the console with no glitch.
  
  Reply View | 0 replies
- NetMageSCW 5 days ago
  
  AWS was performance issues and I believe is resolved.
  
  Reply View | 0 replies
planewave 5 days ago

yes, and it seems that at least for some login.microsoftonline.com is down too, which is part of the Entra login / SSO flow.

Reply View | 0 replies

jonathanlydall 5 days ago

Yet another reason to move away from Front Door.

We already had to do it for large files served from Blob Storage since they would cap out at 2MB/s when not in cache of the nearest PoP. If you’ve ever experienced slow Windows Store or Xbox downloads it’s probably the same problem.

I had a support ticket open for months about this and in the end the agent said “this is to be expected and we don’t plan on doing anything about it”.

We’ve moved to Cloudflare and not only is the performance great, but it costs less.

Only thing I need to move off Front Door is a static website for our docs served from Blob Storage, this incident will make us do it sooner rather than later.

Reply View 6 replies

out_sider 5 days ago

we are considering the same but because our website uses APEX domain we would need to move all DNS resolver to cloudfront right ? Does it have as a nice "rule set builder" as azure ?

Reply View | 4 replies
- jonathanlydall 5 days ago
  
  Unless you pay for CloudFlare’s Enterpise plan, you’re required to have them host your DNS zone, you can use a different registrar as long as you just point your NS records to Cloudflare.
  Be aware that if you’re using Azure as your registrar, it’s (probably still) impossible to change your NS records to point to CloudFlare’s DNS server, at least it was for me about 6 months ago.
  This also makes it impossible to transfer your domain to them either, as CloudFlare’s domain transfer flow requires you set your NS records to point to them before their interface shows a transfer option.
  In our case we had to transfer to a different registrar, we used Namecheap.
  However, transferring a domain from Azure was also a nightmare. Their UI doesn’t have any kind of transfer option, I eventually found an obscure document (not on their Learn website) which had an az command which would let you get a transfer code which I could give to Namecheap.
  Then I had to wait over a week for the transfer timeout to occur because there is no way on Azure side that I could find to accept the transfer immediately.
  I found CloudFlare’s way of building rules quite easy to use, different from Front Door but I’m not doing anything more complex than some redirects and reverse proxying.
  I will say that Cloudflare’s UI is super fast, with Front Door I always found it painfully slow when trying to do any kind of configuration.
  Cloudflare also doesn’t have the problem that Front Door has where it requires a manual process every 6 months or so to renew the APEX certificate.
  
  Reply View | 3 replies
  
  out_sider 5 days ago
  
  Thanks :). We don't use Azure as our registrar. It seems I'll have to plan for this then, we also had another issue, AFD has a hard 500ms tls handshake timeout (doesn't matter how much you put on the origin timeout settings) which means if our server was slow for some reason we would get 504 origin timeout.
  
  Reply View | 0 replies
  
  Figs 5 days ago
  
  CloudFlare != CloudFront
  
  Reply View | 1 reply
  
  out_sider 5 days ago
  
  I meant cloudfare
  
  Reply View | 0 replies
nosefrog 5 days ago

Front Door is not good.

Reply View | 0 replies

eddie_catflap 5 days ago

We saw issues before 16:00 UTC - approx 15:38

Reply View 0 replies

ThatManulTheCat 5 days ago

DNS. Ofc.

Reply View 0 replies

rconti 5 days ago

Sounds like they need to move their portal to a region with more capacity for the desired instance type. /s

Reply View 0 replies