A logging loop in GKE cost me $1,300 in 3 days – 9.2x my actual infrastructure

5 points by nthypes 6 hours ago

3 comments

Last month, a single container in my GKE cluster (Sao Paulo region) entered an error loop, outputting to stdout at ~2k logs/second. I discovered the hard way that GKE's default behavior is to ingest 100% of this into Cloud Logging with no rate limiting. My bill jumped nearly 1000% before alerts caught it.

Infrastructure (Compute): ~$140 (R$821 BRL) Cloud Logging: ~$1,300 (R$7,554 BRL)

Ratio: Logging cost 9.2x the actual servers.

https://imgur.com/jGrxnkh

I fixed the loop and paused the `_Default` sink immediately.

I opened a billing ticket requesting a "one-time courtesy adjustment" for a runaway resource—standard practice for first-time anomalies on AWS/Azure.

I have been rejected twice.

The latest response: "The team has declined the adjustment request due to our internal policies."

If you run GKE, the `_Default` sink in Log Router captures all container stdout/stderr.

There is NO DEFAULT CAP on ingestion volume which is an absurd!

A simple while(true); do echo "error"; done can bankrupt a small project.

Go to Logging -> Log Router. Edit _Default sink.

Add an exclusion filter: resource.type="k8s_container" severity=INFO (or exclude specific namespaces).

Has anyone successfully escalated a billing dispute past Tier 1 support recently?

It seems their policy is now to enforce full payment even on obvious runaway/accidental usage which is absurd since its LOGS! TEXT!

pants2 5 hours ago

Congrats on finding it in within three days. I recently discovered something similar on our infra that had been going on for months (just an INFO log in a tight loop). This is the type of thing that major cloud providers should absolutely have sensible defaults and alerts on, but don't because it's how they make all their money.

It's one thing when it's company cash but I would never ever use a big cloud provider for a personal project (with my credit card). There are way too many ways to accidentally run up an infinite bill. DigitalOcean has plenty of services and more predictable costs, or your local data center will be happy to make a deal on some bare metal servers if you need more horsepower.

I hope you're able to get it reversed - try finding your GCP rep. Either way let this be a valuable lesson on using cloud.

  • nthypes 3 hours ago

    Yes, it's my 4th email to Billing Support and getting "No" as answers. Moving to Azure..

    • pants2 an hour ago

      That's my point though. Azure or AWS are just as bad and that won't solve anything. What are you running that requires a major cloud provider?