Comment by geoctl

Comment by geoctl 10 hours ago

5 replies

Is it? I honestly kinda believe that etcd is probably the weakest point in vanilla k8s. It is simply unsuitable for heavy write environments and causes lots of consistency problems under heavy write loads, it's generally slow, it has value size constraints, it offers very primitive querying, etc... Why not replace etcd altogether with something like Postgres + Redis/NATS?

itsnowandnever 9 hours ago

that touches on what I consider the dichotomy of k8s: it's a really scalable system that makes it easy to spin up a cluster locally on your laptop and interact with the full API locally just like in prod. so it's a super scalable system with a dense array of features. but paradoxically most shops won't need the vast majority of k8s features ever and by the time they scale to where they do need a ton of distributed init features they're extremely close to the point where they'd be better served by a bespoke system conceived from scratch in house that solves problems very specific to the business in question. if you have many thousands of k8s nodes, you're probably in the gray area of if using k8s is worth it because the loop of k8s will never be as fast as a centralized push control plane vs the k8s pull/watch control plane. and naturally at scale that problem will only compound

  • pas 8 hours ago

    but it's also standard, you can hire for it, outsource it, etc.

    and it's pretty modular too, so it can even serve as the host for the bespoke whatever that's needed

    though I remember reading the fly.io blog post about their custom scheduler/allocator which illustrates nicely how much of a difference a custom in-house solution makes if works well

  • trenchpilgrim 8 hours ago

    The other draw: Because k8s is open, you can easily hire employees, contractors, consultants and vendors and have them immediately solve problems within the k8s ecosystem. If you run a bespoke system, you have to train engineers on the system before they can make large contributions.

varispeed 9 hours ago

> Why not replace etcd altogether with something like Postgres + Redis/NATS?

Holy Raft protocol is the blockchain of cloud.

  • trenchpilgrim 9 hours ago

    You can do leader election without etcd. The thing etcd buys you is you can have clusters of 3, 5, 7 or 9 DB nodes and lose up to 1, 2, 3, or 4 nodes respectively. But honestly, the vast majority of k8s users would be fine with a single SQL instance backing each k8s cluster and just running two or more k8s clusters for HA.

    k3s doesn't require etcd, I'm pretty sure GKE uses Spanner and Azure uses Cosmos under the hood.