KubeVPN: Revolutionizing Kubernetes Local Development
(github.com)65 points by naison a day ago
65 points by naison a day ago
> the database
> the backend
Many k8s clusters are quite a bit more complicated than a single database and single backend. Some have dozens or hundreds of deployments. Even if you have the horsepower to run large deployments on your system, you will have problems wiring it all up to match production.
I think you are talking past each other about different stages of development.
At early stages you are writing some code and tests within a single component, here you are iterating with a single binary/container. At some stage a change may involve multiple components.
Once you are satisfied with your code changes you would want to run those components in an environment that simulates how they communicate normally.
In kubernetes this may mean you need your cluster and its networking components which may need configuration changes tested as part of your new feature, you may have introduced new business metrics which you want to verify are collected and shipped to your desired metrics aggregator so that you can build and expose dashboards, you may want to create new alerts from these metrics and verify that the new alerts trigger as expected, etc
You can see how you may need to run many components in order to test a change in only one. I don’t think this is bad engineering, and I don’t think it’s specific to kubernetes or “web-scale”.
That's quite a big conclusion to draw from my statement. Whether or not it is good engineering or not really depends on the problem you're solving, your team structure, your integration footprint, etc. Not everything is a custom CRUD app.
My ideal k8s dev env (I wonder if any of the tools do this):
- local on my machine.
- ingress with https + subdomains integrated with mDNS (so I can access the services easily from my phone when developing mobile apps). mDNS also makes sure that other devs can set it up locally for themselves.
- easily swap what I'm working on, if I have 3 services A, B, C, while I'm working on A locally, I want B and C to run in the cluster and to be able to interact with them, same if I'm working on B, A and C should run in the cluster.
Instead of mDNS, they could update a DNS record for a subdomain (techno00.dev.thecompany.com, preferably under a different domain than your real one) to their local IP address and then do the DNS-01 challenge on LetsEncrypt to get a valid TLS cert for the subdomain. Then the only problem is some routers block DNS responses with RFC-1918 IP addresses, but everyone is using DoT/DoH by now, right? ... right?
I have tools I've developed that do some/most of this, but they're internal/proprietary so I can't share them directly. What I _can_ share is how it works. Maybe somebody with more time/energy/will to live than me can take a crack at the problem.
Every developer runs Rancher Desktop as a local k8s cluster.
There's a controller + nginx container in the cluster.
For any appropriately annotated ingress the controller's mutatingwebhook patches the ingress to be backed by its own proxy service. It then reaches in and reconfigures the cluster's CoreDNS to resolve the domain to its proxy service as well.
Then as pods are started/stopped, it tracks whether the service your ingress was supposed to point at has any running pods behind it. If it does, it configures nginx to forward the request to the local service. If it doesn't, it configures nginx to proxy it to the upstream URL.
That all comes together in three main ways:
1. We start chromium with, among other things, a flag to set host rules that rewrite all connections to 127.0.0.1. So going to `oursite.com` loads our site through your cluster. API requests the page makes to `service.oursite.com` get routed through the local cluster.
2. Any requests your containers make to other services can request `oursite.com` and because of the CoreDNS stuff they'll hit the local proxy and get routed appropriately.
3. ... And for anything else we just have a real `localdev.cloud` domain with a wildcard subdomain that resolves to 127.0.0.1 and include that host on all the ingresses as well. So Postman can hit a service at `service.localdev.cloud`.
This puts us in a good place to "easily swap". There's a pile of bash scripts calling themselves a justfile that manages most of this. Run `system start` to bring up the controller and proxy as well as deploy all of the ingresses and services (so it knows what it's rewriting/proxying). Then you just do `project whatever up` and `project whatever down` to create/destroy the deployment and other resources. Mounting the project code into the container is `project whatever mount`--this is a separate step so in a situation where, e.g., a FE guy wants to test a specific BE build he can just throw the container tag in a .env and start it up and keep working on what he was working on. (And the QA can just start up any build of anything without any extra fuss.)
As for SSL, we're mostly solving for "on the same machine". The controller generates a root certificate on first start (so every developer has their own and it couldn't be used to intercept anyone else's traffic), then uses that to issue + sign certificates for everything else. You could add that to any other devices if you wanted. What we do is just slip chromium an extra flag to tell it to treat our certificates as valid.
So I can run a just command to open up chromium with the novel-worth of extra flags, go to `https://oursite.com`, and everything works. If there's a specific BE service I need to poke at, I just `project my-service up` and go back to chromium and keep doing stuff. If I wanted to make some FE changes, `project my-fe up && project my-fe mount` and start editing code.
There's a lot more to all of this but this comment's already way too long (but feel free to ask and I can talk your ear off). End of the day, though, we went from it taking like 1-2 days to get people to a point where they could start editing code to (and I tested this when a computer died one day) 45 minutes--most of which was waiting for rancher to download and then starting it and waiting for it to download more stuff. Went from it taking a day to get some rarely-touched legacy service limping along enough that you could debug something to it being consistently a single command and like 30 seconds. Went from spending a bunch of time reconfiguring all the services for each combination you might try and run to... just not anymore. Bugs/issues/misalignments getting caught much earlier because turns out making it easy to actually run the software together means people will do it more.
I have most of what you're asking for and you're definitely on the right track--it's a way nicer way to live.
Looks a lot like Telepresence[0] to me.
As the other commenter said, k3d is definitely the way to go in 2025. It uses k3s under the covers which you can actually use in production after you learn with k3d locally.
I've been operating home cluster (on-prem and lately hybrid with cloud instances) for several years now, running a lot of self-hosted applications for my family.
To be frank I struggle to understand what kubevpn does. While I may be missing something important, the quality of the documentation and description is lacking. Documentation is key as you are learning, so I suggest you don't use it and just focus on deploying and using k8s to host something.
IMO, k3s is the best distribution for hobbyists, by far. If you want to add connectivity to it - k3s supports tailscale now and I can confirm that it is quite stable.
Looks like it's similar to mirrord or DevSpace, it's for developing components in Kubernetes clusters without having to rebuild an image and redeploying every time you want to test it.
I don't really understand the point of this. I have a production cluster and I develop locally. My dev environment is not connected to the production cluster. That seems super dangerous.
My dev environment has the database with mock data, the backend, etc all running there. I would never connect to the production cluster. I don't need to VPN into another cluster to run locally.
Even if I have a dev cluster/namespace, then I will run the code I'm currently developing there. That's the point of a dev cluster. (Tilt for example can do both: a local cluster (minikube/k3d/...) or a remote test cluster)
I don't understand in what situation you have an app that needs to partially run in the cluster and needs to partially run on your machine.