When Every Network is 192.168.1.x
(netrinos.com)153 points by pcarroll 4 days ago
153 points by pcarroll 4 days ago
I like to think this is what we did. It's a simple Linux software stack - Linux, nftables, WireGuard, Go... But the goal was also to make it automatic and easy to use. It's not for my Mom. But you don't need a CCNP either. The trick is in the automation and not the stack itself.
The key distinction with a L3VPN setup is that the packets are unmodified from and including the IP layer upwards, they're just encapsulated/labelled/tagged (depending on your choice of distinguisher). That encapsulation/… is a stateless operation, but comes at the cost of MTU (which in your case should be a controllable factor since the inner flows don't really hit uncontrolled devices.) Depending on what you're trying to do, the statelessness can be anything from useless to service critical (the latter if you're under some risk of DoS due to excessive state creation). It can also alleviate NAT problems, e.g. SIP and RTP are "annoying" to NAT.
(ed.: To be fair, 1:1 NAT can be almost stateless too, that is if your server side ["Technician"] can be 1:1 mapped into the customer's network, i.e. the other direction. This only works if you have very few devices on "your" side and/or/according to how many IPs you can grab on the customer network.)
The IPv6/NAT64 approach meanwhile is very similar to what you did, it just gets rid of the need to allocate unique IP addresses to devices. The first 96 bits of the IPv6 address become a customer/site ID, the last 32 bit are the unmodified device IPv4 address.
10. is /8 (24 payload bits), 172.16 is /12 (so 22) and 192.168 is /16. Very little need to spend more than 18 bits of space to map every 'usable' private IPv4 address once per customer. Probably also less than 14 bits (16k) of customers to service.
There's more addresses I didn't know about offhand but found when looking up the 'no DHCP server' autoconf IP address range (Link Local IPv4).
That's all true on a statement level, but doesn't make an IPv4:IPv4 NAT solution better than either VRF/encap or IPv6 mapping.
The benefit with VRF/encap is that the IPv4 packets are unmodified.
The benefit with IPv6 mapping is that you don't need to manage IPv4:IPv4 tables and have a clear boundary of concerns & zoning.
In both cases you don't give a rat's ass which prefixes the customer uses. That math/estimation you're doing there… just entirely not needed.
The problem with talking to a telco, is you have to talk with not just one but any your customer may use. And if at the customer location there’s multiple routers in between the cameras and that telco router, it’s a shitshow trying to configure anything.
Much easier to drop some router on site that is telco neutral and connect back to your telco neutral dc/hq.
The Metro Ethernet Forum standardized a lot of services telcos can offer, many years ago
No good when the upstream is some wifi connection provided by the building management, rather than a telco themselves.
May as well pick a single solution that works across all Internet connections and weird setups, be an expert in that, vs having to manage varying network approaches based on telco presence, local network equipment, operating country, etc.
On the later parts, VRF in my scenarios won’t scale.
Need to provide support access to 10k-50k locations all with the same subnet (industry standard equipment where the vendor mandates specific IP addressing, for better or worse). They are always feeding in data into the core too.
Much easier to just VPN+NAT.
That is a valid point. Though I would probably check first what the scaling limits on VRFs actually are; there was some netdev work a while back to fix scaling with 100k to 1M devices (a VRF is a device, though also a bit more than that). It's only the server ("technician") that needs to have all of these (depends on the setup if that helps or not), intermediate devices just need to forward without looking at the tags, and the VPN entry point only cares about its own subset of customers.
I'd probably use the IPv6 + NAT64 setup in your situation.
What we could do is increase the number of IP addresses available. Just imagine if we enlarged the IP address space from 32 bits to 128 bits: Every device on the Internet could have a unique IP address!
The thing is, this upgrade you two are praising is designed to satisfy the original article's needs and no one else's.
Why do all those devices need to talk to each other btw? It's never specified. Is it a user need or a data collection/spyware need?
In a world where security articles make the news saying that you could obtain access to something IF the attacker already has local root and IF the moon is in a quarter phase and IF the attacker is physically present in the same room as the machine and this means the sky is falling...
... we should be questioning why disparate devices on unrelated home networks need to talk to each other.
The issue is that we DO NOT want every device to have a publicly routable IP address. It does make sense for some machines, but you probably don't want your your Internet-of-Shit devices to have public IPs. Of course you can firewall the devices, but you are always one misconfiguration or bug away from exposing devices that should not be exposed, when a local network is a more natural solution for what is supposed to remain local in the first place.
Why not IPv6? Pretending that it doesn't exist??
https://en.wikipedia.org/wiki/List_of_IPv6_transition_mechan...
The solution is to run ipv6 on the overlay and have the customer site gateway thing they have to translate it to target ipv4. Conveniently you can do the translation it more or less statefully and very easily because you can just embed the ipv4 addr in ipv6. For example you could grab a /64 prefix, assign 32 bits to customer/gateway id and other 32 bits to target ipv4 addr.
The squishy side.
Coincidentally I think that's an overestimation on the number of devices that don't support IPv6. At this point, vendors have to go out of their way to disable IPv6, and they lose out on some government/enterprise tenders that require IPv6 even if they're not running it (yet).
Right, IPv6 is baked into the NIC, so it’s up to developers to use it.
Yes, I was going to suggest nat64 encapsulating the customer's v4 network on the wireguard overlay, but their embedded device is presumably a little linux board, and mainline linux still lacks any siit/clat/nat64 in netfilter. So I guess they'd end up in a world of pain with out-of-tree modules like jool or inefficient funnelling through taptun tayga-style.
IPv6 solves the addressing problem, not the reachability problem. Good luck opening ports in the stateful IPv6 firewalls in the scenarios outlined in TFA:
> And that assumes a single NAT. Many sites have a security firewall behind the ISP modem, or a cellular modem in front of it. Double or triple NAT means configuring port forwarding on two or three devices in series, any of which can be reset or replaced independently.
I'm not really seeing a reason why it would be impossible to open firewalls in that scenario. More work, sure, but by no means impossible. In any case TFA says right up front that it is trying to solve the problem of overlapping subnets, which IPv6 solves nicely.
Then you've probably never worked in any serious networked embedded systems space. Getting people to open ports on the firewall and making the firewall configuration palatable to the end customer is like a quarter of what I think about when my team makes new features.
> I'm not really seeing a reason why it would be impossible to open firewalls in that scenario.
Cheap ass ISP-managed routers. Got to be lucky for these rubbish bins to even somewhat reliably provide IPv6 connectivity to clients at all, or you run into bullshit like new /64's being assigned every 24 hours, or they may provide IPv6 but not provide any firewall control...
Companies with an IT department, maybe. Companies without IT, not much. People, nope.
I can't see my neighbors opening ports on their switch. What's a switch, to start with. And what happens when they change provider and switch next month?
It's much easier to tell them: I install two boxes. One is the camera (or whatever), the other one is necessary to make the camera work properly, keep it online, don't switch it off.
With IPv6 you don’t forward ports at all. The device already has a public address.
That's why I said "open ports", not "forward ports".
Stateful firewalls are very much a thing on v6. Many mobile ISPs don't allow incoming connections by default, for example.
Many CPEs (home routers) also come with a v6 firewall (I'd guess it's probably more common than not?), and not everybody has admin access to theirs.
That's the addressing problem, although I have some bad news on that: NAT is used with IPv6 in some places.
The reachability problem is, even with public addresses, sometimes you have to do the same thing to "configure port forwarding" with stateful IPv6 firewalls as with double or triple NAT IPv4.
I recently just changed my default subnet to 10.X.Y.... rolling two random numbers to make it highly unlikely my home subnet through wireguard would conflict with the subnet where I am connecting from.
I just use /24s in the lower-middle range of 172.16. Very unlikely to have a conflict there.
When I separated my scientific instruments from IT, I went to fixed IP and set each device to 192.A.B.x where x is different for each instrument or PC. And A & B are for my lab only, but definitely not the same as the "generic" address range IT is using.
One day somebody working days or nights "helpfully" plugged one of IT's loose office-machine-network cables into one of my little lab ethernet switches which had a vacant spot :\
With separate IP subnets it really kept the traffic from crossing, no damage was done, and nobody ever knew until a PC configured for DHCP was plugged into the lab network, and their router wanted to autoassign an IP address to it.
So, uh.
I kinda don't want to share this because:
A) it's a bad idea
B) it means it will be less unique
and
C) I got teased for it a long time ago by my other nerd friends.
But the US DOD has huge blocks of prefixes that it doesn't do anything with, presumably they use it for internal routing so every device they have could publicly route without NAT..
One of those prefixes is 7.0.0.0/8.
My home network uses that. I have never had an issue with S2S VPNs.
However, there have been a few bits of software (pfsense for example) which have RFC1918 hardcoded in some areas and treat it like a public network and overwriting it means doing the entire network setup manually without the helping hand of the system to build-out a working boilerplate.
We chose Go as the development language. Go produces statically compiled binaries that include all dependencies. The only external deps are wireguard, nftables, nmap, etc. All easy stuff. So we have no need for Docker. We publish binaries for ARM64 and AMD64. Avoiding Docker has made it much easier to work with.
I had this happen at home. I'm not convinced it was a good idea to choose default subnets as /20.
It was pretty easy to cause myself problems with Docker compose. Eventually I run out of subnets in the 172.16 range and it happily created subnets in the 192.168. range. Some of them overlapped with subnets on my LAN.
Yes, we use Docker (or podman) but generally never rely on Docker’s internal address ranges.
I find a lot of Docker containers using subnets inside 172.16.0.0/16.
Probably for the same reason – 172.16/12 is not as widely used for other networks :-)
This works fine for your end. But the issue we are addressing is on the other end, when you don't control the network and need to reach devices. If all customer sites are running rfc-unroutable blocks, you eventually encounter conflicts. And the conflict will likely be with the 2nd one you try.
I subtly remember that 10.x.y address space is widely used by CGNATs.
The IETF really dragged their heels on CGNAT because they thought that IPv6 is easy™ (of course not, it's intentionally designed not to be "almost the same but wider" but include unworkable stuff like Mobile IPv6[1] which is just a fancy VPN) until they were forced to allocate 100.64.0.0/10 because some ISPs are not just using 10.0.0.0/8 but also US-DoD addresses (especially 11.0.0.0/8, because it's basically 10.0.0.0/7) as "private" addresses.
[1] Not IPv6 on mobile devices but a fully-owned IPv6 range that is supposed to be the address for a device regardless of where it is, see RFC 3775
This is basically what I use tailscale & their magicdns feature for. I manage a few locally hosted jellyfin servers for myself and some family members, and its the same problem. I just added tailscale to them all and now I can basically do ssh parents.jellyfin.ts.net or inlaws.jellyfin.ts.net
I need to implement this type of thing for supporting networks of family members, but without the media server aspect - just computer/networking support. I'm looking for a cheap and reliable device that I can put in each home, to give the Tailscale "foothold". Do you happen to know of any tiny devices? I was thinking there must be something even cheaper than a Raspberry Pi to perform this single function at each location.
I was about to say that. This is what I do too.
The only drawback are routes - they won't work on the same CIDR (I mean the fact that you can say in Tailscale "if you want to reach the 192.168.16.13 device that does not support Tailscale, go through this Tailscale gateway"). For this I had to shift my parents' network to be able to access stuff like the printer, in a network that clashed with another one of mine.
The way we did it, roting is not a problem. Any Netrinos client (Windows, Mac, or Linux, including the free version) can act as a gateway. It assigns a unique overlay IP to devices on the local network that can't run software themselves, like cameras, NAS units, or printers, and handles the NAT translation.
Think of it like a router's DMZ feature, but inverted. Instead of exposing one device to the internet, each device gets a private address that's only reachable inside your mesh network.
This overlay approach is fantastic, but I do not think it exists in Tailscale.
I decided to learn IPv6 recently and I'm pleasantly surprised how simple and elegant it is. Truly a joy. Highly recommend, if you've never worked with IPv6 to try it. It's like discovering a bidet.
> The gateway device performs 1:1 NAT. Traffic arriving for 100.97.14.3 is destination-translated to 192.168.1.100, and the source is masqueraded to the gateway's own LAN address.
Couldn't you tell the WG devices that 192.168.2.0/24 refers to the 192.168.1.0/24 network at customer A, such that 192.168.2.55 is routed to 192.168.1.55. Same for 192.168.3.0/24 referring to customer B.
I think this is what the article is getting at but I don't see the value in manually assigning an alias to each non-wg device, versus assigning an alias to the entire LAN.
It's not enough to set fake routes. You have to edit the addresses in the packets, so the end devices will receive them.
Yeah so instead DNAT, use NETMAP on the gateway device to that LAN. (Sorry if I'm abusing the terminology, I only do this stuff like once a year for homelab.)
eg this is what I'm currently using to alias my home network
# Rewrite 192.168.150.?? as 192.168.50.??
PreUp = iptables -t nat -A PREROUTING -d 192.168.150.0/24 -j NETMAP --to 192.168.50.0/24
PostDown = iptables -t nat -D PREROUTING -d 192.168.150.0/24 -j NETMAP --to 192.168.50.0/24
With other wg peers getting a 192.168.150.0/24 entry in the AllowedIPs for this gateway (if needed).We implemented a very similar solution more than five years ago. The NanoPi R3S was not available then, so we used the GL.iNet GL-MT300N-v2 (aka Mango) running OpenWRT as our edge gateways. It's slow and only has two 100Mb ports, but that was never the bottleneck. At that time, I was able to assemble a batch of 10 including cables and power supplies for only $300, which was ridiculously cheap for such a flexible solution. If you need a polished, turnkey solution, by all means check netrinos out. If you have a strong Linux/nftables/wireguard background, this solution is easy to roll on your own.
The suggested solution involves using the CGNAT /10 in conjunction with a VPN, but I've actually seen someone do this, and still have problems with certain end users where their next hop for routing also involves a router with an IPv4 address in the same space, so it's not really bulletproof either. We may as well consider doing other naughty things like co-opting DoD non-routable /8s or the test net in the RFCs you're not supposed to use, because basically anything you pick is going to have problems.
This is what the NETMAP target in iptables is for - map an entire subnet to another subnet, including the reverse. We were doing this 20 years ago for clients trying to on-board other companies that they'd bought. It's horrible, but it does solve the problem in a pinch.
I feel like this is really only an issue with true site to site VPNs. Client to site shouldn't have this issue because the VPN concentrator is like a virtual NAT.
The best strategy might be to maintain the ability to easily reassign the network for a site. If every site is non-overlapping the problem does become trivial. I'd much rather fight a one time "reboot your machines tonight" battle than the ongoing misery of mapping things that do not want to be.
One step beyond this is the multi-subnetted network on each side. You get the DNAT working, but then suddenly the app gets more complex over time and suddenly you're calling 192.168.2.x, which leads to async routes. Some traffic works, some traffic works one way, and other traffic disappears.
Then you as the client/app manager pull your hair out as the network team tells you everything is working fine.
At the risk of ruining my solution I moved my lan sub net into the
172.16.0.0/12 block
This is used on virtual private clouds and is not publicly addressable.
since switching to this I have not had any collisions.
Shameless plug - this is exactly the same problem that our team had when we had to maintain a bunch of our customer's servers. All of the subnets were same, and we had to jump through hoops just to access those servers - vpns, port forwarding, dynamic dns with vnc - we've tried it all. That is why we developed https://sshreach.me/ - now it's a click of a button.
The initial idea started as a bunch of ssh tunnels. Been doing that for years. But WireGuard seemed a better solution at scale, and more efficient. When I first saw WiteGuard, it blew my mind how elegantly simple it was. I always hated VPNs. Now I seem to have made them my life...
Your website landing page is great. No stock photo hipsters drinking coffee, no corporate fluff amid whitespace wasteland. Just straight to the point. Rare sight today.
> But the moment two sites share the same address range, you have an ambiguity that IP routing cannot resolve.
Writing PF or nft rules to NAT these hyper-legacy subnets on the local side of the layer3 tunnel is actually super trivial, like 20 seconds of effort to reason about and write in a config manifest.
Like written the article, a device on the customer site is required. At that point you might as well deploy a router that has a supportable software stack and where possible sober IP instead of legacy IP.
.
I have been running IPv6-only networks since 2005 and have been deploying IPv6-only networks since 2009. When I encountered a small implementation gap in my favorite BSD, I wrote and submitted a patch.
Anyone who complained about their favorite open source OS having an IPv6 implementation gap or was using proprietary software (and then also dumb enough to complain about it), should be ashamed of themselves for doing so on any forum with "hacker" in the name. But we all know they aren't ashamed of themselves because the competency crisis is very real and the coddle culture let's such disease fester.
There is no excuse to not deploy at minimum a dual-stack network if not an IPv6-only network. If you deploy an IPv4-only network you are incompetent, you are shitting up the internet for everyone else, and it would be better for all of humanity if you kept any and all enthusiasm you have for computers entirely to yourself (not a single utterance).
>Support for IPv6 is notoriously bad in residential modems.
No? Over here at (South) East Asia we have been deploying IPv6 for nearly a decade now. The users are getting their IPv6 connectivity. Before someone jumps out and shouts SeCuRiTy: the firewall is enabled by default.
I am not saying the support is perfect. I know some people moan about lackluster IPv6 configuration in many routers. But for 90% of residential internet users (who care about pretty much nothing but the ability to watch YouTube and browsing social media), it damn sure is.
They clearly haven't talked to a telco or network device vendor, they would've sold them a VRF/EVPN/L3VPN based solution… for a whole bunch of money :)
You can DIY that these days though, plain Linux software stack, with optional hardware offload on some specific things and devices. Basically, you have a traffic distinguisher (VXLAN tunnel, MPLS label, SRv6, heck even GRE tunnel), keep a whole bunch of VRFs (man ip-vrf) around, and have your end services (server side) bind into appropriate VRFs as needed.
Also, yeah, with IPv6 you wouldn't have this problem. Regardless of whether it's GUAs or ULAs.
Also-also, you can do IPv6 on the server side until the NAT (which is in the same place as in the article), and have that NAT be a NAT64 with distinct IPv6 prefixes for each customer.