Comment by eqvinox

Comment by eqvinox 4 days ago

10 replies

They clearly haven't talked to a telco or network device vendor, they would've sold them a VRF/EVPN/L3VPN based solution… for a whole bunch of money :)

You can DIY that these days though, plain Linux software stack, with optional hardware offload on some specific things and devices. Basically, you have a traffic distinguisher (VXLAN tunnel, MPLS label, SRv6, heck even GRE tunnel), keep a whole bunch of VRFs (man ip-vrf) around, and have your end services (server side) bind into appropriate VRFs as needed.

Also, yeah, with IPv6 you wouldn't have this problem. Regardless of whether it's GUAs or ULAs.

Also-also, you can do IPv6 on the server side until the NAT (which is in the same place as in the article), and have that NAT be a NAT64 with distinct IPv6 prefixes for each customer.

pcarroll 4 days ago

I like to think this is what we did. It's a simple Linux software stack - Linux, nftables, WireGuard, Go... But the goal was also to make it automatic and easy to use. It's not for my Mom. But you don't need a CCNP either. The trick is in the automation and not the stack itself.

  • eqvinox 4 days ago

    The key distinction with a L3VPN setup is that the packets are unmodified from and including the IP layer upwards, they're just encapsulated/labelled/tagged (depending on your choice of distinguisher). That encapsulation/… is a stateless operation, but comes at the cost of MTU (which in your case should be a controllable factor since the inner flows don't really hit uncontrolled devices.) Depending on what you're trying to do, the statelessness can be anything from useless to service critical (the latter if you're under some risk of DoS due to excessive state creation). It can also alleviate NAT problems, e.g. SIP and RTP are "annoying" to NAT.

    (ed.: To be fair, 1:1 NAT can be almost stateless too, that is if your server side ["Technician"] can be 1:1 mapped into the customer's network, i.e. the other direction. This only works if you have very few devices on "your" side and/or/according to how many IPs you can grab on the customer network.)

    The IPv6/NAT64 approach meanwhile is very similar to what you did, it just gets rid of the need to allocate unique IP addresses to devices. The first 96 bits of the IPv6 address become a customer/site ID, the last 32 bit are the unmodified device IPv4 address.

    • mjevans 4 days ago

      10. is /8 (24 payload bits), 172.16 is /12 (so 22) and 192.168 is /16. Very little need to spend more than 18 bits of space to map every 'usable' private IPv4 address once per customer. Probably also less than 14 bits (16k) of customers to service.

      There's more addresses I didn't know about offhand but found when looking up the 'no DHCP server' autoconf IP address range (Link Local IPv4).

      https://en.wikipedia.org/wiki/IPv4#Special-use_addresses

      • eqvinox 3 days ago

        That's all true on a statement level, but doesn't make an IPv4:IPv4 NAT solution better than either VRF/encap or IPv6 mapping.

        The benefit with VRF/encap is that the IPv4 packets are unmodified.

        The benefit with IPv6 mapping is that you don't need to manage IPv4:IPv4 tables and have a clear boundary of concerns & zoning.

        In both cases you don't give a rat's ass which prefixes the customer uses. That math/estimation you're doing there… just entirely not needed.

yardstick 4 days ago

The problem with talking to a telco, is you have to talk with not just one but any your customer may use. And if at the customer location there’s multiple routers in between the cameras and that telco router, it’s a shitshow trying to configure anything.

Much easier to drop some router on site that is telco neutral and connect back to your telco neutral dc/hq.

  • direwolf20 4 days ago

    The Metro Ethernet Forum standardized a lot of services telcos can offer, many years ago

    • yardstick 4 days ago

      No good when the upstream is some wifi connection provided by the building management, rather than a telco themselves.

      May as well pick a single solution that works across all Internet connections and weird setups, be an expert in that, vs having to manage varying network approaches based on telco presence, local network equipment, operating country, etc.

  • eqvinox 4 days ago

    That's all true, but you can also, you know, like, talk to people without buying your whole solution from them :)

    (btw, have you actually read past the first 7 words? I'm much more interested what people think about the latter parts.)

    • yardstick 4 days ago

      On the later parts, VRF in my scenarios won’t scale.

      Need to provide support access to 10k-50k locations all with the same subnet (industry standard equipment where the vendor mandates specific IP addressing, for better or worse). They are always feeding in data into the core too.

      Much easier to just VPN+NAT.

      • eqvinox 3 days ago

        That is a valid point. Though I would probably check first what the scaling limits on VRFs actually are; there was some netdev work a while back to fix scaling with 100k to 1M devices (a VRF is a device, though also a bit more than that). It's only the server ("technician") that needs to have all of these (depends on the setup if that helps or not), intermediate devices just need to forward without looking at the tags, and the VPN entry point only cares about its own subset of customers.

        I'd probably use the IPv6 + NAT64 setup in your situation.