antithesis-nl 3 days ago

As far as I can tell, Oxide is all about compute, not storage -- i.e., as long as you can guarantee a stable storage layer, your Oxide racks will be fine.

VMware used to go a bit further, in that they allowed your compute nodes to fail and/or your storage nodes to fail, without adverse effects.

If Oxide can do that, out-of-the-box, right-now, they'll be having a field day. Otherwise, my reservations about Oxide's business model remain...

  • zellyn 3 days ago

    I _think_ their storage layer is fault tolerant. Not sure though. Hopefully one of them will weigh in here.

    • steveklabnik 2 days ago

      This is not my area of expertise, so I'll quote from our website: https://oxide.computer/product/storage

      The storage service uses OpenZFS for all data storage. This marries Oxide’s distributed data storage and multi-node failure resiliency with the dependability and efficiency OpenZFS has earned in its 20 years of running demanding workloads.

      The Oxide control plane monitors performance metrics as another early signal of component failure. As sleds and SSDs are rotated in and out, the Oxide control plane migrates storage regions to ensure the appropriate redundancy.

      OpenZFS checksums and scrubs all data for early failure detection. Virtual disks constantly validate the integrity of your data, correcting failures as soon as they are discovered.

      • zellyn 2 days ago

        I figured ZFS would do the low-level redundancy, but I was under the impression that Crucible does the higher-level stuff, and I don't know much about it.

        • steveklabnik a day ago

          Yeah, I believe so too, but I haven't ever worked on it so I don't know a ton about it either.

houseofzeus 3 days ago

I think most of these solutions including OpenShift Virtualization, Hyper-V, Proxmox, etc. do live migration. What the previous post is talking about is some of the more advanced VMware live migration features like storage live migration and cross-cluster live migration and some of the automations layered over the top of them.