scrp 5 days ago

After years in the making ZFS raidz expansaion is finally here.

Major features added in release:

  - RAIDZ Expansion: Add new devices to an existing RAIDZ pool, increasing storage capacity without downtime.

  - Fast Dedup: A major performance upgrade to the original OpenZFS deduplication functionality.

  - Direct IO: Allows bypassing the ARC for reads/writes, improving performance in scenarios like NVMe devices where caching may hinder efficiency.

  - JSON: Optional JSON output for the most used commands.

  - Long names: Support for file and directory names up to 1023 characters.
  • eatbitseveryday 5 days ago

    > RAIDZ Expansion: Add new devices to an existing RAIDZ pool, increasing storage capacity without downtime.

    More specifically:

    > A new device (disk) can be attached to an existing RAIDZ vdev

  • cromka 4 days ago

    So if I’m running a Proxmox on ZFS and NVMEs, will I be better off enabling Direct IO when 2.3 gets rolled out? What are the use cases for it?

    • 0x457 3 days ago

      Direct IO useful for databases and other applications that use their own disk caching layer. Without knowing what you run in Proxmox no one will be able to tell you if it's beneficial or not.

    • Saris 3 days ago

      I would guess for very high performance NVMe drives.

  • jdboyd 5 days ago

    The first 4 seem like really big deals.

    • snvzz 5 days ago

      The fifth is also, once you consider non-ascii names.

      • GeorgeTirebiter 5 days ago

        Could someone show a legit reason to use 1000-character filenames? Seems to me, when filenames are long like that, they are actually capturing several KEYS that can be easily searched via ls & re's. e.g.

        2025-Jan-14-1258.93743_Experiment-2345_Gas-Flow-375.3_etc_etc.dat

        But to me this stuff should be in metadata. It's just that we don't have great tools for grepping the metadata.

        Heck, the original Macintosh FS had no subdirectories - they were faked by burying subdirectory names in the (flat filesysytem) filename. The original Macintosh File System (MFS), did not support true hierarchical subdirectories. Instead, the illusion of subdirectories was created by embedding folder-like names into the filenames themselves.

        This was done by using colons (:) as separators in filenames. A file named Folder:Subfolder:File would appear to belong to a subfolder within a folder. This was entirely a user interface convention managed by the Finder. Internally, MFS stored all files in a flat namespace, with no actual directory hierarchy in the filesystem structure.

        So, there is 'utility' in "overloading the filename space". But...

  • cm2187 5 days ago

    But I presume it is still not possible to remove a vdev.

    • ryao 5 days ago

      That was added a while ago:

      https://openzfs.github.io/openzfs-docs/man/master/8/zpool-re...

      It works by making a readonly copy of the vdev being removed inside the remaining space. The existing vdev is then removed. Data can still be accessed from the copy, but new writes will go to an actual vdev while data no longer needed on the copy is gradually reclaimed as free space as the old data is no longer needed.

      • lutorm 5 days ago

        Although "Top-level vdevs can only be removed if the primary pool storage does not contain a top-level raidz vdev, all top-level vdevs have the same sector size, and the keys for all encrypted datasets are loaded."

    • mustache_kimono 5 days ago

      Is this possible elsewhere (re: other filesystems)?

      • cm2187 5 days ago

        It is possible with windows storage space (remove drive from a pool) and mdadm/lvm (remove disk from a RAID array, remove volume from lvm), which to me are the two major alternatives. Don't know about unraid.

      • c45y 5 days ago

        Bcachefs allows it

      • pantalaimon 4 days ago

        btrfs has supported online adding and removing of devices to the pool from the start

      • unixhero 5 days ago

        Btrfs

        • tw04 5 days ago

          Except you shouldn’t use btrfs for any parity based raid if you value your data at all. In fact, I’m not aware if any vendor that has implemented btrfs with parity based raid, they all resort to btrfs on md.

  • BodyCulture 5 days ago

    How well tested is this in combination with encryption?

    Is the ZFS team handling encryption as a first class priority at all?

    ZFS on Linux inherited a lot of fame from ZFS on Solaris, but everyone using it in production should study the issue tracker very well for a realistic impression of the situation.

    • p_l 5 days ago

      Main issue with encryption is occasional attempts by certain (specific) Linux kernel developer to lockout ZFS out of access to advanced instruction set extensions (far from the only weird idea of that specific developer).

      The way ZFS encryption is layered, the features should be pretty much orthogonal from each other, but I'll admit that there's a bit of lacking with ZFS native encryption (though mainly in upper layer tooling in my experience rather than actual on-disk encryption parts)

      • ryao 4 days ago

        These are actually wrappers around CPU instructions, so what ZFS does is implement its own equivalents. This does not affect encryption (beyond the inconvenience that we did not have SIMD acceleration for a while on certain architectures).

      • snvzz 4 days ago

        >occasional attempts by certain (specific) Linux kernel developer

        Can we please refer to them by the actual name?

    • ryao 4 days ago

      The new features should interact fine with encryption. They are implemented at different parts of ZFS' internal stack.

      There have been many man hours put into investigating bug reports involving encryption and some fixes were made. Unfortunately, something appears to be going wrong when non-raw sends of encrypted datasets are received by another system:

      https://github.com/openzfs/zfs/issues/12014

      I do not believe anyone has figured out what is going wrong there. It has not been for lack of trying. Raw sends from encrypted datasets appear to be fine.

    • [removed] 4 days ago
      [deleted]
poisonborz 5 days ago

I just don't get it how the Windows world - by far the largest PC platform per userbase - still doesn't have any answer to ZFS. Microsoft had WinFS and then ReFS but it's on the backburner and while there is active development (Win11 ships some bits time to time) release is nowhere in sight. There are some lone warriors trying the giant task of creating a ZFS compatibility layer with some projects, but they are far from being mature/usable.

How come that Windows still uses a 32 year old file system?

  • GuB-42 5 days ago

    To be honest, the situation with Linux is barely better.

    ZFS has license issues with Linux, preventing full integration, and Btrfs is 15 years in the making and still doesn't match ZFS in features and stability.

    Most Linux distros still use ext4 by default, which is 19 years old, but ext4 is little more than a series of extensions on top of ext2, which is the same age as NTFS.

    In all fairness, there are few OS components that are as critical as the filesystem, and many wouldn't touch filesystems that have less than a decade of proven track record in production.

    • mogoh 5 days ago

      ZFS might be better then any other FS on Linux (I don't judge that).

      But you must admit that the situation on Linux is quite better then on Windows. Linux has so many FS in main branch. There is a lot of development. BTRFS had a rocky start, but it got better.

    • stephen_g 5 days ago

      I’m interested to know what ‘full integration’ does look like, I use ZFS in Proxmox (Debian-based) and it’s really great and super solid, but I haven’t used ZFS in more vanilla Linux distros. Does Proxmox have things that regular Linux is missing out on, or are there shortcomings and things I just don’t realise about Proxmox?

      • whataguy 5 days ago

        The difference is that the ZFS kernel module is included by default with Proxmox, whereas with e.g. Debian, you would need to install it manually.

      • BodyCulture 5 days ago

        You probably don’t realise how important encryption is.

        It’s still not supported by Proxmox, yes, you can do it yourself somehow but you are alone then and miss features and people report problems with double or triple file system layers.

        I do not understand how they have not encryption out of the box, this seems to be a problem.

        • kevinmgranger 5 days ago

          I'm not sure about proxmox, but ZFS on Linux does have encryption.

    • lousken 5 days ago

      as far as stability goes, btrfs is used by meta, synology and many others, so I wouldn't say it's not stable, but some features are lacking

      • azalemeth 5 days ago

        My understanding is that single-disk btrfs is good, but raid is decidedly dodgy; https://btrfs.readthedocs.io/en/latest/btrfs-man5.html#raid5... states that:

        > The RAID56 feature provides striping and parity over several devices, same as the traditional RAID5/6.

        > There are some implementation and design deficiencies that make it unreliable for some corner cases and *the feature should not be used in production, only for evaluation or testing*.

        > The power failure safety for metadata with RAID56 is not 100%.

        I have personally been bitten once (about 10 years ago) by btrfs just failing horribly on a single desktop drive. I've used either mdadm + ext4 (for /) or zfs (for large /data mounts) ever since. Zfs is fantastic and I genuinely don't understand why it's not used more widely.

      • jeltz 5 days ago

        It is possible to corrupt the file system from user space as a normal user with Btrfs. The PostgreSQL devs found that when working on async IO. And as fer as I know that issue has not been fixed.

        https://www.postgresql.org/message-id/CA%2BhUKGL-sZrfwcdme8j...

      • _joel 5 days ago

        I'm similar to some other people here, I guess once they've been bitten by data loss due to btrfs, it's difficult to advocate for it.

      • fourfour3 5 days ago

        Do Synology actually use the multi-device options of btrfs, or are they using linux softraid + lvm underneath?

        I know Synology Hybrid RAID is a clever use of LVM + MD raid, for example.

    • cesarb 5 days ago

      > Btrfs [...] still doesn't match ZFS in features [...]

      Isn't the feature in question (array expansion) precisely one which btrfs already had for a long time? Does ZFS have the opposite feature (shrinking the array), which AFAIK btrfs also already had for a long time?

      (And there's one feature which is important to many, "being in the upstream Linux kernel", that ZFS most likely will never have.)

      • wkat4242 5 days ago

        ZFS also had expansion for a long time but it was offline expansion. I don't know if btrfs has also had online for a long time?

        And shrinking no, that is a big missing feature in ZFS IMO. Understandable considering its heritage (large scale datacenters) but nevertheless an issue for home use.

        But raidz is rock-solid. Btrfs' raid is not.

    • bayindirh 5 days ago

      > Most Linux distros still use ext4 by default, which is 19 years old, but ext4 is little more than a series of extensions on top of ext2, which is the same age as NTFS.

      However, ext4 and XFS are much more simpler and performant than BTRFS & ZFS as root drives on personal systems and small servers.

      I personally won't use either on a single disk system as root FS, regardless of how fast my storage subsystem is.

      • ryao 5 days ago

        ZFS will outscale ext4 in parallel workloads with ease. XFS will often scale better than ext4, but if you use L2ARC and SLOG devices, it is no contest. On top of that, you can use compression for an additional boost.

        You might also find ZFS outperforms both of them in read workloads on single disks where ARC minimizes cold cache effects. When I began using ZFS for my rootfs, I noticed my desktop environment became more responsive and I attributed that to ARC.

    • honestSysAdmin 4 days ago

        https://openzfs.github.io/openzfs-docs/Getting%20Started/index.html
      
      ZFS runs on all major Linux distros, the source is compiled locally and there is no meaningful license problem. In datacenter and "enterprise" environments we compile ZFS "statically" with other kernel modules all the time.

      For over six years now, there is an "experimental" option presented by the graphical Ubuntu installer to install the root filesystem on ZFS. Almost everyone I personally know (just my anecdote) chooses this "experimental" option. There has been an occasion here and there of ZFS snapshots taking up too much space, but other than this there have not been any problems.

      I statically compile ZFS into a kernel that intentionally does not support loading modules on some of my personal laptops. My experience has been great, others' mileage may (certainly will) vary.

    • xattt 5 days ago

      ZFS on OS X was killed because of Oracle licensing drama. I don’t expect anything better on Windows either.

      • ryao 5 days ago

        There is a third party port here:

        https://openzfsonosx.org/wiki/Main_Page

        It was actually the NetApp lawsuit that caused problems for Apple’s adoption of ZFS. Apple wanted indemnification from Sun because of the lawsuit, Sun’s CEO did not sign the agreement before Oracle’s acquisition of Sun happened and Oracle had no interest in granting that, so the official Apple port was cancelled.

        I heard this second hand years later from people who were insiders at Sun.

      • throw0101a 5 days ago

        > ZFS on OS X was killed because of Oracle licensing drama.

        It was killed because Apple and Sun couldn't agree on a 'support contract'. From Jeff Bonwick, one of the co-creators ZFS:

        >> Apple can currently just take the ZFS CDDL code and incorporate it (like they did with DTrace), but it may be that they wanted a "private license" from Sun (with appropriate technical support and indemnification), and the two entities couldn't come to mutually agreeable terms.

        > I cannot disclose details, but that is the essence of it.

        * https://archive.is/http://mail.opensolaris.org/pipermail/zfs...

        Sun took DTrace, licensed via CDDL—just like ZFS—and put it into the kernel without issue. Of course a file system is much more central to an operating system, so they wanted much more of a CYA for that.

    • nabla9 5 days ago

      License is not a real issue. It must be just distributed in separate module. No big hurdle.

      • crest 5 days ago

        The main hurdle is hostile Linux kernel developers who aren't held accountable intentionally breaking ZFS for their own petty ideological reasons e.g. removing the in-kernel FPU/SIMD register save/restore API and replacing it with a "new" API to do the the same.

        What's "new" about the "new" API? Its symbols are GPL2 only to deny it's use to non-GPL2 modules (like ZFS). Guess that's an easy way to make sure that BTRFS is faster than ZFS or set yourself up as the (to be) injured party.

        Of course a reimplementation of the old API in terms of the new is an evil "GPL condom" violating the kernel license right? Why can't you see ZFS's CDDL2 license is the real problem here for being the wrong flavour of copyleft license. Way to claim the moral high ground you short-sighted, bigoted pricks. sigh

      • Jnr 5 days ago

        From my point of view it is a real usability issue.

        zfs modules are not in the official repos. You either have to compile it on each machine or use unofficial repos, which is not exactly ideal and can break things if those repos are not up to date. And I guess it also needs some additional steps for secureboot setup on some distros?

        I really want to try zfs because btrfs has some issues with RAID5 and RAID6 (it is not recommended so I don't use it) but I am not sure I want to risk the overall system stability, I would not want to end up in a situation where my machines don't boot and I have to fix it manually.

      • GuB-42 5 days ago

        It is a problem because most of the internal kernel APIs are GPL-only, which limit the abilities of the ZFS module. It is a common source of argument between the Linux guys and the ZFS on Linux guys.

        The reason for this is not just to piss off non-GPL module developers. GPL-only internal APIs are subject to change without notice, even more so than the rest of the kernel. And because the licence may not allow the Linux kernel developers to make the necessary changes to the module when it happens, there is a good chance it breaks without warning.

        And even with that, all internal APIs may change, it is just a bit less likely than for the GPL-only ones, and because ZFS on Linux is a separate module, there is no guarantee for it to not break with successive Linux versions, in fact, it is more like a guarantee that it will break.

        Linux is proudly monolithic, and as constantly evolving a monolithic kernel, developers need to have control over the entire project. It is also community-driven. Combined, you need rules to have the community work together, or everything will break down, and that's what the GPL is for.

      • nijave 4 days ago

        I remember it being a pain in the ass on Fedora which tracks closely to mainline. Frequently a new kernel version would come out that zfs module didn't support so you'd have to downgrade and hold back the package until support was added.

        Fedora packages zfs-fuse. I think some distros have arrangements to make sure kernels have zfs support. It may be less of a headache on those

        In tree fs don't break that way

    • nijave 4 days ago

      You've been able to add and remove devices at will for a long time with btrfs (only recently supported in zfs with lots of caveats)

      Btrfs also supports async/offline dedupe

      You can also layer it on top of mdadm. Iirc zfs strongly discourages using anything but direct attached physical disks.

    • BSDobelix 5 days ago

      >ZFS has license issues with Linux, preventing full integration

      No one wants that, openZFS is much healthier without Linux and it's "Foundation/Politics".

      • bhaney 5 days ago

        > No one wants that

        I want that

    • LtdJorge 4 days ago

      XFS is 22 and still the best in-tree FS there is :)

  • bayindirh 5 days ago

    > How come that Windows still uses a 32 year old file system?

    Simple. Because most of the burden is taken by the (enterprise) storage hardware hosting the FS. Snapshots, block level deduplication, object storage technologies, RAID/Resiliency, size changes, you name it.

    Modern storage appliances are black magic, and you don't need much more features from NTFS. You either transparently access via NAS/SAN or store your NTFS volumes on capable disk boxes.

    On the Linux world, at the higher end, there's Lustre and GPFS. ZFS is mostly for resilient, but not performance critical needs.

    • BSDobelix 5 days ago

      >ZFS is mostly for resilient, but not performance critical needs.

      Los Alamos disagrees ;)

      https://www.lanl.gov/media/news/0321-computational-storage

      But yes, in general you are right, Cern for example uses Ceph:

      https://indico.cern.ch/event/1457076/attachments/2934445/515...

      • bayindirh 5 days ago

        I think what LLNL did predates GPUDirect and other new technologies came after 2022, but that's a good start.

        CERN's Ceph also for their "General IT" needs. Their clusters are independent from that. Also CERN's most processing is distributed across Europe. We are part of that network.

        Many, if not all of the HPC centers we talk with uses Lustre as their "immediate" storage. Also, there's Weka now, a closed source storage system supporting insane speeds and tons of protocols at the same time. Mostly used for and by GPU clusters around the world. You connect terabits to that cluster casually. It's all flash, and flat out fast.

    • poisonborz 5 days ago

      So private consumers should just pay cloud subscription if they want safer/modern data storage for their PC? (without NAS)

      • shrubble 5 days ago

        No, private consumers have a choice, since Linux and FreeBSD runs well on their hardware. Microsoft is too busy shoveling their crappy AI and convincing OEMs to put a second Windows button (the CoPilot button) on their keyboards.

      • bluGill 5 days ago

        Probably. There are levels of backups, and a cloud subscription SHOULD give you copies in geographical separate locations with someone to help you (who probably isn't into computers and doesn't want to learn the complex details) restore when (NOT IF!) needed.

        I have all my backups on a NAS in the next room. This covers the vast majority of use cases for backups, but if my house burns down everything is lost. I know I'm taking that risk, but really I should have better. Just paying someone to do it all in the cloud should be better for me as well and I keep thinking I should do this.

        Of course paying someone assumes they will do their job. There are always incompetent companies out there to take your money.

        • pdimitar 5 days ago

          My setup is similar to yours, but I also distribute my most important data in compressed (<5GB) encrypted backups to several free-tier cloud storage accounts. I could restore it by copying one key and running one script.

          I lost faith in most paid operators. Whoops, this thing that absolutely can happen to home users and we're supposed to protect them from now actually happened to us and we were not prepared. We're so sorry!

          Nah. Give me access to 5-15 cloud storage accounts, I'll handle it myself. Have done so for years.

      • BSDobelix 5 days ago

        If you need Windows, you can use something like restic (checksums and compression) and external drives (more than one, stored in more than one place) to make a backup. Plus "maybe" but not needed ReFS (on your non-Windows partition), which is included in the Workstation/Enterprise editions of Windows.

        I trust my own backups much more than any subscription, not essentially from a technical point of view, but from an access point of view (e.g. losing access to your Google account).

        EDIT: You have to enable check-summing and/or compression for data on ReFS manually

        https://learn.microsoft.com/en-us/windows-server/storage/ref...

      • bayindirh 5 days ago

        I think Microsoft has discontinued Windows 7 backup to force people to buy OneDrive subscriptions. They also forcefully enabled the feature when they first introduced it.

        So, I think that your answer for this question is "unfortunately, yes".

        Not that I support the situation.

      • NoMoreNicksLeft 5 days ago

        Having a NAS is life-changing. Doesn't have to be some large 20-bay monstrosity, just something that will give you redundancy and has an ethernet jack.

      • j16sdiz 4 days ago

        No, if they need ZFS-like function, they just pay for NAS.

        ZFS is not in the same market with AWS S3.

  • mustache_kimono 5 days ago

    > I just don't get it how the Windows world - by far the largest PC platform per userbase - still doesn't have any answer to ZFS.

    The mainline Linux kernel doesn't either, and I think the answer is because it's hard and high risk with a return mostly measured in technical respect?

    • ffsm8 5 days ago

      Technically speaking, bcachefs has been merged into the Linux Kernel - that makes your initial assertion wrong.

      But considering it's had two drama events within 1 year of getting merged... I think we can safely confirm your conclusion of it being really hard

      • mustache_kimono 5 days ago

        > Technically speaking, bcachefs has been merged into the Linux Kernel - that makes your initial assertion wrong.

        bcachefs doesn't implement its erasure coding/RAID yet? Doesn't implement send/receive. Doesn't implement scrub/fsck. See: https://bcachefs.org/Roadmap, https://bcachefs.org/Wishlist/

        btrfs is still more of a legit competitor to ZFS these days and it isn't close to touching ZFS where it matters. If the perpetually half-finished bcachefs and btrfs are the "answer" to ZFS that seems like too little, too late to me.

  • kwanbix 5 days ago

    Honest question. As an end user that uses Windows and Linux and does not uses ZFS, what I am missing?

    • poisonborz 5 days ago

      Way better data security, resilience against file rotting. This goes for both HDDs or SSDs. Copy-on-write, snapshots, end to end integrity. Also easier to extend the storage for safety/drive failure (and SSDs corrupt in a more sneaky way) with pools.

      • wil421 5 days ago

        How many of us are using single disks on our laptops? I have a NAS and use all of the above but that doesn’t help people with single drive systems. Or help me understand why I would want it on my laptop.

      • jeroenhd 5 days ago

        The data security and rot resilience only goes for systems with ECC memory. Correct data with a faulty checksum will be treated the same as incorrect data with a correct checksum.

        Windows has its own extended filesystem through Storage Spaces, with many ZFS features added as lesser used Storage Spaces options, especially when combined with ReFS.

    • johannes1234321 5 days ago

      For a while I ran Open Solaris with ZFS as root filesystem.

      The key feature for me, which I miss, is the snapshotting integrated into the package manager.

      ZFS allows snapshots more or less for free (due to copy on weite) including cron based snapshotting every 15 minutes. So if I did a mistake anywhere there was a way to recover.

      And that integrated with the update manager and boot manager means that on an update a snapshot is created and during boot one can switch between states. Never had a broken update, but gave a good feeling.

      On my home server I like the raid features and on Solaris it was nicely integrated with NFS etc so that one can easily create volumes and export them and set restrictions (max size etc.) on it.

      • attentive 4 days ago

        > is the snapshotting integrated into the package manager.

        some linux distros have that by default with btrfs. And usually it's a package install away if you're already on btrfs.

    • chillfox 5 days ago

      Much faster launch of applications/files you use regularly. Ability to always rollback updates in seconds if they cause issues thanks to snapshots. Fast backups with snapshots + zfs send/receive to a remote machine. Compressed disks, this both let's you store more on a drive and makes accessing files faster. Easy encryption. ability to mirror 2 large usb disks so you never have your data corrupted or lose it from drive failures. Can move your data or entire os install to a new computer easily by using a live disk and just doing a send/receive to the new pc.

      (I have never used dedup, but it's there if you want I guess)

    • hoherd 5 days ago

      Online filesystem checking and repair.

      Reading any file will tell you with 100% guarantee if it is corrupt or not.

      Snapshots that you can `cd` into, so you can compare any prior version of your FS with the live version of your FS.

      Block level compression.

      • snvzz 4 days ago

        >Reading any file will tell you with 100% guarantee if it is corrupt or not.

        Only possible if it was not corrupted in RAM before it was written to disk.

        Using ECC memory is important, irrespective of ZFS.

    • e12e 5 days ago

      Cross platform native encryption with sane fs for removable media.

      • lazide 5 days ago

        Who would that help?

        MacOS also defaults to a non-portable FS for likely similar reasons, if one was being cynical.

    • wkat4242 5 days ago

      Snapshots (Note: NTFS does have this in the way of Volume Shadow Copy but it's not as easily accessible as a feature to the end user as it is in ZFS). Copy on Write for reliability under crashes. Block checksumming for data protection (bitrot)

  • zamadatix 5 days ago

    NTFS was able to be extended in various way over the years to the point what you could do with an NTFS drive 32 years ago will feel like talking about a completely different filesystem than what you can do with it on current Windows.

    Honestly I really like ReFS, particularly in context of storage spaces, but I don't think it's relevant to Microsoft's consumer desktop OS where users don't have 6 drives they need to pool together. Don't get me wrong, I use ZFS because that's what I can get running on a Linux server and I'm not going to go run Windows Server just for the storage pooling... but ReFS + Storage Spaces wins my heart with the 256 MB slab approach. This means you can add+remove mixed sized drives and get the maximum space utilization for the parity settings of the pool. Here ZFS is still getting to online adds of same or larger drives 10 years later.

  • nickdothutton 5 days ago

    OS development pretty much stopped around 2000. ZFS is from 2001. I don't count a new way to organise my photos or integrate with a search engine as "OS" though.

  • doctorpangloss 5 days ago

    The same reason file deduplication is not enabled for client Windows: greed.

    For example, there are numerous new file systems people use: OneDrive, Google Drive, iCloud Storage. Do you get it?

  • badgersnake 5 days ago

    NTFS is good enough for most people, who have a laptop with one SSD in it.

    • wkat4242 5 days ago

      The benefits of ZFS don't need multiple drives to be useful. I'm running ZFS on root for years now and snapshots have saved my bacon several times. Also with block checksums you can at least detect bitrot. And COW is always useful.

      • zamadatix 5 days ago

        Windows manages volume snapshots on NTFS through VSS. I think ZFS snapshots are a bit "cleaner" of a design, and the tooling is a bit friendlier IMO, but the functionality to snapshot, rollback, and save your bacon is there regardless. Outside of the automatically enabled "System Restore" (which only uses VSS to snapshot specific system files during updates) I don't think anyone bothers to use it though.

        CoW, advanced parity, and checksumming are the big ones NTFS lacks. CoW is just inherently not how NTFS is designed and checksumming isn't there. Anything else (encryption, compression, snapshots, ACLs, large scale, virtual devices, basic parity) is done through NTFS on Windows.

uniqueuid 5 days ago

It's good to see that they were pretty conservative about the expansion.

Not only is expansion completely transparent and resumable, it also maintains redundancy throughout the process.

That said, there is one tiny caveat people should be aware of:

> After the expansion completes, old blocks remain with their old data-to-parity ratio (e.g. 5-wide RAIDZ2, has 3 data to 2 parity), but distributed among the larger set of disks. New blocks will be written with the new data-to-parity ratio (e.g. a 5-wide RAIDZ2 which has been expanded once to 6-wide, has 4 data to 2 parity).

  • chungy 5 days ago

    I'm not sure that's really a caveat, it just means old data might be in an inoptimal layout. Even with that, you still get the full benefits of raidzN, where up to N disks can completely fail and the pool will remain functional.

    • crote 5 days ago

      I think it's a huge caveat, because it makes upgrades a lot less efficient than you'd expect.

      For example, home users generally don't want to buy all of their storage up front. They want to add additional disks as the array fills up. Being able to start with a 2-disk raidz1 and later upgrade that to a 3-disk and eventually 4-disk array is amazing. It's a lot less amazing if you end up with a 55% storage efficiency rather than 66% you'd ideally get from a 2-disk to 3-disk upgrade. That's 11% of your total disk capacity wasted, without any benefit whatsoever.

      • ryao 5 days ago

        You have a couple options:

        1. Delete the snapshots and rewrite the files in place like how people do when they want to rebalance a pool.

        2. Use send/receive inside the pool.

        Either one will make the data use the new layout. They both carry the caveat that reflinks will not survive the operation, such that if you used reflinks to deduplicate storage, you will find the deduplication effect is gone afterward.

        • pdimitar 2 days ago

          Can you give sample commands on how to achieve both options that you gave?

      • bmicraft 5 days ago

        Well, when you start a raidz with 2 devices you've already done goofed. Start with a mirror or at least 3 devices.

        Also, if you don't wait to upgrade until the disks are at 100% utilization (which you should never do! you're creating massive fragmentation upwards of ~85%) efficiency in the real world will be better.

      • chungy 5 days ago

        It still seems pretty minor. If you want extreme optimization, feel free to destroy the pool and create it new, or create it with the ideal layout from the beginning.

        Old data still works fine, the same guarantees RAID-Z provides still hold. New data will be written with the new data layout.

    • stavros 5 days ago

      Is that the case? What if I expand a 3-1 array to 3-2? Won't the old blocks remain 3-1?

      • Timshel 5 days ago

        I don't believe it supports adding parity drives only data drives.

  • wjdp 5 days ago

    Caveat is very much expected, you should expect ZFS features to not rewrite blocks. Changes to settings only apply to new data for example.

  • rekoil 5 days ago

    Yaeh it's a pretty huge caveat to be honest.

        Da1 Db1 Dc1 Pa1 Pb1
        Da2 Db2 Dc2 Pa2 Pb2
        Da3 Db3 Dc3 Pa3 Pb3
        ___ ___ ___ Pa4 Pb4
    
    ___ represents free space. After expansion by one disk you would logically expect something like:

        Da1 Db1 Dc1 Da2 Pa1 Pb1
        Db2 Dc2 Da3 Db3 Pa2 Pb2
        Dc3 ___ ___ ___ Pa3 Pb3
        ___ ___ ___ ___ Pa4 Pb4
    
    But as I understand it it would actually expand to:

        Da1 Db1 Dc1 Dd1 Pa1 Pb1
        Da2 Db2 Dc2 Dd2 Pa2 Pb2
        Da3 Db3 Dc3 Dd3 Pa3 Pb3
        ___ ___ ___ ___ Pa4 Pb4
    
    Where the Dd1-3 blocks are just wasted. Meaning by adding a new disk to the array you're only expanding free storage by 25%... So say you have 8TB disks for a total of 24TB of storage free originally, and you have 4TB free before expansion, you would have 5TB free after expansion.

    Please tell me I've misunderstood this, because to me it is a pretty useless implementation if I haven't.

    • ryao 5 days ago

      ZFS RAID-Z does not have parity disks. The parity and data is interleaved to allow data reads to be done from all disks rather than just the data disks.

      The slides here explain how it works:

      https://openzfs.org/w/images/5/5e/RAIDZ_Expansion_2023.pdf

      Anyway, you are not entirely wrong. The old data will have the old parity:data ratio while new data will have the new parity:data ratio. As old data is freed from the vdev, new writes will use the new parity:data ratio. You can speed this up by doing send/receive, or by deleting all snapshots and then rewriting the files in place. This has the caveat that reflinks will not survive the operation, such that if you used reflinks to deduplicate storage, you will find the deduplication effect is gone afterward.

      • chungy 5 days ago

        To be fair, RAID5/6 don't have parity disks either. RAID2, RAID3, and RAID4 do, but they're all effectively dead technology for good reason.

        I think it's easy for a lot of people to conceptualize RAID5/6 and RAID-Zn as having "data disks" and "parity disks" to wrap around the complicated topic of how it works, but all of them truly interleave and compute parity data across all disks, allowing any single disk to die.

        I've been of two minds on the persistent myth of "parity disks" but I usually ignore it, because it's a convenient lie to understand your data is safe, at least. It's also a little bit the same way that raidz1 and raidz2 are sometimes talked about as "RAID5" and "RAID6"; the effective benefits are the same, but the implementation is totally different.

    • magicalhippo 5 days ago

      Unless I misunderstood you, you're describing more how classical RAID would work. The RAID-Z expansion works like you note you would logically expect. You added a drive with four blocks of free space, and you end up with four blocks more of free space afterwards.

      You can see this in the presentation[1] slides[2].

      The reason this is sub-optimal post-expansion is because, in your example, the old maximal stripe width is lower than the post-expansion maximal stripe width.

      Your example is a bit unfortunate in terms of allocated blocks vs layout, but if we tweak it slightly, then

          Da1 Db1 Dc1 Pa1 Pb1
          Da2 Db2 Dc2 Pa2 Pb2
          Da3 Db3 Pa3 Pb3 ___
      
      would after RAID-Z expansion would become

          Da1 Db1 Dc1 Pa1 Pb1 Da2
          Db2 Dc2 Pa2 Pb2 Da3 Db3 
          Pa3 Pb3 ___ ___ ___ ___
      
      Ie you added a disk with 3 new blocks, and so total free space after is 1+3 = 4 blocks.

      However if the same data was written in the post-expanded vdev configuration, it would have become

          Da1 Db1 Dc1 Dd1 Pa1 Pb1
          Da2 Db2 Dc2 Dd2 Pa2 Pb2
          ___ ___ ___ ___ ___ ___
      
      Ie, you'd have 6 free blocks not just 4 blocks.

      Of course this doesn't count for writes which end up taking less than the maximal stripe width.

      [1]: https://www.youtube.com/watch?v=tqyNHyq0LYM

      [2]: https://openzfs.org/w/images/5/5e/RAIDZ_Expansion_2023.pdf

      • ryao 5 days ago

        Your diagrams have some flaws too. ZFS has a variable stripe size. Let’s say you have a 10 disk raid-z2 vdev that is ashift=12 for 4K columns. If you have a 4K file, 1 data block and 2 parity blocks will be written. Even if you expand the raid-z vdev, there is no savings to be had from the new data:parity ratio. Now, let’s assume that you have a 72K file. Here, you have 18 data blocks and 6 parity blocks. You would benefit from rewriting this to use the new data:parity ratio. In this case, you would only need 4 parity blocks. ZFS does not rewrite it as part of the expansion, however.

        There are already good diagrams in your links, so I will refrain from drawing my own with ASCII. Also, ZFS will vary which columns get parity, which is why the slides you linked have the parity at pseudo-random locations. It was not a quirk of the slide’s author. The data is really laid out that way.

cgeier 5 days ago

This is huge news for ZFS users (probably mostly those in the hobbyist/home use space, but still). raidz expansion has been one of the most requested features for years.

  • jfreax 5 days ago

    I'm not yet familiar with zfs and couldn't find it in the release note: Does expansion only works with disk of the same size? Or is adding are bigger/smaller disks possible or do all disk need to have the same size?

    • ryao 5 days ago

      You can use different sized disks, but RAID-Z will truncate the space it uses to the lowest common denominator. If you increase the lowest common denominator, RAID-Z should auto-expand to use the additional space. All parity RAID technologies truncate members to the lowest common denominator, rather than just ZFS.

      • wrboyce 3 days ago

        Is it definitely the LCD? Given drive of size 15 and 20 the LCD would be 1, no? I had assumed it would just use the size of the smallest drive on every drive (so 15+20->15+15=30). When I first read your comment I was thinking of GCF but even that would be fairly inefficient (GCF(15,20) = 5, so 15+20->5+5=10).

      • GauntletWizard 4 days ago

        That's not entirely true, Unraid has mechanisms for unbalanced disks, but they come at a high cost in terms of usability by standard workloads.

    • shiroiushi 5 days ago

      As far as I understand, ZFS doesn't work at all with disks of differing sizes (in the same array). So if you try it, it just finds the size of the smallest disk, and uses that for all disks. So if you put an 8TB drive in an array with a bunch of 10TB drives, they'll all be treated as 8TB drives, and the extra 2TB will be ignored on those disks.

      However, if you replace the smallest disk with a new, larger drive, and resilver, then it'll now use the new smallest disk as the baseline, and use that extra space on the other drives.

      (Someone please correct me if I'm wrong.)

      • mustache_kimono 5 days ago

        > As far as I understand, ZFS doesn't work at all with disks of differing sizes (in the same array).

        This might be misleading, however, it may only be my understanding of word "array".

        You can use 2x10TB mirrors as vdev0, and 6x12TB in RAIDZ2 as vdev1 in the same pool/array. You can also stack as many unevenly sized disks as you want in a pool. The actual problem is when you want a different drive topology within a pool or vdev, or you want to mismatch, say, 3 oddly sized drives to create some synthetic redundancy level (2x4TB and 1x8TB to achieve two copies on two disks) like btrfs does/tries to do.

      • tw04 5 days ago

        This is the case with any parity based raid, they just hide it or lie to you in various ways. If you have two 6TB dives and two 12TB drives in a single raid-6 array, it is physically impossible to have two drive parity once you exceed 12TB of written capacity. BTRFS and bcachefs can’t magically create more space where none exists on your 6TB drives. They resort to dropping to mirror protection for the excess capacity which you could also do manually with ZFS by giving it partitions instead of the whole drive.

    • chasil 5 days ago

      IIRC, you could always replace drives in a raidset with larger devices. When the last drive is replaced, then the new space is recognized.

      This new operation seems somewhat more sophisticated.

    • zelcon 5 days ago

      You need to buy the same exact drive with the same capacity and speed. Your raidz vdev be as small and as slow as your smallest and slowest drive.

      btrfs and the new bcachefs can do RAID with mixed drives, but I can’t trust either of them with my data yet.

      • hda111 5 days ago

        It doesn't have to be the same exact drive. Mixing drives from different manufacturers (with the same capacity) is often used to prevent correlated failure. ZFS is not using the whole disk, so different disks can be mixed, because the disk often have varying capacity.

      • tw04 5 days ago

        You can run raid-z across partitions to utilize the full drive just like synology does with their “hybrid raid” - you just shouldn’t.

      • Mashimo 5 days ago

        > You need to buy the same exact drive

        AFAIK you can add larger and faster drives, you will just not get any benefits from it.

        • bpye 5 days ago

          You can get read speed benefits with faster drives, but your writes will be limited by your slowest.

      • unixhero 5 days ago

        Just have backups. I used btrfs and zfs for different purposes. Never had any lost data or downtime with btrfs since 2016. I only use raid 0 and raid 1 and compression. Btrfs does not havr a hungry ram requirement.

wkat4242 5 days ago

Note: This is online expansion. Expansion was always possible but you did need to take the array down to do it. You could also move to bigger drives but you also had to do that one at a time (and only gain the new capacity once all drives were upgraded of course)

As far as I know shrinking a pool is still not possible though. So if you have a pool with 5 drives and add a 6th, you can't go back to 5 drives even if there is very little data in it.

FrostKiwi 5 days ago

FINALLY!

You can do borderline insane single-vdev setups like RAID-Z3 with 4 disks (3 Disks worth of redundancy) of the most expensive and highest density hard drives money can buy right now, for an initial effective space usage of 25% and then keep buying and expanding Disk by Disk, with the space demand growing, up to something like 12ish disks. Disk prices dropping as time goes on and a spread out failure chance with disks being added at different times.

  • uniqueuid 5 days ago

    Yes but see my sibling comment.

    When you expand your array, your existing data will not be stored any more efficiently.

    To get the new parity/data ratios, you would have to force copies of the data and delete the old, inefficient versions, e.g. with something like this [1]

    My personal take is that it's a much better idea to buy individual complete raid-z configurations and add new ones / replace old ones (disk by disk!) as you go.

    [1] https://github.com/markusressel/zfs-inplace-rebalancing

shepherdjerred 5 days ago

How does ZFS compare to btrfs? I'm currently using btrfs for my home server, but I've had some strange troubles with it. I'm thinking about switching to ZFS, but I don't want to end up in the same situation.

  • ryao 5 days ago

    I first tried btrfs 15 years ago with Linux 2.6.33-rc4 if I recall. It developed an unlinkable file within 3 days, so I stopped using it. Later, I found ZFS. It had a few less significant problems, but I was a CS student at the time and I thought I could fix them since they seemed minor in comparison to the issue I had with btrfs, so over the next 18 months, I solved all of the problems that it had that bothered me and sent the patches to be included in the then ZFSOnLinux repository. My effort helped make it production ready on Linux. I have used ZFS ever since and it has worked well for me.

    If btrfs had been in better shape, I would have been a btrfs contributor. Unfortunately for btrfs, it not only was in bad shape back then, but other btrfs issues continued to bite me every time I tried it over the years for anything serious (e.g. frequent ENOSPC errors when there is still space). ZFS on the other hand just works. Myself and many others did a great deal of work to ensure it works well.

    The main reason for the difference is that ZFS had a very solid foundation, which was achieved by having some fantastic regression testing facilities. It has a userland version that randomly exercises the code to find bugs before they occur in production and a test suite that is run on every proposed change to help shake out bugs.

    ZFS also has more people reviewing proposed changes than other filesystems. The Btrfs developers will often state that there is a significant man power difference between the two file systems. I vaguely recall them claiming the difference was a factor of 6.

    Anyway, few people who use ZFS regret it, so I think you will find you like it too.

  • parshimers 5 days ago

    btrfs has similar aims to ZFS, but is far less mature. i used it for my root partitions due to it not needing DKMS, but had many troubles. i used it in a fairly simple way, just a mirror. one day, of the drives in the array started to have issues- and btrfs fell on it's face. it remounted everything read-only if i remember correctly, and would not run in degraded mode by default. even mdraid would do better than this without checksumming and so forth. ZFS also likewise, says that the array is faulted, but of course allows it to be used. the fact the default behavior was not RAID, because it's literally missing the R part for reading the data back, made me lose any faith in it. i moved to ZFS and haven't had issues since. there is much more of a community and lots of good tooling around it.

  • nnadams 4 days ago

    I used Btrfs for a few years but switched away a couple years ago. I also had one or two incidents with Btrfs where some weirdness happened, but I was able to recover everything in the end. Overall I liked the flexibility of Btrfs, but mostly I found it too slow.

    I use ZFS on Arch Linux and overall have had no problems with it so far. There's more customization and methods to optimize performance. My one suggestion is to do a lot of research and testing with ZFS. There is a bit of a learning curve, but it's been worth the switch for me.

jakedata 5 days ago

Happy to see the ARC bypass for NVMe performance. ZFS really fails to exploit NVMe's potential. Online expansion might be interesting. I tried to use ZFS for some very busy databases and ended up getting bitten badly by the fragmentation bug. The only way to restore performance appears to be copying the data off the volume, nuking it and then copying it back. Now -perhaps- if I expand the zpool then I might be able to reduce fragmentation by copying the tablespace on the same volume.

endorphine 5 days ago

Can someone describe why they would use ZFS (or similar) for home usage?

  • mrighele 5 days ago

    Good reasons for me:

    Checksums: this is even more important in home usage as the hardware is usually of lower quality. Faulty controllers, crappy cables, hard disks stored in a higher than advised temperature... many reasons for bogus data to be saved, and zfs handles that well and automatically (if you have redundancy)

    Snapshots: very useful to make backups and quickly go back to an older version of a file when mistakes are made

    Ease of mind: compared to the alternatives, I find that zfs is easier to use and makes it harder to make a mistake that could bring data loss (e.g. remove by mistake the wrong drive when replacing a faulty one, pool becomes unusable, "ops!", put the disk back, pool goes back to work as nothing happened). Maybe it is different now with mdadm, ma when I used it years ago I was always worried to make a destructive mistake.

    • EvanAnderson 5 days ago

      > Snapshots: very useful to make backups and quickly go back to an older version of a file when mistakes are made

      Piling on here: Sending snapshots to remote machines (or removable drives) is very easy. That makes snapshots viable as a backup mechanism (because they can exist off-site and offline).

  • ryao 5 days ago

    To give an answer that nobody else has given, ZFS is great for storing Steam games. Set recordsize=1M and compression=zstd and you can often store about 33% more games in the same space.

    A friend uses ZFS to store his Steam games on a couple of hard drives. He gave ZFS a SSD to use as L2ARC. ZFS automatically caches the games he likes to run on the SSD so that they load quickly. If he changes which games he likes to run, ZFS will automatically adapt to cache those on the SSD instead.

    • chillfox 5 days ago

      The compression and ARC will make games load much master than they would on NTFS even without having a separate drive for the ARC.

    • bmicraft 4 days ago

      As I understand, L2ARC doesn't work across reboots which unfortunately makes it almost useless for systems that get rebooted regularly, like desktops.

      • olavgg 4 days ago

        L2ARC has had persistence support for a few years now.

        • bmicraft 4 days ago

          Wow thanks for pointing that out, apparently it's been around for four years since with the first 2.0 release without me noticing.

  • chromakode 5 days ago

    I replicate my entire filesystem to a local NAS every 10 minutes using zrepl. This has already saved my bacon once when a WD_BLACK SN850 suddenly died on me [1]. It's also recovered code from some classic git blunders. It shouldn't be possible any more to lose data to user error or single device failure. We have the technology.

    [1]: https://chromakode.com/post/zfs-recovery-with-zrepl/

  • vedranm 5 days ago

    Several reasons, but major ones (for me) are reliability (checksums and self-healing) and portability (no other modern filesystem can be read and written on Linux, FreeBSD, Windows, and macOS).

    Snapshots ("boot environments") are also supported by Btrfs (my Linux installations use that so I don't have to worry about having the 3rd party kernel module to read my rootfs). Performance isn't that great either and, assuming Linux, XFS is a better choice if that is your main concern.

  • Mashimo 5 days ago

    It's relatively easy, and yet powerful. Before that I had MDADM + LVM + dm-crypt + ext4, which also worked but all the layers got me into a headache.

    Automated snapshots are super easy and fast. Also easy to access if you deleted a file, you don't have to restore the whole snapshot, you can just cp from the hidden .zfs/ folder.

    I run it on 6x 8TB disk for a couple of years now. I run it in a raidz2, which means up to 2 disk can die. Would I use it on a single disk on a Desktop? Probably not.

    • redundantly 5 days ago

      > Would I use it on a single disk on a Desktop? Probably not.

      I do. Snapshots and replication and checksumming are awesome.

  • PaulKeeble 5 days ago

    I have a home built NAS that uses ZFS for the storage array and the checksumming has been really quite useful in detecting and correcting bit rot. In the past I used MDADM and EXT over the top and that worked but it didn't defend against bit rot. I have considered BTRFS since it would get me the same checksumming without the rest of ZFS but its not considered reliable for systems with parity yet (although now I think it likely is more than reliable enough now).

    I do occasionally use snapshots and the compression feature is handy on quite a lot of my data set but I don't use the user and group limitations or remote send and receive etc. ZFS does a lot more than I need but it also works really well and I wouldn't move away from a checksumming filesystem now.

  • lutorm 5 days ago

    Apart from just peace of mind from bitrot, I use it for the snapshotting capability which makes it super easy to do backups. You can snapshot and send the snapshots to other storage with e.g zfs-autobackup and it's trivial and you can't screw it up. If the snapshots exist on the other drive, you know you have a backup.

  • mshroyer 5 days ago

    I use it on a NAS for:

    - Confidence in my long-term storage of some data I care about, as zpool scrub protects against bit rot

    - Cheap snapshots that provide both easy checkpoints for work saved to my network share, and resilience against ransomware attacks against my other computers' backups to my NAS

    - Easy and efficient (zfs send) replication to external hard drives for storage pool backup

    - Built-in and ergonomic encryption

    And it's really pretty easy to use. I started with FreeNAS (now TrueNAS), but eventually switched to just running FreeBSD + ZFS + Samba on my file server because it's not that complicated.

  • klauserc 5 days ago

    I use it on my work laptop. Reasons:

    - a single solution that covers the entire storage domain (I don't have to learn multiple layers, like logical volume manager vs. ext4 vs. physical partitions) - cheap/free snapshots. I have been glad to have been able to revert individual files or entire file systems to an earlier state. E.g., create a snapshot before doing a major distro update. - easy to configure/well documented

    Like others have said, at this point I would need a good reason, NOT to use ZFS on a system.

  • NamTaf 5 days ago

    I used it on my home NAS (4x3TB drives, holding all of my family's backups, etc.) for the data security / checksumming features. IMO it's performant, robust and well-designed in ways that give me reassurance regarding data integrity and help prevent me shooting myself in the foot.

  • tbrownaw 5 days ago

    > describe why they would use ZFS (or similar) for home usage

    Mostly because it's there, but also the snapshots have a `diff` feature that's occasionally useful.

  • nesarkvechnep 5 days ago

    I'm trying to find a reason not to use ZFS at home.

    • dizhn 5 days ago

      Requirement for enterprise quality disks, huge RAM (1 gig per TB), ECC, at least x5 disks of redundancy. None of these are things, but people will try to educate you anyway. So use it but keep it to yourself. :)

      • craftkiller 5 days ago

        No need to keep it to yourself. As you've mentioned, all of these requirements are misinformation so you can ignore people who repeat them (or even better, tell them to stop spreading misinformation).

        For those not in the know:

        You don't need to use enterprise quality disks. There is nothing in the ZFS design that requires enterprise quality disks any more than any other file system. In fact, ZFS has saved my data through multiple consumer-grade HDD failures over the years thanks to raidz.

        The 1 gig per TB figure is ONLY for when using the ZFS dedup feature, which the ZFS dedup feature is widely regarded as a bad idea except in VERY specific use cases. 99.9% of ZFS users should not and will not use dedup and therefore they do not need ridiculous piles of ram.

        There is nothing in the design of ZFS any more dangerous to run without ECC than any other filesystem. ECC is a good idea regardless of filesystem but its certainly not a requirement.

        And you don't need x5 disks of redundancy. It runs great and has benefits even on single-disk systems like laptops. Naturally, having parity drives is better in case a drive fails but on single disk systems you still benefit from the checksumming, snapshotting, boot environments, transparent compression, incremental zfs send/recv, and cross-platform native encryption.

      • tpetry 5 days ago

        The interesting part about the enterprise quality disk misinformation is how so wrong it is. The core idea of ZFS was to detect issues when those drives or their drivers are faulty. And this was more happening with cheap non-enterprise disks at that time.

  • zbentley 5 days ago

    I use ZFS for boot and storage volumes on my main workstation, which is primarily that--a workstation, not a server or NAS. Some benefits:

    - Excellent filesystem level backup facility. I can transfer snapshots to a spare drive, or send/receive to a remote (at present a spare computer, but rsync.net looks better every year I have to fix up the spare).

    - Unlike other fs-level backup solutions, the flexibility of zvols means I can easily expand or shrink the scope of what's backed up.

    - It's incredibly easy to test (and restore) backups. Pointing my to-be-backed-up volume, or my backup volume, to a previous backup snapshot is instant, and provides a complete view of the filesystem at that point in time. No "which files do you want to restore" hassles or any of that, and then I can re-point back to latest and keep stacking backups. Only Time Machine has even approached that level of simplicity in my experience, and I have tried a lot of backup tools. In general, backup tools/workflows that uphold "the test process is the restoration process, so we made the restoration process as easy and reversible as possible" are the best ones.

    - Dedup occasionally comes in useful (if e.g. I'm messing around with copies of really large AI training datasets or many terabytes of media file organization work). It's RAM-expensive, yes, but what's often not mentioned is that you can turn it on and off for a volume--if you rewrite data. So if I'm looking ahead to a week of high-volume file wrangling, I can turn dedup on where I need it, start a snapshot-and-immediately-restore of my data (or if it's not that many files, just cp them back and forth), and by the next day or so it'll be ready. Turning it off when I'm done is even simpler. I imagine that the copy cost and unpredictable memory usage mean that this kind of "toggled" approach to dedup isn't that useful for folks driving servers with ZFS, but it's outstanding on a workstation.

    - Using ZFSBootMenu outside of my OS means I can be extremely cavalier with my boot volume. Not sure if an experimental kernel upgrade is going to wreck my graphics driver? Take a snapshot and try it! Not sure if a curl | bash invocation from the internet is going to rm -rf /? Take a snapshot and try it! If my boot volume gets ruined, I can roll it back to a snapshot in the bootloader from outside of the OS. For extra paranoia I have a ZFSBootMenu EFI partition on a USB drive if I ever wreck the bootloader as well, but the odds are that if I ever break the system that bad the boot volume is damaged at the block level and can't restore local snapshots. In that case, I'd plug in the USB drive and restore a snapshot from the adjacent data volume, or my backup volume ... all without installing an OS or leaving the bootloader. The benefits of this to mental health are huge; I can tend towards a more "college me" approach to trying random shit from StackOverflow for tweaking my system without having to worry about "adult professional me" being concerned that I don't know what running some random garbage will do to my system. Being able to experiment first, and then learn what's really going on once I find what works, is very relieving and makes tinkering a much less fraught endeavor.

    - Being able to per-dataset enable/disable ARC and ZIL means that I can selectively make some actions really fast. My Steam games, for example, are in a high-ARC-bias dataset that starts prewarming (with throttled IO) in the background on boot. Game load times are extremely fast--sometimes at better than single-ext4-SSD levels--and I'm storing all my game installs on spinning rust for $35 (4x 500GB + 2x 32GB cheap SSD for cache)!

    • E39M5S62 4 days ago

      It's great to hear that you're using ZFSBootMenu the way I envisioned it! There's such a sense of relief and freedom having snapshots of your whole OS taken every 15 minutes.

      One thing that you might not be aware of is that you can create a zpool checkpoint before doing something 'dangerous' (disk swap, pool version upgrade, etc) and if it goes badly, roll back to that checkpoint in ZFSBootMenu on the Pool tab. Keep in mind though that you can only have one checkpoint at a time, they keep growing and growing, and a rollback is for EVERYTHING on the pool.

      • zbentley 4 days ago

        Oh, are you zdykstra? If so, thanks for creating an invaluable tool!

        > you can create a zpool checkpoint before doing something 'dangerous' (disk swap, pool version upgrade, etc) and if it goes badly, roll back to that checkpoint in ZFSBootMenu on the Pool tab

        Good to know! Snapshots meet most of my needs at present (since my boot volume is a single fast drive, snapshots ~~ checkpoints in this case), but I could see this coming in useful for future scenarios where I need to do complex or risky things with data volumes or SAN layout changes.