Comment by GuB-42

Comment by GuB-42 5 days ago

94 replies

To be honest, the situation with Linux is barely better.

ZFS has license issues with Linux, preventing full integration, and Btrfs is 15 years in the making and still doesn't match ZFS in features and stability.

Most Linux distros still use ext4 by default, which is 19 years old, but ext4 is little more than a series of extensions on top of ext2, which is the same age as NTFS.

In all fairness, there are few OS components that are as critical as the filesystem, and many wouldn't touch filesystems that have less than a decade of proven track record in production.

mogoh 5 days ago

ZFS might be better then any other FS on Linux (I don't judge that).

But you must admit that the situation on Linux is quite better then on Windows. Linux has so many FS in main branch. There is a lot of development. BTRFS had a rocky start, but it got better.

stephen_g 5 days ago

I’m interested to know what ‘full integration’ does look like, I use ZFS in Proxmox (Debian-based) and it’s really great and super solid, but I haven’t used ZFS in more vanilla Linux distros. Does Proxmox have things that regular Linux is missing out on, or are there shortcomings and things I just don’t realise about Proxmox?

  • whataguy 5 days ago

    The difference is that the ZFS kernel module is included by default with Proxmox, whereas with e.g. Debian, you would need to install it manually.

    • pimeys 5 days ago

      And you can't follow the latest kernel before the ZFS module supports it.

      • ryao 4 days ago

        There is a trick for this:

          * Step 1: Make friends with a ZFS developer.
          * Step 2: Guilt him into writing patches to add support as soon as a new kernel is released.
          * Step 3: Enjoy
        
        Adding support for a new kernel release to ZFS is usually only a few hours of work. I have done it in the past more than a dozen times.
      • gf000 4 days ago

        I use NixOS, and it simply updates to the latest kernel that supports zfs, with a single, declerative option.

      • blibble 5 days ago

        for Debian that's not exactly a problem

        • oarsinsync 5 days ago

          Unless you’re using Debian backports, and they backport a new kernel a week before the zfs backport package update happens.

          Happened to me more than once. I ended up manually changing the kernel version limitations the second time just to get me back online, but I don’t recall if that ended up hurting me in the long run or not.

  • BodyCulture 4 days ago

    You probably don’t realise how important encryption is.

    It’s still not supported by Proxmox, yes, you can do it yourself somehow but you are alone then and miss features and people report problems with double or triple file system layers.

    I do not understand how they have not encryption out of the box, this seems to be a problem.

    • kevinmgranger 4 days ago

      I'm not sure about proxmox, but ZFS on Linux does have encryption.

lousken 5 days ago

as far as stability goes, btrfs is used by meta, synology and many others, so I wouldn't say it's not stable, but some features are lacking

  • azalemeth 5 days ago

    My understanding is that single-disk btrfs is good, but raid is decidedly dodgy; https://btrfs.readthedocs.io/en/latest/btrfs-man5.html#raid5... states that:

    > The RAID56 feature provides striping and parity over several devices, same as the traditional RAID5/6.

    > There are some implementation and design deficiencies that make it unreliable for some corner cases and *the feature should not be used in production, only for evaluation or testing*.

    > The power failure safety for metadata with RAID56 is not 100%.

    I have personally been bitten once (about 10 years ago) by btrfs just failing horribly on a single desktop drive. I've used either mdadm + ext4 (for /) or zfs (for large /data mounts) ever since. Zfs is fantastic and I genuinely don't understand why it's not used more widely.

    • crest 5 days ago

      One problem with your setup is that ZFS by design can't use a traditional *nix filesystem buffer cache. Instead it has to use its own ARC (adaptive replacement cache) with end-to-end checksumming, transparent compression, and copy-on-write semantics. This can lead to annoying performance problems when the two types of file system caches contest for available memory. There is a back pressure mechanism, but it effectively pauses other writes while evicting dirty cache entries to release memory.

      • ryao 5 days ago

        Traditionally, you have the page cache on top of the FS and the buffer cache below the FS, with the two being unified such that double caching is avoided in traditional UNIX filesystems.

        ZFS goes out of its way to avoid the buffer cache, although Linux does not give it the option to fully opt out of it since the block layer will buffer reads done by userland to disks underneath ZFS. That is why ZFS began to purge the buffer cache on every flush 11 years ago:

        https://github.com/openzfs/zfs/commit/cecb7487fc8eea3508c3b6...

        That is how it still works today:

        https://github.com/openzfs/zfs/blob/fe44c5ae27993a8ff53f4cef...

        If I recall correctly, the page cache is also still above ZFS when mmap() is used. There was talk about fixing it by having mmap() work out of ARC instead, but I don’t believe it was ever done, so there is technically double caching done there.

      • [removed] 5 days ago
        [deleted]
    • lousken 5 days ago

      I was assuming OP wants to highlight filesystem use on a workstation/desktop, not for a file server/NAS. I had similar experience decade ago, but these days single drives just work, same with mirroring. For such setups btrfs should be stable. I've never seen a workstation with raid5/6 setup. Secondly, filesystems and volume managers are something else, even if e.g. btrfs and ZFS are essentialy both.

      For a NAS setup I would still prefer ZFS with truenas scale (or proxmox if virtualization is needed), just because all these scenarios are supported as well. And as far as ZFS goes, encryption is still something I am not sure about especially since I want to use snapshots sending those as a backup to remote machine.

    • hooli_gan 5 days ago

      RAID5/6 is not needed with btrfs. One should use RAID1, which supports striping the same data onto multiple drives in a redundant way.

      • johnmaguire 5 days ago

        How can you achieve 2-disk fault tolerance using btrfs and RAID 1?

    • brian_cunnie 5 days ago

      > I have personally been bitten once (about 10 years ago) by btrfs just failing horribly on a single desktop drive.

      Me, too. The drive was unrecoverable. I had to reinstall from scratch.

  • jeltz 4 days ago

    It is possible to corrupt the file system from user space as a normal user with Btrfs. The PostgreSQL devs found that when working on async IO. And as fer as I know that issue has not been fixed.

    https://www.postgresql.org/message-id/CA%2BhUKGL-sZrfwcdme8j...

  • _joel 5 days ago

    I'm similar to some other people here, I guess once they've been bitten by data loss due to btrfs, it's difficult to advocate for it.

    • lousken 5 days ago

      I am assuming almost everybody at some point experienced data loss because they pulled out a flash drive too early. Is it safe to assume that we stopped using flash drives because of it?

      • _joel 5 days ago

        I'm not sure we have stopped using flash, judging by the pile of USB sticks on my desk :) In relation to the fs analogy if you used a flash drive that you know corrupted your data, you'd throw it away for one you know works.

        • ryao 5 days ago

          I once purchased a bunch of flash drives from Google’s online swag store and just unplugging them was often enough to put then in a state where they claimed to be 8MB devices and nothing I wrote to them was ever possible to read back in my limited tests. I stopped using those fast.

  • fourfour3 5 days ago

    Do Synology actually use the multi-device options of btrfs, or are they using linux softraid + lvm underneath?

    I know Synology Hybrid RAID is a clever use of LVM + MD raid, for example.

    • phs2501 4 days ago

      I believe Synology runs btrfs on top of regular mdraid + lvm, possibly with patches to let btrfs checksum failures reach into the underlying layers to find the right data to recover.

      Related blog post: https://daltondur.st/syno_btrfs_1/

cesarb 5 days ago

> Btrfs [...] still doesn't match ZFS in features [...]

Isn't the feature in question (array expansion) precisely one which btrfs already had for a long time? Does ZFS have the opposite feature (shrinking the array), which AFAIK btrfs also already had for a long time?

(And there's one feature which is important to many, "being in the upstream Linux kernel", that ZFS most likely will never have.)

  • wkat4242 5 days ago

    ZFS also had expansion for a long time but it was offline expansion. I don't know if btrfs has also had online for a long time?

    And shrinking no, that is a big missing feature in ZFS IMO. Understandable considering its heritage (large scale datacenters) but nevertheless an issue for home use.

    But raidz is rock-solid. Btrfs' raid is not.

    • unsnap_biceps 5 days ago

      Raidz wasn't able to be expanded in place before this. You were able to add to a pool that included a raidz vdev, but that raidz vdev was immutable.

      • wkat4242 5 days ago

        Oh ok, I've never done this, but I thought it was already there. Maybe this was the original ZFS from Sun? But maybe I just remember it incorrectly, sorry.

        I've used it on multi-drive arrays but I never had the need for expansion.

        • ryao 5 days ago

          You could add top level raidz vdevs or replace the members of a raid-z vdev with larger disks to increase storage space back then. You still have those options now.

bayindirh 5 days ago

> Most Linux distros still use ext4 by default, which is 19 years old, but ext4 is little more than a series of extensions on top of ext2, which is the same age as NTFS.

However, ext4 and XFS are much more simpler and performant than BTRFS & ZFS as root drives on personal systems and small servers.

I personally won't use either on a single disk system as root FS, regardless of how fast my storage subsystem is.

  • ryao 5 days ago

    ZFS will outscale ext4 in parallel workloads with ease. XFS will often scale better than ext4, but if you use L2ARC and SLOG devices, it is no contest. On top of that, you can use compression for an additional boost.

    You might also find ZFS outperforms both of them in read workloads on single disks where ARC minimizes cold cache effects. When I began using ZFS for my rootfs, I noticed my desktop environment became more responsive and I attributed that to ARC.

    • jeltz 4 days ago

      Not on most database workloads. There zfs does not scale very well.

      • ryao 4 days ago

        Percona and many others who benchmarked this properly would disagree with you. Percona found that ext4 and ZFS performed similarly when given identical hardware (with proper tuning of ZFS):

        https://www.percona.com/blog/mysql-zfs-performance-update/

        In this older comparison where they did not initially tune ZFS properly for the database, they found XFS to perform better, only for ZFS to outperform it when tuning was done and a L2ARC was added:

        https://www.percona.com/blog/about-zfs-performance/

        This is roughly what others find when they take the time to do proper tuning and benchmarks. ZFS outscales both ext4 and XFS, since it is a multiple block device filesystem that supports tiered storage while ext4 and XFS are single block device filesystems (with the exception of supporting journals on external drives). They need other things to provide them with scaling to multiple block devices and there is no block device level substitute for supporting tiered storage at the filesystem level.

        That said, ZFS has a killer feature that ext4 and XFS do not have, which is low cost replication. You can snapshot and send/recv without affecting system performance very much, so even in situations where ZFS is not at the top in every benchmark such as being on equal hardware, it still wins, since the performance penalty of database backups on ext4 and XFS is huge.

    • bayindirh 4 days ago

      No doubt. I want to reiterate my point. Citing myself:

      > "I personally won't use either on a single disk system as root FS, regardless of how fast my storage subsystem is." (emphasis mine)

      We are no strangers to filesystems. I personally benchmarked a ZFS7320 extensively, writing a characterization report, plus we have a ZFS7420 for a very long time, complete with separate log SSDs for read and write on every box.

      However, ZFS is not saturation proof, plus is nowhere near a Lustre cluster performance wise, when scaled.

      What kills ZFS and BTRFS on desktop systems are write performance, esp. on heavy workloads like system updates. If I need a desktop server (performance-wise), I'd configure it accordingly and use these, but I'd never use BTRFS or ZFS on a single root disk due to their overhead, to reiterate myself thrice.

honestSysAdmin 4 days ago

  https://openzfs.github.io/openzfs-docs/Getting%20Started/index.html
ZFS runs on all major Linux distros, the source is compiled locally and there is no meaningful license problem. In datacenter and "enterprise" environments we compile ZFS "statically" with other kernel modules all the time.

For over six years now, there is an "experimental" option presented by the graphical Ubuntu installer to install the root filesystem on ZFS. Almost everyone I personally know (just my anecdote) chooses this "experimental" option. There has been an occasion here and there of ZFS snapshots taking up too much space, but other than this there have not been any problems.

I statically compile ZFS into a kernel that intentionally does not support loading modules on some of my personal laptops. My experience has been great, others' mileage may (certainly will) vary.

xattt 5 days ago

ZFS on OS X was killed because of Oracle licensing drama. I don’t expect anything better on Windows either.

  • ryao 5 days ago

    There is a third party port here:

    https://openzfsonosx.org/wiki/Main_Page

    It was actually the NetApp lawsuit that caused problems for Apple’s adoption of ZFS. Apple wanted indemnification from Sun because of the lawsuit, Sun’s CEO did not sign the agreement before Oracle’s acquisition of Sun happened and Oracle had no interest in granting that, so the official Apple port was cancelled.

    I heard this second hand years later from people who were insiders at Sun.

    • xattt 5 days ago

      That’s a shame re: NetApp/ZFS.

      While third-party ports are great, they lack deep integration that first-party support would have brought (non-kludgy Time Machine which is technically fixed with APFS).

      • [removed] 4 days ago
        [deleted]
  • throw0101a 5 days ago

    > ZFS on OS X was killed because of Oracle licensing drama.

    It was killed because Apple and Sun couldn't agree on a 'support contract'. From Jeff Bonwick, one of the co-creators ZFS:

    >> Apple can currently just take the ZFS CDDL code and incorporate it (like they did with DTrace), but it may be that they wanted a "private license" from Sun (with appropriate technical support and indemnification), and the two entities couldn't come to mutually agreeable terms.

    > I cannot disclose details, but that is the essence of it.

    * https://archive.is/http://mail.opensolaris.org/pipermail/zfs...

    Sun took DTrace, licensed via CDDL—just like ZFS—and put it into the kernel without issue. Of course a file system is much more central to an operating system, so they wanted much more of a CYA for that.

  • BSDobelix 5 days ago

    >ZFS on OS X was killed because of Oracle licensing drama.

    Naa it was Jobs ego not the license:

    >>Only one person at Steve Jobs' company announces new products: Steve Jobs.

    https://arstechnica.com/gadgets/2016/06/zfs-the-other-new-ap...

    • bolognafairy 5 days ago

      It’s a cute story that plays into the same old assertions about Steve Jobs, but the conclusion is mostly baseless. There are many other, more credible, less conspiratorial, possible explanations.

      • wkat4242 5 days ago

        It could have played into it though, but I agree the support contract that couldn't be worked out mentioned elsewhere in the thread is more likely.

        But I think these things are usually a combination. When a business relationship sours, agreements are suddenly much harder to work out. The negotiators are still people and they have feelings that will affect their decisionmaking.

      • [removed] 5 days ago
        [deleted]
nabla9 5 days ago

License is not a real issue. It must be just distributed in separate module. No big hurdle.

  • crest 5 days ago

    The main hurdle is hostile Linux kernel developers who aren't held accountable intentionally breaking ZFS for their own petty ideological reasons e.g. removing the in-kernel FPU/SIMD register save/restore API and replacing it with a "new" API to do the the same.

    What's "new" about the "new" API? Its symbols are GPL2 only to deny it's use to non-GPL2 modules (like ZFS). Guess that's an easy way to make sure that BTRFS is faster than ZFS or set yourself up as the (to be) injured party.

    Of course a reimplementation of the old API in terms of the new is an evil "GPL condom" violating the kernel license right? Why can't you see ZFS's CDDL2 license is the real problem here for being the wrong flavour of copyleft license. Way to claim the moral high ground you short-sighted, bigoted pricks. sigh

  • Jnr 5 days ago

    From my point of view it is a real usability issue.

    zfs modules are not in the official repos. You either have to compile it on each machine or use unofficial repos, which is not exactly ideal and can break things if those repos are not up to date. And I guess it also needs some additional steps for secureboot setup on some distros?

    I really want to try zfs because btrfs has some issues with RAID5 and RAID6 (it is not recommended so I don't use it) but I am not sure I want to risk the overall system stability, I would not want to end up in a situation where my machines don't boot and I have to fix it manually.

    • chillfox 5 days ago

      I have been using ZFS on Mint and Alpine Linux for years for all drives (including root) and have never had an issue. It's been fantastic and is super fast. My linux/zfs laptop loads games much faster than an identical machine running Windows.

      I have never had data corruption issues with ZFS, but I have had both xfs and ext4 destroy entire discs.

    • harshreality 5 days ago

      Why are you considering raid5/6? Are you considering building a large storage array? If the data will fit comfortably (50-60% utilization) on one drive, all you need is raid1. Btrfs is fine for raid1 (raid1c3 for extra redundancy); it might have hidden bugs, but no filesystem is immune from those; zfs had a data loss bug (it was rare, but it happened) a year ago.

      Why use zfs for a boot partition? Unless you're using every disk mounting point and nvme slot for a single large raid array, you can use a cheap 512GB nvme drive or old spare 2.5" ssd for the boot volume. Or two, in btrfs raid1 if you absolutely must... but do you even need redundancy or datasum (which can hurt performance) to protect OS files? Do you really care if static package files get corrupted? Those are easily reinstalled, and modern quality brand SSDs are quite reliable.

      • Jnr 4 days ago

        I am already using ext4 for /boot and / on nvme, and I am happy with that.

        I want to use raid 5 for the large storage mount point that holds non-OS files. I want both space and redundancy. Currently I have several separate raid1 btrfs mounts since it is recommended against raid5.

  • GuB-42 5 days ago

    It is a problem because most of the internal kernel APIs are GPL-only, which limit the abilities of the ZFS module. It is a common source of argument between the Linux guys and the ZFS on Linux guys.

    The reason for this is not just to piss off non-GPL module developers. GPL-only internal APIs are subject to change without notice, even more so than the rest of the kernel. And because the licence may not allow the Linux kernel developers to make the necessary changes to the module when it happens, there is a good chance it breaks without warning.

    And even with that, all internal APIs may change, it is just a bit less likely than for the GPL-only ones, and because ZFS on Linux is a separate module, there is no guarantee for it to not break with successive Linux versions, in fact, it is more like a guarantee that it will break.

    Linux is proudly monolithic, and as constantly evolving a monolithic kernel, developers need to have control over the entire project. It is also community-driven. Combined, you need rules to have the community work together, or everything will break down, and that's what the GPL is for.

  • nijave 4 days ago

    I remember it being a pain in the ass on Fedora which tracks closely to mainline. Frequently a new kernel version would come out that zfs module didn't support so you'd have to downgrade and hold back the package until support was added.

    Fedora packages zfs-fuse. I think some distros have arrangements to make sure kernels have zfs support. It may be less of a headache on those

    In tree fs don't break that way

nijave 4 days ago

You've been able to add and remove devices at will for a long time with btrfs (only recently supported in zfs with lots of caveats)

Btrfs also supports async/offline dedupe

You can also layer it on top of mdadm. Iirc zfs strongly discourages using anything but direct attached physical disks.

BSDobelix 5 days ago

>ZFS has license issues with Linux, preventing full integration

No one wants that, openZFS is much healthier without Linux and it's "Foundation/Politics".

  • bhaney 5 days ago

    > No one wants that

    I want that

    • BSDobelix 5 days ago

      Then let me tell you that FreeBSD or OmniOS is what you really want ;)

      • bhaney 5 days ago

        You're now 0 for 2 at telling me what I want

LtdJorge 4 days ago

XFS is 22 and still the best in-tree FS there is :)