Comment by mustache_kimono

Comment by mustache_kimono 5 days ago

13 replies

> Technically speaking, bcachefs has been merged into the Linux Kernel - that makes your initial assertion wrong.

bcachefs doesn't implement its erasure coding/RAID yet? Doesn't implement send/receive. Doesn't implement scrub/fsck. See: https://bcachefs.org/Roadmap, https://bcachefs.org/Wishlist/

btrfs is still more of a legit competitor to ZFS these days and it isn't close to touching ZFS where it matters. If the perpetually half-finished bcachefs and btrfs are the "answer" to ZFS that seems like too little, too late to me.

koverstreet 5 days ago

Erasure coding is almost done; all that's missing is some of the device evacuate and reconstruct paths, and people have been testing it and giving positive feedback (especially w.r.t. performance).

It most definitely does have fsck and has since the beginning, and it's a much more robust and dependable fsck than btrfs's. Scrub isn't quite done - I actually was going to have it ready for this upcoming merge window except for a nasty bout of salmonella :)

Send/recv is a long ways off, there might be some low level database improvements needed before that lands.

Short term (next year or two) priorities are finishing off online fsck, more scalability work (upcoming version for this merge window will do 50PB, but now we need to up the limit on number of drives), and quashing bugs.

  • ryao 5 days ago

    Hearing that it is missing some code for reconstruction makes it sound like it is missing something fairly important. The original purpose of parity RAID is to support reconstruction.

    • koverstreet 5 days ago

      We can do reconstruct reads, what's missing is the code to rewrite missing blocks in a stripe after a drive dies.

      In general, due to the scope of the project, I've been prioritizing the functionality that's needed to validate the design and the parts that are needed for getting the relationships between different components correct.

      e.g. recently I've been doing a bunch of work on backpointers scalability, and that plus scrub are leading to more back and forth iteration on minor interactions with erasure coding.

      So: erasure coding is complete enough to know that it works and for people to torture test it, but yes you shouldn't be running it in production yet (and it's explicitly marked as such). What's remaining is trivial but slightly tedious stuff that's outside the critical path of the rest of the design.

      Some of the code I've been writing for scrub is turning out to also be what we want for reconstruct, so maybe we'll get there sooner rather than later...

  • BSDobelix 5 days ago

    >except for a nasty bout of salmonella

    Did the Linux Foundation send you some "free" sushi? ;)

    However keep the good work rolling, super happy about a good, usable and modern Filesystem native to Linux.

  • pdimitar 5 days ago

    FYI: the main reason I gave up on bcachefs is that I can't use devices with native 16K blocks.

    Hope that's coming this year. I have a bunch of old HDDs and SSDs and I could very easily assemble a spare storage server with about 4TB capacity. Already tested bcachefs with most of the drives and it performed very well.

    Also lack of ability to reconstruct seems like another worrying omission.

    • koverstreet 5 days ago

      I wasn't aware there were actual users needing bs > ps yet. Cool :)

      That should be a completely trivial for bcachefs to support, it'll mostly just be a matter of finding or writing the tests.

      • pdimitar 5 days ago

        Seriously? But... NVMe drives! I stopped testing because I only have one spare NVMe and couldn't use it with bcachefs.

        If you or others can get it done I'm absolutely starting to use bcachefs the month after. I do need fast storage servers in my home office.

  • mafuy 5 days ago

    Thank you, looking forward to it!