Nothing to report.

  • FrederikNJS@lemmy.zip
    link
    fedilink
    arrow-up
    2
    ·
    edit-2
    23 days ago

    BTRFS has native checksumming, so it will detect any bitrot that occurs. Additionally it supports various RAID levels. So if you have some level of replication or parity, then combined with the checksums, it will automatically correct bitrot as well.

    A proper backup strategy is of course still necessary.

    • Hupf@feddit.org
      link
      fedilink
      arrow-up
      1
      ·
      23 days ago

      I’m running a 60TB btrfs RAID with all the bells and whistles myself and just recently had an instance of some file being fucked up (probably just the wrong metadata bit being affected or something), which I noticed because btrfs send would repeatedly crash at that inum. All the redundancy may be there, but sometimes it’s not able to recover automagically.

      Not hating on btrfs at all - it helped me recover from a few fubar situations that could easily have been total data loss - but magical thinking (about all the fancy features) is dangerous.

      • FrederikNJS@lemmy.zip
        link
        fedilink
        arrow-up
        2
        ·
        edit-2
        22 days ago

        Huh, that sound very weird… If for example you’re running RAID1, then all bits of the metadata should be duplicated. So unless the same bit of metadata was also corrupted on the other disk, it should be recoverable…

        What checksum algorithm are you running?

        • Hupf@feddit.org
          link
          fedilink
          arrow-up
          1
          ·
          edit-2
          20 days ago

          blake2b checksum, zstd compression, raid1c4 metadata and raid6 data. Kernel 6.12, btrfs-progs 6.17, ECC RAM.

          The files in the affected inode haven’t been touched for a few years. Dmesg was something about zstd decompression failed and prevented btrfs send of an incremental snapshot as well as accessing one single file.

          Due to the size of the array, I don’t always get around to do a full scrub after a (albeit rare) system crash, so I wrote it off as probably that and didn’t analyze much further at the time.

          • FrederikNJS@lemmy.zip
            link
            fedilink
            arrow-up
            2
            ·
            edit-2
            20 days ago

            Ah, it’s probably a result of running RAID6 then. All the parity RAID modes in BTRFS still has some issues, such as suffering from the “write hole” issue. This can result in data loss when the filesystem isn’t unmounted cleanly, such as a crash or power loss.

            RAID5 and RAID6 are still not recommended for production use.