Corrupting a ZFS File on Purpose

Infrastructure
Storage
Open Source

The post is a hands-on experiment in forcing corruption into a ZFS-backed file and observing how ZFS responds. The useful takeaway was not the exact mechanics of `dd` and block offsets, but the reminder that ZFS is built around end-to-end verification. It checks that the data returned for a given block is the data that block was supposed to contain, not just that the drive returned something internally consistent.

If you rely on plain RAID or drive firmware to keep data safe, you are missing failure modes that only filesystem-level checksums can catch. For archival or large critical datasets, pair ZFS-style integrity checks with redundancy and periodic verification instead of assuming modern disks will surface every error cleanly.

June 9, 2026
oshogbo.com
Discuss on HN

Key insights

Filesystem checksums catch failures drive ECC cannot

Drive-level ECC only validates that the device can read back a sector-shaped chunk that matches its own coding. It cannot prove the firmware wrote the correct payload, stored it at the correct LBA, or returned the block the OS actually asked for. That is why ZFS can flag corruption even when the disk reports a clean read, including cases like the Samsung 840 EVO queued TRIM bug where the device looked healthy from below the filesystem.

Do not treat successful block reads as proof your data is intact. If the data matters, use a filesystem or verification layer that validates content against higher-level checksums.

Attribution:

throw0101c #1
matja #1
ssl-3 #1

Archival storage still needs its own redundancy

Long-stored disks can develop a few bad sectors that slip past normal handling, and independent hashes are often the only reason the damage gets noticed. PAR2 or RAR recovery records can repair scattered corruption, but commenters were clear that sidecar redundancy is not a substitute for full duplicate copies on separate media and in different locations.

For cold storage, keep per-file verification data and maintain at least one independent duplicate. Use PAR2 for repairable drift, not as your only disaster plan.

Attribution:

adrian_b #1 #2
ramses0 #1
wongarsu #1

ZFS can survive terrible hardware combinations

One detailed report described a RAIDZ setup built from external USB SMR disks, which is close to a worst-case stack for reliability. Even there, the filesystem usually preserved data through controller issues, dropped drives, corrupted files, and damaged metadata, with recovery possible after manual repair steps. That is a strong vote for ZFS resilience, though not for copying that architecture.

If you are already stuck with flaky storage, ZFS can buy you recovery headroom. It is still cheaper to avoid fragile USB controllers and SMR-heavy designs than to depend on heroics later.

Attribution:

guardiangod #1

Multiple copies on one disk can help

ZFS's ability to keep more than one copy of a block on a single disk sounds odd until you optimize for sector failure rather than whole-drive death. For workloads where isolated block errors are more common than total device loss, extra in-disk copies can reduce corruption exposure without requiring another full mirror device.

Match redundancy to the failure mode you actually expect. If you care about localized media errors on a single device, block-level duplication may be worth considering alongside pool-level redundancy.

Attribution:

BuildTheRobots #1

Against the grain

Random byte flips may model the wrong failure

The experiment's corruption method may not look like a typical failing disk, because real devices often detect unreadable sectors and raise I/O errors instead of quietly returning garbage. That does not undercut ZFS, but it does mean a more realistic test would include truncation, holes, or forced read failures rather than only hand-edited bytes.

When you test storage recovery, simulate the failures your stack is likely to produce. Include outright read errors and missing sectors, not just silent corruption.

Attribution:

ralferoo #1

The writing style annoyed more than it informed

A noticeable side conversation argued that the post leaned too hard on dramatic phrasing and suspense for what was really a straightforward technical walkthrough. The complaint was not about one word choice. It was about a broader grandiose tone that some readers now associate with LLM-assisted writing and find exhausting in explanatory posts.

If you publish technical writeups for engineers, keep the prose tight and concrete. A clear lab notebook voice will usually travel better than a theatrical one.

Attribution:

anonymous_user9 #1
calcifer #1
rcxdude #1
eigencoder #1

In plain english

dd ↩

A Unix command-line tool used for low-level copying and editing of raw data blocks.

ECC ↩

Error-Correcting Code memory, hardware memory that can detect and often correct some bit errors.

LBA ↩

Logical Block Addressing, the numbered block locations a computer uses to read and write sectors on a disk.

PAR2 ↩

Parchive version 2, a file format and toolset that creates recovery data so damaged files can be verified and repaired.

RAIDZ ↩

ZFS's RAID-like storage layout that uses parity across multiple drives to survive disk failures.

SMR ↩

Steam methane reforming, the dominant industrial method for making hydrogen from natural gas.

SSD ↩

Solid-state drive, a type of flash-based storage used in phones and computers.

TRIM ↩

A storage command that tells a solid-state drive which blocks are no longer needed, which can affect attempts to hide data.

ZFS ↩

A filesystem and volume manager designed with built-in checksumming and data integrity features.

Reference links

File recovery and integrity tools

Parchive
Referenced as a way to add repair data for archive files so corrupted files can be reconstructed.

Related experiments

Btrfs corruption experiment gist
Shared as a similar hands-on exercise for testing corruption behavior on Btrfs.

Corrupting a ZFS File on Purpose

Discussion mood

Key insights

Against the grain

In plain english

Reference links

File recovery and integrity tools

Related experiments