Replication Between Pools Causes Corruption

yottabit

Contributor
Joined
Apr 15, 2012
Messages
192


Top-comment update: see comment #50 if you want the raw source dataset that corrupts when replicating to the raw destination dataset. The encryption was a red herring. Current hypothesis is that lz4-compressed data is not decompressed before presentation to the user.



I have an old dataset that I want to encrypt.
  1. Snapshot vol1/jacob.mcdonald
  2. Use replicate task to send the snapshot from the old pool to a new pool
  3. Use replicate task to send the snapshot from the new pool to the old pool into an encrypted dataset
  4. md5sum the unencrypted/original snapshot on the old pool
  5. md5sum the encrypted/new snapshot on the old pool
  6. 0.9% of the files have differing checksums!
Example:
Code:
113ab4fd2edc93565019db624576bf15  /mnt/vol1/jacob.mcdonald/jacob.mcdonald/Video/Rylo/180309 Spring Break 2018/raw/43413741-3635-4639-3433-463035314537.project # original dataset
113ab4fd2edc93565019db624576bf15  /mnt/big_scratch/migration/jacob.mcdonald/jacob.mcdonald/Video/Rylo/180309 Spring Break 2018/raw/43413741-3635-4639-3433-463035314537.project # replicated snapshot to new pool
37181be79c071551faa8a624d7ec2070  /mnt/vol1/jacob.mcdonald_migrated/jacob.mcdonald/Video/Rylo/180309 Spring Break 2018/raw/43413741-3635-4639-3433-463035314537.project # replicated snapshot into encrypted dataset

I'm running checksums on the intermediate dataset on the new pool now, but quick checks show original -> unencrypted intermediate are good, but intermediate -> encrypted are bad.

Just to preempt the question: the pools show 0 errors. All appears well.

Has anyone else run into this?
 
Last edited:
Joined
Oct 22, 2019
Messages
3,641
That is odd. Out of curiosity, what if you tried a different checksum to compare the files, such as SHA512?

So 99.1% of the files have matching MD5 checksums, yet there's still 0.9% that don't? From the same task? Something feels amiss.
 

yottabit

Contributor
Joined
Apr 15, 2012
Messages
192
Yep, that's right, 99.1% matching hases and 0.9% non-matching. Even though this is GNU, I manually ran the hashes on that file with '-b' just to make sure. It matches what I wrote in the first post.

I'm now rerunning the md5sum now from each .zfs/snapshot/migration directory to ensure it's a like-for-like, though I didn't change anything since the replications so I really don't know how/why this has happened.

My chances of an md5 collision are going to be pretty close to non-existent, and I did manually inspect the file I referenced above; indeed it was changed. It's a mostly-binary file that I never would edit directly, or even load from this data store as it's useless without being loaded by the Android/iOS app that uses it natively.
 

AlexGG

Contributor
Joined
Dec 13, 2018
Messages
171
If you re-run your checksumming twice, does it produce the same list of faulty files or a different list? I'm thinking about the hardware problem. Is it the NAS itself doing the checksumming or are two different machines involved, one is NAS and the other is checksumming?
 

yottabit

Contributor
Joined
Apr 15, 2012
Messages
192
The replication was all on the same server, a server-class machine that has been in use for many years with no faults. The source pool consists of two Z1 vdevs, both operating fine for many years. The disks in vdev1 have been in operation around 10 years, and the disks in vdev2 have been in operation for a few years. The temporary pool used to speed up the shuffling is a single SAS disk, newly created. zpool shows zero read/write/cksum errors for both pools.

The md5sums on all 3 datasets should've finished overnight. I'll start checks again soon and report back.
 

yottabit

Contributor
Joined
Apr 15, 2012
Messages
192
md5sum checks are still running (will be at least 5-6 more hours), but early indicators show the same problem, checksumming directly from the snapshot.

I found a complete text file example.

Contents of the text file on original dataset snapshot (`cat -nv`):
Code:
     1
     2  file 'MVI_0769.MOV'
     3  file 'MVI_0771.MOV'
     4  file 'MVI_0772.MOV'
     5  file 'MVI_0773.MOV'
     6  file 'MVI_0774.MOV'
     7  file 'MVI_0775.MOV'
     8  file 'MVI_0776.MOV'
     9  file 'MVI_0777.MOV'
    10  file 'MVI_0778.MOV'
    11  file 'MVI_0780.MOV'
    12  file 'MVI_0781.MOV'
    13  file 'MVI_0782.MOV'
    14  file 'MVI_0783.MOV'
    15  file 'MVI_0784.MOV'
    16  file 'MVI_0785.MOV'
    17  file 'MVI_0786.MOV'
    18  file 'MVI_0788.MOV'
    19  file 'MVI_0790.MOV'
    20  file 'MVI_0791.MOV'
    21  file 'MVI_0792.MOV'
    22  file 'MVI_0793.MOV'
    23  file 'MVI_0794.MOV'
    24  file 'MVI_0796.MOV'
    25  file 'MVI_0797.MOV'
    26  file 'MVI_0816.MOV'
    27  file 'MVI_0817.MOV'
    28  file 'MVI_0820.MOV'
    29  file 'MVI_0821.MOV'
    30  file 'MVI_0825.MOV'
    31  file 'MVI_0827.MOV'
    32  file 'MVI_0828.MOV'
    33  file 'MVI_0829.MOV'
    34  file 'MVI_0833.MOV'
    35  file 'MVI_0834.MOV'
    36  file 'MVI_0835.MOV'
    37  file 'MVI_0836.MOV'
    38  file 'MVI_0837.MOV'
    39  file 'MVI_0838.MOV'
    40  file 'MVI_0839.MOV'
    41  file 'MVI_0846.MOV'
    42  file 'MVI_0847.MOV'
    43  file 'MVI_0848.MOV'
    44  file 'MVI_0862.MOV'
    45  file 'MVI_0863.MOV'
    46  file 'MVI_0864.MOV'
    47  file 'MVI_0890.MOV'
    48  file 'MVI_0897.MOV'
    49  file 'MVI_0898.MOV'
    50  file 'MVI_0899.MOV'
    51  file 'MVI_0900.MOV'
    52  file 'MVI_0901.MOV'
    53  file 'MVI_0902.MOV'
    54  file 'MVI_0903.MOV'
    55  file 'MVI_0904.MOV'
    56  file 'MVI_0907.MOV'
    57  file 'MVI_0908.MOV'
    58  file 'MVI_0909.MOV'
    59  file 'MVI_0910.MOV'
    60  file 'MVI_0911.MOV'
    61  file 'MVI_0912.MOV'
    62  file 'MVI_0913.MOV'
    63  file 'MVI_0914.MOV'
    64  file 'MVI_0917.MOV'
    65  file 'MVI_0919.MOV'
    66  file 'MVI_0931.MOV'
    67  file 'MVI_0932.MOV'
    68  file 'MVI_0933.MOV'
    69  file 'MVI_0934.MOV'
    70  file 'MVI_0935.MOV'
    71  file 'MVI_0936.MOV'


Contents of the text file on intermediate (still unencrypted) dataset snapshot (md5 check success):
Code:
     1
     2  file 'MVI_0769.MOV'
     3  file 'MVI_0771.MOV'
     4  file 'MVI_0772.MOV'
     5  file 'MVI_0773.MOV'
     6  file 'MVI_0774.MOV'
     7  file 'MVI_0775.MOV'
     8  file 'MVI_0776.MOV'
     9  file 'MVI_0777.MOV'
    10  file 'MVI_0778.MOV'
    11  file 'MVI_0780.MOV'
    12  file 'MVI_0781.MOV'
    13  file 'MVI_0782.MOV'
    14  file 'MVI_0783.MOV'
    15  file 'MVI_0784.MOV'
    16  file 'MVI_0785.MOV'
    17  file 'MVI_0786.MOV'
    18  file 'MVI_0788.MOV'
    19  file 'MVI_0790.MOV'
    20  file 'MVI_0791.MOV'
    21  file 'MVI_0792.MOV'
    22  file 'MVI_0793.MOV'
    23  file 'MVI_0794.MOV'
    24  file 'MVI_0796.MOV'
    25  file 'MVI_0797.MOV'
    26  file 'MVI_0816.MOV'
    27  file 'MVI_0817.MOV'
    28  file 'MVI_0820.MOV'
    29  file 'MVI_0821.MOV'
    30  file 'MVI_0825.MOV'
    31  file 'MVI_0827.MOV'
    32  file 'MVI_0828.MOV'
    33  file 'MVI_0829.MOV'
    34  file 'MVI_0833.MOV'
    35  file 'MVI_0834.MOV'
    36  file 'MVI_0835.MOV'
    37  file 'MVI_0836.MOV'
    38  file 'MVI_0837.MOV'
    39  file 'MVI_0838.MOV'
    40  file 'MVI_0839.MOV'
    41  file 'MVI_0846.MOV'
    42  file 'MVI_0847.MOV'
    43  file 'MVI_0848.MOV'
    44  file 'MVI_0862.MOV'
    45  file 'MVI_0863.MOV'
    46  file 'MVI_0864.MOV'
    47  file 'MVI_0890.MOV'
    48  file 'MVI_0897.MOV'
    49  file 'MVI_0898.MOV'
    50  file 'MVI_0899.MOV'
    51  file 'MVI_0900.MOV'
    52  file 'MVI_0901.MOV'
    53  file 'MVI_0902.MOV'
    54  file 'MVI_0903.MOV'
    55  file 'MVI_0904.MOV'
    56  file 'MVI_0907.MOV'
    57  file 'MVI_0908.MOV'
    58  file 'MVI_0909.MOV'
    59  file 'MVI_0910.MOV'
    60  file 'MVI_0911.MOV'
    61  file 'MVI_0912.MOV'
    62  file 'MVI_0913.MOV'
    63  file 'MVI_0914.MOV'
    64  file 'MVI_0917.MOV'
    65  file 'MVI_0919.MOV'
    66  file 'MVI_0931.MOV'
    67  file 'MVI_0932.MOV'
    68  file 'MVI_0933.MOV'
    69  file 'MVI_0934.MOV'
    70  file 'MVI_0935.MOV'
    71  file 'MVI_0936.MOV'


Contents of the text file on destination (encrypted) dataset snapshot (md5 check fail):
Code:
     1  ^@^@^AxM-y^E
     2  file 'MVI_0769.MOV'^T^@/71^T^@^@^_2^T^@^@^_3^T^@^@^_4^T^@^@^_5^T^@^@^_6^T^@^@^_7^T^@^@^^8^T^@/80^T^@^@^OM-4^@^@^_8M-4^@^@^_8M-4^@^@^_8M-4^@^@^_8M-4^@^@^_8M-4^@^@^_8M- ^@^@^_9M- ^@^@^_9M- ^@^@^_9M- ^@^@^_9M- ^@^@^_9M- ^@^@^_9M-^L^@^@^^9@^A.81(^@/81(^@^@^^2M-4^@.82M-4^@.82^X^A/82P^@^A^N^X^A.82l^B.83M-p^@.83M-p^@/83x^@^@^_3M-\^@^@^_3M-^L^@^@^_3M-^L^@^@^_3M-^L^@^@^_4P^@^@^_4P^@^@^_4P^@^@^^6M-L^A/86M-\^@^@^_6M-\^@^@^O0^B^@^_8M-L^A^A^_9x^@^@^^9M-H^@.90M-L^A.90M-L^A.90M-4^@.90M-4^@.90M-4^@.90^X^A.90M- ^@/90M- ^@^@^_1M- ^@^@^_1M- ^@^@^_1M- ^@^@^_1M- ^@^@^_1M- ^@^@^OM-d^B^@/91M-^L^@^@^_3x^@^@^_3x^@^@^OM-(^B^@^_9M-(^B^@^_9M-(^B^@^T9M-(^B^_^@^A^@nP^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@


Well, this is a bummer. Here are the dataset properties.

Original dataset:
Code:
$ sudo zfs get all vol1/jacob.mcdonald@migration
NAME                           PROPERTY                VALUE                   SOURCE
vol1/jacob.mcdonald@migration  type                    snapshot                -
vol1/jacob.mcdonald@migration  creation                Mon Jan 31 16:13 2022   -
vol1/jacob.mcdonald@migration  used                    141K                    -
vol1/jacob.mcdonald@migration  referenced              1.92T                   -
vol1/jacob.mcdonald@migration  compressratio           1.01x                   -
vol1/jacob.mcdonald@migration  devices                 on                      default
vol1/jacob.mcdonald@migration  exec                    on                      inherited from vol1/jacob.mcdonald
vol1/jacob.mcdonald@migration  setuid                  on                      default
vol1/jacob.mcdonald@migration  createtxg               48421091                -
vol1/jacob.mcdonald@migration  xattr                   on                      default
vol1/jacob.mcdonald@migration  version                 5                       -
vol1/jacob.mcdonald@migration  utf8only                off                     -
vol1/jacob.mcdonald@migration  normalization           none                    -
vol1/jacob.mcdonald@migration  casesensitivity         sensitive               -
vol1/jacob.mcdonald@migration  nbmand                  off                     default
vol1/jacob.mcdonald@migration  guid                    16919260077066763466    -
vol1/jacob.mcdonald@migration  primarycache            all                     default
vol1/jacob.mcdonald@migration  secondarycache          all                     default
vol1/jacob.mcdonald@migration  defer_destroy           off                     -
vol1/jacob.mcdonald@migration  userrefs                0                       -
vol1/jacob.mcdonald@migration  objsetid                87817                   -
vol1/jacob.mcdonald@migration  mlslabel                none                    default
vol1/jacob.mcdonald@migration  refcompressratio        1.01x                   -
vol1/jacob.mcdonald@migration  written                 17.1M                   -
vol1/jacob.mcdonald@migration  logicalreferenced       1.94T                   -
vol1/jacob.mcdonald@migration  acltype                 nfsv4                   inherited from vol1/jacob.mcdonald
vol1/jacob.mcdonald@migration  context                 none                    default
vol1/jacob.mcdonald@migration  fscontext               none                    default
vol1/jacob.mcdonald@migration  defcontext              none                    default
vol1/jacob.mcdonald@migration  rootcontext             none                    default
vol1/jacob.mcdonald@migration  encryption              off                     default
vol1/jacob.mcdonald@migration  xattr_compat            linux                   default
vol1/jacob.mcdonald@migration  org.freebsd.ioc:active  yes                     inherited from vol1


Intermediate dataset:
Code:
$ sudo zfs get all big_scratch/migration/jacob.mcdonald@migration
NAME                                            PROPERTY               VALUE                  SOURCE
big_scratch/migration/jacob.mcdonald@migration  type                   snapshot               -
big_scratch/migration/jacob.mcdonald@migration  creation               Mon Jan 31 16:13 2022  -
big_scratch/migration/jacob.mcdonald@migration  used                   236K                   -
big_scratch/migration/jacob.mcdonald@migration  referenced             1.91T                  -
big_scratch/migration/jacob.mcdonald@migration  compressratio          1.02x                  -
big_scratch/migration/jacob.mcdonald@migration  devices                on                     default
big_scratch/migration/jacob.mcdonald@migration  exec                   on                     inherited from big_scratch/migration/jacob.mcdonald
big_scratch/migration/jacob.mcdonald@migration  setuid                 on                     default
big_scratch/migration/jacob.mcdonald@migration  createtxg              184273                 -
big_scratch/migration/jacob.mcdonald@migration  xattr                  sa                     inherited from big_scratch/migration
big_scratch/migration/jacob.mcdonald@migration  version                5                      -
big_scratch/migration/jacob.mcdonald@migration  utf8only               off                    -
big_scratch/migration/jacob.mcdonald@migration  normalization          none                   -
big_scratch/migration/jacob.mcdonald@migration  casesensitivity        sensitive              -
big_scratch/migration/jacob.mcdonald@migration  nbmand                 off                    default
big_scratch/migration/jacob.mcdonald@migration  guid                   16919260077066763466   -
big_scratch/migration/jacob.mcdonald@migration  primarycache           all                    default
big_scratch/migration/jacob.mcdonald@migration  secondarycache         all                    default
big_scratch/migration/jacob.mcdonald@migration  defer_destroy          off                    -
big_scratch/migration/jacob.mcdonald@migration  userrefs               0                      -
big_scratch/migration/jacob.mcdonald@migration  objsetid               37487                  -
big_scratch/migration/jacob.mcdonald@migration  mlslabel               none                   default
big_scratch/migration/jacob.mcdonald@migration  refcompressratio       1.02x                  -
big_scratch/migration/jacob.mcdonald@migration  written                1.91T                  -
big_scratch/migration/jacob.mcdonald@migration  logicalreferenced      1.94T                  -
big_scratch/migration/jacob.mcdonald@migration  acltype                nfsv4                  inherited from big_scratch/migration/jacob.mcdonald
big_scratch/migration/jacob.mcdonald@migration  context                none                   default
big_scratch/migration/jacob.mcdonald@migration  fscontext              none                   default
big_scratch/migration/jacob.mcdonald@migration  defcontext             none                   default
big_scratch/migration/jacob.mcdonald@migration  rootcontext            none                   default
big_scratch/migration/jacob.mcdonald@migration  encryption             off                    default
big_scratch/migration/jacob.mcdonald@migration  xattr_compat           linux                  default
big_scratch/migration/jacob.mcdonald@migration  org.truenas:managedby  172.16.42.46           inherited from big_scratch/migration


Destination dataset:
Code:
$ sudo zfs get all vol1/jacob.mcdonald_migrated@migration
NAME                                    PROPERTY                VALUE                         SOURCE
vol1/jacob.mcdonald_migrated@migration  type                    snapshot                      -
vol1/jacob.mcdonald_migrated@migration  creation                Mon Jan 31 16:13 2022         -
vol1/jacob.mcdonald_migrated@migration  used                    0B                            -
vol1/jacob.mcdonald_migrated@migration  referenced              1.91T                         -
vol1/jacob.mcdonald_migrated@migration  compressratio           1.02x                         -
vol1/jacob.mcdonald_migrated@migration  devices                 on                            default
vol1/jacob.mcdonald_migrated@migration  exec                    on                            inherited from vol1/jacob.mcdonald_migrated
vol1/jacob.mcdonald_migrated@migration  setuid                  on                            default
vol1/jacob.mcdonald_migrated@migration  createtxg               48427504                      -
vol1/jacob.mcdonald_migrated@migration  xattr                   on                            default
vol1/jacob.mcdonald_migrated@migration  version                 5                             -
vol1/jacob.mcdonald_migrated@migration  utf8only                off                           -
vol1/jacob.mcdonald_migrated@migration  normalization           none                          -
vol1/jacob.mcdonald_migrated@migration  casesensitivity         sensitive                     -
vol1/jacob.mcdonald_migrated@migration  nbmand                  off                           default
vol1/jacob.mcdonald_migrated@migration  guid                    16919260077066763466          -
vol1/jacob.mcdonald_migrated@migration  primarycache            all                           default
vol1/jacob.mcdonald_migrated@migration  secondarycache          all                           default
vol1/jacob.mcdonald_migrated@migration  defer_destroy           off                           -
vol1/jacob.mcdonald_migrated@migration  userrefs                0                             -
vol1/jacob.mcdonald_migrated@migration  objsetid                93995                         -
vol1/jacob.mcdonald_migrated@migration  mlslabel                none                          default
vol1/jacob.mcdonald_migrated@migration  refcompressratio        1.02x                         -
vol1/jacob.mcdonald_migrated@migration  written                 1.91T                         -
vol1/jacob.mcdonald_migrated@migration  logicalreferenced       1.94T                         -
vol1/jacob.mcdonald_migrated@migration  acltype                 nfsv4                         inherited from vol1/jacob.mcdonald_migrated
vol1/jacob.mcdonald_migrated@migration  context                 none                          default
vol1/jacob.mcdonald_migrated@migration  fscontext               none                          default
vol1/jacob.mcdonald_migrated@migration  defcontext              none                          default
vol1/jacob.mcdonald_migrated@migration  rootcontext             none                          default
vol1/jacob.mcdonald_migrated@migration  encryption              aes-256-gcm                   -
vol1/jacob.mcdonald_migrated@migration  encryptionroot          vol1/jacob.mcdonald_migrated  -
vol1/jacob.mcdonald_migrated@migration  keystatus               available                     -
vol1/jacob.mcdonald_migrated@migration  xattr_compat            linux                         default
vol1/jacob.mcdonald_migrated@migration  org.freebsd.ioc:active  yes                           inherited from vol1


I guess I'll delete the destination dataset and try again, perhaps using lz4 instead of zstd in the destination? But I'm using zstd in the intermediate dataset, and the original dataset is a mix of lz4 and zstd. Perhaps it's a bad interaction between encryption and zstd? I'm grasping at straws here.

If that still fails, I can try again using no encryption on the destination dataset, but that really negates the whole point of this exercise.

Update: I deleted the destination (encrypted) dataset. I'm starting replication again. This time I changed the parent dataset from zstd to lz4, and disabled `Full Filesystem Replication`. The `Include Dataset Properties` is still enabled. Running now... it will be several hours.
 
Last edited:

AlexGG

Contributor
Joined
Dec 13, 2018
Messages
171
If that fails, you need to try again using no encryption on target. While it negates practical usefulness, it provides an important data point for figuring out what's going on.

If ZFS checksums match, this means ZFS did write these broken files and thought they were correct at the time of writing. The data is checksummed after it is compressed and encrypted. Also, during a write, there is no round-trip check to ensure ZFS can decrypt the encrypted data to the same data. So ZFS will accept the data, compress it, encrypt it, compute the checksum over encrypted data, and write it out to disk. The fact that the checksum matches guarantees that encrypted data on read is the same that was written, and that's it. It does not guarantee that the original data was encrypted correctly.

Now, if the CPU has hardware AES extensions (AES-NI) (it probably does), and the CPU is overheating or is just defective, that will explain the behavior.

So, another thing to try will be to disable hardware AES and see if it solves the problem, but I don't know how to do it.
 

yottabit

Contributor
Joined
Apr 15, 2012
Messages
192
Agreed on the test methodology.

A few more details about the server:
  • vdev1 drives in operation for ~10 years
  • vdev2 drives in operation for ~4 years
  • Server in operation for ~5 years
  • Chassis: Supermicro 5U w/ redundant PSU and high cooling airflow
  • Motherboard: Supermicro
  • 2xCPU: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (AES is listed as a processor flag feature)
  • RAM: 64 GB ECC
  • Pool origination: FreeNAS 9? (FreeBSD)
  • Current OS: TrueNAS-SCALE-22.02-RC.2 (Linux)
Replication task from the intermediate pool dataset (unencrypted) to the destination pool data (encrypted) is complete.

I am starting md5sum checks now. I already have seen corruption, of the .cshrc resource in the homedir. It has been corrupted from text to binary.

File contents on the intermediate datatset:
Code:
[yottabit@nas1 ~]$ cat -nv /mnt/big_scratch/migration/jacob.mcdonald/.cshrc 
     1  # $FreeBSD$
     2  #
     3  # .cshrc - csh resource script, read at beginning of execution by each shell
     4  #
     5  # see also csh(1), environ(7).
     6  # more examples available at /usr/share/examples/csh/
     7  #
     8
     9  alias h         history 25
    10  alias j         jobs -l
    11  alias la        ls -aF
    12  alias lf        ls -FA
    13  alias ll        ls -lAF
    14
    15  # These are normally set through /etc/login.conf.  You may override them here
    16  # if wanted.
    17  # set path = (/sbin /bin /usr/sbin /usr/bin /usr/local/sbin /usr/local/bin $HOME/bin)
    18  # setenv        BLOCKSIZE       K
    19  # A righteous umask
    20  # umask 22
    21
    22  setenv  EDITOR  vi
    23  setenv  PAGER   more
    24
    25  if ($?prompt) then
    26          # An interactive shell -- set some stuff up
    27          set prompt = "%N@%m:%~ %# "
    28          set promptchars = "%#"
    29
    30          set filec
    31          set history = 1000
    32          set savehist = (1000 merge)
    33          set autolist = ambiguous
    34          # Use history to aid expansion
    35          set autoexpand
    36          set autorehash
    37          set mail = (/var/mail/$USER)
    38          if ( $?tcsh ) then
    39                  bindkey "^W" backward-delete-word
    40                  bindkey -k up history-search-backward
    41                  bindkey -k down history-search-forward
    42          endif
    43
    44  endif
    45
    46  if ( $?tcsh ) then
    47          bindkey "^W" backward-delete-word
    48          bindkey -k up history-search-backward
    49          bindkey -k down history-search-forward
    50          bindkey '\e[H'    beginning-of-line      # home
    51          bindkey '\e[F'    end-of-line                # end
    52          bindkey '\e[3~'   delete-char             # delete
    53          bindkey '\e[1;5C' forward-word        # ctrl right
    54          bindkey '\e[1;5D' backward-word     # ctrl left
    55          bindkey '\e[1~'   beginning-of-line    # home
    56          bindkey '\e[4~'   end-of-line              # end
    57      endif
    58


File contents on the destination dataset:
Code:
[yottabit@nas1 ~]$ cat -nv /mnt/vol1/jacob.mcdonald_migrated/.cshrc 
     1  ^@^@^CM-jM-qK# $FreeBSD$
     2  #
     3  # .cshrc - csh resource script, read at beginning of execution by each shellO^@M-^@see alsoO^@M-p^[(1), environ(7).
     4  # more examples availablec^@M-4/usr/share/!^@M-s^M/csh/
     5  #
     6
     7  alias h         history 25^T^@M-#j              jobs -l^Q^@M-^Tla       ls -aF^P^@^Qf^P^@$FA^P^@^Ql^P^@M-pOlAF
     8
     9  # These are normally set through /etc/login.conf.  You may override them here
    10  # if wantedM-k^@^@H^@M-q^@path = (/sbin /^E^@^AM-g^@^E
    11  ^@^E    ^@Wlocal^Y^@^B^P^@^@^_^@P$HOME<^@^R)V^@M-r^Venv BLOCKSIZE       K
    12  # A righteous umask
    13  #^H^@S 22
    14
    15  3^@M-^TEDITOR   vi^Q^@`PAGER    M-^R^AM-p^@
    16
    17  if ($?prompt)M-T^@M-r^Dn
    18          # An interactiveM-f^A1 --#^AM-q^@some stuff up
    19          M-n^@^A?^@M-x^B = "%N@%m:%~ %# "^]^@Qchars"^@2#"
    20  ^Y^@Rfilec^K^@^DM-^^Ab= 1000^T^@@save^X^@^@G^A^@^V^@r merge)^]^@Rautol^]^@M-^PambiguousM-9^@@Use 5^@^@+^BM-v^Ato aid expansion:^@^A^S^@^Vd^P^@brehash^P^@@mails^@P/var/^M^@M-^P/$USER)
    21          +^A@ $?t0^C^D+^AM-w^S   bindkey "^W" backward-delete-word$^@T-k upM-^_^@M-^D-search-4^@
    22  (^@Ldown*^@0for]^@M-^S
    23          endif
    24
    25  ^G^@^OM-^W^@^@^S ^A^@^OM-^]^@^O^C)^@^E*^@^OM-#^@^K^O.^@^@^OM-)^@        ^L/^@`'\e[H'^V^@^EV^DM-^@-of-line^U^@M-^P  # home
    26  ^K^@^@^B^@^DM-^U^@^@8^@^QF8^@:end2^@^F^B^@O# ens^@^C 3~<^@^CM-^S^A^@M-^H^B^F5^@^A8^@^B^Z^@^Ov^@^Bc1;5C' y^A^AM-O^A^A4^@^A7^@Bctrl^?^C^O;^@^E&D'^R^B^@^K^B^H9^@?lef8^@^D^AM-.^@^O!^A^B^O^_^A^H^Q46^@^O^_^A^F^F^]^A^C5^B^_^@^A^@M-^?M-[P^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@


Le sigh. I am going to replicate again, this time without an encrypted dataset destination. If that is successful, I will figure out how to disable CPU AES and replicate again into an encrypted dataset.

This is concerning.
 

yottabit

Contributor
Joined
Apr 15, 2012
Messages
192
Oops, the destination was still compressed with zstd, even though the parent dataset was set to lz4. I guess this is because `Include dataset properties` was enabled in the replication task? If I disable that option, I am presented with error `[EINVAL] replication_update.properties_override: null not allowed`. Well, what if I don't want to override properties, but just want the defaults? I'm not sure what to enter here.

Edit: the `Properties override` and `Properties exclude` fields are hidden when deselecting `Include dataset properties` anyway, so this seems like a UI bug.

Edit2: I added `compression` to the `Properties exclude` and am starting the replication again.

Edit3: well, adding anything to `Properties exclude` or `Properties override` doesn't save correctly. When opening the task again, they're blank.
 
Last edited:

AlexGG

Contributor
Joined
Dec 13, 2018
Messages
171
Can you provide hex dumps for the file comparison, rather than text? It would be interesting to see the alignment and size of the areas that differ, and also if there is any clear period or pattern to the corruption.
 

yottabit

Contributor
Joined
Apr 15, 2012
Messages
192
Yes, can do, but I have already deleted the destination dataset again. I'll provide that when I have a corrupted file again. First I need to figure out how to start a replication that does not include the source compression property, as the UI form seems broken. I may just `zfs send | zfs recv` but I have had problems in the past where the mountpoint gets all jacked up doing that. I guess I can just exclude the mountpoint property or set it to legacy.

Edit: it looks like the `Properties exclude` form is just a UI bug. If I write `compression` there, it works and the destination inherits the parent compression. But opening the replication task form again shows a blank field for `Properties exclude`. Replication is underway again, this time to an lz4 compressed destination. See you again in a few hours. :smile:
 
Last edited:

AlexGG

Contributor
Joined
Dec 13, 2018
Messages
171
Maybe it is a compression problem, actually.

# $FreeBSD$
#
#
.cshrc - csh resource script, read at beginning of execution by each shell
#
#
see also csh(1), environ(7).
# more examples available at /usr/share/examples/csh/
#

alias h history 25
alias j jobs -l
alias la ls -aF


So if you look at the damage pattern, the repeated words are replaced with something.
Above, I took the original text and highlighted damaged parts in bold. And first entries for each word in underline.
You will see the first instance of a word is okay, and subsequent instances are kaput.
That's what a compressed file looks like inside.
As if the system does not decompress data when reading.
I bet if you run LZ4 (or whatever compression was) unpacker on the file, it will produce the original text.
 

yottabit

Contributor
Joined
Apr 15, 2012
Messages
192
Very interesting hypothesis. Here are the compression states throughout the pipeline:
  • Source file was lz4, with parent recently changed to zstd (but not much data written) <-- File still readable
  • Intermediate dataset should be zstd <-- File still readable
  • Destination dataset should be zstd <-- File unreadable
I believe I was sending all properties, but not recursive, so the dataset was not being replicated block-for-block, but only the data sent and integrated into a new dataset, iirc. Therefore the problem could be reading/decompressing from zstd or writing/compressing to zstd, whereas the original was reading/decompressing from lz4 and writing/compression to zstd.

Replication is about 25% finished, and looks like the destination dataset is not mounted until replication completes, so I can't early check to see if the .cshrc file is still corrupt.
 
Joined
Oct 22, 2019
Messages
3,641
I'm finding this deeply troubling. Under no circumstance (regardless of compression method, encryption, etc), should data be altered in a replication.

The snapshot at the destination should represent exactly that: a pure replica of the filesystem and data at that point in time. Every single file, every bit of metadata (attributes, timestamps, filenames, etc) should be an exact match to the original source.

The only thing that encryption and compression affects is the record itself, at rest. But when the record(s) are loaded into RAM to read the data (uncompressed and/or decrypted), this data should 100% match the original data on the source.

Could this possible be a rare bug in SCALE and/or upstream OpenZFS?
 

AlexGG

Contributor
Joined
Dec 13, 2018
Messages
171
You can't have a pure replica of the system if you ask to change the compression. So your target has different attributes than the source. Now, the data must also be recompressed to produce the actual change. Which in turn changes a significant amount of metadata inside the file pointers. However, I'm pretty sure we're looking at a bug. Now the only task remaining is to better localize it and then submit a bug report.
 
Joined
Oct 22, 2019
Messages
3,641
I bet if you run LZ4 (or whatever compression was) unpacker on the file, it will produce the original text.
This would assume it's a problem with decompressing, while everything before it (replicating, compressing, writing) is working fine.

If this was an external harddrive, it would be interesting to see if the issue persists on TrueNAS Core or even a Linux distro with the latest version of ZFS. Perhaps there's a decompression bug lurking in SCALE or the version of ZFS it ships with?
 
Joined
Oct 22, 2019
Messages
3,641
You can't have a pure replica of the system if you ask to change the compression. So your target has different attributes than the source.
I'm referring to "replica" in non-ZFS language. "Replica" as in casual english.

Because everything should end up identical on the destination:
  • filenames
  • permissions
  • attributes / extended attributes
  • data (actual file content)

Yes, it's "different" when it is "at rest", but upon reading it into RAM, it should be 100% identical to how it exists on the original source. It should never read a different timestamp, or permissions, or extended attributes, or data of the file itself, etc.

If the file important_doc.txt has a modification timestamp of 2022-02-01-08-15-00 on the source, then after a replication to the destination, it should have that exact same modified timestamp.
 

AlexGG

Contributor
Joined
Dec 13, 2018
Messages
171
This would assume it's a problem with decompressing

It is not a problem with decompression per se, it is a problem that the data is compressed, but the block pointer is marked as plain. Therefore, ZFS reads data from the disk, verifies the checksum (which is correct), and then instead of decompressing, returns compressed data as is. Which data I think we are looking at.

Now you need to localize it. Do we need encryption for this to happen? Do we need ZSTD? From a hex dump, we will be able to tell if it is ZSTD or LZ4 compressed block. LZ4 will have 4 bytes compressed length before it, and ZSTD will have 8 bytes (4 bytes compressed length, 4 bytes algorithm version).

I am still interested, does it affect the same set of files on each write or a different set of files?

Also, by the next day, I suppose someone from IX will be here because it is a significant issue if it is not some weird one-off.
 

yottabit

Contributor
Joined
Apr 15, 2012
Messages
192
This wouldn't be the first ZFS bug I have encountered.

The first was the livelist corruption bug, which caused FreeBSD (on CORE) to kernel panic when attempting to mount my zpool after cloning a deduped zvol for VM use. I was able to mount the pool in Ubuntu, so I took a chance and upgraded pretty early to SCALE to get access to my data again. I worked with the ZFS team, and a couple others experiencing the same bug, on Slack to submit a bunch of data and such. They assured me my data was safe, I wouldn't need to migrate my pool, patched the bug, saw it hit SCALE, but I still suffered from spa errors in my logs for a long time; it didn't affect my data at rest, but caused mounting the zpool to take 15+ minutes. The ZFS team eventually stopped responding in my further inquiries.

I'll provide the hex dump as soon as replication finishes, assuming it's reproduced again. (If it isn't reproduced, I'll replicate again with the zstd destination and provide the hex dumps either way.)

I'm really glad I did the md5sum to verify, before sending my dataset off to the ether, pulling it back after refactoring, and only then discovering my data was trashed. I always say, Trust. But verify.
 

yottabit

Contributor
Joined
Apr 15, 2012
Messages
192
I am still interested, does it affect the same set of files on each write or a different set of files?
I didn't exhaustively check, but it definitely was mostly (could have been 100%, but I didn't check) the same set of files! I wonder if it's being triggered by a specific pattern.
 
Top