Corrupt files; where?

ppmax

Contributor
Joined
May 16, 2012
Messages
111
I have 2 corrupt files in my pool:
Code:
errors: Permanent errors have been detected in the following files:

        volume1/.system/syslog-097c772a7a1e40b5a1773cbed6329f54@manual-2020-02-25_05-01:/log/mdnsresponder.log
        volume1/.system/syslog-097c772a7a1e40b5a1773cbed6329f54@manual-2020-02-25_05-01:/log/middlewared.log.2


volume1 is the name of my pool; where the heck are these files? I can't seem to find these anywhere...and all I want to do is delete them ;)

Any tips?

thx
PP
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
Although, to be clear... the corrupted files listed are actually in snapshots, so not the ones you would find under /var/log...

I would destroy the snapshot "volume1/.system/syslog-097c772a7a1e40b5a1773cbed6329f54@manual-2020-02-25_05-01" if you can and the errors should go.

Look into why you might have experienced corruption in the first place though...
 

ppmax

Contributor
Joined
May 16, 2012
Messages
111
Thanks much @sretalla...I totally forgot about snapshots and have deleted those.

Re the corruption: I just replaced/resilvered a failing drive so hopefully I'm all good now.

Thanks again!
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
A failed drive can't cause corruption in a properly designed pool. I see you're using RAIDZ2 where that would be true, so I would point to other components letting you down (and somehow not being highlighted to you by the system, so again, take care).

Whatever happened, it's potentially still there waiting to kill something more important than a snapshot.
 

ppmax

Contributor
Joined
May 16, 2012
Messages
111
A failed drive can't cause corruption in a properly designed pool. I see you're using RAIDZ2 where that would be true, so I would point to other components letting you down (and somehow not being highlighted to you by the system, so again, take care).

Whatever happened, it's potentially still there waiting to kill something more important than a snapshot.
I should have been more thorough in my response above:
I had a home built box that suffered a cascade of failures and eventually died. I rescued the drives and stuffed them into a used Dell T110 II that I bought. Then a drive died. Hopefully I'm beyond all the stuff that caused this situation.

As mentioned I just replaced and resilvered and I see those two errors. Per your suggesting I deleted 2 snapshots, then imitated a scrub.

The scrub just finished and I still see those two errors despite nuking the snapshots. Any ideas why that may be the case? FWIW the mdnsresponsder.log and middlewared.log.2 files do not exist in /var/log
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
You need to zpool clear volume1 before the errors go away completely. Then you run the scrub, which should not find any new errors if your hardware is good.
 

ppmax

Contributor
Joined
May 16, 2012
Messages
111
You need to zpool clear volume1 before the errors go away completely. Then you run the scrub, which should not find any new errors if your hardware is good.
Thanks @Arwen, it's been a while since I've dealt with pool-related issues and completely forgot that I need to clear the errors. Where is the slaps head emoji?
 

ppmax

Contributor
Joined
May 16, 2012
Messages
111
OK, so I ran zpool clear volume1, then scrubbed. 2 "new" errors were identified (these are the same errors from up thread):
Code:
  pool: volume1
 state: ONLINE
status: One or more devices has experienced an error resulting in data
    corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
    entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub repaired 0 in 0 days 01:51:17 with 2 errors on Wed Dec 14 14:27:45 2022
config:

    NAME                                            STATE     READ WRITE CKSUM
    volume1                                         ONLINE       0     0     2
      raidz2-0                                      ONLINE       0     0     4
        gptid/5dd66899-aabe-11e1-90d1-6805ca067062  ONLINE       0     0     0
        gptid/86e2281d-7a50-11ed-9e5f-d067e5eda5bd  ONLINE       0     0     0  block size: 512B configured, 4096B native
        gptid/5ea4a2db-aabe-11e1-90d1-6805ca067062  ONLINE       0     0     0
        gptid/5f14865b-aabe-11e1-90d1-6805ca067062  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        volume1/.system/syslog-097c772a7a1e40b5a1773cbed6329f54@manual-2020-02-25_05-01:/log/mdnsresponder.log
        volume1/.system/syslog-097c772a7a1e40b5a1773cbed6329f54@manual-2020-02-25_05-01:/log/middlewared.log.2


I thought those files were part of a couple snapshots I had laying around so I nuked the snapshots and then rinsed/reapeat. Same errors present.

So where are these things hiding? I ran zfs list and see where these things are probably hiding:
Code:
volume1                                                    986G   805G  41.4M  /mnt/volume1
volume1/.system                                           3.78G   805G  31.4K  legacy
volume1/.system/configs-097c772a7a1e40b5a1773cbed6329f54  1.01G   805G  1.01G  legacy
volume1/.system/configs-0aeee2e60241454fb5b4c63b115661e1  12.4M   805G  12.4M  legacy
volume1/.system/cores                                     2.23G   805G  2.21G  legacy
volume1/.system/perftest                                  48.6K   805G  47.1K  legacy
volume1/.system/rrd-097c772a7a1e40b5a1773cbed6329f54       286M   805G   187M  legacy
volume1/.system/rrd-0aeee2e60241454fb5b4c63b115661e1       123M   805G   123M  legacy
volume1/.system/rrd-5aff9b55f6744f32844e671d651f6466      44.8K   805G  44.8K  legacy
volume1/.system/samba4                                    8.49M   805G  4.61M  legacy
volume1/.system/syslog-097c772a7a1e40b5a1773cbed6329f54    118M   805G  72.5M  legacy
volume1/.system/syslog-0aeee2e60241454fb5b4c63b115661e1   1.75M   805G  1.75M  legacy
volume1/.system/syslog-5aff9b55f6744f32844e671d651f6466   1.46M   805G  1.46M  legacy
volume1/.system/webui                                     50.8K   805G  31.4K  legacy
volume1/backups                                            723G   805G   723G  /mnt/volume1/backups
volume1/iocage                                            4.21G   805G  10.1M  /mnt/volume1/iocage
volume1/iocage/download                                    559M   805G  31.4K  /mnt/volume1/iocage/download
volume1/iocage/download/11.2-RELEASE                       271M   805G   271M  /mnt/volume1/iocage/download/11.2-RELEASE
volume1/iocage/download/11.3-RELEASE                       288M   805G   288M  /mnt/volume1/iocage/download/11.3-RELEASE
volume1/iocage/images                                     31.4K   805G  31.4K  /mnt/volume1/iocage/images
volume1/iocage/jails                                      2.48G   805G  31.4K  /mnt/volume1/iocage/jails
volume1/iocage/jails/FreeNAS-Plex                         1.52G   805G   131K  /mnt/volume1/iocage/jails/FreeNAS-Plex
volume1/iocage/jails/FreeNAS-Plex/root                    1.52G   805G  1.52G  /mnt/volume1/iocage/jails/FreeNAS-Plex/root
volume1/iocage/jails/channels-dvr                          982M   805G  99.4K  /mnt/volume1/iocage/jails/channels-dvr
volume1/iocage/jails/channels-dvr/root                     981M   805G   981M  /mnt/volume1/iocage/jails/channels-dvr/root
volume1/iocage/log                                        37.4K   805G  37.4K  /mnt/volume1/iocage/log
volume1/iocage/releases                                   1.17G   805G  31.4K  /mnt/volume1/iocage/releases
volume1/iocage/releases/11.2-RELEASE                       890M   805G  31.4K  /mnt/volume1/iocage/releases/11.2-RELEASE
volume1/iocage/releases/11.2-RELEASE/root                  890M   805G   890M  /mnt/volume1/iocage/releases/11.2-RELEASE/root
volume1/iocage/releases/11.3-RELEASE                       309M   805G  31.4K  /mnt/volume1/iocage/releases/11.3-RELEASE
volume1/iocage/releases/11.3-RELEASE/root                  309M   805G   309M  /mnt/volume1/iocage/releases/11.3-RELEASE/root
volume1/iocage/templates                                  31.4K   805G  31.4K  /mnt/volume1/iocage/templates
volume1/media                                              255G   805G   255G  /mnt/volume1/media


Are these these datasets safe to destroy with zfs destroy -r .system...or should I only target ./system/syslog-* ?
Code:
volume1/.system                                           3.78G   805G  31.4K  legacy
volume1/.system/configs-097c772a7a1e40b5a1773cbed6329f54  1.01G   805G  1.01G  legacy
volume1/.system/configs-0aeee2e60241454fb5b4c63b115661e1  12.4M   805G  12.4M  legacy
volume1/.system/cores                                     2.23G   805G  2.21G  legacy
volume1/.system/perftest                                  48.6K   805G  47.1K  legacy
volume1/.system/rrd-097c772a7a1e40b5a1773cbed6329f54       286M   805G   187M  legacy
volume1/.system/rrd-0aeee2e60241454fb5b4c63b115661e1       123M   805G   123M  legacy
volume1/.system/rrd-5aff9b55f6744f32844e671d651f6466      44.8K   805G  44.8K  legacy
volume1/.system/samba4                                    8.49M   805G  4.61M  legacy
volume1/.system/syslog-097c772a7a1e40b5a1773cbed6329f54    118M   805G  72.5M  legacy
volume1/.system/syslog-0aeee2e60241454fb5b4c63b115661e1   1.75M   805G  1.75M  legacy
volume1/.system/syslog-5aff9b55f6744f32844e671d651f6466   1.46M   805G  1.46M  legacy
volume1/.system/webui                                     50.8K   805G  31.4K  legacy
 
Joined
Oct 22, 2019
Messages
3,641
Are these these datasets safe to destroy with zfs destroy -r .system...or should I only target ./system/syslog-* ?
DO NOT destroy the hidden .system dataset, nor any of its children.
 

ppmax

Contributor
Joined
May 16, 2012
Messages
111
DO NOT destroy the hidden .system dataset, nor any of its children.
Thanks for the quick reply--much appreciated.

These are all marked as 'legacy;' I thought these were essentially ignored by FreeNAS/TruNas?

In any case, I see that both of the corrupt files live in volume1/.system/syslog-097c772a7a1e40b5a1773cbed6329f54

How can I delete the two corrupt items in that location? I can't cd into the .system dir...
 

ppmax

Contributor
Joined
May 16, 2012
Messages
111
Ok, dunce cap removed. I'm in
/var/db/system/syslog-097c772a7a1e40b5a1773cbed6329f54

...but this directory is empty eg no trace of the two corrupt files:
Code:
volume1/.system/syslog-097c772a7a1e40b5a1773cbed6329f54@manual-2020-02-25_05-01:/log/mdnsresponder.log
volume1/.system/syslog-097c772a7a1e40b5a1773cbed6329f54@manual-2020-02-25_05-01:/log/middlewared.log.2


It appears the @manual is a reference to a snapshot I created from the UI; this snapshot has been deleted.

Is there any method to remove/repair something on the filesystem such that these errors don't appear when I scrub the pool?
 
Joined
Oct 22, 2019
Messages
3,641
Directories and dataset hierarchies are different. Although that can overlap (for the sake of convenience), they are not the same.

The dataset volume1/.system/syslog-097c772a7a1e40b5a1773cbed6329f54 is likely mounted along the /var/db/system/ directory path.

You can confirm with:
zfs mount | grep syslog-097c772a7a1e40b5a1773cbed6329f54

EDIT: Just saw that you followed up with another reply seconds before I sent this post. :wink:
 
Joined
Oct 22, 2019
Messages
3,641
...but this directory is empty eg no trace of the two corrupt files:
Can you browse into the secret folder .zfs/snapshot ?

cd /var/db/system/syslog-097c772a7a1e40b5a1773cbed6329f54/.zfs/snapshot

Do you see any folders within?
 

ppmax

Contributor
Joined
May 16, 2012
Messages
111
I was not aware there were secret folders ;)

Unfortunately I cannot cd to
Code:
cd /var/db/system/syslog-097c772a7a1e40b5a1773cbed6329f54/.zfs/snapshot


or to the parent .zfs

no such file or directory
 
Joined
Oct 22, 2019
Messages
3,641
I think I know what's going on.

And honestly, this is yet another failure of design/testing of TrueNAS. :frown:

You created a manual (recursive) snapshot of your top-level root dataset. This then creates snapshots for all your children, including .system, which does NOT show up in the GUI.

Hence, when you try to delete the "@manual-2020-02-25_05-01" snapshots using the TrueNAS GUI, it doesn't affect the snapshots of the secret .system dataset.

Before destroying anything, go ahead and do this from the terminal:
zfs list -t snap -r volume1/.system | grep manual

EDIT: Wait, if you did destroy those manual snapshots in the terminal (not GUI), then zpool cannot complain about corrupted files in a snapshot that no longer exists.

Are you saying that neither zpool clear nor a subsequent scrub removed these lingering errors?
 
Last edited:

ppmax

Contributor
Joined
May 16, 2012
Messages
111
Interesting! That's a great explanation and I appreciate your help in this. I'm just trying to clean up after some HW failures ;)

Code:
root@freenas[/var/db/system]# zfs list -t snap -r volume1/.system
NAME                                                                               USED  AVAIL  REFER  MOUNTPOINT
volume1/.system@manual-2020-02-25_05-01                                           16.4K      -  31.4K  -
volume1/.system/configs-097c772a7a1e40b5a1773cbed6329f54@manual-2020-02-25_05-01  34.4K      -   634M  -
volume1/.system/cores@manual-2020-02-25_05-01                                     18.7M      -   376M  -
volume1/.system/perftest@manual-2020-02-25_05-01                                  1.49K      -  47.1K  -
volume1/.system/rrd-097c772a7a1e40b5a1773cbed6329f54@manual-2020-02-25_05-01      98.2M      -   171M  -
volume1/.system/rrd-5aff9b55f6744f32844e671d651f6466@manual-2020-02-25_05-01          0      -  44.8K  -
volume1/.system/samba4@manual-2020-02-25_05-01                                     123K      -  4.64M  -
volume1/.system/samba4@update--2020-05-06-23-52--11.3-U1                           133K      -  4.65M  -
volume1/.system/syslog-097c772a7a1e40b5a1773cbed6329f54@manual-2020-02-25_05-01   45.1M      -  63.7M  -
volume1/.system/syslog-5aff9b55f6744f32844e671d651f6466@manual-2020-02-25_05-01   1.49K      -  1.46M  -
volume1/.system/webui@manual-2020-02-25_05-01                                     19.4K      -  31.4K  -


Pardon the noob question what bbCode or other markup are you using to encapsulate your terminal commands?
 
Joined
Oct 22, 2019
Messages
3,641
Go ahead and nuke them, recursively.

First, to be safe, use a dry run to see what would be destroyed.
zfs destroy -n -v -r volume1/.system@manual-2020-02-25_05-01

If it looks safe, go ahead and nuke them:
zfs destroy -v -r volume1/.system@manual-2020-02-25_05-01

---

For single-line commands, I encapsulate them within [icode].
 

ppmax

Contributor
Joined
May 16, 2012
Messages
111
Thanks again for your help and explanation @winnielinnie, I appreciate it!

The zfs destroy worked like a charm...and I reclaimed some disk space too. Bonus!

I cleared the pool and just kicked off another scrub...I'll disappear for *at least* 4 hours :cool:

Have a nice day/evening
 

ppmax

Contributor
Joined
May 16, 2012
Messages
111
Performing the actions above was 50% successful. After the clear and scrub I see a "new" error:
Code:
errors: Permanent errors have been detected in the following files:

        volume1/.system/syslog-097c772a7a1e40b5a1773cbed6329f54:/log/mdnsresponder.log


So I then ran:
root@freenas[~]# zfs list -t snap -r volume1/.system volume1/.system/samba4@update--2020-05-06-23-52--11.3-U1 3.77M - 4.65M -

So I navigated to;
Code:
root@freenas[~]# cd /var/db/system/samba4/.zfs/snapshot
root@freenas[/var/db/system/samba4/.zfs/snapshot]# ll
total 4
dr-xr-xr-x+ 3 root  wheel  -      3 Dec 14 16:19 ./
dr-xr-xr-x+ 3 root  wheel  -      3 Feb 22  2014 ../
drwxr-xr-x  4 root  wheel  uarch 27 May  2  2020 update--2020-05-06-23-52--11.3-U1/
root@freenas[/var/db/system/samba4/.zfs/snapshot]# 


@winnielinnie Since I'm new to destroying datasets...and want to respect the degree of caution you offered previously upthread...I just wanted to request your opinion: Ok to nuke this one? It looks like this is a snapshot generated from an @update.

Thanks again for any opinion you may offer ;)
 
Top