FreeNAS waiting for scrub to complete during boot?

Status
Not open for further replies.

Evi Vanoost

Explorer
Joined
Aug 4, 2016
Messages
91
I'm having a bit of an issue, a scrub was running and I shut down the machine because of a planned power outage. However upon reboot, although everything seems to have been 'okay', the machine is hung on importing the ZFS pool (after all the GEOM messages). There is massive disk activity which makes me believe it is continuing the scrub and waiting for it to complete before importing the pool or continuing to boot.

Obviously I can "wait" for a scrub to complete (probably 5-7 days) but it's a huge issue that this makes the system not available. I can attempt to hard reboot without the disks attached and then hope importing the pool won't hang. I have waited for ~8 hours so far.

Not sure if anyone can replicate this and see whether they have the same issue or a way of booting without waiting for scrub.
 

Sakuru

Guru
Joined
Nov 20, 2015
Messages
527

Evi Vanoost

Explorer
Joined
Aug 4, 2016
Messages
91
The zpool status is a bit hard to do since it's not booting up yet but last time I did it, everything was okay.

The pool is 180TB (60x6TB in mirrors) and contains ~120TB of data. There are a number of spares and 2 Intel DC P3500 and 2 P3700 400GB SSD's (NVMe) for Cache/SLOG.

All of this is contained in a Supermicro 1028U-TN10RT+ with 512GB of DDR4 ECC RAM and dual 10-core Xeon E5-2660. The drives are in 2 SuperMicro 847E1C-R1K28JBOD (4U, 44 drives) all dual connected back to the system and each other with an LSI 9300-16E controller.
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
Scrub should not be the cause. Were any large deletes or snapshot removes in progress when you shut down? Any dedup?

Have previous reboots been quick?
 

Evi Vanoost

Explorer
Joined
Aug 4, 2016
Messages
91
There were no large deletes or snapshot removes in progress when it shut down. There is no dedup on the pool. The system does always reboot relatively quickly (it takes about 5 minutes to enumerate the 120+ drive interfaces).

The system did eventually boot and once it had, the scrub was also done. The timings both on the scrub and when the system continued booting do indicate that it waited for scrub to be done before continuing to boot.
 

Evi Vanoost

Explorer
Joined
Aug 4, 2016
Messages
91
It's the latest stable 9.10 but the pool was created way back when Solaris was still Sun although it has been migrated since using a zpool send/receive from older drives (it has all the old mount tags which are no longer available). The pool has the latest ZFS version though.
 

Evi Vanoost

Explorer
Joined
Aug 4, 2016
Messages
91
Actually reviewing the zpool history now, it seems like right after boot, the system begun snapshotting and destroying old snapshots before it was fully booted, however many cycles of those (both snapshots and snapshot destroys every hour) continued for the time the system was "down" and they take less than 1 minute.

So it seems the FreeNAS snapshotting scheduling continues to operate even before the system is fully booted although it continues to operate as regular, even after the system is fully booted.

The other difference I see is the following (although I think that's not relevant to the mounting since the pool was mounted because it took snapshots):
Previous reboots:
zpool import -c /data/zfs/zpool.cache.saved -o cachefile=none -R /mnt -f 5188259721742887111

Last reboot:
zpool import -o cachefile=none -R /mnt -f 5188259721742887111

Code:
Scrub was completed: Sun Mar  5 07:42:39 2017
Init and other core things were "up" since March 4th (when I booted the system)
root		   1	0.0  0.0   9488	848  -  ILs   4Mar17	  0:00.41 /sbin/init --
root		  12	4.5  0.0	  0   2944  -  WL	4Mar17   2010:43.53 [intr]


SSSD, NFS and other things went "up" on March 5th
Code:
root		4179	8.1  0.0  14476   2976  -  S	 5Mar17   3151:38.60 nfsd: server (nfsd)
root		7039	0.7  0.0 360588  47396  -  Ss	5Mar17	334:36.01 /usr/local/sbin/collectd
root		4829	0.1  0.0 456660  38236  -  S	 5Mar17	 40:14.49 /usr/local/libexec/sssd/sssd_be --domain DIRECTORY1 --debug-to-
 
Last edited by a moderator:
Status
Not open for further replies.
Top