Scheduled scrub errors

Status
Not open for further replies.

birt

Dabbler
Joined
Nov 9, 2015
Messages
26
I have some 10 scheduled scrubs (default setting, 35 days) that were working fine with 11.0-U2 but started emailing me errors after updating to 11.0-U3, errors that also persist with 11.0-U4.

What is strange is that the weekly scrub for the boot drive is working fine. The other scrubs send me two types of error emails:
Subject: Cron <root@freenas> PATH="/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/root/bin" /usr/local/libexec/nas/scrub -t 35 d4tb4-backups
Code:
Traceback (most recent call last):
  File "/usr/local/bin/midclt", line 10, in <module>
	sys.exit(main())
  File "/usr/local/lib/python3.6/site-packages/middlewared/client/client.py", line 325, in main
	with Client(uri=args.uri) as c:
  File "/usr/local/lib/python3.6/site-packages/middlewared/client/client.py", line 117, in __init__
	raise ClientException('Failed connection handshake')
middlewared.client.client.ClientException: Failed connection handshake


or
Subject: Cron <root@freenas> PATH="/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/root/bin" /usr/local/libexec/nas/scrub -t 35 backup_vm
Code:
Traceback (most recent call last):
  File "/usr/local/bin/midclt", line 10, in <module>
	sys.exit(main())
  File "/usr/local/lib/python3.6/site-packages/middlewared/client/client.py", line 325, in main
	with Client(uri=args.uri) as c:
  File "/usr/local/lib/python3.6/site-packages/middlewared/client/client.py", line 114, in __init__
	self._ws.connect()
  File "/usr/local/lib/python3.6/site-packages/middlewared/client/client.py", line 51, in connect
	rv = super(WSClient, self).connect()
  File "/usr/local/lib/python3.6/site-packages/ws4py/client/__init__.py", line 216, in connect
	bytes = self.sock.recv(128)
socket.timeout: timed out
   starting first scrub (since reboot) of pool 'backup_vm'


As I mentioned, the same scrubs were working perfectly fine in 11.0-U2 and were only sending an email with "starting scrub of...".

I saw another thread with tracebacks related to scrubs but the issue seems to be different in my case so I decided to open a new thread.

Any idea what might be the issue? Any suggestion on how to fix it or how to proceed?

Seeing that it appears to be a timeout of some sort, could it be related to the fact that some drives are spinned down at that particular time by the configured power management? Note that the same error also shows up for a drive that is guaranteed to be active at the time, but I guess the timeout would still happen if middlewared is waiting for some other drive to spin up (seeing that all the scrubs start at the same time). Could be that the weekly scrub works fine because it doesn't start at the same time as the other scrubs and doesn't require powering anything up.
 
D

dlavigne

Guest
Please create a report at bugs.freenas.org and post the issue number here.
 
Status
Not open for further replies.
Top