S.M.A.R.T. Services Error When Restarting

Joined
Oct 18, 2018
Messages
969
I got a report that the smart daemon stopped running. When I navigated to the services list and attempted to restart the service I get an error saying that it is unable to start the service. If I navigate away from that menu and back I unexpectedly see the service listed as running. `$ ps aux | grep smartd` reports the service as running as well. I'm interested in what is causing the error to appear in the first place and whether it was something I did and whether I should feel comfortable that my disks are in fact being monitored given that the service is listed as running. I have attached some error logs found from `/var/log/debug.log` as well as my smartd.conf and relevant screen shots.

I am running 11.2-U2.

I posted a bug about this at first but thought that perhaps I should post here as well in case it was my error. The bug can be found here.
 

Attachments

  • smartd.conf.txt
    3.9 KB · Views: 364
  • smartd_daemon alert.png
    smartd_daemon alert.png
    26.2 KB · Views: 367
  • S.M.A.R.T. listed as running.png
    S.M.A.R.T. listed as running.png
    904.4 KB · Views: 352
  • S.M.A.R.T. starting as off.png
    S.M.A.R.T. starting as off.png
    236.7 KB · Views: 351
  • S.M.A.R.T. failing to start.png
    S.M.A.R.T. failing to start.png
    1.1 MB · Views: 388
  • bug_report.txt
    4.2 KB · Views: 379

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
Are you using USB boot media or something similar? If so, you need to disable SMART on it via Storage -> Disks. Select the USB boot drive, click the three dots, edit, then untick "Enable SMART" and click save. Then restart smartd.
 
Joined
Oct 18, 2018
Messages
969
Are you using USB boot media or something similar?
I'm using 2 x 64GB SuperMicro SATA DOM SSD-DM064-SMCMVN1. This system has been up and running for several weeks with no issue using this same boot pool. I have on occasion received SMART temperature alerts for those two SATA DOMs.
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
Try tail -F /var/log/messages and then restart smartd and see if anything is logged there.
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
On the CLI that is.
 
Joined
Oct 18, 2018
Messages
969
Try tail -F /var/log/messages and then restart smartd and see if anything is logged there.
Saw nothing in /var/log/messages

I found Feb 24 16:38:37 dorothy smartd[77386]: file /var/run/smartd-daemon.pid written containing PID 77386 in /var/log/daemon.log. The pid listed there matches the pid of the running smartd process.

Edit: Add details about logs found in /var/log/daemon.log
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
Oh looks like it logs to daemon.log now - you see anything there when you restart?
 
Joined
Oct 18, 2018
Messages
969
Oh looks like it logs to daemon.log now - you see anything there when you restart?
Only what I posted above about the PID being written to /var/run/smartd-daemon.pid
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
That's odd. I stopped and then started it and in daemon.log I see a ton of log entries.
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
The other thing I'd do: in services, stop smartd. Then on the CLI run ps auxwww | grep smartd to see if a rogue process is still running.
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
Or rather, since it's not starting for you, just do the second half.
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
Oh - reread the OP. Try killing the process you see running (kill [PID of smartd]) then use the UI to try starting it again.
 
Joined
Oct 18, 2018
Messages
969
Alright, so I turned off the process in the UI and then killed the PID (it was still hanging around after the UI reported SMART services as off. I then tailed /var/log/daemon.log when I restarted it in the UI.

It looks to me like the daemon is running just fine. Based on the error found in /var/log/debug.log that I posted in my original message it seems like perhaps the issue lies in some of the UI .py code? I haven't had a chance to dig in today any more on those errors.
 

Attachments

  • daemon.txt
    15.2 KB · Views: 402

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
So no errors reported in the log? I get a bunch of info logs when I start smartd. You get nothing like that?
 
Joined
Oct 18, 2018
Messages
969
So no errors reported in the log?
Doesn't seem to be. I posted a file in my previous post containing the logs I do get. They look like info level logs. Interestingly, the log does report 26 drives in my system. I expect 14, 12 data drives and 2 boot drives.
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
OK - I was on mobile which must not show attachments (or I completely missed it). Here are the drives I grepped out of the log file:

Code:
/dev/ada0
/dev/ada1
/dev/ada2
/dev/ada3
/dev/ada4
/dev/ada5
/dev/ada6
/dev/ada7
/dev/da0
/dev/da1
/dev/da2
/dev/da3
/dev/da4
/dev/da5


I count 14 here.

I don't see any errors either. Do you see anything in /var/log/middlewared.log when you toggle smartd in the UI?
 
Joined
Oct 18, 2018
Messages
969
In addition to the errors that I see in /var/log/debug.log

Code:
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12] Traceback (most recent call last):
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]   File "./freenasUI/middleware/form.py", line 109, in __save
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]     return c.call(f"{self.middleware_plugin}.{self._middleware_action}", *args, **kwargs)
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]   File "/usr/local/lib/python3.6/site-packages/middlewared/client/client.py", line 454, in call
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]     raise ClientException(c.error, c.errno, c.trace, c.extra)
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12] middlewared.client.client.ClientException: [ESERVICESTARTFAILURE] The smartd service failed to start
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12] 
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12] During handling of the above exception, another exception occurred:
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12] 
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12] Traceback (most recent call last):
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]   File "./freenasUI/api/resources.py", line 369, in obj_update
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]     return super(DiskResourceMixin, self).obj_update(bundle, skip_errors=skip_errors, **kwargs)
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]   File "/usr/local/lib/python3.6/site-packages/tastypie/resources.py", line 2236, in obj_update
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]     return self.save(bundle, skip_errors=skip_errors)
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]   File "./freenasUI/api/utils.py", line 415, in save
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]     form.save()
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]   File "./freenasUI/middleware/form.py", line 55, in save
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]     result = self.__save()
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]   File "./freenasUI/middleware/form.py", line 114, in __save
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]     raise
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]   File "./freenasUI/middleware/form.py", line 112, in __save
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]     raise ServiceFailed(e.extra[0], e.error)
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12] freenasUI.services.exceptions.ServiceFailed: [ESERVICESTARTFAILURE] The smartd service failed to start
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12] 
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12] During handling of the above exception, another exception occurred:
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12] 
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12] Traceback (most recent call last):
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]   File "./freenasUI/api/utils.py", line 251, in dispatch
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]     request_type, request, *args, **kwargs
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]   File "/usr/local/lib/python3.6/site-packages/tastypie/resources.py", line 482, in dispatch
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]     response = method(request, **kwargs)
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]   File "./freenasUI/api/utils.py", line 332, in put_detail
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]     updated_bundle = self.obj_update(bundle=bundle, **self.remove_api_resource_names(kwargs))
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]   File "./freenasUI/api/resources.py", line 371, in obj_update
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12]     raise ImmediateHttpResponse(response=HttpNotFound())
Feb 18 10:59:26 dorothy uwsgi: [api.utils:12] tastypie.exceptions.ImmediateHttpResponse
Feb 18 10:59:43 dorothy uwsgi: [api.utils:12] Traceback (most recent call last):
Feb 18 10:59:43 dorothy uwsgi: [api.utils:12]   File "./freenasUI/middleware/form.py", line 109, in __save
Feb 18 10:59:43 dorothy uwsgi: [api.utils:12]     return c.call(f"{self.middleware_plugin}.{self._middleware_action}", *args, **kwargs)
Feb 18 10:59:43 dorothy uwsgi: [api.utils:12]   File "/usr/local/lib/python3.6/site-packages/middlewared/client/client.py", line 454, in call
Feb 18 10:59:43 dorothy uwsgi: [api.utils:12]     raise ClientException(c.error, c.errno, c.trace, c.extra)
Feb 18 10:59:43 dorothy uwsgi: [api.utils:12] middlewared.client.client.ClientException: [ESERVICESTARTFAILURE] The smartd service failed to start


I do see several of the following errors in /var/log/middlewared.log

Code:
[2019/02/23 01:36:17] (ERROR) asyncio.default_exception_handler():1266 - Task exception was never retrieved
future: <Task finished coro=<_event_devfs() done, defined at /usr/local/lib/python3.6/site-packages/middlewared/plugins/disk.py:1123> exception=KeyError('devname',)>
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/middlewared/plugins/disk.py", line 1138, in _event_devfs
    await middleware.call('disk.multipath_sync')
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 1145, in call
    return await self._call(name, serviceobj, methodobj, params, app=app, pipes=pipes, io_thread=True)
  File "/usr/local/lib/python3.6/site-packages/middlewared/main.py", line 1081, in _call
    return await methodobj(*args)
  File "/usr/local/lib/python3.6/site-packages/middlewared/schema.py", line 664, in nf
    return await f(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/middlewared/plugins/disk.py", line 777, in multipath_sync
    reserved = await self.get_reserved()
  File "/usr/local/lib/python3.6/site-packages/middlewared/plugins/disk.py", line 264, in get_reserved
    reserved += [i async for i in await self.middleware.call('pool.get_disks')]
  File "/usr/local/lib/python3.6/site-packages/middlewared/plugins/disk.py", line 264, in <listcomp>
    reserved += [i async for i in await self.middleware.call('pool.get_disks')]
  File "/usr/local/lib/python3.6/site-packages/middlewared/plugins/pool.py", line 315, in get_disks
    if os.path.exists(os.path.join("/dev", disk['devname'])):
KeyError: 'devname'
 
Joined
Oct 18, 2018
Messages
969
I count 14 here.
I count the same and yet /var/log/daemon.log reports 26.
Code:
Feb 24 19:20:01 dorothy smartd[86234]: Monitoring 26 ATA/SATA, 0 SCSI/SAS and 0 NVMe devices
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
I see. Definitely looks like a middleware bug or something to me. We verified smartd is running anyway so that's the important part while the bug is being checked out.
 

Meyers

Patron
Joined
Nov 16, 2016
Messages
211
I'd probably try regenerating the smartd.conf just for the heck of it, see what happens. Storage -> Disks, edit a disk, untick Enable SMART, save, edit it again, tick Enable SMART, save. On my systems, this regenerates smartd.conf and reloads smartd.
 
Top