Help understanding "daily security run output"

Status
Not open for further replies.

shimon

Dabbler
Joined
Jul 23, 2016
Messages
16
I have a FreeNAS Mini XL and I've received two "daily security run output" messages which I do not understand (they are much over my head). Any advice or help deciphering these would be appreciated.

I received the first message yesterday:

Code:
freenas.local kernel log messages:
> (ada8:ahcich15:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00
> (ada8:ahcich15:0:0:0): CAM status: ATA Status Error
> (ada8:ahcich15:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 04 (ABRT )
> (ada8:ahcich15:0:0:0): RES: 51 04 10 31 71 40 03 01 00 00 00
> (ada8:ahcich15:0:0:0): Retrying command
> (ada6:ahcich13:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00
> (ada6:ahcich13:0:0:0): CAM status: ATA Status Error
> (ada6:ahcich13:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 04 (ABRT )
> (ada6:ahcich13:0:0:0): RES: 51 04 60 4f bb 40 b5 00 00 00 00
> (ada6:ahcich13:0:0:0): Retrying command

-- End of security output --


And the second message today:

Code:
freenas.local kernel log messages:
> (ada8:ahcich15:0:0:0): READ_FPDMA_QUEUED. ACB: 60 30 f0 30 71 40 03 01 00 00 00 00
> (ada8:ahcich15:0:0:0): CAM status: ATA Status Error
> (ada8:ahcich15:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada8:ahcich15:0:0:0): RES: 41 40 10 31 71 40 03 01 00 00 00
> (ada8:ahcich15:0:0:0): Retrying command
> (ada8:ahcich15:0:0:0): READ_FPDMA_QUEUED. ACB: 60 30 f0 30 71 40 03 01 00 00 00 00
> (ada8:ahcich15:0:0:0): CAM status: ATA Status Error
> (ada8:ahcich15:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada8:ahcich15:0:0:0): RES: 41 40 10 31 71 40 03 01 00 00 00
> (ada8:ahcich15:0:0:0): Retrying command
> (ada8:ahcich15:0:0:0): READ_FPDMA_QUEUED. ACB: 60 30 f0 30 71 40 03 01 00 00 00 00
> (ada8:ahcich15:0:0:0): CAM status: ATA Status Error
> (ada8:ahcich15:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada8:ahcich15:0:0:0): RES: 41 40 10 31 71 40 03 01 00 00 00
> (ada8:ahcich15:0:0:0): Retrying command
> (ada8:ahcich15:0:0:0): READ_FPDMA_QUEUED. ACB: 60 30 f0 30 71 40 03 01 00 00 00 00
> (ada8:ahcich15:0:0:0): CAM status: ATA Status Error
> (ada8:ahcich15:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada8:ahcich15:0:0:0): RES: 41 40 10 31 71 40 03 01 00 00 00
> (ada8:ahcich15:0:0:0): Retrying command
> (ada8:ahcich15:0:0:0): READ_FPDMA_QUEUED. ACB: 60 30 f0 30 71 40 03 01 00 00 00 00
> (ada8:ahcich15:0:0:0): CAM status: ATA Status Error
> (ada8:ahcich15:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada8:ahcich15:0:0:0): RES: 41 40 10 31 71 40 03 01 00 00 00
> (ada8:ahcich15:0:0:0): Error 5, Retries exhausted

-- End of security output --
 

shimon

Dabbler
Joined
Jul 23, 2016
Messages
16
The messages would suggest you have a drive (or two) failing. Do you have regularly scheduled SMART tests?

I didn't set it up because there was this note in that section of the manual you linked to: "To prevent problems, do not enable the S.M.A.R.T. service if the disks are controlled by a RAID controller."

I thought the FreeNAS mini XL uses a RAID controller for the drives. I have the 32TB version (8 drives) and believe I set it up as RAID-Z3.

Am I misreading that note about SMART and RAID?
 

m0nkey_

MVP
Joined
Oct 27, 2015
Messages
2,739
The drives on the FreeNAS mini are directly attached.

@dlavigne:
To prevent problems, do not enable the S.M.A.R.T. service if the disks are controlled by a RAID controller. It is the job of the controller to monitor S.M.A.R.T. and mark drives as Predictive Failure when they trip.
Should the note about RAID controllers be dropped? As it's typically recommended to not use a RAID controller under any circumstance when using ZFS.
 

shimon

Dabbler
Joined
Jul 23, 2016
Messages
16
It looks like "Short Self-Test" is already enabled to occur every hour on Sunday. That sounds like a lot. Should I change it to only happen for one total hour instead of every "N (1) hour"? I attached a screenshot of the Task screen. The SMART service is enabled.

But it sounds like I need to add another task for a "Long Self-Test" which should be once a month.

By the way...I appreciate the help.
 

Attachments

  • smart_test.jpg
    smart_test.jpg
    35.7 KB · Views: 249

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
Should I change it to only happen for one total hour instead of every "N (1) hour"?
It should only be run once on a given day. Overall frequency recommendations vary from daily to weekly.
sounds like I need to add another task for a "Long Self-Test" which should be once a month.
That's what I would go with. Some members run them more often.
 

shimon

Dabbler
Joined
Jul 23, 2016
Messages
16
I found this post: https://forums.freenas.org/index.php?threads/scrub-and-smart-testing-schedules.20108/ which was helpful in setting up proper scheduling for scrubs and SMART.

Ran a short-SMART last night and got this message today:
Code:
freenas.local kernel log messages:
> (ada6:ahcich13:0:0:0): READ_FPDMA_QUEUED. ACB: 60 70 38 4f bb 40 b5 00 00 00 00 00
> (ada6:ahcich13:0:0:0): CAM status: ATA Status Error
> (ada6:ahcich13:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada6:ahcich13:0:0:0): RES: 41 40 60 4f bb 40 b5 00 00 00 00
> (ada6:ahcich13:0:0:0): Retrying command
> (ada6:ahcich13:0:0:0): READ_FPDMA_QUEUED. ACB: 60 70 38 4f bb 40 b5 00 00 00 00 00
> (ada6:ahcich13:0:0:0): CAM status: ATA Status Error
> (ada6:ahcich13:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada6:ahcich13:0:0:0): RES: 41 40 60 4f bb 40 b5 00 00 00 00
> (ada6:ahcich13:0:0:0): Retrying command
> (ada6:ahcich13:0:0:0): READ_FPDMA_QUEUED. ACB: 60 70 38 4f bb 40 b5 00 00 00 00 00
> (ada6:ahcich13:0:0:0): CAM status: ATA Status Error
> (ada6:ahcich13:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada6:ahcich13:0:0:0): RES: 41 40 60 4f bb 40 b5 00 00 00 00
> (ada6:ahcich13:0:0:0): Retrying command
> (ada6:ahcich13:0:0:0): READ_FPDMA_QUEUED. ACB: 60 70 38 4f bb 40 b5 00 00 00 00 00
> (ada6:ahcich13:0:0:0): CAM status: ATA Status Error
> (ada6:ahcich13:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada6:ahcich13:0:0:0): RES: 41 40 60 4f bb 40 b5 00 00 00 00
> (ada6:ahcich13:0:0:0): Retrying command
> (ada6:ahcich13:0:0:0): READ_FPDMA_QUEUED. ACB: 60 70 38 4f bb 40 b5 00 00 00 00 00
> (ada6:ahcich13:0:0:0): CAM status: ATA Status Error
> (ada6:ahcich13:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada6:ahcich13:0:0:0): RES: 41 40 60 4f bb 40 b5 00 00 00 00
> (ada6:ahcich13:0:0:0): Error 5, Retries exhausted

-- End of security output --


So, it seems that drive 6 is bad. I will contact IX Systems for a replacement as this is still under warranty.

I appreciate the help. Thanks.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
That should show up in attribute #197 or #198 in the output of smartctl -x /dev/ada6.
 

shimon

Dabbler
Joined
Jul 23, 2016
Messages
16
Been in touch with IX Systems...they are very responsive. They are asking for me to post a file copy of /var/tmp/rc.conf.freenas file.
Can someone help me with the command to do this? I'm not familiar with these UNIX (?) commands yet.
 

shimon

Dabbler
Joined
Jul 23, 2016
Messages
16
Been in touch with IX Systems...they are very responsive. They are asking for me to post a file copy of /var/tmp/rc.conf.freenas file.
Can someone help me with the command to do this? I'm not familiar with these UNIX (?) commands yet.

I figured it out after much Googling of UNIX commands and locating my SMB shares through the Shell. Getting the basics down: cd, ls, cp, etc.
 

shimon

Dabbler
Joined
Jul 23, 2016
Messages
16
Just want to update on the resolution to this issue. IXsystems tech support was excellent. The tech actually spent a lot of time helping to actually troubleshoot problem instead of just looking for a quick fix or feeding me canned responses. He listened to me, educated me and helped me solve this issue. Kudos to the IXsystems team.

So, after running some SMART tests and going through debug logs, I swapped the suspected bad drives with other drives (changing slot positions) and the problem never came back. May have just been a reseating issue or something else but I haven't seen the errors again and debug logs look clean.

And I know a lot more about FreeNAS and UNIX now. Thanks for assistance.
 
Status
Not open for further replies.
Top