smartd 2688 - - Device: /dev/ciss0 [cciss_disk_06] [SCSI], SMART Failure: HARDWARE IMPENDING FAILURE TOO MANY BLOCK REASSIGNS

Bnito Kmelas · Sep 22, 2021

Greetings Everyone,

A couple of days ago i started getting this message:

smartd 2688 - - Device: /dev/ciss0 [cciss_disk_06] [SCSI], SMART Failure: HARDWARE IMPENDING FAILURE TOO MANY BLOCK REASSIGNS but all my disks seem to be okey, i´ll appreciate some feed back on this error.

Thanks in Advance! (Y)

Alecmascot · Sep 23, 2021

Looks like errors from Hardware Raid.
You will need to supply full system details as per the Forum Rules.

Bnito Kmelas · Sep 23, 2021

Alecmascot said:
Looks like errors from Hardware Raid.
You will need to supply full system details as per the Forum Rules.

Hi Alecmascot Thanks for replying!

My full system specs:

Hp Storageworks x1600
Hp Smart Array P212 Controller
10 Gigs Ram
12 Bay - 2 Tera Sas Disks.

Alecmascot · Sep 23, 2021

That's not enough detail.
How are the disks presented to TNAS ?
What is the output of 'zpool status'

Have you seen :

What's all the noise about HBAs, and why can't I use a RAID controller?

1) An HBA is a Host Bus Adapter. This is a controller that allows SAS and SATA devices to be attached to, and communicate directly with, a server. RAID controllers typically aggregate several disks into a Virtual Disk abstraction of some sort...

www.truenas.com

jgreco · Sep 23, 2021

Alecmascot said:
That's not enough detail.

Respectfully disagree; the CISS driver is written from a RAID perspective with a goal of keeping "disk issues" hidden from the server to a large extent. You already quoted a great (imho!) resource on that topic:

What's all the noise about HBAs, and why can't I use a RAID controller?

1) An HBA is a Host Bus Adapter. This is a controller that allows SAS and SATA devices to be attached to, and communicate directly with, a server. RAID controllers typically aggregate several disks into a Virtual Disk abstraction of some sort...

www.truenas.com

but it is worth noting that CISS controller have provided endless headaches for some people. In this case, it is actually working close to as well as possible, by reporting the issue without being particularly disruptive. OP should replace disk 6.

I believe the Smart Array controller may be one of the controllers that places its own "header"/partition table on a disk, and if that's so, it's going to be a real problem to replace it with a real HBA. Best advice is to backup all the data off the pool, put in an LSI HBA running IT firmware 20.00.07.00, and then build a new pool (if it won't import the old one after the controller change).

Bnito Kmelas · Sep 24, 2021

Posting my output:

root@freenas12:~ # zpool status
pool: ShareDrive
state: ONLINE
scan: resilvered 10.3M in 00:00:12 with 0 errors on Tue Sep 21 07:47:24 2021
config:

NAME STATE READ WRITE CKSUM
ShareDrive ONLINE 0 0 0
raidz3-0 ONLINE 0 0 0
gptid/331c32fc-8ca7-11ea-90c5-78e3b5100ef2 ONLINE 0 0 0
gptid/36bd52ec-8ca7-11ea-90c5-78e3b5100ef2 ONLINE 0 0 0
gptid/3a82b069-8ca7-11ea-90c5-78e3b5100ef2 ONLINE 0 0 0
gptid/3e58c3a7-8ca7-11ea-90c5-78e3b5100ef2 ONLINE 0 0 0
gptid/4228852c-8ca7-11ea-90c5-78e3b5100ef2 ONLINE 0 0 0
gptid/45c311b1-8ca7-11ea-90c5-78e3b5100ef2 ONLINE 0 0 0
gptid/49852449-8ca7-11ea-90c5-78e3b5100ef2 ONLINE 0 0 0
gptid/4d136413-8ca7-11ea-90c5-78e3b5100ef2 ONLINE 0 0 0
gptid/50c28f2d-8ca7-11ea-90c5-78e3b5100ef2 ONLINE 0 0 0
gptid/547dbbcf-8ca7-11ea-90c5-78e3b5100ef2 ONLINE 0 0 0
cache
gptid/5a02cce8-8ca7-11ea-90c5-78e3b5100ef2 ONLINE 0 0 0
gptid/5f32c214-8ca7-11ea-90c5-78e3b5100ef2 ONLINE 0 0 0

errors: No known data errors

pool: boot-pool
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
scan: scrub repaired 0B in 00:03:17 with 0 errors on Sun Sep 19 03:48:18 2021
config:

AlexGG · Sep 24, 2021

jgreco said:
I believe the Smart Array controller may be one of the controllers that places its own "header"/partition table on a disk, and if that's so, it's going to be a real problem to replace it with a real HBA.

This is correct. Most controllers using configuration-on-disks will put metadata at the end of the disk, but SmartArray does indeed put it at the start, thus offsetting the user-accessible area.

jgreco · Sep 24, 2021

AlexGG said:
SmartArray does indeed put it at the start

I was about 95% certain.

So, backup your pool, tear out the Smart Array controller, replace it with an HBA, create a new pool, and reload your data onto the pool.

Important Announcement for the TrueNAS Community.

smartd 2688 - - Device: /dev/ciss0 [cciss_disk_06] [SCSI], SMART Failure: HARDWARE IMPENDING FAILURE TOO MANY BLOCK REASSIGNS

Bnito Kmelas

Cadet

Alecmascot

Guru

Bnito Kmelas

Cadet

Alecmascot

Guru

What's all the noise about HBAs, and why can't I use a RAID controller?

jgreco

Resident Grinch

What's all the noise about HBAs, and why can't I use a RAID controller?

Bnito Kmelas

Cadet

AlexGG

Contributor

jgreco

Resident Grinch

Similar threads

Important Announcement for the TrueNAS Community.

smartd 2688 - - Device: /dev/ciss0 [cciss_disk_06] [SCSI], SMART Failure: HARDWARE IMPENDING FAILURE TOO MANY BLOCK REASSIGNS

Cadet

Guru

Cadet

Guru

Resident Grinch

Cadet

Contributor

Resident Grinch

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "smartd 2688 - - Device: /dev/ciss0 [cciss_disk_06] [SCSI], SMART Failure: HARDWARE IMPENDING FAILURE TOO MANY BLOCK REASSIGNS"

Similar threads