identify disks serial number on Raid0 P410i

baby27784 · Jul 23, 2019

hi everyone

in my freenas server, each disk is on Raid0 Hardware Raid P410i separate raid0 disk present to zfs.(i know this is not recommended)

when I want get disk serial number for fail time, all disk have same serial in freenas.

but confusion is that :

smartctl -a /dev/da1 ( or da2 and so on)
i see all serial number are same and SMART support is: Unavailable - device lacks SMART capability.

Code:

root@freenas:~ # smartctl -a /dev/da1

smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.2-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               HP
Product:              RAID 0
Revision:             OK
User Capacity:        146,778,685,440 bytes [146 GB]
Logical block size:   512 bytes
Logical Unit id:      0x600508b1001ce1a22350a816a4976db2
Serial number:        5001438018C8E650
Device type:          disk
Local Time is:        Wed Jul 24 04:18:31 2019 PDT
SMART support is:     Unavailable - device lacks SMART capability.

=== START OF READ SMART DATA SECTION ===
Current Drive Temperature:     0 C
Drive Trip Temperature:        0 C

Error Counter logging not supported

Device does not support Self Test logging

but when i run " smartctl -d cciss,0 -a /dev/ciss0"
i get correct serial number as Logical unit id and SMART support is: Available - device has SMART capability.

Code:

root@freenas:~ # smartctl -d cciss,1 -a /dev/ciss0
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.2-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               HP
Product:              EH0146FARWD
Revision:             HPDD
User Capacity:        146,815,737,856 bytes [146 GB]
Logical block size:   512 bytes
Rotation Rate:        15030 rpm
Form Factor:          2.5 inches
Logical Unit id:      0x5000cca00b7b6dec
Serial number:        PLX5WAEE
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Wed Jul 24 04:24:04 2019 PDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     29 C
Drive Trip Temperature:        65 C

Manufactured in week 05 of year 2012
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  100
Elements in grown defect list: 0

Vendor (Seagate) cache information
  Blocks sent to initiator = 1625038809726976

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0   787711         0    787711          0    1699748.818           0
write:         0   320525         0    320525          0       3421.644           0
verify:        0       30         0        30          0        146.816           0

Non-medium error count:      435

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Completed                   -      44                 - [-   -    -]
# 2  Background short  Completed                   -      23                 - [-   -    -]
# 3  Background short  Completed                   -      19                 - [-   -    -]
# 4  Background short  Completed                   -       0                 - [-   -    -]

Long (extended) Self Test duration: 1394 seconds [23.2 minutes]

1- How Can I get disk serial number for replace in failing time?

2- is SMART functioning properly for failing detection? if no, what is alternative solution?

3- in my implementation how can i find failed drive?

thanks

garm · Jul 24, 2019

baby27784 said:
i know this is not recommended

Do you understand why?

baby27784 · Jul 24, 2019

garm said:
Do you understand why?

lose some of the data protection features of ZFS.

Chris Moore · Jul 24, 2019

baby27784 said:
lose some of the data protection features of ZFS.

You also loose the ability to figure out which disk is which because FreeNAS is not able to access the DRIVE, only the drive controller. Your best option here is copy the data out of FreeNAS to some other storage and start over. You should NEVER use a hardware RAID controller with FreeNAS.

Chris Moore · Jul 24, 2019

baby27784 said:
2- is SMART functioning properly for failing detection? if no, what is alternative solution?

NO!

baby27784 said:
3- in my implementation how can i find failed drive?

Ask the RAID controller.

baby27784 · Jul 26, 2019

Chris Moore said:
You also loose the ability to figure out which disk is which because FreeNAS is not able to access the DRIVE, only the drive controller. Your best option here is copy the data out of FreeNAS to some other storage and start over. You should NEVER use a hardware RAID controller with FreeNAS.

so if freenas can not detect failed drive on raid controller why i get this warnings:

CRITICAL:July 24, 2019, 3:57 a.m. Device: /dev/ciss0 [cciss_disk_02] [SCSI], SMART Failure: FAILURE PREDICTION THRESHOLD EXCEEDED: ascq=0x99
CRITICAL:July 24, 2019, 3:57 a.m. - Device: /dev/ciss0 [cciss_disk_03] [SCSI], SMART Failure: DATA CHANNEL IMPENDING FAILURE DATA ERROR RATE TOO HIGH

CRITICAL:July 25, 2019, 3:57 a. - The boot volume state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.

Chris Moore · Jul 26, 2019

baby27784 said:
CRITICAL:July 25, 2019, 3:57 a. - The boot volume state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.

Is your boot pool also on the hardware RAID controller? I don't know. You have not provided a detailed description of your hardware. Are the drives you are asking about in the first post the boot drives for the system? The drives may not be having problems but FreeNAS may be seeing an error because of the RAID controller manipulating the data such that ZFS checksum data is not working properly. On the other hand, it is possible that the drives are going bad. Good luck figuring that out. It is the reason we have hardware recommendations saying to NOT use hardware RAID, even if you set each drive up as a RAID-0. Diagnosing problems are difficult enough without the added complexity and misleading information from a RAID controller.

baby27784 · Jul 26, 2019

Chris Moore said:
Is your boot pool also on the hardware RAID controller? I don't know. You have not provided a detailed description of your hardware. Are the drives you are asking about in the first post the boot drives for the system? The drives may not be having problems but FreeNAS may be seeing an error because of the RAID controller manipulating the data such that ZFS checksum data is not working properly. On the other hand, it is possible that the drives are going bad. Good luck figuring that out. It is the reason we have hardware recommendations saying to NOT use hardware RAID, even if you set each drive up as a RAID-0. Diagnosing problems are difficult enough without the added complexity and misleading information from a RAID controller.

boot device is 16G SD card and not on raid controller.
those error are not for boot device.

and when i run smartctl -d cciss,1 -a /dev/ciss0
SMART IS :

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

I don not know any thing about HBA . please guide and suggest some HBA compatible with HP DL380 G7.

Elliot Dierksen · Jul 27, 2019

baby27784 said:
so if freenas can not detect failed drive on raid controller why i get this warnings:

CRITICAL:July 24, 2019, 3:57 a.m. Device: /dev/ciss0 [cciss_disk_02] [SCSI], SMART Failure: FAILURE PREDICTION THRESHOLD EXCEEDED: ascq=0x99

CRITICAL:July 24, 2019, 3:57 a.m. - Device: /dev/ciss0 [cciss_disk_03] [SCSI], SMART Failure: DATA CHANNEL IMPENDING FAILURE DATA ERROR RATE TOO HIGH

CRITICAL:July 25, 2019, 3:57 a. - The boot volume state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.

One of my old FreeNAS units was using a similar controller. The ciss driver is reporting the error, not the normal FreeNAS smart reporting system. I think most RAID controller drivers won't tell you that, but that one does. You would still be better off with an HBA as opposed to a RAID controller.

baby27784 · Jul 27, 2019

Elliot Dierksen said:
One of my old FreeNAS units was using a similar controller. The ciss driver is reporting the error, not the normal FreeNAS smart reporting system. I think most RAID controller drivers won't tell you that, but that one does. You would still be better off with an HBA as opposed to a RAID controller.

the fact that i dont know HBA that compatible with DL380 G7. as i search for HBA there is not exist for that.
all HBA are external support. my server has 16 bay 2.5 SAS HDD.

maybe i should think to other open source solution.

baby27784 · Jul 27, 2019

The more I read, the more I get confused :

FreeNAS® 11.2- User Guide, Release 11.2 in page 322 says that:

"ZFS was designed for commodity disks so no RAID controller is needed. While ZFS can also be
used with a RAID controller, it is recommended that the controller be put into JBOD mode so that ZFS has full control of the disks"

it says it is not needed ( not explicitly dont use ) and While ZFS can also be used with a RAID controller.

and in page 110 says that:

To prevent problems, do not enable the S.M.A.R.T. service if the disks are controlled by a RAID controller. It is the job of the controller to monitor S.M.A.R.T. and mark drives as Predictive Failure when they trip.

assuming that i can not prepare another fully compatible hardware now what if i use this implementation by disabling S.M.A.R.T services in freenas and allow The RAID card handles the SMART monitoring, handles the alerting ?

if i do what other i loos?

Elliot Dierksen · Jul 28, 2019

baby27784 said:
the fact that i don't know HBA that compatible with DL380 G7. as i search for HBA there is not exist for that.
all HBA are external support. my server has 16 bay 2.5 SAS HDD.

maybe i should think to other open source solution.

You can use an LSI 9207-8i which is also available as an H220.

baby27784 · Jul 28, 2019

baby27784 said:
The more I read, the more I get confused :

FreeNAS® 11.2- User Guide, Release 11.2 in page 322 says that:

"ZFS was designed for commodity disks so no RAID controller is needed. While ZFS can also be
used with a RAID controller, it is recommended that the controller be put into JBOD mode so that ZFS has full control of the disks"

it says it is not needed ( not explicitly don't use ) and While ZFS can also be used with a RAID controller.

and in page 110 says that:

To prevent problems, do not enable the S.M.A.R.T. service if the disks are controlled by a RAID controller. It is the job of the controller to monitor S.M.A.R.T. and mark drives as Predictive Failure when they trip.

assuming that i can not prepare another fully compatible hardware now what if i use this implementation by disabling S.M.A.R.T services in freenas and allow The RAID card handles the SMART monitoring, handles the alerting ?

if i do what other i loos?

i can get disk serial number by this order :

zpool status
glabel status
smartctl -d cciss,0 -a /dev/ciss0 | grep ^serial

is there any way to figure out the location in server?

jgreco · Jul 28, 2019

baby27784 said:
i can get disk serial number by this order :

zpool status
glabel status
smartctl -d cciss,0 -a /dev/ciss0 | grep ^serial

is there any way to figure out the location in server?

If the RAID controller isn't already showing an alarm on the drive, then it may be difficult to do via CLI.

Suggest: Pull the drives and inspect the labels. There's a reason some of us tag the front of the drives with the serial numbers.

Also, be aware, CCISS is known to be a problematic driver. You need to replace that with an HBA.

Important Announcement for the TrueNAS Community.

identify disks serial number on Raid0 P410i

baby27784

Dabbler

garm

Wizard

baby27784

Dabbler

Chris Moore

Hall of Famer

Chris Moore

Hall of Famer

baby27784

Dabbler

Chris Moore

Hall of Famer

baby27784

Dabbler

Elliot Dierksen

Guru

baby27784

Dabbler

baby27784

Dabbler

Elliot Dierksen

Guru

baby27784

Dabbler

jgreco

Resident Grinch

Similar threads

Important Announcement for the TrueNAS Community.

identify disks serial number on Raid0 P410i

Dabbler

Wizard

Dabbler

Hall of Famer

Hall of Famer

Dabbler

Hall of Famer

Dabbler

Guru

Dabbler

Dabbler

Guru

Dabbler

Resident Grinch

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "identify disks serial number on Raid0 P410i"

Similar threads