Brand New Disk Errors?

TAC

Contributor
Joined
Feb 16, 2014
Messages
152
I just installed a few brand new disks in my system and see the following errors on one of the drives (da5). What would be the best way to test this drive, and if it's truly an issue, provide information to Toshiba for a replacement. Drive is a NAS N300 6TB.

Code:
Aug  5 14:57:21 freenas     (da5:mps0:0:9:0): READ(10). CDB: 28 00 c5 95 e7 b0 00 00 08 00 length 4096 SMID 517 terminated ioc 804b loginfo 31110d00 scsi 0 state c xfer 0
Aug  5 14:57:21 freenas     (da5:mps0:0:9:0): READ(10). CDB: 28 00 1d 2a 80 18 00 00 40 00 length 32768 SMID 156 terminated ioc 804b loginfo 31110d00 scsi 0 state c xfer 0
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): READ(10). CDB: 28 00 c5 95 e7 b0 00 00 08 00
Aug  5 14:57:21 freenas     (da5:mps0:0:9:0): READ(10). CDB: 28 00 1d 2a 80 58 00 00 40 00 length 32768 SMID 189 terminated ioc 804b loginfo 31110d00 scs(da5:mps0:0:9:0): CAM status: CCB request completed with an error
Aug  5 14:57:21 freenas i 0 state c xfer 0
Aug  5 14:57:21 freenas     (da5:mps0:0:9:0): WRITE(10). CDB: 2a 00 ba eb 68 e8 00 00 18 00 length 12288 SMID 122 terminated ioc 804b loginfo 31110d00 sc(da5:si 0 state c xfer 0
Aug  5 14:57:21 freenas mps0:0:9:0):     (da5:mps0:0:9:0): READ(10). CDB: 28 00 1d 2a 80 98 00 00 40 00 length 32768 SMID 624 terminated ioc 804b loginfo 31110d00 scsRetrying command
Aug  5 14:57:21 freenas i 0 state c xfer 0
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): READ(10). CDB: 28 00 1d 2a 80 18 00 00 40 00
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): CAM status: CCB request completed with an error
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): Retrying command
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): READ(10). CDB: 28 00 1d 2a 80 58 00 00 40 00
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): CAM status: CCB request completed with an error
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): Retrying command
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): WRITE(10). CDB: 2a 00 ba eb 68 e8 00 00 18 00
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): CAM status: CCB request completed with an error
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): Retrying command
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): READ(10). CDB: 28 00 1d 2a 80 98 00 00 40 00
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): CAM status: CCB request completed with an error
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): Retrying command
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): READ(10). CDB: 28 00 65 9c dd 30 00 00 08 00
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): CAM status: SCSI Status Error
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): SCSI status: Check Condition
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
Aug  5 14:57:21 freenas (da5:mps0:0:9:0): Retrying command (per sense data)
Aug  5 14:57:22 freenas (da5:mps0:0:9:0): READ(10). CDB: 28 00 1d 2b 67 18 00 00 40 00
Aug  5 14:57:22 freenas (da5:mps0:0:9:0): CAM status: SCSI Status Error
Aug  5 14:57:22 freenas (da5:mps0:0:9:0): SCSI status: Check Condition
Aug  5 14:57:22 freenas (da5:mps0:0:9:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
Aug  5 14:57:22 freenas (da5:mps0:0:9:0): Retrying command (per sense data)

Check to stop refresh
 

TAC

Contributor
Joined
Feb 16, 2014
Messages
152
Guess it doesn't matter anymore since the drive is now off-line. Glad I'm running RAIDZ2.
 

TAC

Contributor
Joined
Feb 16, 2014
Messages
152
I'll fiddle with the cables.
 

TAC

Contributor
Joined
Feb 16, 2014
Messages
152
This drive is plugged into a IBM Serveraid M1015 SAS/SATA controller. What do you think would be the best way to trouble shoot if it's the cable or the controller? The board has two bundles of cables with each one going to 4 disks.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
You already put data on these new disks?
 

TAC

Contributor
Joined
Feb 16, 2014
Messages
152
@danb35 thanks for the input. I powered down my system, re-seated both cables from the M1015 and the SATA cable from da5 and now the drive is back on line. What do you think, cable or controller? I guess I'll wait and see if FreeNAS starts complaining again.

Come to think of it I was getting some random errors on the old 4 TB drive I removed from this slot before I put in the 6 TB drive.
 

TAC

Contributor
Joined
Feb 16, 2014
Messages
152
You already put data on these new disks?
Yeah, the disks are are in the server and I've 'expanded' the data onto this second vdev. I figured since I'm running RAIDz2 the chances of getting in big trouble with the new disks were slim. lol Go head, let me have it. ;-)
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080

TAC

Contributor
Joined
Feb 16, 2014
Messages
152
@danb35 in my system I have two spare SATA ports on the motherboard. If I unplugged disk da5 from the IBM M1015 controller and plug it into a motherboard SATA port should it show up backing my vdev as da5 with not problems to the vdev or pool?

Two questions:
1) Are all disk names (da0, da1, ada0, etc) associated to individual drive serial numbers, i.e. could you shuffle them around to different SATA cables without causing any problems to your data pool / vdevs?
2) Per my issue above, since disk da5 was offline for awhile why when I got the disk back online didn't the system have to resilver disk da5? Some data had to be written to the pool during that time since I did watch a show on Plex. Could it have just been that any writes to my 'Data1' pool were the other vdev that da5 wasn't part of?
 

anmnz

Patron
Joined
Feb 17, 2018
Messages
286
could you shuffle them around to different SATA cables without causing any problems to your data pool
Yes. ZFS identifies a disk not by where it's plugged in but by metadata it has written to the disk. You can shuffle them around no problem.

why when I got the disk back online didn't the system have to resilver disk da5
Are you sure it didn't just do it too quickly for you to notice? It wouldn't have to rewrite all the data on the disk, unlike with some other RAID systems. It would just have to catch up with writes that had happened while it was offline.

Could it have just been that any writes to my 'Data1' pool were the other vdev that da5 wasn't part of?
If you've added a new empty vdev to your pool then writes would go preferentially to the new vdev, so if this disk is part of the new vdev (which is what it sounds like) then it's unlikely writes would have gone elsewhere.

(Hm, unless ZFS noticed the vdev was degraded through missing a disk and decided to avoid writing to it... I don't think it behaves that way though.)
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Hm, unless ZFS noticed the vdev was degraded through missing a disk and decided to avoid writing to it... I don't think it behaves that way though
You are correct, ZFS continues to use a degraded vdev as normal. If the pool is resilvering, overall performance of the pool is impacted, but all vdevs are still actively used.
Are all disk names (da0, da1, ada0, etc) associated to individual drive serial numbers
Not in FreeNAS. There are other operating systems that establish the pool based on disk device number, but FreeNAS uses the GPTID number of the partition to identify the members of a vdev. You can shuffle the disks like a deck of cards and connect them to SAS or SATA and FreeNAS will still find and use the disk as long as there is a supported hardware data path between the OS and the drive. The da0, ada0 numbers are not even always associated to the same disk from one power cycle to the next. They are assigned in the order that the disks come ready as the OS is booting and are not related to the SAS or SATA port that the drive is connected to. There are times that the da# / ada# will coincide with the SAS or SATA port, but it is not always and it can change, so it is important that you don't think of them as being connected.
when I got the disk back online didn't the system have to resilver disk da5?
Look at the output of zpool status. It will tell you if there was a resilver, as long as nothing else happened since. You can remove a disk and reconnect it and the pool will recover automatically. I am sure there is some point at which too much has changed and it won't, but I have had disks out for an hour and put it back in with no trouble.
 
  • Like
Reactions: TAC

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
PS. The reliable way to identify a disk is by the serial number. In most situations, you can just look at the last four digits of the serial number as that is usually unique. I have only run into one instance where I had a duplicate last four digits out of hundreds of drives.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
  • Like
Reactions: TAC

TAC

Contributor
Joined
Feb 16, 2014
Messages
152
To troubleshoot I think I'll swap the two 4-way drive cables on the IBM M1015 SATA controller and see if the issue follows the cable. If not I'll move the da5 disk to a spare SATA port on my MoBo and order a new M1015 controller.

I'm leaning on a cable issue since I think loosing just one of eight SATA ports on such a highly integrated controller card is unlikely.

With a little luck maybe re-seating the two cables already did the job.
 

TAC

Contributor
Joined
Feb 16, 2014
Messages
152

tfran1990

Patron
Joined
Oct 18, 2017
Messages
294
Also note, if you think you have a bad port. there is a good chance there will be intermittent problems on the 4 discs connect to the port.( talking about the m1015)

Is it just me or is it a Bi*** to get the 8087 end out of the m1015?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
How is a GPTID related to a particular drive?
The GPTID is the name of the partition. The partition is recorded on the drive. You can think of the GPTID as being connected to the drive, but it can change if the partition is deleted and a new partition is created. The definitive identification of the drive is the serial number.
In one of the first FreeNAS systems I built, I put labels on the sides of the drives, so I could easily tell which drive was which, like this:
20160124_090832.png
Later, when I had a system with hot-swap bays, I put lables on the front of the drive trays like this:
20171005_210057.png
The serial number of the drive is the thing to go by when identifying drives. It will never change.

You might want to have a look at some of these system monitoring scripts:

GitHub repository for FreeNAS scripts, including disk burnin
https://www.ixsystems.com/community...for-freenas-scripts-including-disk-burnin.28/
-
Original @Bidule0hm - Scripts to report SMART, ZPool and UPS status, HDD/CPU T°, HDD identification and backup the config
https://www.ixsystems.com/community/threads/scripts-to-report-smart-zpool-and-ups-status-hdd-cpu-t°-hdd-identification-and-backup-the-config.27365/
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Is it just me or is it a Bi*** to get the 8087 end out of the m1015?
Some connectors are smoother than others. I have one cable that is almost impossible to get in or out of the connector and it is the cable, not the card, for me. I have an easier time with the cables at work, but they were probably more expensive, I tend to buy the bargain items for home.
 
Top