LSI (Avago) 9207-8i with Seagate 10TB Enterprise (ST10000NM0016)

miip

Dabbler
Joined
Oct 7, 2017
Messages
15
And there is the first error after one week:

Code:
mpr0: Sending reset from mprsas_send_abort for target ID 6
	(da3:mpr0:0:6:0): WRITE(16). CDB: 8a 00 00 00 00 02 d5 9f 7f 40 00 00 00 08 00 00 length 4096 SMID 971 terminated ioc 804b scsi 0 state c xfer 0
mpr0: Unfreezing devq for target ID 6
(da3:mpr0:0:6:0): WRITE(16). CDB: 8a 00 00 00 00 02 d5 9f 7f 40 00 00 00 08 00 00
(da3:mpr0:0:6:0): CAM status: CCB request completed with an error
(da3:mpr0:0:6:0): Retrying command
(da3:mpr0:0:6:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
(da3:mpr0:0:6:0): CAM status: Command timeout
(da3:mpr0:0:6:0): Retrying command
(da3:mpr0:0:6:0): WRITE(16). CDB: 8a 00 00 00 00 02 d5 9f 7f 40 00 00 00 08 00 00
(da3:mpr0:0:6:0): CAM status: SCSI Status Error
(da3:mpr0:0:6:0): SCSI status: Check Condition
(da3:mpr0:0:6:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
(da3:mpr0:0:6:0): Retrying command (per sense data)
	(da3:mpr0:0:6:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 length 0 SMID 594 terminated ioc 804b scsi 0 state c xfer 0
(da3:mpr0:0:6:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
(da3:mpr0:0:6:0): CAM status: CCB request completed with an error
(da3:mpr0:0:6:0): Retrying command
(da3:mpr0:0:6:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
(da3:mpr0:0:6:0): CAM status: SCSI Status Error
(da3:mpr0:0:6:0): SCSI status: Check Condition
(da3:mpr0:0:6:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
(da3:mpr0:0:6:0): Error 6, Retries exhausted
(da3:mpr0:0:6:0): Invalidating pack
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 06 2c 00 da 00 00 00 00 00 4f 00 c2 00 b0 00 length 0 SMID 893 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 06 2c 00 da 00 00 00 00 00 4f 00 c2 00 b0 00 length 0 SMID 266 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d0 00 01 00 00 00 4f 00 c2 00 b0 00 length 512 SMID 424 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d0 00 01 00 00 00 4f 00 c2 00 b0 00 length 512 SMID 265 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 06 00 4f 00 c2 00 b0 00 length 512 SMID 492 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 06 00 4f 00 c2 00 b0 00 length 512 SMID 682 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 01 00 4f 00 c2 00 b0 00 length 512 SMID 764 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 01 00 4f 00 c2 00 b0 00 length 512 SMID 524 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 06 2c 00 da 00 00 00 00 00 4f 00 c2 00 b0 00 length 0 SMID 417 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 06 2c 00 da 00 00 00 00 00 4f 00 c2 00 b0 00 length 0 SMID 572 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d0 00 01 00 00 00 4f 00 c2 00 b0 00 length 512 SMID 899 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d0 00 01 00 00 00 4f 00 c2 00 b0 00 length 512 SMID 916 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 06 00 4f 00 c2 00 b0 00 length 512 SMID 550 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 06 00 4f 00 c2 00 b0 00 length 512 SMID 999 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 01 00 4f 00 c2 00 b0 00 length 512 SMID 744 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 01 00 4f 00 c2 00 b0 00 length 512 SMID 371 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 06 2c 00 da 00 00 00 00 00 4f 00 c2 00 b0 00 length 0 SMID 999 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 06 2c 00 da 00 00 00 00 00 4f 00 c2 00 b0 00 length 0 SMID 744 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d0 00 01 00 00 00 4f 00 c2 00 b0 00 length 512 SMID 371 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d0 00 01 00 00 00 4f 00 c2 00 b0 00 length 512 SMID 508 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 06 00 4f 00 c2 00 b0 00 length 512 SMID 236 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 06 00 4f 00 c2 00 b0 00 length 512 SMID 356 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 01 00 4f 00 c2 00 b0 00 length 512 SMID 271 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 01 00 4f 00 c2 00 b0 00 length 512 SMID 211 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 06 2c 00 da 00 00 00 00 00 4f 00 c2 00 b0 00 length 0 SMID 592 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 06 2c 00 da 00 00 00 00 00 4f 00 c2 00 b0 00 length 0 SMID 477 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d0 00 01 00 00 00 4f 00 c2 00 b0 00 length 512 SMID 424 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d0 00 01 00 00 00 4f 00 c2 00 b0 00 length 512 SMID 266 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 06 00 4f 00 c2 00 b0 00 length 512 SMID 265 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 06 00 4f 00 c2 00 b0 00 length 512 SMID 492 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 01 00 4f 00 c2 00 b0 00 length 512 SMID 524 terminated ioc 804b scsi 0 state c xfer 0
	(pass3:mpr0:0:6:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0e 00 d5 00 01 00 01 00 4f 00 c2 00 b0 00 length 512 SMID 764 terminated ioc 804b scsi 0 state c xfer 0
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Sounds to me like these drives have serious firmware issues if they're acting like this.
 

saltmaster

Cadet
Joined
Nov 2, 2017
Messages
3
My assumption is that these drives are SMR drives (pretty sure that they are), looking through the FreeBSD source code I noticed:

Code:
{
/*
* Seagate Lamarr 8TB Shingled Magnetic Recording (SMR)
* Drive Managed SATA hard drive. This drive doesn't report
* in firmware that it is a drive managed SMR drive.
*/
{ T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST8000AS0002*", "*" },
/*quirks*/DA_Q_SMR_DM
},


Could it also be that these drives are not correctly reporting that they are drive-managed SMR as well? I remember reading somewhere that SMR drives that are issued commands in a certain manner could potentially lock up (can't remember where I read this so I'm not 100% confident with this statement).
 

miip

Dabbler
Joined
Oct 7, 2017
Messages
15
My assumption is that these drives are SMR drives (pretty sure that they are), looking through the FreeBSD source code I noticed:

No, the ST10000NM0016 is a PMR drive. I am currently considering switching to HGSTs HUH721010ALE600, is anyone using these?

And another thing about the errors, they always seem to happen when the drive is under no load. This never happened when there was actually load on the drives like copying 10TB of data to the pool or when a zfs scrub is running.
My feel is that this is some kind of weird powersaving feature in the drives firmware that is not waking up the drive fast enough.
Could also be some reused code in the freebsd driver for these controllers, they are not that different after all.
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I know that iXsystems is shipping systems with 10TB hard drives and I know that one of those models is using the LSI/Broadcom SAS 3008 controller. I don't know what brand or model of disk they are using with it.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
iX generally uses WD, AFAIK.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
This could also be a simple driver issue. LSI 2008 and 2308 use the same driver in FreeBSD.

So far no errors with the LSI 3008 controller.

BTW, where did you guys see that LSI 2008 only supports drives <8TB? I can not find anything official about a 8TB limit. There were no issues for several weeks, so i don't think this is some compatibility issue.
The documentation that came with one of the servers I manage but it is not definitive because it could be that an 8TB drive was just the largest at the time the documentation was written.

Sent from my SAMSUNG-SGH-I537 using Tapatalk
 
Joined
Jan 18, 2017
Messages
525
My feel is that this is some kind of weird powersaving feature in the drives firmware that is not waking up the drive fast enough.
hmmm this is interesting to me, do you have your drives spin down? I was looking at the spec sheet on these drives and their spin-up from standby is 20 to 30 seconds
 

miip

Dabbler
Joined
Oct 7, 2017
Messages
15
hmmm this is interesting to me, do you have your drives spin down? I was looking at the spec sheet on these drives and their spin-up from standby is 20 to 30 seconds

Before it happened the first time i had activated the energy saving settings in FreeNAS for all drives. I deactivated everything, right now it is running without any energy saving in FreeNAS. But this seems not to be some normal energy saving behaviour. It would happen far more frequently if it was.
 

mattlach

Patron
Joined
Oct 14, 2012
Messages
280
Hmm.

So, let me add a nugget of information from my side.

I am a former FreeNAS user. I used to have it running in a VM on ESXi, but a while back I migrated from ESXi to KVM and LXC under Proxmox (Debian based) for a variety of reasons and the best way I got it to work was to export my FreeNAS pool, and import it under ZFS on Linux running on bare metal on the Proxmox host.

I still lurk and ask questions on these forums, as it is a great resource on storage hardware (controllers & drives) even when not using FreeNAS.

Anyway, I have two IBM M1015 flashed to P20 IT mode. These run two RAIDz2 vdev's of 6 drives each, two SSD's for L2ARC and 2 SSD's (in a mirror) as a SLOG, 16 drives in total, with no expander in between, connected via a Norco backplane.

I have in the last month or so one by one started replacing my 12 4TB WD Red's I have been using since late 2013/early 2014 with 10TB Seagate ST10000NM0016 drives. Thus far I am 4 drives in, with 8 to go. Between the 5 days it takes to run badblocks on one of these bad boys, and the 14 hours it takes me to resilver, and my busy work schedule, this has been a slow process. So the first drive has been in there about a month, second about 3 weeks, 3rd about 2 weeks and the most recent one about a week.

Under linux, these SAS controllers use the mpt2sas driver. Here is everything in my dmesg regarding that driver since my last boot.

Code:
root@proxmox:~# dmesg |grep -i mpt2sas
[	1.327548] mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (198152268 kB)
[	1.383437] mpt2sas_cm0: MSI-X vectors supported: 1, no of cores: 24, max_msix_vectors: -1
[	1.383528] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 38
[	1.383531] mpt2sas_cm0: iomem(0x00000000fad3c000), mapped(0xffffc900192f8000), size(16384)
[	1.383532] mpt2sas_cm0: ioport(0x000000000000c000), size(256)
[	1.481320] mpt2sas_cm0: Allocated physical memory: size(7579 kB)
[	1.481324] mpt2sas_cm0: Current Controller Queue Depth(3364),Max Controller Queue Depth(3432)
[	1.481326] mpt2sas_cm0: Scatter Gather Elements per IO(128)
[	1.529427] mpt2sas_cm0: LSISAS2008: FWVersion(20.00.04.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
[	1.529429] mpt2sas_cm0: Protocol=(
[	1.530071] mpt2sas_cm0: sending port enable !!
[	1.530684] mpt2sas_cm1: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (198152268 kB)
[	1.589643] mpt2sas_cm1: MSI-X vectors supported: 1, no of cores: 24, max_msix_vectors: 8
[	1.589714] mpt2sas1-msix0: PCI-MSI-X enabled: IRQ 42
[	1.589716] mpt2sas_cm1: iomem(0x00000000fa73c000), mapped(0xffffc900194b0000), size(16384)
[	1.589718] mpt2sas_cm1: ioport(0x000000000000b000), size(256)
[	1.688947] mpt2sas_cm1: Allocated physical memory: size(7579 kB)
[	1.688950] mpt2sas_cm1: Current Controller Queue Depth(3364),Max Controller Queue Depth(3432)
[	1.688951] mpt2sas_cm1: Scatter Gather Elements per IO(128)
[	1.737380] mpt2sas_cm1: LSISAS2008: FWVersion(20.00.04.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
[	1.737382] mpt2sas_cm1: Protocol=(
[	1.738229] mpt2sas_cm1: sending port enable !!
[	3.100711] mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x500605b005f41a10), phys(8)
[	3.350829] mpt2sas_cm1: host_add: handle(0x0001), sas_addr(0x500605b004d5b160), phys(8)
[   12.225377] mpt2sas_cm1: port enable: SUCCESS
[   12.233355] mpt2sas_cm0: port enable: SUCCESS
[701774.651777] mpt2sas_cm1: removing handle(0x000d), sas_addr(0x4433221105000000)
[701774.651782] mpt2sas_cm1: removing : enclosure logical id(0x500605b004d5b160), slot(6)
[1486673.708734] mpt2sas_cm0: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
[1486673.708742] mpt2sas_cm0: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
[1486673.708746] mpt2sas_cm0: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
[1486674.755463] mpt2sas_cm0: removing handle(0x0009), sas_addr(0x4433221100000000)
[1486674.755467] mpt2sas_cm0: removing : enclosure logical id(0x500605b005f41a10), slot(3)
[1924803.785697] mpt2sas_cm0: removing handle(0x000b), sas_addr(0x4433221101000000)
[1924803.785702] mpt2sas_cm0: removing : enclosure logical id(0x500605b005f41a10), slot(2)
[1924824.161995] mpt2sas_cm1: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
[1924824.162004] mpt2sas_cm1: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
[1924824.162009] mpt2sas_cm1: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
[1924824.162013] mpt2sas_cm1: log_info(0x31110d00): originator(PL), code(0x11), sub_code(0x0d00)
[1924824.945398] mpt2sas_cm1: removing handle(0x000c), sas_addr(0x4433221104000000)
[1924824.945402] mpt2sas_cm1: removing : enclosure logical id(0x500605b004d5b160), slot(7)


(Looking at this I just noticed that I'm on an older P20 firmware 20.00.04.00, not the latest 20.00.07.00)

The error messages in the log above appear - to me - to be tied to my removal of my old WD drives, not to the operation of the new Seagate drives.

Also, I have no errors in my Zpool Status:

Code:
root@proxmox:~# zpool status
  pool: rpool
state: ONLINE
  scan: scrub repaired 0 in 0h5m with 0 errors on Sun Oct  8 00:29:03 2017
config:

	NAME												 STATE	 READ WRITE CKSUM
	rpool												ONLINE	   0	 0	 0
	  mirror-0										   ONLINE	   0	 0	 0
		ata-Samsung_SSD_850_EVO_500GB_S21HNXAGCXXXXXX	ONLINE	   0	 0	 0
		ata-Samsung_SSD_850_EVO_500GB_S21HNXAGCXXXXXX	ONLINE	   0	 0	 0

errors: No known data errors

  pool: zfshome
state: ONLINE
  scan: resilvered 5.23T in 20h26m with 0 errors on Mon Nov  6 18:31:55 2017
config:

	NAME											 STATE	 READ WRITE CKSUM
	zfshome										  ONLINE	   0	 0	 0
	  raidz2-0									   ONLINE	   0	 0	 0
		ata-ST10000NM0016-1TT101_ZA21XXXX			ONLINE	   0	 0	 0
		ata-ST10000NM0016-1TT101_ZA21XXXX			ONLINE	   0	 0	 0
		ata-WDC_WD40EFRX-68WT0N0_WD-WCC4EXXXXXXX	 ONLINE	   0	 0	 0
		ata-WDC_WD40EFRX-68WT0N0_WD-WCC4EXXXXXXX	 ONLINE	   0	 0	 0
		ata-WDC_WD40EFRX-68WT0N0_WD-WCC4EXXXXXXX	 ONLINE	   0	 0	 0
		ata-WDC_WD40EFRX-68WT0N0_WD-WCC4EXXXXXXX	 ONLINE	   0	 0	 0
	  raidz2-1									   ONLINE	   0	 0	 0
		ata-ST10000NM0016-1TT101_ZA21XXXX			ONLINE	   0	 0	 0
		ata-ST10000NM0016-1TT101_ZA21XXXX			ONLINE	   0	 0	 0
		ata-WDC_WD40EFRX-68WT0N0_WD-WCC4EXXXXXXX	 ONLINE	   0	 0	 0
		ata-WDC_WD40EFRX-68WT0N0_WD-WCC4EXXXXXXX	 ONLINE	   0	 0	 0
		ata-WDC_WD40EFRX-68WT0N0_WD-WCC4EXXXXXXX	 ONLINE	   0	 0	 0
		ata-WDC_WD40EFRX-68WT0N0_WD-WCC4EXXXXXXX	 ONLINE	   0	 0	 0
	logs
	  mirror-2									   ONLINE	   0	 0	 0
		ata-INTEL_SSDSC2BA100G3_BTTV425XXXXX100FGN   ONLINE	   0	 0	 0
		ata-INTEL_SSDSC2BA100G3_BTTV425XXXXX100FGN   ONLINE	   0	 0	 0
	cache
	  ata-Samsung_SSD_850_PRO_512GB_S250NXXXXXXXXXX  ONLINE	   0	 0	 0
	  ata-Samsung_SSD_850_PRO_512GB_S250NXXXXXXXXXX  ONLINE	   0	 0	 0

errors: No known data errors


Now, granted, my pool workload is not terribly write intensive (usually no more than several GB per day, if I had to guess.) so I might just not have teased out the issue, but at the very least the drives should have seen a lot of write activity during resilvering.

So, it's looking like (with some caveats as noted above, i.e. firmware revision & not terribly write intensive work load) that I have not experienced this issue under Linux, which suggests this may have something to do with BSD, the BSD driver, the BSD implementation of ZFS, or something else in FreeNAS that is absent in my Debian based Proxmox distribution.

I will do some sort of heavy write test later tonight to confirm, but this is what I have for now.
 
Last edited:

miip

Dabbler
Joined
Oct 7, 2017
Messages
15
That is interesting, i suspected an issue with the driver.

I will do some sort of heavy write test later tonight to confirm, but this is what I have for now.

I found that the errors appear when there is little to no load on the drives. So some heavy write tests could be counterproductive.
 

mattlach

Patron
Joined
Oct 14, 2012
Messages
280
That is interesting, i suspected an issue with the driver.



I found that the errors appear when there is little to no load on the drives. So some heavy write tests could be counterproductive.

Hmm. Yeah.

That would suggest that maybe something is timing out or going to sleep at low activity. The power saving and spindown settings are an area that I know next to nothing about.
 

miip

Dabbler
Joined
Oct 7, 2017
Messages
15
I got two more errors this week, but zfs was not affected by it, not even a checksum error. I could live with it, if my pool stays intact.

Update: Ok sometimes it causes a drive to be thown out of the zpool with the new controller, sometimes the error is just informational. So no real improvement.
 
Last edited:

saltmaster

Cadet
Joined
Nov 2, 2017
Messages
3
9300-8e arrived and connected the drives up and all working perfectly. Copied (and scrubbed) around 20TB of data with no issues.
 

AMiGAmann

Contributor
Joined
Jun 4, 2015
Messages
106
I was thinking about upgrading my drives from existing 6TB to 10TB Seagates. Then I found this thread about the Seagate Enterprise ST10000NM0016 and this other thread about the Seagate IronWolf SD10000VN0004 and I am quite a bit shocked. It seems that both 10TB models are not compatible to
my LSI 2308.

The source of the problems (drive firmware, driver, anything else) seems not to be clear yet, is that right?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
The source of the problems (drive firmware, driver, anything else) seems not to be clear yet, is that right?
The OP @leveche has not responded to the thread recently. I wonder how they are dealing. Others that were affected (@miip ) last said that the few errors they were getting did not appear to be significant. I don't know where the solution lies. I know that iXsystems (the creators of FreeNAS) are shipping systems with 10TB hard drives and they are using Western Digital drives. I don't want to tell you to put your data at risk, but I think the risk is small, so you might give it a try. One theory is that the problem is because of the firmware in the Seagate drives, but I don't know if I would agree with that. The largest drive we are using in servers at work is the 6TB WD Red Pro and and the 8TB HGST helium filled drives. Nothing like this from either of those.
Sorry, no definitive answer for you.
 

miip

Dabbler
Joined
Oct 7, 2017
Messages
15
Others that were affected (@miip ) last said that the few errors they were getting did not appear to be significant.

I just updated my previous post, the errors are still significant. That other thread about the IronWolf drives is very interesting, exactly the same issue.
 
Joined
Jan 18, 2017
Messages
525
the ironwolf drives have the seagate powerchoice tech on them too according to their manual. I haven't been able to figure out how to disable the function but I also don't have one to test.
 

krazos

Dabbler
Joined
Nov 11, 2017
Messages
15
Hi,

I'm having the exact same problem. I have 5x 10TB Ironwolf (ST10000VN0004) and I recently made a thread about this as well.

I have tested with a SAS2308 controller and a LSI 9211-8i with latest IT firmware, but still problems.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Has anyone contacted Seagate about the issue?
 
Top