Samsung 850 EVO: "disk busy" weirdness?

Status
Not open for further replies.

chuegen

Cadet
Joined
Sep 11, 2017
Messages
6
Hi all,

I'm seeing a bit of weirdness and I'm unsure of whether it's causing me some problems with performance.

Problem: 2 of the SSD's (da0,da1) report high "disk busy" numbers compared to the other 6 (da2 through da7) for the same amount of load. I'm finding that the performance of the installation is not where I expect it, and I can't seem to get maximum performance from the configuration - on a "dd if=/dev/zero of=/mnt/DATASSD/NAS-VMD/freenas/outfile bs=1M count=262144", gstat shows da0 and da1 as 80%+ busy while the rest of the drives are only 30% busy.

Dell T320, Running 11.0-U3
32 GB RAM, 8x Samsung 850 EVO 1TB SSD's on Dell PERC controller flashed to "IT" (passthrough) mode
ZFS pool set up as 4 groups of mirror pairs
10GbE twinax to switch, 10 GbE twinax to compute nodes

NFS shares mounted as:
a.b.c.d:/mnt/DATASSD/NAS-VMD on /mnt/pve/NAS-VMD type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=a.b.c.d,mountvers=3,mountport=x,mountproto=udp,local_lock=none,addr=a.b.c.d)

Boot time drive info for da0 and da2:
Code:
da0 at mps0 bus 0 scbus0 target 4 lun 0
da0: <ATA Samsung SSD 850 1B6Q> Fixed Direct Access SPC-3 SCSI device
da0: Serial Number S246NXAG602332H
da0: 600.000MB/s transfers
da0: Command Queueing enabled
da0: 953869MB (1953525168 512 byte sectors)
da0: quirks=0x8<4K>
GEOM_ELI: Device da0p1.eli created.
da2 at mps0 bus 0 scbus0 target 6 lun 0
da2: <ATA Samsung SSD 850 1B6Q> Fixed Direct Access SPC-3 SCSI device
da2: Serial Number S33FNCAH502074V
da2: 600.000MB/s transfers
da2: Command Queueing enabled
da2: 953869MB (1953525168 512 byte sectors)
da2: quirks=0x8<4K>
GEOM_ELI: Device da2p1.eli created.


volumes.PNG


The only difference seems to be that the da0 and da1 drives may come from a different lot of drives, despite being the same model and same firmware:

disks.PNG


smartinfo.PNG


During a data migration activity via NFS from a couple of nodes, here's what the performance metrics look like -- first the "disk busy":

gstat.PNG

diskbusy.PNG


But as you can see, the same number of operations and amount of transfer is occurring across the disks:

diskops.PNG

diskio.PNG



Any thoughts? Would this higher "disk busy" percentage be holding me up? And if so, any ideas as to why da0 and da1 would be the only two drives showing double or triple the busy% for the same transactions?

Thanks,
-c
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
The drives are part of a separate mirror. They may be seeing more activity, or have in the past.

Or maybe they're warmer and throttling?

Fwiw, you may benefit from a high performance NVMe slog.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Please post the outpout of zpool list -v for the affected pool.

ZFS will attempt to balance the writes so each mirror pair has about the same amount of blocks used.
That said, I don't know why it would only affect half of a mirror pair. (Or 2 mirror pairs.)
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Please post the outpout of zpool list -v for the affected pool.

ZFS will attempt to balance the writes so each mirror pair has about the same amount of blocks used.
That said, I don't know why it would only affect half of a mirror pair. (Or 2 mirror pairs.)

These being Evos, I thought it may have to do with the first mirror having recieved more writes in the past, and thus being closer to steady state (ie slow) than the rest of the drives.

Alternatively, is it possible the ZIL is only being written to the first vdev?
 

chuegen

Cadet
Joined
Sep 11, 2017
Messages
6
Here is output of zpool list -v:

Code:
NAME									 SIZE  ALLOC   FREE  EXPANDSZ   FRAG	CAP  DEDUP  HEALTH  ALTROOT
DATASSD								 3.62T   884G  2.76T		 -	15%	23%  1.00x  ONLINE  /mnt
  mirror								 928G   221G   707G		 -	15%	23%
	gptid/848c802e-9733-11e7-bd11-74867ae04eaa	  -	  -	  -		 -	  -	  -
	gptid/84c9bea5-9733-11e7-bd11-74867ae04eaa	  -	  -	  -		 -	  -	  -
  mirror								 928G   221G   707G		 -	15%	23%
	gptid/a474f984-9733-11e7-bd11-74867ae04eaa	  -	  -	  -		 -	  -	  -
	gptid/a4b1774d-9733-11e7-bd11-74867ae04eaa	  -	  -	  -		 -	  -	  -
  mirror								 928G   221G   707G		 -	15%	23%
	gptid/ba97e9eb-9733-11e7-bd11-74867ae04eaa	  -	  -	  -		 -	  -	  -
	gptid/bad8b254-9733-11e7-bd11-74867ae04eaa	  -	  -	  -		 -	  -	  -
  mirror								 928G   221G   707G		 -	15%	23%
	gptid/cd0b31ef-9733-11e7-bd11-74867ae04eaa	  -	  -	  -		 -	  -	  -
	gptid/cd4b50cf-9733-11e7-bd11-74867ae04eaa	  -	  -	  -		 -	  -	  -


All drives were new and installed at same time, although I tried a different configuration first -- 2 4-drive RAIDZ1 sets.... da0-da3 and da4-da7. da0-da1 still exhibited the same higher latency / disk busy %age.

Delete latency for da0-da1 seem to be 2-3x higher than the rest of the drives (da2-da7 averages 1.2 ms, da0-da1 averages over 3).

I find it odd that the two drives that are slow have serial number formats far different than the other 6 SSD's.

I have to admit, I'm not familiar with the help an NVMe slog might provide, I'll go do some reading.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Well, that output removes a potential cause.

I think I agree that the 2 affected drives have something different internally. My own Sansung EVO 840 1TB started to get slower due to it being a multi-level type, with a firmware bug. Once the firmware was updated, it restored most, (but not all), of the speed.

PS: Note that Tri-Level, (and higher), flash devices basically use analog bit detection. Meaning a single cell is not just on or off. It could be in-between adding another bit. Constant reading, (but not over-writing), will cause these bits to loose a little power. Thus, the firmware has to account for this, and eventually re-write the block, (or relocate the block), to maintain reliability.
 

chuegen

Cadet
Joined
Sep 11, 2017
Messages
6
Thank you for the guidance, it's what I was thinking too but was frustrating me. I originally thought "firmware", but they're all running the same level - and there appears to be no firmware upgrades available for the EVO series.

As for MLC, I just couldn't pass up the price for new 1TB EVO's ($300) and the application isn't a "money is no object" variety one.
 

chuegen

Cadet
Joined
Sep 11, 2017
Messages
6
Update:
It does, indeed, appear to be the specific drives. I found a spare drive beginning with S/N prefix "S33", so I offlined da0, replaced the drive, resilvered, and started another data migration job. As you might expect, da0's "busy" percentage came in line with the rest of them... note that da1 is now the only drive that shows a high "disk busy" percentage.

new-diskbusy1.PNG

new-diskbusy2.PNG
 

chuegen

Cadet
Joined
Sep 11, 2017
Messages
6
Code:
kern.cam.da.7.delete_method: ATA_TRIM
kern.cam.da.5.delete_method: ATA_TRIM
kern.cam.da.0.delete_method: ATA_TRIM
kern.cam.da.4.delete_method: ATA_TRIM
kern.cam.da.2.delete_method: ATA_TRIM
kern.cam.da.1.delete_method: ATA_TRIM
kern.cam.da.6.delete_method: ATA_TRIM
kern.cam.da.3.delete_method: ATA_TRIM
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
I originally thought "firmware", but they're all running the same level - and there appears to be no firmware upgrades available for the EVO series.

Best way to check with Samsung drives us to run the drive (one of them) connected to windows and run Samsung Magician over it to check the firmware status.

Sounds like batch variation.... or a silent hardware revision :(
 

chuegen

Cadet
Joined
Sep 11, 2017
Messages
6
That was fun - didn't expect it to be such a PITA to get this thing connected to a Windows machine. Nonetheless, Magician says it's running the latest firmware.

I suspect silent hardware revision, myself. Thanks for the help!
 
Status
Not open for further replies.
Top