Last Straw! ZFS/Disk Performance Issues

Status
Not open for further replies.

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I know you CAN write to the device directly, but you can't if the drive has any mounted partitions. I saw this errror once on a drive that had a ZFS pool on it. Try deleting your ZFS pool then run the commands.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
[root@freenas] ~# dd if=/dev/zero of=/dev/ada0 bs=2048k count=50k
dd: /dev/ada0: Operation not permitted

Looks like I can't write directly to the dev - not sure if this is normal or not.
As noobsauce80 said delete your zpool first. The OS is protecting the GPT partition info, I believe.

Also, do the test with the drives connected directly to the motherboard SATA ports for a baseline.
 

slydog

Cadet
Joined
Aug 10, 2012
Messages
1
the whole point of ZFS is flawed if there is an underlying file system on the disk, which there is in ESXi.

Are you using a vmdk for your pool's disks? That would suck. I would pass them through directly to the virtual machine.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
I'm away from home at the moment, but I'll check out that link as well as delete the zpools and test the dev directly.

As for the vmdk question, I am using them on my current build. I am trying to set this up to prevent that. There is a way to pass local devices through to the vm, but it's a hack and not as straightforward as I'd like it to be. Iscsi can be passed through though, and that's what I'm leaning towards using.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
I had to clear the zpools, and do a quick wipe of the disks to be able to write directly to the device. I'm re-running the baseline dd tests (both read and write), and will post the results tomorrow.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Ok, so it finished up in enough time to post these up tonight:

ada0:
Code:
dd if=/dev/zero of=/dev/ada0 bs=2048k count=25k
25600+0 records in
25600+0 records out
53687091200 bytes transferred in 469.029439 secs (114464225 bytes/sec)

dd if=/dev/ada0 of=/dev/null bs=2048k count=25k
25600+0 records in
25600+0 records out
53687091200 bytes transferred in 400.069497 secs (134194413 bytes/sec)


ada2:
Code:
dd if=/dev/zero of=/dev/ada2 bs=2048k count=25k
25600+0 records in
25600+0 records out
53687091200 bytes transferred in 430.287495 secs (124770280 bytes/sec)

dd if=/dev/ada2 of=/dev/null bs=2048k count=25k
25600+0 records in
25600+0 records out
53687091200 bytes transferred in 410.287855 secs (130852255 bytes/sec)


ada1 is the one that was giving me the smart errors, and I just received another with it plugged directly into the mobo. I think this drive will have to be RMA'd to Western Digital. The tests took a very long time to complete, and because of the errors, I just cancelled it.

From what I can tell, these results are within the expected range for 7200RPM SATAII drives. Let me know if I'm out of line here.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
From what I can tell, these results are within the expected range for 7200RPM SATAII drives. Let me know if I'm out of line here.
They look fine to me. Wait nevermind. Someone changed the test. It's 50k not 25k.

Raw before I created the mirror:
This will do terrible things to your array. Do not try it on a disk in any type of array.
Code:
# dd if=/dev/zero of=/dev/ada0 bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes transferred in 543.106874 secs (197703597 bytes/sec)

# dd if=/dev/ada0 of=/dev/null bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes transferred in 537.182353 secs (199884046 bytes/sec)

119 for your drives vs 188.5 for mine. Your drives seem a bit low.

Did you run both the dd tests concurrently or one at a time. Were ada0 and ada2 connected directly to the motherboard or in the drive bays?

Try running the test again, maybe 50k this time to not confuse me ;), against a single drive only if you haven't already.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
I did run both at the same time in separate terminal windows. I thought that may cause issues, but I thought that was recommended.

The drives were connected directly to the motherboard ports via a SATA cable - no drive bays.

I'll retest again at 50K counts one at a time and repost results.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
I did run both at the same time in separate terminal windows. I thought that may cause issues, but I thought that was recommended.
You need to test both ways if there were any issues. In a zpool all the drives will be accessed concurrently.

The drives were connected directly to the motherboard ports via a SATA cable - no drive bays.
OK, got it.

I'll retest again at 50K counts one at a time and repost results.
If a single drive comes back with the same rate than you don't need to test the 2[sup]nd[/sup] one.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Makes sense...

;)
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Results of first drive:

Code:
dd if=/dev/zero of=/dev/ada0 bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes transferred in 937.129852 secs (114577699 bytes/sec)

dd if=/dev/ada0 of=/dev/null bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes transferred in 815.340635 secs (131692421 bytes/sec)
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
Hmm, same speed. I guess your combination of drives/SATA controller are just a bit slower? You do have AHCI enabled in the BIOS right?

Next is to test the drives in the bays. You should see the same speeds if the bays are working properly.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Although I've performed this test before, I put the drives back in the drive bays and am testing now. I'm consistently getting 120 - 130MB/s transfers connected directly.

I looked through the BIOS several times and only saw one setting for AHCI and it mentioned something about the onboard RAID controller. I set that to AHCI, but in the BIOS the disks still show as IDE. I'll look again in a little while and post the exact setting - and if the drive tests are complete, the results.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Same deal. I/O errors as soon as the drives were inserted into the backplanes. Test completed, but performance was about 43MB/s write.

On a side note, I finally was able to get the tech support @ supermicro to perform a similar test (instead of copying a 3GB file in windows and timing it). The results are as follows:

Code:
w/o backplane

[root@localhost ~]# dd if=/dev/zero of=/dev/sdb1 bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB) copied, 999.778 seconds, 107 MB/s

[root@localhost ~]# dd if=/dev/sdb1 of=/dev/null bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB) copied, 998.793 seconds, 108 MB/s


w/ backplane


[root@localhost ~]# dd if=/dev/zero of=/dev/sdb1 bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB) copied, 999.573 seconds, 107 MB/s

[root@localhost ~]# dd if=/dev/sdb1 of=/dev/null bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB) copied, 999.061 seconds, 107 MB/s


Maybe I'm wrong, but given their hardware, I would expect better performance either way. The write speeds seem a little on the low end (even lower than my "good" tests), but the read speeds are the same as the write... I thought these were typically significantly higher than the write speeds???

Their test hardware:

Code:
Motherboard: Supermicro H8QGi-F, south bridge is SP5100
HDD: two WD RE4 500GB, both connect from on-board SATA ports to CSE-M35T-1
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Been watching the thread and haven't had much to contribute... couple of questions:

1. The WD drive that is failing smart tests, do you know what test is failing? The reason I ask is because some drives have a parameter for communication errors. If you are getting tons of communication errors that can cause wonky test/performance results and eventually a SMART failure.(read below for my other idea)
2. I can't seem to find a post mentioning the enclosure you are using. What model is your enclosure?

I'm running on a wild idea.. bear with me...

I'm wondering if you are having issues between SATA 1 and SATA 2 hardware. I used to own a CSE-M35T-1. I believe it came out before SATA II was even ratified. I'm just wondering if the low performance is because of noise on the SATA cables from the close proximity to the power cables on the enclosure. I did notice that Supermicro doesn't give a specification on CSE-M35T-1 for SATA I,II or III. The manual just says "Supports SATA and SAS" and was last revised in 2010. No speed given anywhere. That may hint that SATA I is the best you can hope for. Is there a way to set your ports to SATA I only? I've seen hardware that was really flaky and would negotiate SATA II speeds but would be riddled with errors and performance problems that went away as soon as I forced 150MB/sec transfer rates. Could that be a problem here? If you were forced to use SATA I speeds I'm not sure if there would be any significant performance hit because of the number of drives used. Your zpool could dish out the data so much faster than Gb LAN speeds. Resilvering might be slower though.

Another idea... are you using a port multiplier? Does your enclosure require 5 SATA cables be connected to it for the 5 drives or is it 1 cable?
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Noobsauce80,

I appreciate your wild ideas. I'm all out on this end...

I have the same hot-swap SATA backplane that you had. I listed it previously here.

The release date shows 2010. I completely overlooked the fact that it didn't specify which SATA standard it was designed for. I didn't expect SATA III since it didn't specifically say so, but I did expect SATA II. Tech support has not mentioned any of the ideas that you did, and while your ideas may be a little wild... they're feasible. As for your theory, SATA I speeds would be fine, as my drives aren't performing past SATA I speeds when connected directly. 10-40MB/s - not so much. As for setting the SATA ports for SATA I (150MB/s), I've never noticed a setting for that. I'm also getting I/O errors and SMART errors when the drives are connected to the backplane, so that also is in-line with your noise theory.

I can't for the life of me figure out why I have three of these backplanes doing the exact same thing, but when bypassed, the drives perform well. There are about a hundred reviews on newegg, and none that I saw complained about speed problems. I have run across one person on smallnetbuilder that used one and had a similar problem. He just returned it for an icydock enclosure. With all the good reviews, I took my chances (three times).

Not using any port multipliers. Straight cable from onboard controller straight to the connectors on the backplane. One connector per drive.

The most recent SMART test failure I've had is Reallocated_Sector_Ct. One drive had this failure numerous times and was increasing steadily. I think it was on about 1224 or so when I gave up on it. I have that drive disconnected now.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Somehow I missed the one line with your model number. :P

I assume you are plugging in power to both of the power plugs.

Do you have another controller you can use temporarily that can force SATA I speeds? Some hard drives have a jumper that lets you force 1.5Gb/sec mode. If you could prove the enclosure works at 1.5Gb/sec at least you'd know how to get them to work correctly. I have a background in EE and I've seen a few situations like this in the past. It's extremely rare and very difficult to definitively prove.

Other than that I don't have any other good ideas. There's something bizarre with your 5 bay enclosures.

As an alternative for your situation you could try buying a Norco case like http://www.newegg.com/Product/Product.aspx?Item=N82E16811219033. They work pretty darn well and cost-wise they're about the same cost as 3 of your enclosures.

Edit: I found my old box in the basement. Bought it in 2005 for 119.99. Of course, in 2005 there was no SATA 2 or 3, so saying "SATA" was all that was needed.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
I'm plugging in both power plugs on the backplane. I don't have anything that I know of that will force SATA I speeds.

I'm really considering just buying some of the ~$20 4 in 3 drive bays from coolermaster, and throwing them in the case and call it a day. I don't have the server full of drives at this point anyway, but I was trying to future proof this case.

A couple things from supermicro's tech support:

The backplanes are built to SATA II specs.
They are thinking that the mobo SATA signals may not be strong enough for the added circuitry of the backplane.

I'm no EE, but I've done a great deal of component level electrical diagnostic work, and admit that it's POSSIBLE. Not sure if it's probable though. I am considering buying the IBM M1015 card that would give me a greater port density. Maybe this would solve my problem? I don't want to throw any more money at this, but the M1015 would be usable (and eventually required) either way. If I don't get these backplanes working, I'll just have to try and sell them anyway. I want them to work, and I have to admit that I think it's unlikely that all three of them are bad. That may be the best option at this point.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
I can't for the life of me figure out why I have three of these backplanes doing the exact same thing, but when bypassed, the drives perform well. There are about a hundred reviews on newegg, and none that I saw complained about speed problems. I have run across one person on smallnetbuilder that used one and had a similar problem.
They are thinking that the mobo SATA signals may not be strong enough for the added circuitry of the backplane.

You could also have a faulty cable. Not sure how likely it is to have 3 bad cables. If you can find someplace else to order it from besides SuperMicro, assuming you want to at this point, you could give that a shot.

I saw a couple of posts about speed problems with no resolution though. I saw a few that were quite happy as well.

The most recent SMART test failure I've had is Reallocated_Sector_Ct. One drive had this failure numerous times and was increasing steadily. I think it was on about 1224 or so when I gave up on it. I have that drive disconnected now.
Time to throw it in the trash.
 
Status
Not open for further replies.
Top