Poor performance with 2TB/4k HDDs and Raid-Z

Status
Not open for further replies.

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
Hello,

now that I've finally managed to set up my NAS correctly, I have a new problem. The write/read performance is really bad (locally and over FTP/CIFS). I'm using 4k sector drives from different vendors:

Code:
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <SAMSUNG HD204UI 09570115> ATA-7 SATA 2.x device
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 512bytes)
ada0: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada1 at ahcich0 bus 0 scbus0 target 1 lun 0
ada1: <WDC WD20EARS-00MVWB0 51.0AB51> ATA-8 SATA 2.x device
ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada2 at ahcich0 bus 0 scbus0 target 2 lun 0
ada2: <Hitachi HDS5C3020ALA632 ML6OA580> ATA-8 SATA 3.x device
ada2: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada3 at ahcich0 bus 0 scbus0 target 3 lun 0
ada3: <ST2000DL003-9VT166 CC32> ATA-8 SATA 3.x device
ada3: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes)
ada3: Command Queueing enabled
ada3: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada4 at ahcich1 bus 0 scbus1 target 0 lun 0
ada4: <WDC WD2500BEKT-00PVMT0 01.01A01> ATA-8 SATA 2.x device
ada4: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada4: Command Queueing enabled
ada4: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C)
ada5 at ahcich2 bus 0 scbus3 target 0 lun 0
ada5: <WDC WD20EARS-00MVWB0 51.0AB51> ATA-8 SATA 2.x device
ada5: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada5: Command Queueing enabled
ada5: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada6 at ahcich3 bus 0 scbus4 target 0 lun 0
ada6: <WDC WD20EARS-00MVWB0 51.0AB51> ATA-8 SATA 2.x device
ada6: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada6: Command Queueing enabled
ada6: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)


All 2 TB drives (ada1-3,ada 4-6) are part of a Raid-Z array. Only ada4 (A 250GB 2,5" hdd) is seperated. When writing to the array, I get speeds of ~33 MB/s write and ~34 MB/s read:

Code:
freenas# dd if=/dev/zero of=/mnt/tank1/ddfile bs=1024k count=2000
2000+0 records in
2000+0 records out
2097152000 bytes transferred in 59.536223 secs (35224808 bytes/sec)
freenas# dd if=/mnt/tank1/ddfile of=/dev/zero bs=1024k count=2000
2000+0 records in
2000+0 records out
2097152000 bytes transferred in 57.885041 secs (36229602 bytes/sec)


Testing the 2,5" HDD results in write speeds of ~45 MB/s and read speeds of ~90 MB/s, which is totally fine for this drive.

Now I'm wondering if that is related to the 4k sectors on the big drives. They are reporting 512b sectors via driveinfo and seem to be aligned for 512b sectors, too (ashift=9 in zpool.cache).

Can I realign them and will it result in a better performance?
Any other ideas why the performance could be that bad?

CPU usage is at ~25% while writing/reading and RAM usage at ~ 500 MB.

Here are my system specs:

NM10-DTX WiFi with Intel Atom D525 (1.8 Ghz Dualcore)
Realtek 1Gbps NIC
2 GB RAM (some old DDR-2)
6x 2TB drives (Raid-Z, see above)
1x 250GB 2,5" drive (see above)
FreeNAS-8.0-RELEASE-i386 (there were no amd64 drivers for the Realtek NIC...)
 

William Grzybowski

Wizard
iXsystems
Joined
May 27, 2011
Messages
1,754
Hello,

now that I've finally managed to set up my NAS correctly, I have a new problem. The write/read performance is really bad (locally and over FTP/CIFS). I'm using 4k sector drives from different vendors:

Code:
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <SAMSUNG HD204UI 09570115> ATA-7 SATA 2.x device
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 512bytes)
ada0: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada1 at ahcich0 bus 0 scbus0 target 1 lun 0
ada1: <WDC WD20EARS-00MVWB0 51.0AB51> ATA-8 SATA 2.x device
ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada2 at ahcich0 bus 0 scbus0 target 2 lun 0
ada2: <Hitachi HDS5C3020ALA632 ML6OA580> ATA-8 SATA 3.x device
ada2: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada3 at ahcich0 bus 0 scbus0 target 3 lun 0
ada3: <ST2000DL003-9VT166 CC32> ATA-8 SATA 3.x device
ada3: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes)
ada3: Command Queueing enabled
ada3: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada4 at ahcich1 bus 0 scbus1 target 0 lun 0
ada4: <WDC WD2500BEKT-00PVMT0 01.01A01> ATA-8 SATA 2.x device
ada4: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada4: Command Queueing enabled
ada4: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C)
ada5 at ahcich2 bus 0 scbus3 target 0 lun 0
ada5: <WDC WD20EARS-00MVWB0 51.0AB51> ATA-8 SATA 2.x device
ada5: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada5: Command Queueing enabled
ada5: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada6 at ahcich3 bus 0 scbus4 target 0 lun 0
ada6: <WDC WD20EARS-00MVWB0 51.0AB51> ATA-8 SATA 2.x device
ada6: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada6: Command Queueing enabled
ada6: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)


All 2 TB drives (ada1-3,ada 4-6) are part of a Raid-Z array. Only ada4 (A 250GB 2,5" hdd) is seperated. When writing to the array, I get speeds of ~33 MB/s write and ~34 MB/s read:

Code:
freenas# dd if=/dev/zero of=/mnt/tank1/ddfile bs=1024k count=2000
2000+0 records in
2000+0 records out
2097152000 bytes transferred in 59.536223 secs (35224808 bytes/sec)
freenas# dd if=/mnt/tank1/ddfile of=/dev/zero bs=1024k count=2000
2000+0 records in
2000+0 records out
2097152000 bytes transferred in 57.885041 secs (36229602 bytes/sec)


Testing the 2,5" HDD results in write speeds of ~45 MB/s and read speeds of ~90 MB/s, which is totally fine for this drive.

Now I'm wondering if that is related to the 4k sectors on the big drives. They are reporting 512b sectors via driveinfo and seem to be aligned for 512b sectors, too (ashift=9 in zpool.cache).

Can I realign them and will it result in a better performance?
Any other ideas why the performance could be that bad?

CPU usage is at ~25% while writing/reading and RAM usage at ~ 500 MB.

Here are my system specs:

NM10-DTX WiFi with Intel Atom D525 (1.8 Ghz Dualcore)
Realtek 1Gbps NIC
2 GB RAM (some old DDR-2)
6x 2TB drives (Raid-Z, see above)
1x 250GB 2,5" drive (see above)
FreeNAS-8.0-RELEASE-i386 (there were no amd64 drivers for the Realtek NIC...)

Hello, you could give a shot with 8.0.1-BETA, it has an option to enable 4k sector size for zfs while creating the volume
 

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
I've updated to 8.01-BETA and already got a huge performance improvement with my old 512b tank.
Here is the new benchmark (just after the upgrade to BETA with the old tank):

Code:
freenas# dd if=/dev/zero of=/mnt/tank1/ddfile bs=1024k count=2000
2000+0 records in
2000+0 records out
2097152000 bytes transferred in 47.186508 secs (44443891 bytes/sec)
freenas#
freenas# dd if=/mnt/tank1/ddfile of=/dev/zero bs=1024k count=2000
2000+0 records in
2000+0 records out
2097152000 bytes transferred in 23.264608 secs (90143448 bytes/sec)


42 MB/s write and 86 MB/s read!

Then I destroyed all my tanks and rebuild it with 4k alignment:

Code:
freenas# dd if=/dev/zero of=/mnt/tank1/ddfile bs=1024k count=2000
2000+0 records in
2000+0 records out
2097152000 bytes transferred in 25.146174 secs (83398453 bytes/sec)
freenas# dd if=/mnt/tank1/ddfile of=/dev/zero bs=1024k count=2000
2000+0 records in
2000+0 records out
2097152000 bytes transferred in 21.293656 secs (98487174 bytes/sec)


Wow! I wasn't expecting this.. ~80 MB/s write and 94 MB/s read.


But apparently the tank creation is still somehow buggy, after adding another tank you sometimes can't access one or both of the tanks (They are displayed as ONLINE, but "Can't determine total space" or something similar). And timezones are messed up, as you can see in the log below.
If that happens, there is an error in the tank creating step while unmounting /mnt:

Code:
Jun  7 08:24:13 freenas freenas[1820]: Executing: zpool export test
Jun  7 17:24:13 freenas freenas: cannot unmount '/mnt': Device busy
Jun  7 08:24:13 freenas freenas[1820]: Executing: gnop destroy /dev/ada0p2.nop
Jun  7 17:24:13 freenas freenas: gnop: Cannot destroy device ada0p2.nop (error=16).
Jun  7 08:24:13 freenas freenas[1820]: Executing: zpool import -R /mnt test
Jun  7 17:24:15 freenas freenas: cannot import 'test': no such pool available


So that's probably easy to fix? Maybe that issue can be resolved by setting up a fresh installation instead of an upgrade, but I wasn't able to test that.
 

William Grzybowski

Wizard
iXsystems
Joined
May 27, 2011
Messages
1,754
So that's probably easy to fix? Maybe that issue can be resolved by setting up a fresh installation instead of an upgrade, but I wasn't able to test that.

I am not sure what is happening this, I cannot reproduce on trunk, perhaps you might want to wait for BETA2...
 

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
OK, I gave it another try with the BETA again, now amd64. My NIC worked without hassle this time!
And: Another speed improvement (115 MB/s read - 100 MB/s write now).

But then the system started to behave weird, producing Checksum mismatches everywhere.
I created several test pools, consisting of different hdds and configurations, but the checksum errors stay.

shell:
Code:
freenas# zpool status
  pool: tank1
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        tank1       ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            ada0p2  ONLINE       0     0     0
            ada1p2  ONLINE       0     0     0

errors: No known data errors

freenas# dd if=/dev/zero of=/mnt/tank1/ddtest bs=1024k count=2000
2000+0 records in
2000+0 records out
2097152000 bytes transferred in 34.979439 secs (59953849 bytes/sec)
freenas# dd of=/dev/zero if=/mnt/tank1/ddtest bs=1024k count=2000
dd: /mnt/tank1/ddtest: Input/output error
451+0 records in
451+0 records out
472907776 bytes transferred in 8.459542 secs (55902290 bytes/sec)

freenas# zpool status
  pool: tank1
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        tank1       ONLINE       0     0     1
          mirror    ONLINE       0     0     2
            ada0p2  ONLINE       0     0     2
            ada1p2  ONLINE       0     0     2

errors: 1 data errors, use '-v' for a list


dmesg:
Code:
Jun  7 21:18:00 freenas root: ZFS: checksum mismatch, zpool=tank1 path=/dev/ada1p2 offset=477936128 size=131072
Jun  7 21:18:00 freenas root: ZFS: checksum mismatch, zpool=tank1 path=/dev/ada0p2 offset=477936128 size=131072
Jun  7 21:18:00 freenas root: ZFS: checksum mismatch, zpool=tank1 path=/dev/ada1p2 offset=477936128 size=131072
Jun  7 21:18:00 freenas root: ZFS: checksum mismatch, zpool=tank1 path=/dev/ada0p2 offset=477936128 size=131072
Jun  7 21:18:00 freenas root: ZFS: zpool I/O failure, zpool=tank1 error=86


Rebooting doesn't help, nor does changing hdds and configurations (mirror, raidz, stripe, 512b, 4k alignment)

Probably the best solution is to wait for the next beta :(
Sad, because the speed improvements of the BETA1 and 64bit and 4k are huge!

[edit]
Seems to be a faulty memory stick, memtest is throwing a lots of errors..

[edit2]
Yes, bad luck. It obviously just died.. Well I had a spare one and it's working again without any checksum errors.
 

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
Performance tests

So, here is a quick summary of all performance tests I did:

System
NM10-DTX WiFi
Intel Atom D525 (1.8 Ghz Dualcore)
2 GB DDR2 RAM
Realtek 1Gbps NIC (onboard)
1x Lexar JD FireFly 1100 4GB (System flash drive)

raidz array
3x Western Digital Caviar Green (WD20EARS) 2TB 5400rpm
1x Samsung EcoGreen F4 (HD204UI) 2TB 5400rpm
1x Hitachi 5K3000 (HDS5C30) 2TB 5400rpm
1x Seagate Barracuda Green (ST2000DL003) 2TB 5400rpm
as RaidZ1

Test
Write:
Code:
dd if=/dev/zero of=/mnt/tank1/ddtest bs=1024k count=2000

Read:
Code:
dd if=/mnt/tank1/ddtest of=/dev/zero bs=1024k count=2000


Results
FreeNas-8.0-RELEASE-i386
512b sectors
Read: ~34 MB/s
Write: ~33 MB/s

FreeNas-8.0.1-BETA1-i386
512b sectors
Read: ~86 MB/s
Write: ~42 MB/s

4k sectors
Read: ~94 MB/s
Write: ~80 MB/s

FreeNas-8.0.1-BETA1-amd64
512b sectors
Read: ~87 MB/s
Write: ~78 MB/s

4k sectors
Read: ~102 MB/s
Write: ~123-134 MB/s

Conclusion
Go for 64bit and for 4k sectors with an advanced format drive!
 

jenksdrummer

Patron
Joined
Jun 7, 2011
Messages
250
For what it's worth, I'm getting really good performance using CIFS along with 4 WD20EARS drives in RAID-Z. I built the array using the web interface, and I also jumpered the drives which for Western Digital, gives them compatability with older operating systems...I noticed at least in the web GUI for FreeNAS 8.0-RELEASE that it doesn't have the same setup as was in 7.3 (don't remember subversion, but it was the last one posted before 8.0-RELEASE) - to suggest 4K support...

I'm able to saturate a gigabit ethernet link both on download and upload, so I'm pretty happy. I couldn't do that with Windows Server 2008 R2 and software RAID-5...even after it finished syncing the drives almost a week later.

System is a HP Proliant Microserver with 8GB of Kingston ECC RAM, and 4 WD20EARS drives. Boot off a SANDisk Cruser, but that's about to change to a new 2GB USB flash drive that doesn't have the junkware preloaded on it. ;)

Anyhow, my point being is that I think the performance of ZFS on 8.0-RELEASE and 4K drives is that they need to be set in compatability mode and wiped out/reconfigured.
 

jenksdrummer

Patron
Joined
Jun 7, 2011
Messages
250
Just for comparison...

Here's my stats running FreeNas 8.0-Release:

Code:
freenas# zpool status
  pool: RAID_Z
 state: ONLINE
 scrub: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        RAID_Z        ONLINE       0     0     0
          raidz1      ONLINE       0     0     0
            gpt/ada0  ONLINE       0     0     0
            gpt/ada1  ONLINE       0     0     0
            gpt/ada2  ONLINE       0     0     0
            gpt/ada3  ONLINE       0     0     0

errors: No known data errors
freenas# dd if=/dev/zero of=/mnt/RAID_Z/ddtest bs=1024k count=2000
2000+0 records in
2000+0 records out
2097152000 bytes transferred in 8.385048 secs (250106137 bytes/sec)
freenas# (250106137 bytes/sec
freenas# dd of=/dev/zero if=/mnt/RAID_Z/ddtest bs=1024k count=2000
2000+0 records in
2000+0 records out
2097152000 bytes transferred in 1.803769 secs (1162649913 bytes/sec)


That's with 4x WD20EARS with the compatability jumpers in place. :)
 
Joined
May 27, 2011
Messages
566
zfs does some major caching with your system memory. working with a 2 GB file will not give you an accurate benchmark.

for example on my system with 8GB of memory and a 8 disk raidz2 pool, i can write a 200MB file at 1.5GB/s, a 2GB file at 350MB/s and a 20GB file at 250MB/s

if i read a file twice, i can get as much into the cache as i can. i can also read the 200MB file at 6GB/s, the 2GB file at 6GB/s and the 20GB file at 320MB/s.

benchmarking small files will give you inaccurate results.
 

jenksdrummer

Patron
Joined
Jun 7, 2011
Messages
250
zfs does some major caching with your system memory. working with a 2 GB file will not give you an accurate benchmark.

for example on my system with 8GB of memory and a 8 disk raidz2 pool, i can write a 200MB file at 1.5GB/s, a 2GB file at 350MB/s and a 20GB file at 250MB/s

if i read a file twice, i can get as much into the cache as i can. i can also read the 200MB file at 6GB/s, the 2GB file at 6GB/s and the 20GB file at 320MB/s.

benchmarking small files will give you inaccurate results.

Since I know hardly anything about FreeBSD Unix or anything non-windows, I used what was posted earlier. :)

I was wondering how it was that I was getting over a GB/sec from 4x 5400RPM drives in a very complex (high overhead) software RAID system. That said, I've seen better performance when it comes to writing/reading large files vs small ones...at least when it comes to how much the wire gets saturated.
 

jenksdrummer

Patron
Joined
Jun 7, 2011
Messages
250
Hey, not bad...

Code:
freenas# dd if=/dev/zero of=/mnt/RAID_Z/ddtest bs=1024k count=20000
20000+0 records in
20000+0 records out
20971520000 bytes transferred in 91.021817 secs (230401026 bytes/sec)
freenas# dd of=/dev/zero if=/mnt/RAID_Z/ddtest bs=1024k count=20000
20000+0 records in
20000+0 records out
20971520000 bytes transferred in 64.676075 secs (324254680 bytes/sec)
 

Crispin

Explorer
Joined
Jun 8, 2011
Messages
85
Hello, you could give a shot with 8.0.1-BETA, it has an option to enable 4k sector size for zfs while creating the volume

Sorry, noob here - where is this option? I can't find anything in the GUI and the walkthroughs I have found for BSD seem to be more complicated than suggested above.

Cheers,
Crispin
 

William Grzybowski

Wizard
iXsystems
Joined
May 27, 2011
Messages
1,754
Sorry, noob here - where is this option? I can't find anything in the GUI and the walkthroughs I have found for BSD seem to be more complicated than suggested above.

Cheers,
Crispin

The 4k sector size will be available as soon as you select ZFS in volume creation (from 8.0.1-BETA1 and on)
 

Crispin

Explorer
Joined
Jun 8, 2011
Messages
85
The 4k sector size will be available as soon as you select ZFS in volume creation (from 8.0.1-BETA1 and on)

sorry - can't see it? When I select the disks and then ZFS I get the options to either add other disks as cache or ZIL, nothing about sector size.
Using 8 RC, 64bit

Cheers,
Crispin
 

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
You need to use 8.0.1-BETA1 at least.
8.0 RC < 8.0-RELEASE < 8.0.1-BETA1
 

Tekkie

Patron
Joined
May 31, 2011
Messages
353
Here are the numbers for my little mixed-vendor disk setup:

Code:
%dd if=/dev/zero of=/mnt/storage/movies/ddtest bs=1024k count=20000
20000+0 records in
20000+0 records out
20971520000 bytes transferred in 89.074932 secs (235436834 bytes/sec)
%dd of=/dev/zero if=/mnt/storage/movies/ddtest bs=1024k count=20000
20000+0 records in
20000+0 records out
20971520000 bytes transferred in 50.796097 secs (412856917 bytes/sec)
 

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
Here are the numbers for my little mixed-vendor disk setup:

Code:
%dd if=/dev/zero of=/mnt/storage/movies/ddtest bs=1024k count=20000
20000+0 records in
20000+0 records out
20971520000 bytes transferred in 89.074932 secs (235436834 bytes/sec)
%dd of=/dev/zero if=/mnt/storage/movies/ddtest bs=1024k count=20000
20000+0 records in
20000+0 records out
20971520000 bytes transferred in 50.796097 secs (412856917 bytes/sec)

What system is this running on? CPU, Mainboard, RAM, etc.?
Will upgrading from 2GB to 4GB RAM improve the speed?
 
B

Bohs Hansen

Guest
The 4k sector size will be available as soon as you select ZFS in volume creation (from 8.0.1-BETA1 and on)


I might be looking the wrong place .. but can't seem to find it either with beta2

zfs1.jpg zfs2.jpg
 

Tekkie

Patron
Joined
May 31, 2011
Messages
353
I might be looking the wrong place .. but can't seem to find it either with beta2

View attachment 30 View attachment 29
I had exactly same problem when I upgraded from 8.0 -> 8.0.1-beta initially. I thought it had to do with me setting the swap partition to be 1GB per drive on new volumes, but I can't reproduce this problem anymore.

You may want to try to clear your browser cache, or do another reboot or so?
 
Status
Not open for further replies.
Top