Write speed issues with new build - any tips?

Status
Not open for further replies.

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
new server details:
Supermicro 6047r-e1r36n 36-bay
Motherboard X9DRi-LN4F+
2x 2.0Ghz 6-core
64 GB DDR3 ECC Ram
Myricom 10Gbe + 41GBe onboard Intel i350X
option between LSI 9300-8i and 9207-8i
Edit: AFP protocol used for Sharing

So far my test raidz2 and 12x 2TB 7200rpm SATA drives has had lackluster write speeds, around 12MB-50MB/s tops.
Even though the project is to attempt a maximum MTTDL with Raidz3, I'm positive I can reach for higher write speeds with all that ram, just not sure how.

My current Test Raidz2 has no dedicated ZIL or Log SSD and Dedupe is turned on, but write seemed fairly lame when I had Dedupe off as well. I noticed that wired ram usage went to 50GB and stayed there early on in the big data move, which consists of several million photos between 11-50MB.

the source of the data is an Xserve with 10Gb fiber as well, and a DAS hardware Raid6 that I have already seen do 650MB/s read.

I am open to purchasing SSDs if they can be proven a sure win, but with these speeds, I'll have to look into other options like Win2k12 R2 storage spaces or hardware raid :(
Screen Shot 2014-09-22 at 12.43.30 PM.png
Screen Shot 2014-09-22 at 12.43.53 PM.png


All told, this server will need to hold over 100TB, some of which could really benefit from compression and dedupe.

Thanks in advance for any advice or links.
 
Last edited:

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
Do I understand right that you are using Windows/Samba shares for the writing? It was found that up to FreeNAS 9.2.1.7 Samba+ZFS in that cases use small synchronous transaction on every file creation. That may dramatically reduce speed when storing zillions of small files. This bug should be fixed in next FreeNAS version. To prove that theory you may try to temporary set sync=disabled for the test, and then return it back.
 

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
Solid question with a not entirely clear answer.
when I made the Dataset, I selected Apple, like so:

Screen Shot 2014-09-22 at 1.55.54 PM.png


But now that I check back in on it, it switched itself back to windows:

Screen Shot 2014-09-22 at 12.43.53 PM.png


However, if you're just referring the share protocol, I don't even have CIFS services running. That's AFP only.
 

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
You haven't mentioned the AFP share protocol in your question, so I had to guess. Nevermind.
 

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
Does making more Datasets help with dedupe performance? Like is running dedupe on 2 8TB datasets less resource intensive than running dedupe on 1 16TB dataset if dedupe is turned off on the raidz2 level and not inherited?
Especially if I'm only writing data to 1 dataset at a time? We have 5-8 major brands that duplicate data is very unlikely across, but very likely within. Re-cutting the data this way would be easy if you guys think it will help with performance.
But with only 4TB on the array and 55GB of ram now wired, I have no idea why it's running so slow. Even with the dire warnings of 5GB of ram per 1TB dedupe data, I'm under that threshold at the moment... and getting DroboFS-like speeds :(
 

solarisguy

Guru
Joined
Apr 4, 2014
Messages
1,125
The link I have provided explains why it is 20GB/1TB and not 5GB/1TB (it is due to other limits)
 

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
Whatever you are doing, do not use deduplication. I think you might be better off after recreating from scratch without deduplication.

If you had time to read about it, you would have not used it. So I am telling you that ‟you should plan for at least 20GB of system RAM per TB of pool data, if you want to keep the dedup table in RAM”

apologies. I've read various guides, and they always seem to say 5GB of Ram per 1TB of data. I was hoping I could stay under that be slicing and dicing up clever data sets and not hitting them simultaneously.
Happy to try without dedupe, but that would negate a lot of the benefit of going such an exotic route, rather than the corporate win2k12 way.
 

solarisguy

Guru
Joined
Apr 4, 2014
Messages
1,125
P.S. Try other tests too.
  • Have two separate pools, 6 disks each.
  • Did you try a local write? (With two pools that would be easy.)
  • Test pure network speed with Iperf first, then FTP, etc.
 

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
Cool, it sounds like still a viable option if I get a 512GB SSD (5GB *1TB) for L2ARC cache.... ? And these newer versions of FreeNas ZFS don't nuke all your data when that SSD fails... supposedly :) So that L2ARC doesn't need to be redundant?

I was able to get closer to gigabit line level with a "Raid0" type stripe across all drives (no parity). Just not sure how to figure out what my specific bottleneck is.
is parity calculation a single threaded process? is it possible I'm hitting the end of a single 2.0Ghz core and the other 11 cores just don't get used for parity?
running top via ssh while writing data doesn't show an obvious parity process running like mds or mdraid.
 

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
Or is there a way to cheat the best of both worlds? Raidz3, no dedupe, no compression, no datasets.
Make a zvol instead, share via iSCSI, mount and format NTFS on some windows server and dedupe at night?

is that possible? would I be losing some essential data integrity features of ZFS at that point? I'd still be getting triple parity and bit rot protection? what would I be losing?
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
Wow. Dedupe turned on with 12T of data, and plans to go to 100T of data? Wow.

When you turned off dedupe, did you destroy and recreate the pool to completely get rid of dedupe? Otherwise the test isn't that valid.
 

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
There will be plenty of time to make fun of my newb-ness afterwards, but I really need to focus on options at the moment.
I've spent the last 10 years of my career hearing about what file systems don't do.
I want to explore what ZFS does do, and if it has any use in extending MTTDL and or eventual phase of LTO5.

I haven't destroyed my deduped pool yet because nobody has weighed in on my suggested re-configs. So there is no "when" at the moment titan.
 

solarisguy

Guru
Joined
Apr 4, 2014
Messages
1,125
ZFS does block-level deduplication.

I do not understand what would be a deduplication on a Windows server. Would it be making hard links? You can do them on the FreeNAS server, and serve AFP from there.
 

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
@solarisguy big picture pretty much summed up by
extend MTTDL with RaidZ3 so far out as to be competitive with LTO tape backups and completely vendor agnostic.
everything else, snapshots, dedupe, compression, speed, would be added bonus.
eventually I'll have to go even bigger, adding a shelf of another 36 4-6TB drives. so capacity will be reaching toward 200TB quickly.

Spectra Logic has an n-Verde product that does this with ZFS for about $50-75k per shelf of 144TB raw. If I can beat that here with you guys, that's awesome, otherwise...
we'll just have to bite the bullet.
 

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
ZFS does block-level deduplication.

I do not understand what would be a deduplication on a Windows server. Would it be making hard links? You can do them on the FreeNAS server, and serve AFP from there.
@solarisguy Win2k12 has a dedupe function that happens out of order in server idle time. So once the data has been there for a few weeks, it will pass through at night and perform a similar block level dedupe. But because it's not in real time, it will work with 2GB of ram. So my idea, which could be bogus and hair brained, would be to share iSCSI out of FreeNas to a win2k12 server, format it with NTFS over there sorta cheat the benefits of both... ZFS bit rot protection and Win2k12 non-real time dedupe...
 

gpsguy

Active Member
Joined
Jan 22, 2012
Messages
4,472
You might want to contact iXsystems and look at their TrueNAS products.
 

solarisguy

Guru
Joined
Apr 4, 2014
Messages
1,125
I do not have iSCSI experience of that type, so I cannot comment.

Once you do not have deduplication, you should be fine.

The only potential problem I am seeing is that you will be managing rather large data sets with tools you might not be comfortable with.
 

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
@solarisguy haha, true! You caught me. I'm an everything but linux/unix kinda guy :)
In a perfect world ReFS would support extended attributes and dedupe, and windows storage spaces would support thin provisioning and dual parity at the same time.
But without all that 21st century hotness, I have to branch out.
 
Status
Not open for further replies.
Top