I have write problem need help :-(

Status
Not open for further replies.

skyyxy

Contributor
Joined
Jul 16, 2016
Messages
136
Hi everybody here, I have a server with FNS installed 11.1U6 lastest. The hardwear list is:
E5-1620v3
supermicro x10srl-f
32gb or 64gb reg ecc ddr4 2400
lsi 9211-8i hba with it mode
SEAGATE 3tb desktop hdd * 18 (also have exos enterprise)
intel x520 10GbE 2prots network and 2ports 1GbE local onboard
Samsung 1TB 850evo sata3 for l2arc
Samsung 1TB SM961 NVME for Zil

Im using adobe pr ae and davinci resolve under the mac osx and connect the server with afp. When I read files or clips and import to applications is great and working smooth. But when I write files to datazpool that the disks usage is very high(80% for Desktop HDD Zpool, and 40-50% for EXOS Enterprise HDD). Other workstations will very slow in PR AE and Davinci resolve Playback at this momt. Even I setup 1 nvme 1TB for zil (without zil before just use it today), But just read just less than 10% usage. I have no idea why isit. Anybody can help me? Thanks a lot.

In the pictuure the Red is Write, Green is read.
 

Attachments

  • 2.pic.jpg
    2.pic.jpg
    219.5 KB · Views: 239

m0nkey_

MVP
Joined
Oct 27, 2015
Messages
2,739
Samsung 1TB 850evo sata3 for l2arc
Samsung 1TB SM961 NVME for Zil
Remove these, you'll see a performance increase. A 1TB L2ARC is ridiculous with only "32 or 64GB" of RAM. You'll be storing all the L2ARC pointers in RAM with nothing left for ARC. The ZIL won't even be touched because AFP doesn't even do synchronous writes.

You've wasted a perfectly good 850EVO and NVMe SSD. They're better suited to other tasks.
 

Allan Jude

Dabbler
Joined
Feb 6, 2014
Messages
22
Im using adobe pr ae and davinci resolve under the mac osx and connect the server with afp. When I read files or clips and import to applications is great and working smooth. But when I write files to datazpool that the disks usage is very high(80% for Desktop HDD Zpool, and 40-50% for EXOS Enterprise HDD). Other workstations will very slow in PR AE and Davinci resolve Playback at this momt. Even I setup 1 nvme 1TB for zil (without zil before just use it today), But just read just less than 10% usage. I have no idea why isit. Anybody can help me? Thanks a lot.

In the pictuure the Red is Write, Green is read.

What is the problem? When you write to the disks, the usage SHOULD be high, that means the disks are doing the work you have asked them to do...

Are you not getting the speeds you expect? I think you can find a different graph that will show the actual write speed in megabytes per second instead of as a 'busy %'.

It may be possible you are seeing a lot of 'read-modify-write' or something if the recordsize of your data set doesn't match up with what the application is doing, but I wouldn't jump to that conclusion right away.
 

skyyxy

Contributor
Joined
Jul 16, 2016
Messages
136
What is the problem? When you write to the disks, the usage SHOULD be high, that means the disks are doing the work you have asked them to do...

Are you not getting the speeds you expect? I think you can find a different graph that will show the actual write speed in megabytes per second instead of as a 'busy %'.

It may be possible you are seeing a lot of 'read-modify-write' or something if the recordsize of your data set doesn't match up with what the application is doing, but I wouldn't jump to that conclusion right away.

thanks for your reply. How I can to fix it? Because when somebody write something to the Server (like copy something or export a clip in Adobe PR) , other workstations will very slow (do something like playback in Adobe PR or Davinci resolve). Or I need use an RaidCard like LSI 9260-8I and SSD as a L2arc to instead of the HBA (now is LSI 9211-8i), thanks a lot.
 
Joined
Jan 26, 2015
Messages
8
skyyxy,
(1) could CPU load be the bottleneck? Check CPU usage while you write to the pool and it feels "slow"
(2) while you're writing to the pool, how is determined the pool is "slow"? Additional writes being slow, or reads?
(3) How are the 18 disks organized in your pool?
(4) How full is your pool?
 

skyyxy

Contributor
Joined
Jul 16, 2016
Messages
136
skyyxy,
(1) could CPU load be the bottleneck? Check CPU usage while you write to the pool and it feels "slow"
(2) while you're writing to the pool, how is determined the pool is "slow"? Additional writes being slow, or reads?
(3) How are the 18 disks organized in your pool?
(4) How full is your pool?

Thanks for your reply.
1:My CPU is Xeon E5-1620v2 3.7G 4cores and 8 threads. So I dont think its weak for the NAS system.
2:I tested today again. 3 workstations copy big clips to the Freenas server together (200-600GB files), one of them is 10GbE network (My server has 2 10GbE network adapter and 2 1GbE onboard), another two are 1GbE network. at this mont the fourth workstation running PR or Davinci Reslove and drop a clip to the timeline and playback starting slow or stop (playback acrtrully is read from server), then I close the PR and Davinci and run AJA system test, the read speed just 5-10MB/S sometime is a little bit higher or worth.
3:I organized 18disks as RAIDZ2 mode, and use samsung 860evo SATA3 1TB ssd as the L2ARC cache for this datapool
4:Now still has 18% space. Acturlly always has this issue since I built it.

Today I remove the ZIL and still like that:)
Tomorrow I will build another server to test the write performance if I use and RAID card(I have LSI 9260-8i and Highpoint 2720), hope can be lucky.
Yesterday one of my friend has commercial NAS system build on linux and BTRFS file system. Use the LSI RAID card and without SSD cache (not use the Bcache yet), he take the server to my office try to test the same project and no problem. Thats why I want to try the RAID card tomorrow.

Thanks again :) I really hope can be fix it, because I very like Freenas, its a Great system and very powerful.
 
Joined
Jan 26, 2015
Messages
8
Hi skyyxy,

you are aware of the fact that your RAIDZ2 with 18 disks, has the write IOPS of *1* disk only ? This goes for randomly distributed writes, but with a pool that only has 18% space left and for sure accumulated quite some fragmentation, lots of your write traffic is in fact quite random even if you have "streaming" writes of few large files. That's what you get for using a copy-on-write filesystem plus having few free space: ZFS has to work hard to find the largest contiguous chunks of free space - disk head movement results.
So at the moment your bottleneck are very likely your disk mechanics.
Obviously your L2ARC, if at all, helps with reads only (well, it might relieve the spinning disks from read requests trying to move their heads).
The separate ZIL ( actually the "SLOG") should not have any effect with asynchronous writes (what you mostly seem to have as I understood reading this thread).

So to improve the situation, you should make sure you keep your pool less filled up, but more importantly organize your disks in a different way. Using 2 RAIDZ2 vdevs á 9 disks or even 3 vdevs á 6 disks should improve your pool write performance by about a factor of 2 or 3, respectively. And yes, I am aware you cannot squeeze out that much usable space out of that equipment.
[ Also, 18 disks in a RAIDZ2 are unusual many ... IIRC lower numbers are recommended ]

Sorry for the bad news. Since I had a similar experience on my first test pool when I started with ZFS I was always IOPS-aware when planning my pools' layouts ( even going to striping over mirrors for particularly IOPS-hungry applications like DB or VM backing store), plus limiting usable pool capacity via quota. Thanks to this forum, btw. ...

If you argue, a COW filesystem like ZFS makes you lose lots of space, you're right. That's what you pay for the feature set you get and the fact you can mostly avoid expensive high RPM drives :)



Thanks for your reply.
1:My CPU is Xeon E5-1620v2 3.7G 4cores and 8 threads. So I don't think its weak for the NAS system.
2:I tested today again. 3 workstations copy big clips to the Freenas server together (200-600GB files), one of them is 10GbE network (My server has 2 10GbE network adapter and 2 1GbE onboard), another two are 1GbE network. at this mont the fourth workstation running PR or Davinci Reslove and drop a clip to the timeline and playback starting slow or stop (playback acrtrully is read from server), then I close the PR and Davinci and run AJA system test, the read speed just 5-10MB/S sometime is a little bit higher or worth.
3:I organized 18disks as RAIDZ2 mode, and use samsung 860evo SATA3 1TB ssd as the L2ARC cache for this datapool
4:Now still has 18% space. Acturlly always has this issue since I built it.

Today I remove the ZIL and still like that:)
Tomorrow I will build another server to test the write performance if I use and RAID card(I have LSI 9260-8i and Highpoint 2720), hope can be lucky.
Yesterday one of my friend has commercial NAS system build on linux and BTRFS file system. Use the LSI RAID card and without SSD cache (not use the Bcache yet), he take the server to my office try to test the same project and no problem. Thats why I want to try the RAID card tomorrow.

Thanks again :) I really hope can be fix it, because I very like Freenas, its a Great system and very powerful.
 

skyyxy

Contributor
Joined
Jul 16, 2016
Messages
136
Hi skyyxy,

you are aware of the fact that your RAIDZ2 with 18 disks, has the write IOPS of *1* disk only ? This goes for randomly distributed writes, but with a pool that only has 18% space left and for sure accumulated quite some fragmentation, lots of your write traffic is in fact quite random even if you have "streaming" writes of few large files. That's what you get for using a copy-on-write filesystem plus having few free space: ZFS has to work hard to find the largest contiguous chunks of free space - disk head movement results.
So at the moment your bottleneck are very likely your disk mechanics.
Obviously your L2ARC, if at all, helps with reads only (well, it might relieve the spinning disks from read requests trying to move their heads).
The separate ZIL ( actually the "SLOG") should not have any effect with asynchronous writes (what you mostly seem to have as I understood reading this thread).

So to improve the situation, you should make sure you keep your pool less filled up, but more importantly organize your disks in a different way. Using 2 RAIDZ2 vdevs á 9 disks or even 3 vdevs á 6 disks should improve your pool write performance by about a factor of 2 or 3, respectively. And yes, I am aware you cannot squeeze out that much usable space out of that equipment.
[ Also, 18 disks in a RAIDZ2 are unusual many ... IIRC lower numbers are recommended ]

Sorry for the bad news. Since I had a similar experience on my first test pool when I started with ZFS I was always IOPS-aware when planning my pools' layouts ( even going to striping over mirrors for particularly IOPS-hungry applications like DB or VM backing store), plus limiting usable pool capacity via quota. Thanks to this forum, btw. ...

If you argue, a COW filesystem like ZFS makes you lose lots of space, you're right. That's what you pay for the feature set you get and the fact you can mostly avoid expensive high RPM drives :)
Thank you verymuch for your help, so now I have a question: if I use the RAID card it will be help for write performance? Because I will build RAID6 in LSI 9260-8I bios and the freenas will just find 1 disk in the system. hahaha I will do the test.
 
Joined
Jan 26, 2015
Messages
8
Hi skyyxy,

well, if you're using a battery backed write cache (or equivalent flash cache) I could imagine that's helpful.
On the other hand ZFS loses direct access to the drives, making built-in optimizations for dealing with spinning disks meaningless, and especially ... you lose access to the drives' SMART info and have to rely on the RAID controller and its management software to learn about problematic disks ...
Generally this is not a recommended configuration, if you search the forum.

However, give it a try - ultimately it has to work for YOU :)

Good luck!

Thank you verymuch for your help, so now I have a question: if I use the RAID card it will be help for write performance? Because I will build RAID6 in LSI 9260-8I bios and the freenas will just find 1 disk in the system. hahaha I will do the test.
 

skyyxy

Contributor
Joined
Jul 16, 2016
Messages
136
Hi skyyxy,

well, if you're using a battery backed write cache (or equivalent flash cache) I could imagine that's helpful.
On the other hand ZFS loses direct access to the drives, making built-in optimizations for dealing with spinning disks meaningless, and especially ... you lose access to the drives' SMART info and have to rely on the RAID controller and its management software to learn about problematic disks ...
Generally this is not a recommended configuration, if you search the forum.

However, give it a try - ultimately it has to work for YOU :)

Good luck!
Thanks a lot. I will start the test job at tomorrow. Hope can be works....
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Because I will build RAID6 in LSI 9260-8I bios and the freenas will just find 1 disk in the system.
Yes, and ZFS will then be nearly useless to you. Hardware RAID controllers are strongly discouraged. But then, so are 18-disk vdevs.
 
Joined
Jan 26, 2015
Messages
8
Danb35, indeed a part of ZFS's feature set will be useless, and there are definitely compelling reasons to give ZFS full access to single disks. The disadvantages of such a crippled ZFS setup have been discussed ad nauseam in this forum again and again.

What the OP does not lose are the features resulting from ZFS's copy-on-write nature: cheap snapshotting and snapshot replication to remote pools. There are more advanced features that stay, like ZFS custom properties.

skyyxy:
- Going with a HW RAID controller means you have to make sure you utilize that controller's mechanisms for identifying and replacing defect disks (which IMO are inferior to what ZFS can do). So install the RAID controller's management software and make sure you're alerted via mail once a drive shows problems.
- create at least 2 RAID-6 arrays and stripe over them (RAID-60). A single, large RAID-6 has lousy write performance also on classical, non-ZFS RAID setups.
- Danb35 is kind of right, but as long as you're cautious and know what you're doing at least your data should be ok. Give it a try and let us know how things perform.

Cheers ...
 

skyyxy

Contributor
Joined
Jul 16, 2016
Messages
136
Danb35, indeed a part of ZFS's feature set will be useless, and there are definitely compelling reasons to give ZFS full access to single disks. The disadvantages of such a crippled ZFS setup have been discussed ad nauseam in this forum again and again.

What the OP does not lose are the features resulting from ZFS's copy-on-write nature: cheap snapshotting and snapshot replication to remote pools. There are more advanced features that stay, like ZFS custom properties.

skyyxy:
- Going with a HW RAID controller means you have to make sure you utilize that controller's mechanisms for identifying and replacing defect disks (which IMO are inferior to what ZFS can do). So install the RAID controller's management software and make sure you're alerted via mail once a drive shows problems.
- create at least 2 RAID-6 arrays and stripe over them (RAID-60). A single, large RAID-6 has lousy write performance also on classical, non-ZFS RAID setups.
- Danb35 is kind of right, but as long as you're cautious and know what you're doing at least your data should be ok. Give it a try and let us know how things perform.

Cheers ...

Thanks a lot everybody here. I bought battery today and just finished setup to a new small freenas server (6 disks, because I cant touch my old server, too much data.), will test the different between HBA and RAIDcard for my job. Acturlly I really want to still use HBA like you gays said but the issue really drivers me confused how to fix it or not possible.
 

skyyxy

Contributor
Joined
Jul 16, 2016
Messages
136
Finally I solved the problem.
I have 1 40GbE network adapter (intel xl710 with qsfp+ prot) and connect to the 40GbE network switch(1x 40GbE prots and 24x 1GbE prots). When I switch to 10GbE network everything working great even 5 peoples write together. But still very slow when I back to the 40GbE network. Somebody has optimiz parameters for Intel xl710 40GbE in tunables? Thanks a lot.
 
Status
Not open for further replies.
Top