CIFS Slow Performance Analysis

Status
Not open for further replies.

dropline

Dabbler
Joined
Aug 29, 2015
Messages
25
Hi guys!

I have a few problems with our fileserver, not really sure where to start adressing the problem. I hope I posted in the correct subforum. I am looking forward to get your input on this issue :)

We are a photography studio and use our freenas server as main filserver.
We transfer our photoshoots from a dedicated machine on to the freenas via cifs. Shoots are between
25 and 60 GB each, all RAW images. Freenas is connected via a gigabit wired connection. We probably do up to 10/12 shoots a day


On top of this, we have photographers and production selecting and editing the images on this machine,
using bridge and photoshop. I would say can have up to 10 people working on this server, editing images,
converting images, selecting or uploading them on the freenas. 95% of all the editing happens with jpgs,
after the RAWs where converted. all happens via cifs shares.

We do have problems with the performance of this machine, slowdowns occur and I am having a hard
time figuring out what it exactly is that causes the problems.

I not sure if we just have too many users/operations going on, or if there is some other problem.


Some detail about the machine.


Freenas Version:

Version FreeNAS-9.2.1.5-RELEASE-x64
Processor Intel(R) Xeon(R) CPU E3-1220 V2 @ 3.10GHz
Memory 16087MB
Disks 4x 3.0 TB WD RED
Motherboard Intel S1200BTS E98683-352

I am happy to provide more info, whatever you need just let me know who to export it out of the shell. I tried to
export some info on the load that the server faces, but could not find where to export it for example.

Thanks a lot for your help in advance!

Michael
 
Last edited:

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
How full is your pool? What do you mean by performance issues? Do file transfers go slow? Is latency bad? Listing directories slow?

Sent from my Nexus 5X using Tapatalk
 

dropline

Dabbler
Joined
Aug 29, 2015
Messages
25
How full is your pool? What do you mean by performance issues? Do file transfers go slow? Is latency bad? Listing directories slow?

Sent from my Nexus 5X using Tapatalk

Thanks for your reply.

Pool is 8TB and of this 8TB 1.5 TB is free.
Latency is fine, directories list fine as well.

It is mainly that the photo edits and selects go slow/very slow. So I guess you could break that down to low transfer speeds. Sorry, I could have been more clear on this.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
The more full your pool gets the slower things will get. You are at 81% full so I hope you are looking to expand your pool soon.

You should figure out how to measure your performance. Just saying things feel slow is not good enough.

Sent from my Nexus 5X using Tapatalk
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
And if those 10 people are simultaneously reading or writing big files, those drives will spend more time seeking than they will reading data. If there's a lot of data going back and forth all the time, it may be time to consider a higher performance solution (probably striped mirrors rather than RAIDZ).

Pool needs to be expanded for sure. You didn't list your motherboard, but being an E3, you can only go to 32GB RAM... but that would be a worthwhile upgrade. You may find that it's time to upgrade the box and go to an E5 where you can throw lots of RAM at it... that would give you far better caching.

And, you're waaaay behind on your FreeNAS version. 9.2.1.5 was released April 2014. 9.10.2 is current. An update won't dramatically help things, but certainly wouldn't hurt.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
And if those 10 people are simultaneously reading or writing big files, those drives will spend more time seeking than they will reading data. If there's a lot of data going back and forth all the time, it may be time to consider a higher performance solution (probably striped mirrors rather than RAIDZ).

Pool needs to be expanded for sure. You didn't list your motherboard, but being an E3, you can only go to 32GB RAM... but that would be a worthwhile upgrade. You may find that it's time to upgrade the box and go to an E5 where you can throw lots of RAM at it... that would give you far better caching.

And, you're waaaay behind on your FreeNAS version. 9.2.1.5 was released April 2014. 9.10.2 is current. An update won't dramatically help things, but certainly wouldn't hurt.
It could hurt. With all the changes that have happened it might cause a little downtime.

Sent from my Nexus 5X using Tapatalk
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
True... I suppose I should say "couldn't hurt beyond what you might normally deal with during an upgrade" - my brain just factors that part in naturally, having been through many "seamless! totally hands-off!" upgrades that weren't.
 

dropline

Dabbler
Joined
Aug 29, 2015
Messages
25
Thanks for all the participation, I appreciate your efforts to help me with this!

The approach that we simply have too much traffic going on makes a lot of sense to me. My guess would be that transferring the whole shoots onto the server takes the biggest chunk out of performance. Like I said, one shoot can be between 25 and 60 GB, roughly. Photographers and Production work mainly with high res jpgs, which are considerably smaller. A shoot, after converting to jpg, comes down to 5-15GB, massive difference.

To confirm this, I would like to run some tests and export the logs afterwards, so I can get some numbers to this theory. I think logs for the Disks and LAN would be the most interesting. But also RAM and CPU could be helpful.

How can I export that out of the shell? Is there a guide here, on how to export these logs?

Also, is there a way to display the model of the Mainboard? It is a Supermicro, but I don't know the model.
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
FN 9.10.2 gives you many more graphs under Reporting, including stats on drive utilization. That would answer most of your questions. RAM and CPU should exist under Reporting of even an older version like you're running.

"dmidecode" from the shell should give you the motherboard info - it should be fairly close to the top.

I suspect you're simply out of IOPS, combined with a pool that is quite full and likely highly fragmented. Copy/paste the output of "zpool list" and "zpool status" here... include it in CODE tags so it's readable.

Just out of curiosity, are you sure you don't have a failing drive? Email alerts configured, no alerts in the GUI, etc.? That would cause a substantial performance hit.
 

dropline

Dabbler
Joined
Aug 29, 2015
Messages
25
@tvsjr I have scheduled in to upgrade to 9.10 .As I am not on location I will have to do this remotely, so saturday night, in case something goes wrong I have time to fly down sunday and fix things while the studio is closed.

Board is actually an Intel:

Intel S1200BTS E98683-352


"zpool list":
Code:

NAME	 SIZE  ALLOC   FREE	CAP  DEDUP  HEALTH  ALTROOT
-		   -	  -	  -	  -	  -	   -  -
-		   -	  -	  -	  -	  -	   -  -
studio  10.9T  8.72T  2.15T	80%  1.00x  ONLINE  /mnt



"zpool status":
Code:

  pool: Galaxy1
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://illumos.org/msg/ZFS-8000-JQ
  scan: scrub repaired 0 in 13h37m with 0 errors on Sun Jan 17 13:37:48 2016
config:

		NAME					STATE	 READ WRITE CKSUM
		Galaxy1				 UNAVAIL	  0	 0	 0
		  15599678839288462067  REMOVED	  0	 0	 0  was /dev/gptid/e73ea7d3-f3b6-11e4-afcf-001e67aa856f

errors: 4 data errors, use '-v' for a list

  pool: Galaxy2
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://illumos.org/msg/ZFS-8000-JQ
  scan: scrub repaired 0 in 36h56m with 0 errors on Mon Jan 23 12:57:02 2017
config:

		NAME				   STATE	 READ WRITE CKSUM
		Galaxy2			   UNAVAIL	  0	 0	 0
		  9196685175823004860  REMOVED	  0	 0	 0  was /dev/gptid/c23eedfd-9eeb-11e5-a0f4-001e67aa856f

errors: 3 data errors, use '-v' for a list

  pool: studio
 state: ONLINE
status: The pool is formatted using a legacy on-disk format.  The pool can
		still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
		pool will no longer be accessible on software that does not support feature
		flags.
  scan: scrub repaired 0 in 15h5m with 0 errors on Sun Jan 29 15:05:48 2017
config:

		NAME		STATE	 READ WRITE CKSUM
		studio	  ONLINE	   0	 0	 0
		  raidz1-0  ONLINE	   0	 0	 0
			ada1p2  ONLINE	   0	 0	 0
			ada0p2  ONLINE	   0	 0	 0
			ada2p2  ONLINE	   0	 0	 0
			ada3p2  ONLINE	   0	 0	 0

errors: No known data errors



The pools "Galaxy1" and "Galaxy2" are USB Harddrives that are not connected at the moment. "Galaxy2"
might have some drive errors, see how long the scrub took. These drives where used to archive some data, and they were not used productive at all. I am looking into the possibility that one if one of these drives was defective and if yes, if it has anything to do with the slowdowns.

I have no Alerts in the GUI, need to check if the eMail alerts are set up though. I have not set this server up, but will check and see that I either set it up or change it to send the alerts to me.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
@tvsjr I have scheduled in to upgrade to 9.10 .As I am not on location I will have to do this remotely, so saturday night, in case something goes wrong I have time to fly down sunday and fix things while the studio is closed.

Board is actually an Intel:

Intel S1200BTS E98683-352


"zpool list":
Code:

NAME	 SIZE  ALLOC   FREE	CAP  DEDUP  HEALTH  ALTROOT
-		   -	  -	  -	  -	  -	   -  -
-		   -	  -	  -	  -	  -	   -  -
studio  10.9T  8.72T  2.15T	80%  1.00x  ONLINE  /mnt



"zpool status":
Code:

  pool: Galaxy1
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://illumos.org/msg/ZFS-8000-JQ
  scan: scrub repaired 0 in 13h37m with 0 errors on Sun Jan 17 13:37:48 2016
config:

		NAME					STATE	 READ WRITE CKSUM
		Galaxy1				 UNAVAIL	  0	 0	 0
		  15599678839288462067  REMOVED	  0	 0	 0  was /dev/gptid/e73ea7d3-f3b6-11e4-afcf-001e67aa856f

errors: 4 data errors, use '-v' for a list

  pool: Galaxy2
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://illumos.org/msg/ZFS-8000-JQ
  scan: scrub repaired 0 in 36h56m with 0 errors on Mon Jan 23 12:57:02 2017
config:

		NAME				   STATE	 READ WRITE CKSUM
		Galaxy2			   UNAVAIL	  0	 0	 0
		  9196685175823004860  REMOVED	  0	 0	 0  was /dev/gptid/c23eedfd-9eeb-11e5-a0f4-001e67aa856f

errors: 3 data errors, use '-v' for a list

  pool: studio
 state: ONLINE
status: The pool is formatted using a legacy on-disk format.  The pool can
		still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
		pool will no longer be accessible on software that does not support feature
		flags.
  scan: scrub repaired 0 in 15h5m with 0 errors on Sun Jan 29 15:05:48 2017
config:

		NAME		STATE	 READ WRITE CKSUM
		studio	  ONLINE	   0	 0	 0
		  raidz1-0  ONLINE	   0	 0	 0
			ada1p2  ONLINE	   0	 0	 0
			ada0p2  ONLINE	   0	 0	 0
			ada2p2  ONLINE	   0	 0	 0
			ada3p2  ONLINE	   0	 0	 0

errors: No known data errors



The pools "Galaxy1" and "Galaxy2" are USB Harddrives that are not connected at the moment. "Galaxy2"
might have some drive errors, see how long the scrub took. These drives where used to archive some data, and they were not used productive at all. I am looking into the possibility that one if one of these drives was defective and if yes, if it has anything to do with the slowdowns.

I have no Alerts in the GUI, need to check if the eMail alerts are set up though. I have not set this server up, but will check and see that I either set it up or change it to send the alerts to me.
Whoa, I would run away from this server if I was you. It wasn't set up using freenas and the removed drives wild freak me out.

Sent from my Nexus 5X using Tapatalk
 

dropline

Dabbler
Joined
Aug 29, 2015
Messages
25
What do you mean by "it wasn't set up using freenas". Its just 2 USB drives that are disconnected at the moment...
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
What do you mean by "it wasn't set up using freenas". Its just 2 USB drives that are disconnected at the moment...
Freenas uses gptids for the disk labels your pool was created with just device labels.

You also have raid z1 which isn't a good idea in today's world of multi terabyte disks. You can read the noob guide of you want to know more.

Sent from my Nexus 5X using Tapatalk
 

dropline

Dabbler
Joined
Aug 29, 2015
Messages
25
very interesting. The server is quite old by now, 6 years plus i guess. So, this might be a bit of a vague question, but in terms of updating to 9.10, are the device labels going to be a problem here?
 

Evi Vanoost

Explorer
Joined
Aug 4, 2016
Messages
91
No, ZFS doesn't care about your device labels. I've upgraded a pool from Solaris to OpenIndiana to Linux to FreeNAS, they all used different labeling methods, ZFS assembled the pools without an issues.
 

dropline

Dabbler
Joined
Aug 29, 2015
Messages
25
thank you. Is there any guides or documentation about doing a big update step from 9.2 to 9.10 ?
9.2 does not even have the update section yet. I have to go settings->advanced-> firmware update, download the 9.10 images, save it on there and then update this way. I am a bit anxious about it to be honest as stuff could go wrong. Is there a routine or checklist for me to minimize possible errors here?
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
thank you. Is there any guides or documentation about doing a big update step from 9.2 to 9.10 ?
9.2 does not even have the update section yet. I have to go settings->advanced-> firmware update, download the 9.10 images, save it on there and then update this way. I am a bit anxious about it to be honest as stuff could go wrong. Is there a routine or checklist for me to minimize possible errors here?
Save config, save encryption keys, have a backup verified.

With this big of an upgrade I would just use a new boot device. I would not upgrade I would just boot the new version. This way if it doesn't work you can just put the old boot device in and be in your way.

Sent from my Nexus 5X using Tapatalk
 
Status
Not open for further replies.
Top