New 32 4TB HDD Server

Status
Not open for further replies.

cougarmaster

Dabbler
Joined
Jul 7, 2014
Messages
18
Hi guys,

I have been using a freenas 8 for quite a while now and quite happy with it and runs quite predictably. Anyway now I have installed a new server with the specs but since I am away from it most of the particulars I am not quite sure :-

1] LGA 2011 E5-2600 v2 cpu x 1
2] 64GB ECC RAM (16GB x 4)
3] LSI 9211-8i flashed to IT mode P16 HBA with breakout cables
4] WD Enterprise 4TB x 36
5] Supermicro 4U 36 bays 1280W redundant power supply
6] 4GB USB kingston for boot
7] Freenas 9.2.1.6

This is basically a very loud server in the server room. I have setup the zraid as follows :-

1] Raid Z2 - 10 Drives
2] Raid Z2 - 10 Drives extended to first raid
3] Raid Z2 - 10 Drives extended to first raid - Total = 81.5TB
4] Raid z2 - 6 Drives Seperate partition - Total = 16TB

My questions is I am getting a lot of error messages about the LSI 9211-8i "IOC fault, resetting" and the maximum speed i can get for write is 55MB/s. I also see in TOP the LARC is using nearly all the RAM leaving about 1-2GB. I have checked everywhere about the resetting issues but am not able to find a suitable solution. Please help this is running and backing up as it is not giving any critical errors I am forced to keep using it in this state, but it seems extremely slow. All the drives are in good condition. Have not done any tuning or changed any settings other than login password and shares using cifs and afp. The only machines accessing this are servers. Please give me some direction in rectifying this problem.

Tks Eric
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
First of all, you need a lot more RAM if you want decent performance. You have less than 512MB per TB of storage. You'll want to try 128GB of RAM first, probably.
 

cougarmaster

Dabbler
Joined
Jul 7, 2014
Messages
18
Hi Ericloewe,

Thanks for the reply. This server is used for backup only for other servers about 3 of them. So if I don't increase RAM will this cause the HDD to do writes at 55MB/s? Also the problem about the IOC fault, resetting is caused by RAM?

Tks Eric
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Hi Ericloewe,

Thanks for the reply. This server is used for backup only for other servers about 3 of them. So if I don't increase RAM will this cause the HDD to do writes at 55MB/s? Also the problem about the IOC fault, resetting is caused by RAM?

Tks Eric

The latter sounds like an unrelated problem, but slow transfers are frequently caused by a shortage of RAM. What protocol are you using for file transfers?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I'm with Eric on this. Your IOC errors either don't mean anything or mean your hardware is probably failing. But they are likely unrelated to your performance problem. Your performance problem is almost certainly related to your quantity of RAM in relation to pool size.
 

cougarmaster

Dabbler
Joined
Jul 7, 2014
Messages
18
I'm with Eric on this. Your IOC errors either don't mean anything or mean your hardware is probably failing. But they are likely unrelated to your performance problem. Your performance problem is almost certainly related to your quantity of RAM in relation to pool size.

Hi cyberjock,

Thanks for the reply. So what I dont understand is that even using dd to test performance RAM will affect it?

Tks Eric
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Yes. ZFS uses your RAM as it's cache and without enough cache it will be slow. Just like the manual and my guide state, the more RAM the better.
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
I'd be interested in arc_summary.py output after the server's been up a while, and you done a fair bit of IO on the pool.

gstat output during the throughput test is sometimes useful too.
 

cougarmaster

Dabbler
Joined
Jul 7, 2014
Messages
18
Thanks cyberjock I appreciate your time. Is there any other tweaks I could do to increase my hdd speed? As I am tied down and cannot purchase anymore RAM. Please help.

Tks Eric
 

aufalien

Patron
Joined
Jul 25, 2013
Messages
374
Well, arcstat.py is your friend and will tell you if you are ARC starved. For example I've 192GB for my ~100TB usable but am only using ~15TB (CORRECTION, I have 15GB) of ARC at any one time. In fact the only time I use 150GB of ARC is during a replication but I digress.

The IOC errors can be a failing HBA, but re flash the firmware and re seat the card into a diff. If the errors persist, re flash with a newer version of firmware to see what happens. And if the errors still persist, contact LSI support as it may be a bad card.

By the way, have you run any SMART tests yet? I would just for curiosity. I've seen drive errors manifest in the strangest of places.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
@aufalien-

You are my f'in hero if you have 15TB of ARC! I will marry you! ;)
 

cougarmaster

Dabbler
Joined
Jul 7, 2014
Messages
18
Well, arcstat.py is your friend and will tell you if you are ARC starved. For example I've 192GB for my ~100TB usable but am only using ~15TB of ARC at any one time. In fact the only time I use 150GB of ARC is during a replication but I digress.

The IOC errors can be a failing HBA, but re flash the firmware and re seat the card into a diff. If the errors persist, re flash with a newer version of firmware to see what happens. And if the errors still persist, contact LSI support as it may be a bad card.

By the way, have you run any SMART tests yet? I would just for curiosity. I've seen drive errors manifest in the strangest of places.

Thanks aufalien I will reflash the card and seat it in a diffrent slot. I will need to do it next week since I am away from the company. I am sorry but I am not familiar how to run things in FreeBSD so please let me know how to execute arcstat.py.

Tks Eric
 

aufalien

Patron
Joined
Jul 25, 2013
Messages
374
@aufalien-

You are my f'in hero if you have 15TB of ARC! I will marry you! ;)

:) Well I wished the systems used more as I bought soo much, but it is what it is. That's why you and others have said use case is first and foremost. The money would have been better spent on a crazy ZIL but you live and learn.

However at least its being used during replication so not all a waste. And L2ARC in my case is completely a waste so no need for it.
 

aufalien

Patron
Joined
Jul 25, 2013
Messages
374
Thanks aufalien I will reflash the card and seat it in a different slot. I will need to do it next week since I am away from the company. I am sorry but I am not familiar how to run things in FreeBSD so please let me know how to execute arcstat.py.

Tks Eric

Eh no worries. Its been covered on the forums, I know Josh Paetzel has a good primer on it so do a search. If this is for professional use, it would be a good idea to contract CJ on the side for a once over. I'm always considering it myself, just a thought. You know, have some one kick the tires, check the oil etc...
 

solarisguy

Guru
Joined
Apr 4, 2014
Messages
1,125
@cougarmaster, on the left hand side of the GUI click on Shell, in the window that appears type arcstat.py press Enter. Post the results using CODE tags, if possible.

P.S.
S.M.A.R.T. is important
 

Mr_N

Patron
Joined
Aug 31, 2013
Messages
289
You should probably reflash your HBA to at least P18, given the driver in current freenas is v16 and I was told to keep ahead of it with the HBA driver version...

Also isnt that alot of drives, bandwidth usage wise, for the single HBA your using?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
You should probably reflash your HBA to at least P18, given the driver in current freenas is v16 and I was told to keep ahead of it with the HBA driver version...

Also isnt that alot of drives, bandwidth usage wise, for the single HBA your using?

Who told you that? You're supposed to keep the driver and firmware at the same version.
 

aufalien

Patron
Joined
Jul 25, 2013
Messages
374
Also isnt that a lot of drives, bandwidth usage wise, for the single HBA your using?

That's 4Gb/s per lane theoretical which for estimation sake take 75% of that for a decent gestimate. That card has 8 lanes of PCIex2.

So he could throw a lot more drives at that card w/o issue. I think he'd be stoked to get 6Gb/s sustained for the over all array.
 

cougarmaster

Dabbler
Joined
Jul 7, 2014
Messages
18
@cougarmaster, on the left hand side of the GUI click on Shell, in the window that appears type arcstat.py press Enter. Post the results using CODE tags, if possible.

P.S.
S.M.A.R.T. is important

Hi Solarisguy,

I don't know how to post code tags so I typed it out here.

time read miss miss% dmis dm% pmis pm% runis run% arcsz c
0 0 0 0 0 0 0 0 0 51G 51G

Hope this helps
Tks,
Eric
 
Status
Not open for further replies.
Top