Looking for suggestion

Joined
Jan 3, 2024
Messages
4
Good day I just got hired as system administrator for about 4 months and we're using truenas core as NAS. In 4 months using truenas we experiencing difference kinds of problem in NAS including the slow file server because of nas problem "Connecting to TrueNAS ... Make sure the TrueNAS system is powered on and connected to network." After some minutes it will respond again at the file server will be accessible again. Can someone suggest what upgrade can we do or what is the possible reason of the problem we're experiencing? Here's the specs of the nas.

CPU: Intel(R) Xeon(R) Silver 4112 CPU @ 2.60GHz
Memory: 39.6 GB
Interface ixl0


Note: We already tried the aggregation of two NIC's but the problem still persists.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Also, are you using dedupe?
And how many people are using the NAS? Some details about your use case please
 
Joined
Jan 3, 2024
Messages
4
Hey @markthebuilder

Can you post further details about your system specs, including your motherboard, storage controller, drive count and model, and network card?
Sorry for incomplete details. Here's the specs.

Motherboard: Lenovo ThinkSystem SR550 Server (Xeon SP Gen 1 / Gen 2)
Raid Controller: Lenovo Raid 930-16i
Drive Count/Model: 12 Bay
Network Card: No idea what model
 
Joined
Jan 3, 2024
Messages
4
Also, are you using dedupe?
And how many people are using the NAS? Some details about your use case please
Yes we're using dedup in our nas. I think around 50 computer accessing the NAS storage. Some of them accessing large amount of file.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
This is a recipe for disaster. Sorry to have to put it so bluntly.

- 50 people - so a highly loaded productive business critical system
- RAID controller - strongly advised against and known to lead to data loss if you have a bad day
- dedup - strongly advised against unless you 100% know what you are doing and also have a crapton of memory

I would inform the boss or board of directors ... that you have a large risk here and should take appropriate measures before lightning strikes. One could for example build a new NAS according to recommendations and move the data, then repurpose the old server.
 
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Well, you've got a bit of a pickle as mentioned by @Patrick M. Hausen - the presence of the RAID controller and use of deduplication are likely acting as a one-two punch here. Let's see how far we can unwind things. Short-term, you may be able to band-aid over some of this with additional RAM.

For a deduplication check, enable the SSH service (and allow root logins) connect remotely, and enter the following command to gather some stats on your deduplication tables and their RAM consumption.

zpool status -D YourPoolNameHere

Please paste the results here inside of CODE tags. We can interpret this to show the RAM impact and actual results of deduplication.

Regarding the RAID controller, we need to see if you used the hardware level RAID, or passed the drives through relatively untouched. The SSH command above will show this as well, but you can also open the Storage / Pools page, then from the actions dropdown (the gear) choose Status and share a screenshot here of the layout.

Other thoughts - your network card should be an Intel one based on the ixl driver being in use - your SR550 uses non-expanding backplanes as well, so any replacement HBA (if it's possible to use one) would need to have enough ports to directly connect to the existing cables (looks like 1x SFF-8643 and 1x SFF-8654) or put an expander card in the middle of the chain.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Yes we're using dedup in our nas. I think around 50 computer accessing the NAS storage. Some of them accessing large amount of file.
Yeah - I suspected so, based on the symptoms - which is why I asked.

As @Patrick M. Hausen & @HoneyBadger have indicated - you have a problem. Whoever put that box together did not do it in a sensible manner and fixing it is not going to be simple. RAM (a lot) will probably help - but is not the cure

Do you have a good backup? I think any solution is going to need your backup as any of the following will need backups:
1. Unwind the dedupe
2. Do dedupe properly
3. Eliminate hardware RAID (if it is indeed hardware RAID)

The Lenovo RAID Controller mentioned is based on the LSI 9460 MegaRAID adapter. Not a good choice for ZFS I am afraid. The good news is that its easy to replace. The bad news is that doing so MAY required a complete rebuild of the array (depending on how its been done in the first place). If the array is in JBOD mode - that might be good news (well fair news)
 
Top