Nfs crashes after running for a while and other stability issues

Joined
Sep 12, 2021
Messages
5
Hi all

I've running SCALE since the alpha and it's been a bumby ride trying to keep my server running stable but I can't figure it out myself so I'm hoping to get some help here.
A few months back I decided to try my hand in building myself a storage server to get rid of the Qnaps I had and decided to go with Truenas because I wanted to keep Zfs.

At the time I wanted to use Truenas Core because it was more stable but had to go with Scale because my hardware was too new (Nic wasn't recognized in Core).

Now fast forward a few months and I've slowly fixed most issues (some were fixed by the patches as well) and I can almost sleep well at night xD.

The issue I'm having now is related to the ganesha nfs service that keeps crashing and stopping breaking the virtual machines that rely on the shares. A reboot usually fixes this problem but it always comes back after hours/days.

After a reboot this morning now the web ui won't load in any data. (Fix one issue and a new one arrises xD)

I attached some of the Ganesha logs.

Should note I'm currently on 21.08 and did a full reinstall last friday (did import my old config)

Hopefully someone can point my in the right direction.

Thanks in advance!
Jens Sels
 

Attachments

  • Ganesha.zip
    210.3 KB · Views: 134

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Can you eliminate networking as the issue.... is the network function when things crash?

Is the CPU running with ECC memory?
 
Joined
Sep 12, 2021
Messages
5
Networking stays working. I can still access my shares over smb, my vm and the web gui (although it freezes a lot).

I have ECC memory and the web gui shows this too.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Networking stays working. I can still access my shares over smb, my vm and the web gui (although it freezes a lot).

I have ECC memory and the web gui shows this too.

I'd suggest reporting a bug... capture the debugs. It seems you've eliminated any hardware issues.
 
Joined
Sep 12, 2021
Messages
5
Haven't had that much time this weekend so sorry for the late reply.

Seems like something is causing these freezes. I've checked the logs and the middleware service keeps timing out causing the freezes. Nfs logs also point to timeouts.

I think they are probably related so I'm going to open an issue on the Jira.

Thanks for the help!
 

LarsR

Guru
Joined
Oct 23, 2020
Messages
719
Are those "freezes" appear after around 2-3 days of uptime?
From your Signature you're using a Ryzen Processor, and there are some bios configs you have to do otherwise the advanced power management settings that are enabled by default screw you over.

If those freezes appear after said timeframe try disabling the following bios settings:

AMD Cool&Quiet
ERP-Ready
Global C-States

At least that helped me when i build my ryzen based system last year.
 
Joined
Sep 12, 2021
Messages
5
Yea they usually happen randomly after running for a while. Considering I do have a Ryzen I'm gonna try that. Thanks for the suggestion!
 
Top