ZFS crashing when writing for too long on SSD

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
What about with k3s stopped?
systemctl stop k3s.service

(it may well just restart itself though... it's designed to be unkillable).
 

nooblard

Dabbler
Joined
Jul 23, 2023
Messages
13
Where is your system dataset? On the boot pool?

I'd check the pod logs.
The system dataset containing the logs is on the boot pool.

What about with k3s stopped?
systemctl stop k3s.service

(it may well just restart itself though... it's designed to be unkillable).
1690291033144.png

It hasn't restarted for 5 minutes. I killed k3s :cool:
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
Or were you talking about the zfs pod in k3s crashing?

Almost 100% sure that's all about memory consumption (i.e. you don't have enough of it to run your containers and have the NAS do other work).
 

nooblard

Dabbler
Joined
Jul 23, 2023
Messages
13
Or were you talking about the zfs pod in k3s crashing?
I was talking about zfs pod in k3s crashing.


OK; and does ZFS still crash with the copy now?
No more heavy io wait BUT, I still got have an issue because the TrueNAS reporting has continued to crash for a few seconds, maybe it's just a coïncidence.
1690319879538.png



Almost 100% sure that's all about memory consumption (i.e. you don't have enough of it to run your containers and have the NAS do other work).
You might be right then. But I'm surprised it has manifested recently and not since the beginning (almost 6 months).



EDIT : I got 2 blanks
1690320420282.png
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
There may be an obvious explanation for that which I can't see from my external viewpoint, but it's likely to be something that changed... upgraded version of some component or the OS kernel which handles memory differently, for example.
 

nooblard

Dabbler
Joined
Jul 23, 2023
Messages
13
There may be an obvious explanation for that which I can't see from my external viewpoint, but it's likely to be something that changed... upgraded version of some component or the OS kernel which handles memory differently, for example.
Well, shoot.

Still thanks for your time.

If I found the solution without changing components (which will certainly not happen), I'll give it.


Where is your system dataset? On the boot pool?

I'd check the pod logs.
I'll have a look, if I find something weird, I'll add a post reply.
 
Top