TrueNAS Scale keeps crashing when moving files to SMB share

bdpyle

Cadet
Joined
Dec 23, 2022
Messages
4
I recently built a server for home use and installed TrueNAS on it. I got a bunch of things working like nextcloud, pihole, jellyfin, etc. Because I wanted to get all my movies to Jellyfin so I could watch them remotely, I created an SMB share to hold all of my movies. Unfortunately, I am only occasionally able to get movies to fully move over to the share before TrueNAS crashes. Usually I can get them to move over after a couple tries, but I don't want my nas to have to restart after every time I move a file over to the smb share.

I am using the SMB share as an example for crashing, but the crashing also happens when I am doing other things such as syncing all my phone's photos to Nextcloud

Here are the specs of my server:
CPU: Ryzen 5 5600g
MOBO: Gigabyte b550 Gaming Gen3
RAM: 32GB DDR4 Non-ECC 3200mHz
Storage: 1TB nvme boot drive and 2 4TB Seagate drives mirrored

Here are some screenshots of my cpu, ram, and network usage during the reboots:
Screenshot 2023-01-07 121729.png

Screenshot 2023-01-07 121752.png

Screenshot 2023-01-07 121811.png



The way I have my SMB share set up is I have a folder which is shared through SMB. Inside that folder is the media folder for Jellyfin, which is not shared through SMB. I can still access it, though, which I think is a bug, but that's the only way I was able to get it to work.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
I recently built a server for home use and installed TrueNAS on it. I got a bunch of things working like nextcloud, pihole, jellyfin, etc. Because I wanted to get all my movies to Jellyfin so I could watch them remotely, I created an SMB share to hold all of my movies. Unfortunately, I am only occasionally able to get movies to fully move over to the share before TrueNAS crashes. Usually I can get them to move over after a couple tries, but I don't want my nas to have to restart after every time I move a file over to the smb share.

I am using the SMB share as an example for crashing, but the crashing also happens when I am doing other things such as syncing all my phone's photos to Nextcloud

Here are the specs of my server:
CPU: Ryzen 5 5600g
MOBO: Gigabyte b550 Gaming Gen3
RAM: 32GB DDR4 Non-ECC 3200mHz
Storage: 1TB nvme boot drive and 2 4TB Seagate drives mirrored

Here are some screenshots of my cpu, ram, and network usage during the reboots:
View attachment 62094
View attachment 62095
View attachment 62096


The way I have my SMB share set up is I have a folder which is shared through SMB. Inside that folder is the media folder for Jellyfin, which is not shared through SMB. I can still access it, though, which I think is a bug, but that's the only way I was able to get it to work.
Crashing is unusual.... can you identify the specific version of SCALE

However, you are using non-ECC RAM. Its very difficult to diagnose if this is part of the cause.

Lets see if anyone reports or confirms a similar issue.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
CPU: Ryzen 5 5600g
MOBO: Gigabyte b550 Gaming Gen3
Welcome!

Have you checked your BIOS for any power-saving settings such as the AMD "Cool N Quiet" and the "C6" power state? Both of these have been shown to cause issues with TrueNAS CORE, so it might be a good place to begin with for SCALE as well.

2 4TB Seagate drives mirrored

Can you also post the model numbers of these drives? SMR drives usually manifest just as "slow performance" but in extreme cases, they've been slow enough to respond that the ZFS "deadman" timer fires off, declares the system unresponsive, and subsequently has a kernel panic.
 
Last edited:

bdpyle

Cadet
Joined
Dec 23, 2022
Messages
4
Crashing is unusual.... can you identify the specific version of SCALE

However, you are using non-ECC RAM. Its very difficult to diagnose if this is part of the cause.

Lets see if anyone reports or confirms a similar issue.
I am using version 22.12
 

jwong858

Dabbler
Joined
Nov 25, 2022
Messages
28
By the way, I'm using ASRock X570M PRO4 motherboard. I think this could cause the problem.
 

jwong858

Dabbler
Joined
Nov 25, 2022
Messages
28
I'm getting hardware errors:

Aug 17 00:06:09 truenas kernel: mce: [Hardware Error]: Machine check events logged
Aug 17 00:06:09 truenas kernel: mce: [Hardware Error]: CPU 4: Machine Check: 0 Bank 0: bc00080001010135
Aug 17 00:06:09 truenas kernel: mce: [Hardware Error]: TSC 0
Aug 17 00:06:09 truenas kernel: mce: [Hardware Error]: PROCESSOR 2:a20f12 TIME 1692255957 SOCKET 0 APIC 8 microcode a20120a
Aug 17 00:23:04 truenas kernel: mce: [Hardware Error]: Machine check events logged
Aug 17 00:23:04 truenas kernel: mce: [Hardware Error]: CPU 5: Machine Check: 0 Bank 0: bc00080001010135
Aug 17 00:23:04 truenas kernel: mce: [Hardware Error]: TSC 0 ADDR 1ba8ea440 MISC d012000000000000 IPID 1000b000000000
Aug 17 00:23:04 truenas kernel: mce: [Hardware Error]: PROCESSOR 2:a20f12 TIME 1692256973 SOCKET 0 APIC a microcode a20120a
Aug 17 20:49:29 truenas kernel: mce: [Hardware Error]: Machine check events logged
Aug 17 20:49:29 truenas kernel: mce: [Hardware Error]: CPU 5: Machine Check: 0 Bank 0: bc00080001010135
Aug 17 20:49:29 truenas kernel: mce: [Hardware Error]: TSC 0 ADDR 2109f6700 MISC d012000000000000 IPID 1000b000000000
Aug 17 20:49:29 truenas kernel: mce: [Hardware Error]: PROCESSOR 2:a20f12 TIME 1692330557 SOCKET 0 APIC a microcode a20120a
Aug 18 03:51:15 truenas kernel: mce: [Hardware Error]: Machine check events logged
Aug 18 03:51:15 truenas kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 27: faa000000000080b
Aug 18 03:51:15 truenas kernel: mce: [Hardware Error]: TSC 0 MISC d012000200000000 SYND 5d000000 IPID 1002e00000500
Aug 18 03:51:15 truenas kernel: mce: [Hardware Error]: PROCESSOR 2:a20f12 TIME 1692355863 SOCKET 0 APIC 0 microcode a20120a
Aug 18 04:35:05 truenas kernel: mce: [Hardware Error]: Machine check events logged
Aug 18 04:35:05 truenas kernel: mce: [Hardware Error]: CPU 5: Machine Check: 0 Bank 1: bc800800060c0859
Aug 18 04:35:05 truenas kernel: mce: [Hardware Error]: TSC 0 ADDR 1dce3f4640 MISC d012000000000000 IPID 100b000000000
Aug 18 04:35:05 truenas kernel: mce: [Hardware Error]: PROCESSOR 2:a20f12 TIME 1692358494 SOCKET 0 APIC a microcode a20120a
Aug 18 14:29:25 truenas kernel: mce: [Hardware Error]: Machine check events logged
Aug 18 14:29:25 truenas kernel: mce: [Hardware Error]: CPU 5: Machine Check: 0 Bank 1: fc800800060c0859
Aug 18 14:29:25 truenas kernel: mce: [Hardware Error]: TSC 0 ADDR 1e22215340 MISC d012000000000000 IPID 100b000000000
Aug 18 14:29:25 truenas kernel: mce: [Hardware Error]: PROCESSOR 2:a20f12 TIME 1692394154 SOCKET 0 APIC a microcode a20120a
Aug 18 14:34:50 truenas kernel: mce: [Hardware Error]: Machine check events logged
Aug 18 14:34:50 truenas kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 27: faa000000000080b
Aug 18 14:34:50 truenas kernel: mce: [Hardware Error]: TSC 0
Aug 18 14:34:50 truenas kernel: mce: [Hardware Error]: PROCESSOR 2:a20f12 TIME 1692394479 SOCKET 0 APIC 0 microcode a20120a
Aug 18 14:59:50 truenas kernel: mce: [Hardware Error]: Machine check events logged
Aug 18 14:59:50 truenas kernel: mce: [Hardware Error]: CPU 4: Machine Check: 0 Bank 0: bc00080001010135
Aug 18 14:59:50 truenas kernel: mce: [Hardware Error]: TSC 0 ADDR 433afba40 MISC d012000000000000 IPID 1000b000000000
Aug 18 14:59:50 truenas kernel: mce: [Hardware Error]: PROCESSOR 2:a20f12 TIME 1692395979 SOCKET 0 APIC 8 microcode a20120a

I did memory check for 3 days and it found no memory errors.

I also have broken BIOS errors:

Aug 17 00:06:10 truenas kernel: ccp 0000:0d:00.1: enabling device (0000 -> 0002)
Aug 17 00:06:10 truenas kernel: ccp 0000:0d:00.1: ccp: unable to access the device: you might be running a broken BIOS.
Aug 17 00:23:05 truenas kernel: ccp 0000:0d:00.1: enabling device (0000 -> 0002)
Aug 17 00:23:05 truenas kernel: ccp 0000:0d:00.1: ccp: unable to access the device: you might be running a broken BIOS.
Aug 17 00:35:58 truenas kernel: ccp 0000:0d:00.1: enabling device (0000 -> 0002)
Aug 17 00:35:58 truenas kernel: ccp 0000:0d:00.1: ccp: unable to access the device: you might be running a broken BIOS.
Aug 17 20:49:30 truenas kernel: ccp 0000:0d:00.1: enabling device (0000 -> 0002)
Aug 17 20:49:30 truenas kernel: ccp 0000:0d:00.1: ccp: unable to access the device: you might be running a broken BIOS.
Aug 18 03:51:16 truenas kernel: ccp 0000:0d:00.1: enabling device (0000 -> 0002)
Aug 18 03:51:16 truenas kernel: ccp 0000:0d:00.1: ccp: unable to access the device: you might be running a broken BIOS.
Aug 18 04:35:06 truenas kernel: ccp 0000:0d:00.1: enabling device (0000 -> 0002)
Aug 18 04:35:07 truenas kernel: ccp 0000:0d:00.1: ccp: unable to access the device: you might be running a broken BIOS.
Aug 18 14:25:57 truenas kernel: ccp 0000:0d:00.1: enabling device (0000 -> 0002)
Aug 18 14:25:57 truenas kernel: ccp 0000:0d:00.1: ccp: unable to access the device: you might be running a broken BIOS.
Aug 18 14:29:26 truenas kernel: ccp 0000:0d:00.1: enabling device (0000 -> 0002)
Aug 18 14:29:26 truenas kernel: ccp 0000:0d:00.1: ccp: unable to access the device: you might be running a broken BIOS.
Aug 18 14:34:51 truenas kernel: ccp 0000:0d:00.1: enabling device (0000 -> 0002)
Aug 18 14:34:51 truenas kernel: ccp 0000:0d:00.1: ccp: unable to access the device: you might be running a broken BIOS.
Aug 18 14:59:51 truenas kernel: ccp 0000:0d:00.1: enabling device (0000 -> 0002)
Aug 18 14:59:51 truenas kernel: ccp 0000:0d:00.1: ccp: unable to access the device: you might be running a broken BIOS.

I talked to the motherboard manufacturer, ASRock and they didn't support Linux.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
@jwong858 - please post your own thread - and don't hijack someone elses
@bdpyle - Can we have a hardware kit list as per forum rules. In particular motherboard make and model, network interface card and HDD make and model
 

jwong858

Dabbler
Joined
Nov 25, 2022
Messages
28
While the mv command crashes Truenas, but the cp command seems to work without any issue. Any idea?
 

jwong858

Dabbler
Joined
Nov 25, 2022
Messages
28

NugentS, I didn't mean to jijack anyone's thread. I thought it was the same issue and therefore I continued with the same thread.​

 
Top