REALLY SLOW READING/WRITING SPEED - TRUENAS SCALE 23.10.1.3

simple_Robi

Cadet
Joined
Feb 15, 2024
Messages
6
Hi to everyone and thanks to who will have the patience to help me.

I built my brand new NAS a couple of monthes ago (see below for my current hardware and software setup).
Everything went smooth during the building phase, even installation went smoothly.
Till a couple of weeks ago, everything worked perfectly. I think the bad event that broke my system was a black out that cut the power for like 2 hours while I was sleeping and my UPS died.

Here's the first main problem that I am having latly: after some days that my server is on, I have very bad reading and writing speed on SMB: for the first 3/4 seconds, I write/read at 280ish MB/s (I have a 2,5 Gb/s connection between my server and my pc) but than it falls at 2/3 MB/s or sometimes even 100/150 KB/s and it never comes up again.
It doesn't matter if the file size (I tried 5/10/100/1000 GB) or the device (I tried with my pc - both linux and win - same results with my Mac, my iPhone or iPad).
I can't seems to notice a pattern except that happen after a couple of days being on.

Sometimes it fixes itself after just a reboot, sometimes I have to reboot it several times. One time I had to shut it off completly to make it work.

But here comes the second problem: the Apps Service never start properly. I have to try different work-around like unsetting and setting g the pool, reboot, disable SMB services multiple times before I can make it to work again.
Sometimes I get kubernetes error, everyime is different from the previous one.

I did search in the forums for the solution but none was helpful.

The only thing out of place is my pool status check: I notice that my boot pool is damage.
Here the result of 'zpool status':
root@truenas[~]# zpool status
pool: Sanji-SAN
state: ONLINE
scan: scrub repaired 0B in 00:29:46 with 0 errors on Sun Jan 28 00:29:48 2024
config:

NAME STATE READ WRITE CKSUM
Sanji-SAN ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
4529bb42-9a64-47bf-84ee-ca8f26b30520 ONLINE 0 0 0
15ee06d0-27b7-4a9b-87aa-ff406409becb ONLINE 0 0 0
1a7f08f1-a551-429a-b1df-ee50ab7910be ONLINE 0 0 0
ab8081da-6abc-4e6c-a3bd-e07a73398011 ONLINE 0 0 0
820e7a70-3168-4b49-a83e-8c390cf8e1e8 ONLINE 0 0 0
cache
bb28a548-06a6-42c7-90b7-a87ec6b7b14c ONLINE 0 0 0

errors: No known data errors

pool: boot-pool
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
scan: scrub repaired 0B in 00:00:47 with 0 errors on Fri Feb 16 03:45:48 2024
config:

NAME STATE READ WRITE CKSUM
boot-pool DEGRADED 0 0 0
sdd3 DEGRADED 14 0 42 too many errors

errors: No known data errors

I think (and I hope) that my problems are there because of the boot-pool degraded.

Should I reinstall the OS on my SSD pool? Is it safe and easy?
My HDD held tons movies and some photos. I can save my photos on my PC's SSD and format everything but I really don't wanna lose all my movies and anime that I downloaded.
If I install my OS again, I can import my old pool without losing my data? Is there a guide?

If you need me to do something else to check, to help you troubleshoot, just ask (like testing my HDD writing speed or something else).

Thank you.
Robi. :)

P.S.
My PC doesn't have issue downloading at 2,5 Gb/s speed from the internet. Dunno if you need this info or not.
And sorry for my English, my first language is Italian so please be kind to me.


Hardware:
Motherboard: B550I AORUS PRO AX
CPU: AMD Ryzen 5 5600G
RAM: 2 x 16 GB Crucial Pro DDR4 3200 MHz (CP2K16G4DFRA32A)
Boot SSD: Kingston A400 SSD 240GB 2.5" SATA (SA400S37/240G)
Cache SSD: Samsung 980 250 GB NVMe M.2 (MZ-V8V250BW)
HDD for DATA: 5 x 2 TB Seagate BarraCuda (SMR - ST20000DM008) - I know it's not the best but I found it out too late.
Power Supply: Cooler Master V750 SFX Gold


Software:
OS Version: TrueNAS-SCALE-23.10.1.3

Configuration:
Data VDEVs: 1 x RAIDZ1 (with 5 x 2TB HDD)
Cache VDEVs: 1 x 250 GB SSD NVMe M.2
Capacity used: 2,09 TiB (29,1%)
 

chuck32

Guru
Joined
Jan 14, 2023
Messages
623
First of all, since you said you could save your photos on your PC, do that now. To be clear your data is not at risk due to a degraded boot pool. But you should have backups of your most important stuff. Do it now.

Create a backup of your configuration (second thing you should always do, a boot pool can always die). Under system settings, general, manage config.

Reinstall the OS and import your config, your pool is back up.
I doubt the degraded pool will lead to a corrupt config export, but if that does not work you could reinstall and do not reuse your config. In that case export the pool before reinstalling (don't check destroy data) and import it after reinstall.
I assume the SSD will have survived the crash.

Does your UPS not have USB? Otherwise it should have gracefully shutdown your server before dying if you set up the UPS service.

Lastly, I would lean towards your SMB drives beeing the culprit for your bad performance. If possible replace them with CMR drives. Eventually they will lead to tears. There's a lot of information on the forums.

Also raidz1 with smr drives and no backups is a recipe for disaster. With 5 drives you should go for RAIDZ2. They're not huge with 2 TB but still you will have no redundancy during resilvering, which stresses the drives.

A L2ARC (you're cache drive) is not needed with only 32 GB RAM. It says here no less than 32 GB, it was/is 64 GB in other versions. It can also actually slow things down.
 

simple_Robi

Cadet
Joined
Feb 15, 2024
Messages
6
First of all, since you said you could save your photos on your PC, do that now. To be clear your data is not at risk due to a degraded boot pool. But you should have backups of your most important stuff. Do it now.

Create a backup of your configuration (second thing you should always do, a boot pool can always die). Under system settings, general, manage config.

Reinstall the OS and import your config, your pool is back up.
I doubt the degraded pool will lead to a corrupt config export, but if that does not work you could reinstall and do not reuse your config. In that case export the pool before reinstalling (don't check destroy data) and import it after reinstall.
I assume the SSD will have survived the crash.

Does your UPS not have USB? Otherwise it should have gracefully shutdown your server before dying if you set up the UPS service.

Lastly, I would lean towards your SMB drives beeing the culprit for your bad performance. If possible replace them with CMR drives. Eventually they will lead to tears. There's a lot of information on the forums.

Also raidz1 with smr drives and no backups is a recipe for disaster. With 5 drives you should go for RAIDZ2. They're not huge with 2 TB but still you will have no redundancy during resilvering, which stresses the drives.

A L2ARC (you're cache drive) is not needed with only 32 GB RAM. It says here no less than 32 GB, it was/is 64 GB in other versions. It can also actually slow things down.
Thanks for your advice!

Well, I have a my photo on my NAS, my phone, my iCloud and my GDrive so I am not worry about losing them now.

So to sum it up, now I have to delete my cache VDEV and reinstall my OS on my boot SSD with the config file backup. I'll check a guide online on how to do it.

Then what? How I can change my RAIDZ1 configuration to RAIDZ2 without losing data?
Is there a way to swap my SMR HDD to CMR HDD without losing data or without having a third support?
 

simple_Robi

Cadet
Joined
Feb 15, 2024
Messages
6
First of all, since you said you could save your photos on your PC, do that now. To be clear your data is not at risk due to a degraded boot pool. But you should have backups of your most important stuff. Do it now.

Create a backup of your configuration (second thing you should always do, a boot pool can always die). Under system settings, general, manage config.

Reinstall the OS and import your config, your pool is back up.
I doubt the degraded pool will lead to a corrupt config export, but if that does not work you could reinstall and do not reuse your config. In that case export the pool before reinstalling (don't check destroy data) and import it after reinstall.
I assume the SSD will have survived the crash.

Does your UPS not have USB? Otherwise it should have gracefully shutdown your server before dying if you set up the UPS service.

Lastly, I would lean towards your SMB drives beeing the culprit for your bad performance. If possible replace them with CMR drives. Eventually they will lead to tears. There's a lot of information on the forums.

Also raidz1 with smr drives and no backups is a recipe for disaster. With 5 drives you should go for RAIDZ2. They're not huge with 2 TB but still you will have no redundancy during resilvering, which stresses the drives.

A L2ARC (you're cache drive) is not needed with only 32 GB RAM. It says here no less than 32 GB, it was/is 64 GB in other versions. It can also actually slow things down.
Another question: how SMR drives perform so bad? Here I read about people dropping 50/60% of their performance, but I lost like 99% of my performance. :(
 

chuck32

Guru
Joined
Jan 14, 2023
Messages
623
So to sum it up, now I have to delete my cache VDEV
You don't have to, I'm just pretty confident it doesn't do you any favors.

reinstall my OS on my boot SSD with the config file backup. I'll check a guide online on how to do it.
It's really all in the documentation.

Then what? How I can change my RAIDZ1 configuration to RAIDZ2 without losing data?
You can't, you need to recreate the pool with the new raidz layout and this will delete all data on the pool.

Is there a way to swap my SMR HDD to CMR HDD without losing data or without having a third support?
If you stay at raidz1, you can replace the disks one by one (replace the disk, resilver, replace next disk...)

If you want to change your layout, just create a raidz2 pool and copy your data there.
If you have enough sata ports to keep everything connected and just replicate from your old raidz1 pool to your new raidz2. <- this assumes you will recreate the new pool with new CMR drives. You did not mention it but what expansion card do you use? Your board has only 4 SATA ports, you run more SATA disks. Maybe the card can be at fault as well.

Here I read about people dropping 50/60% of their performance, but I lost like 99% of my performance. :(
Just do not use SMR drives.
 

simple_Robi

Cadet
Joined
Feb 15, 2024
Messages
6
You don't have to, I'm just pretty confident it doesn't do you any favors.
Ok, just removed it.
If you want to change your layout, just create a raidz2 pool and copy your data there.
If you have enough sata ports to keep everything connected and just replicate from your old raidz1 pool to your new raidz2. <- this assumes you will recreate the new pool with new CMR drives. You did not mention it but what expansion card do you use? Your board has only 4 SATA ports, you run more SATA disks. Maybe the card can be at fault as well.
I can't seem to find the model, I'll post you the link here.

I think I'll stay with raidz1 but I'll slowly change my drive.. It's too bad cause I just bought them and they are brand new.

Thank you
 

chuck32

Guru
Joined
Jan 14, 2023
Messages
623
I think I'll stay with raidz1 but I'll slowly change my drive..
Multiply your problems with SATA Port Multipliers and cheap SATA controllers adding the language barrier I'm not sure how to place your card into the mix.

I myself use a cheap card, but only for a single SSD that is part of a mirror and does not contain data I cannot lose. So I know about the risk (I am planning to get a proper HBA sometime in the future too). But I wouldn't trust my data pool with it.

I'd definitely look into using a proper HBA if I was you. Used HBAs by LSI are often recommended around here.

It's too bad cause I just bought them and they are brand new.
Maybe you are still in the return for the drives? I am always hesitant to send back working stuff, because I'm afraid that it rather goes to the bin than beeing sold again as used, but ultimately this is up to you and what the return policy is ;)
 

simple_Robi

Cadet
Joined
Feb 15, 2024
Messages
6
Just finished reintalling the OS. Now the pool is not degraded anymore.

But the problem persist: after a couple on second, my speed drop to 1 MB/s.. Even Windows Explorer has trouble while I navigate my SMB folder.
I can't seem to even install new apps.
And still I am not seeing any errors at all in the logs.

I think I'll buy an SLI HBA and 5 Iron Wolf and give it a try, still I am not that sure that doing this will help at all.

Thank you anyway for your adivce, chuck32
 

chuck32

Guru
Joined
Jan 14, 2023
Messages
623
I think I'll buy an SLI HBA and 5 Iron Wolf and give it a try, still I am not that sure that doing this will help at all.
It's really hard to tell, but using recommended / compatible hardware helps a ton.

What you could do, to verify, maybe just purchase 1 or 2 new drives and create a test pool. If the problem goes away you know your card / SMR drives were the culprit.

I don't want to steer people into spending unnecessary cash however as you
Here I read about people dropping 50/60% of their performance, but I lost like 99% of my performance. :(
did your reading about CMR vs SMR you will agree that CMR drives are the better option.

Good luck to you! I really hope this could solve your problem. Look into the recommended hardware guide and when in doubt ask before ordering new drives / HBA. As I mentioned, the HBA can be purchased used on ebay or the likes. There are a lot of reputable sellers.
 

simple_Robi

Cadet
Joined
Feb 15, 2024
Messages
6
It's really hard to tell, but using recommended / compatible hardware helps a ton.

What you could do, to verify, maybe just purchase 1 or 2 new drives and create a test pool. If the problem goes away you know your card / SMR drives were the culprit.

I don't want to steer people into spending unnecessary cash however as you

did your reading about CMR vs SMR you will agree that CMR drives are the better option.

Good luck to you! I really hope this could solve your problem. Look into the recommended hardware guide and when in doubt ask before ordering new drives / HBA. As I mentioned, the HBA can be purchased used on ebay or the likes. There are a lot of reputable sellers.
Thanks..
I'll be right back as soon as I've tried.

Anyway seems like even my nic on board has some problems (being Realtek)

I hope I do not have to find another motherboard.
 
Top