My TrueNAS backup server "crashed?" and is acting strange, I'm not sure what it's even doing

devedse · Nov 7, 2022

Introduction

Hello all,

I've been using TrueNAS for a bit now as a secondary backup server. I love the concept of ZFS so I wanted to play with it.
I'm running a server with 26x 1TB disks where 2 are being ran in ZFS-Mirror with the operating system and the other 24 are all in 1 pool with 3 disks of redundancy.
All disks are connected to an HBA slotted in one of the PCIe slots of the server. I did this because I when I initially connected them to the internal RAID controller (which could be flashed into HBA mode) I had some strange issues. Where eventually I also ended up in a scenario where the operating systems seemed to have corrupted.

For this new setup, the performance was great and everything worked fine.

After a while I chose to simply turn off the server when I'm not using it and every 2-3 weeks turn it backup to refresh my backup (last time I also ran a Scrub / kept it on for a few days because I heard that's good for ZFS health). (I have a more up to date backup on a secondary Synology NAS).

One of the disks is currently broken but I was planning on replacing that soon. (Again, no real important data on this thing)

The crash today

Today I turned the server on again and tried removing some snapshots because the disks were almost full (95%) which seemed to work fine. I also went to the Shell and deleted a directory that was taking up a lot of space.

When I came back a few hours later the Web frontend was unavailable and the server itself was showing some strange information. I managed to trigger a shutdown which got stuck after unmounting 4 things (I don't remember the exact name) so after about 4 hours I did a cold reset.

The server is doing things

Now the server is doing all kinds of things but I have no idea if it's good or bad. Again, the data on the server isn't crucial but I'm curious what went wrong, how to fix it and what else I can do.

I attached screenshots 1 and 2 to show what the server displays now. I believe screenshot 3 is something from a few minutes earlier (harder to read since it's an iDRAC screenshot)

Does anyone have an idea what's going on here?

jgreco · Nov 7, 2022

Welcome to the forums.

Sorry to hear you're having trouble. Please take a few moments to review the Forum Rules, conveniently linked at the top of every page in red, and pay particular attention to the section on how to formulate a useful problem report, especially including a detailed description of your hardware.

You've basically given no one anything to work with, so the responses will tend to be random guesses rather than anything useful.

HoneyBadger · Nov 7, 2022

I've got an inkling here, but I'll need some info to confirm it - please dump your complete hardware configuration, paying special attention to RAM quantity and HBA make/model, as well as any firmware changes made to it (reflash, crossflash, etc)

Suspecting you're on an R720 or R730XD based on drive descriptions and iDRAC reference.

But if I can cut to the chase here - are you using deduplication, by any chance?

I won't bury you under the ZFS-specific details, but the short version is that your disks are likely very busy, and having one of them (da20, based on the third screenshot) be broken or misbehaving isn't going to assist. I'd suggest if you have a way to positively ID it by serial number (through iDRAC, perhaps) that you pull it out or otherwise offline it - but only do this if you can 100% positively ID the disk with the match SN as shown in the third screenshot.

devedse · Nov 8, 2022

Hello, as you all requested, here's the hardware specifications of my server:

PowerEdge R720xd
Motherboard: Not sure, the default in an R720xd I assume
Processor: 2x Intel(R) Xeon(R) CPU E5-2630L v2 @ 2.40GHz (Model 62 Stepping 4)
RAM: 4x 16gb DDR3 800mhz
Hard drives:
2x SEAGATE 1TB in mirror for boot pool
24x SEAGATE 1TB in RaidZ3 with 1 vdev
Hard disk controller: LSI SAS 9207-8i HBA
Netword cards: Just using 1 of the onboard cards

devedse · Nov 8, 2022

@HoneyBadger,

I am indeed using deduplication. I'll see if I can replace the disk sometime soon to ensure that's not a problem anymore. I'm not exactly sure what was happening though. I can understand the disks are busy but TrueNAS itself was completely unreachable (through the WebUI) yesterday. Next to that the server itself was showing errors like:
`Low water mark reached. Dropping 100% of metrics.`

After about 4-5 hours of "booting up" (screenshot 1 and 2) TrueNAS seems to have come online again and seems to be working. Is there any way to figure out what it was doing?

devedse · Nov 8, 2022

A problem I'm running into now is that the system seems to be "hanging" again. I tried removing a whole directory tree by calling `rm -v -R /mnt/..../blah` which worked fine for a few minutes but seems to be hanging now (screenshot 1).

I tried to open a new shell and simply `cat` a file, but this also isn't working anymore (screenshot 2).

jgreco · Nov 8, 2022

devedse said:
I am indeed using deduplication.

devedse said:
24x SEAGATE 1TB in RaidZ3 with 1 vdev

devedse said:
RAM: 4x 16gb DDR3 800mhz

Mmm, so the memory suggestions for dedup are that you should have between 5GB and 10GB per 1TB of deduplicated disk space. You might not need that much; on a pool that is optimally suited for dedup, that might only be a few GB per TB. But you are showing classic signs of dedup stresses.

So by my math, you currently have 64GB RAM, but you have 24TB storage, and should have allowed up to 240GB ARC for that.

devedse said:
Is there any way to figure out what it was doing?

It's thrashing around trying to bring in the stuff it needs to do dedup from the pool.

HoneyBadger · Nov 8, 2022

devedse said:
@HoneyBadger,

I am indeed using deduplication ...

After about 4-5 hours of "booting up" (screenshot 1 and 2) TrueNAS seems to have come online again and seems to be working. Is there any way to figure out what it was doing?

Basically, you've run out of memory. Instead of being able to do lookups in RAM at the speed of "gigabytes per second" you've been reduced to lookups on disk at a probable speed of "kilobytes per second"

Your disks are being overwhelmed by very small I/O as they update the deduplication metadata ("block X, Y, and Z used to be a copy of block Q - block Z got deleted, so we need to update") - spinning disks are not the best at handling it in an optimal scenario, but the single 24-disk wide RAIDZ3 layout is going to exacerbate that as there's twenty-four disks that need to rotate to the right spot.

In screenshots 1+2 you can see that the txg (transaction group) number is incrementing very slowly for your main pool "DevePool" - it starts on txg 396575 where it reports 17824 blocks freed in 64132ms - the next transaction group is txg 396576, where you see a huge number of metaslab load and unloads as it's trying to pull your deduplication table into memory and make space to work on it. Finally you reach the end of 396576, where 17408 blocks are freed in 49483ms.

As a yardstick, a transaction group is normally allowed to exist for a maximum of 5s with the default tuning. So if your system needs to spend an extra minute doing "housekeeping" for every 5 seconds of writes put in it ... it's going to have a poor time keeping up and appearing responsive.

If you can safely shut the system down and add more RAM (I would try to double it, at a minimum) this may help make the system somewhat more responsive, as it will no longer need to evict metadata that it will then have to re-read again - but ultimately, the way out of this will likely involve removing deduplication, and probably a pool re-design to use multiple smaller RAIDZ2 vdevs to regain a little performance and split up the large 24-wide Z3.

devedse · Nov 8, 2022

So is the dedup mainly getting overloaded during deletions? Because when doing my backups I didn't seem to have any problems.

And if the dedup is the problem, is it possible to easily disable this? Or do you need to create a whole new pool?

devedse · Nov 8, 2022

My memory usage seems to be quite okay now:

Or is this not an accurate graph?

MrGuvernment · Nov 8, 2022

Over all, you do need to redesign your entire deployment as how you have it is not optimal for performance at all and you are not meeting min. requirements as noted above, for dedupe to work effectively.

Yes, Dedupe is dragging your system to a crawl, so you can either leave your server on and running for several days, or even weeks :( and do not touch it, let it work through what it has to do, putting high strain on all your disks, or as @HoneyBadger noted, if you can afford to buy some more ram, that would at least help to speed it up and alleviate some of the load and keep it in memory.

HoneyBadger · Nov 8, 2022

devedse said:
So is the dedup mainly getting overloaded during deletions? Because when doing my backups I didn't seem to have any problems.

It's likely the snapshot removal and underlying consolidation that's really causing the issue, especially if there are multiple snaps in the chain. If your new data being written in during the backup jobs was successfully deduplicating (and it didn't suffer metadata thrashing in RAM before hitting a matching hash entry) then it may have worked quite well as it was effectively just writing "N+1" to each of the DDT records, and actual disk activity was minimal.

devedse said:
And if the dedup is the problem, is it possible to easily disable this? Or do you need to create a whole new pool?

Data that's already been deduplicated would need to be re-hydrated by being copied to a non-dedup dataset or off the pool entirely. As mentioned, re-designing the pool to use a larger number of narrower vdevs (such as 3x 8-wide Z2, or even 2x 12-wide Z2) might be a benefit as well for general ingest speeds.

Question - if you have an interactive shell, can you issue the command zpool list and post the output (preferably inside of [CODE][/CODE] blocks for formatting) as well as zpool status -D DevePool - looking to see just how much you're saving with deduplication. Using SSH will make it easier to copy and past the results.

jgreco · Nov 8, 2022

devedse said:
So is the dedup mainly getting overloaded during deletions? Because when doing my backups I didn't seem to have any problems.

Deletions would seem to involve doing a bunch of metadata operations in short order. Backups would seem to imply streaming data in over the network, which might seem fast to you as a human being, but is actually incredibly slow. It doesn't necessarily take a huge pause in between actions to make a difference.

Back in the day, when I was writing high performance USENET code, one of my clients came to me with a steaming pile of poo written by some goober who knew ... was it perl? I don't recall. It was an indexer that would run through a newsgroup, download all the overviews, sort them, pull out any "complete" articles, and stick the results into a database. It was slow, ugly, CPU-hungry, slow, inefficient, and did I mention slow?

One of the interesting things programmers often fail to do is to optimize their code at the design phase. To me, I saw an obvious issue here, which is that while the overviews were downloading, that was a very slow process, and it was stupid to store that in a temporary file because that just generated even more I/O. Instead, "indx" built a data structure in memory that was a combination of linked list with a hash table entrance, and "indx" would read from the overviews coming in over the network (slow operation), would hash the record just read, use that to insert the necessary bits into the linked list, and then went on to read the next record. By doing most of the work WHILE it was waiting for more network input, it actually achieved wirespeed processing of the overviews, and behaved like an O(n) algorithm even though it really wasn't.

ZFS dedup is very sensitive to the way you implement it. You can't just "turn it on" and hope that it will work. Understanding what is going on underneath it all is very important, and you should definitely read up on the experience @Stilez had building a dedup platform fast enough to be realistically usable.

Oh yeah hi to all you Newshosting and Highwinds users from back in the day.

devedse · Nov 8, 2022

Alright makes sense :), for now I'm thinking off just letting it run for a bit as I've now been able to remove the specific folder I wanted. Again my usecase is just a secondary backup server.

Here's the output:

Code:

  pool: DevePool
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q
  scan: scrub repaired 0B in 13:33:05 with 0 errors on Thu Sep  8 12:38:52 2022
config:

        NAME                                            STATE     READ WRITE CKSUM
        DevePool                                        DEGRADED     0     0     0
          raidz3-0                                      DEGRADED     0     0     0
            gptid/6da2232a-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/6eca21b8-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/6c7f7004-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/6dabe622-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/6eaafcdb-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/7072fe93-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/708e130f-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/6d946554-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/6ec28e06-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/6c901286-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/6c727fc2-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/6eb86e37-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/707bfb57-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/702cc9eb-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/6c877d6c-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/6db6b97d-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/7084b1bf-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/71934f01-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/71ac81d5-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/71a45ba3-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            da25                                        ONLINE       0     0     0
            14269871488979767539                        UNAVAIL      0     0     0  was /dev/gptid/721a7e76-d3cf-11ec-93c3-c81f66ee8705
            gptid/722d2966-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0
            gptid/72237655-d3cf-11ec-93c3-c81f66ee8705  ONLINE       0     0     0

errors: No known data errors

 dedup: DDT entries 127552436, size 2.14K on disk, 221B in core

bucket              allocated                       referenced
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
     1     111M   13.8T   13.0T   13.1T     111M   13.8T   13.0T   13.1T
     2    9.54M   1.16T   1.02T   1.04T    21.0M   2.55T   2.24T   2.27T
     4     666K   73.1G   59.7G   61.7G    2.86M    317G    258G    267G
     8    36.5K   2.08G   1.11G   1.41G     362K   20.5G   10.7G   13.8G
    16    12.0K    710M    309M    419M     251K   14.1G   5.97G   8.27G
    32    4.34K    215M    117M    157M     191K   10.8G   6.51G   8.10G
    64      861   44.6M   18.8M   26.4M    70.4K   3.61G   1.54G   2.16G
   128      234   9.86M   4.19M   6.47M    38.3K   1.54G    620M   1009M
   256      131   10.4M   2.84M   4.12M    42.3K   3.23G    965M   1.35G
   512       28    668K    164K    473K    19.1K    447M   90.6M    303M
    1K       15    690K   97.5K    243K    20.4K    876M    141M    338M
    2K        4    198K     41K   76.7K    11.5K    602M    157M    249M
    4K        2      1K      1K   25.6K    12.2K   6.08M   6.08M    155M
    8K        1    512B    512B   12.8K    13.1K   6.56M   6.56M    168M
   32K        1    128K      4K   12.8K    35.7K   4.46G    143M    456M
  512K        1    512B    512B   12.8K     557K    279M    279M   6.96G
 Total     122M   15.0T   14.1T   14.2T     137M   16.7T   15.6T   15.7T

zpool list shows the following:

Code:

NAME        SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
DevePool   21.8T  18.4T  3.37T        -         -    48%    84%  1.10x  DEGRADED  /mnt
boot-pool   912G  3.86G   908G        -         -     0%     0%  1.00x    ONLINE  -

So basically a dedup reatio of 1.1 (which for me is quite useful as it's the difference betweeen fitting the data and not fitting it :) )

jgreco · Nov 8, 2022

devedse said:
So basically a dedup reatio of 1.1 (which for me is quite useful as it's the difference betweeen fitting the data and not fitting it :) )

1.1? That's terrible. You'd be better off maximizing compression or something like that.

devedse · Nov 8, 2022

Another error I'm running into is when I try to shut down the system now it hangs on the following:

What could that be?

devedse · Nov 8, 2022

jgreco said:
1.1? That's terrible. You'd be better off maximizing compression or something like that.

Well compression is already activated and set to one of the higher levels. Again for my usecase my backup fits with dedup enabled and doesn't with it disabled :).

jgreco · Nov 8, 2022

devedse said:
What could that be?

I expect that if you wait long enough, it'll finish whatever it's doing. You could also try ^T.

Basically, the dedup, the super-wide RAIDZ3, and the degraded RAIDZ3 are conspiring to kill you.

jgreco · Nov 8, 2022

devedse said:
Again for my usecase my backup fits with dedup enabled and doesn't with it disabled :).

Since you're well past the 80% mark, this statement would appear to be categorically false.

devedse · Nov 8, 2022

jgreco said:
Since you're well past the 80% mark, this statement would appear to be categorically false.

Well that's why I was removing some snapshots so I would fit within :)

Important Announcement for the TrueNAS Community.

My TrueNAS backup server "crashed?" and is acting strange, I'm not sure what it's even doing

Dabbler

Attachments

Resident Grinch

actually does care

Dabbler

Dabbler

Dabbler

Attachments

Resident Grinch

actually does care

Dabbler

Dabbler

Patron

actually does care

Resident Grinch

Dabbler

Resident Grinch

Dabbler

Dabbler

Resident Grinch

Resident Grinch

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "My TrueNAS backup server "crashed?" and is acting strange, I'm not sure what it's even doing"

Similar threads