Seemingly slow Resilver performance on Spanned 2xvdev with 8x10TB HDDs Each

schmidtd0039 · Feb 6, 2024

Hello All,

I'm hoping to get some help on a topic I know comes up frequently, but I'm having trouble understanding how is impacting my setup based on other online results. I had a drive failure today, and I'm in the process of resilvering the drive, but I'm seeing MAX 18MB/s writing to the new drive, and while my resilver percent is moving up for now, the time is only increasing, rather dramatically at that.

System Details:
TrueNAS-13.0-U4
Intel E5-1660 v3, 8C 16T
128GB DDR4 ECC
2x10GB Intel x520-DA2
x2 Toshiba 128GB SATA SSDs for Boot
x16 HUH721010AL42C0 10TB SAS HDDs
(No cache, just spinners)
SuperMicro mobo X10SRH-CF
Supermicro 36 Bay Chassis; SAS-3 backplane, onboard raid card flashed to IT Mode

Pool #1
Encryption on
Sync Standard
no Compression, no DeDup
2 vDevs of 8 drives, each vDev is x8 10TB drives
100TB usable pool size, with 40TB in use and 60TB free

Recent smart tests, long and short, have all passed on all drives (except the bad one I swapped), shorts run daily and longs weekly. The last scrub job was timed poorly and unfortunately timed out due to the bad drive, and hasn't been ran in ~30 days. (plan to run once resilver catches up).

I also noticed I had never set a Resilver priority before this started, so I initially thought that was it. However, even after updating the priority to all-day, the speed hasn't changed and rests at 18MB/s consistently for the last 10 hours. Disk latency is normal, as are temps (29C). System was freshly rebooted as part of disk replacement, and I don't see any concerning messages in dmsg. Is there something I missed or can do to improve resilver times? I know it will take at least a half-day to a day in an optimal environment, but my concern is if the 18MB/s is true and accurate, I'm looking at about a week, which doesn't seem right. I disabled automated Scrub and SMART tests for now so they don't conflict, and my pool is 99% used for reads, infrequent ones at that. It's been idle the 10 hours it's been rebuilding at this performance so far.

To note, there's also no other notable performance issues with the pool. I can consistently move data from it reading at nearly full 10GB line rate with SMB, and writing to the array rests around 300MB/s once RAM-cache runs out. Overall very happy with it, just concerned on my newly discovered resilver times leaving me anxious.

Any advice or input on experience would be appreciated, and thank you in advance! Hoping it magically speeds up overnight while I'm sleeping - else this anxiety caused by rebuild time alone is going to push me to do mirrors next NAS iteration. :D

Ericloewe · Feb 7, 2024

schmidtd0039 said:
no Compression, n

Why? That's one thing that would contribute positively to (scrub) performance.

schmidtd0039 said:
but I'm seeing MAX 18MB/s writing to the new drive,

Part of the issue is that you're focusing on a number that doesn't mean much. Your pool is only half-done with the scan phase, which is heavily IOPS limited, but only reads metadata. Meanwhile, as it issues sequential operations to actually read (and write) big chunks of data in one go, you have a not-unhealthy 450 MB/s. So focus on what zpool status tells you, as it's a far better estimate than whatever a disk's I/O measurements are saying.

schmidtd0039 · Feb 7, 2024

Ericloewe said:
Why? That's one thing that would contribute positively to (scrub) performance.

Part of the issue is that you're focusing on a number that doesn't mean much. Your pool is only half-done with the scan phase, which is heavily IOPS limited, but only reads metadata. Meanwhile, as it issues sequential operations to actually read (and write) big chunks of data in one go, you have a not-unhealthy 450 MB/s. So focus on what zpool status tells you, as it's a far better estimate than whatever a disk's I/O measurements are saying.

I opted out of compression as the data on the pool wouldn't compress well or at all, so I figured I would simplify by leaving it off.

Are you saying at some point the status will change from scanning to resilvering, and at that point it will start copying raw data?

This error was also spamming overnight - net-new, I've never seen it before (at least not without a specific disk ID attatched to it). Going to work now but will continue research from there.

Ericloewe · Feb 7, 2024

schmidtd0039 said:
Are you saying at some point the status will change from scanning to resilvering, and at that point it will start copying raw data?

No, it's already going on, ZFS keeps a buffer of read block pointers to assemble large I/O operations to issue, so as to avoid reading every single block in whatever order they show up in. The buffer gets emptied as large enough chunks show up or the buffer fills up.

schmidtd0039 said:
This error was also spamming overnight - net-new, I've never seen it before (at least not without a specific disk ID attatched to it). Going to work now but will continue research from there.

Yeah, that's not good. Definitely double check the SMART data and make sure the HBA is not overheating.

schmidtd0039 · Feb 7, 2024

Ericloewe said:
No, it's already going on, ZFS keeps a buffer of read block pointers to assemble large I/O operations to issue, so as to avoid reading every single block in whatever order they show up in. The buffer gets emptied as large enough chunks show up or the buffer fills up.

Yeah, that's not good. Definitely double check the SMART data and make sure the HBA is not overheating.

Do you have any suggestions on checking the HBA temp on an onboard LSI controller on my motherboard? As far as I can tell it doesn't have a pollable temp sensor, and others have had to use an external temp gauge to find the temps.

To note, this system is in a rackmount chassis with thorough airflow, and ambient temps of the general space sit around 40F, so if there's thermal issues I'd have to assume the heatsink might need repasting or similar physical fault, as otherwise even under load I'd expect it to receive enough cooling in its native chassis and cooler temps.

What is the danger of shutting down the server while a resilver is running? Without temp sensors, I have no good way to check if the HBA is running hot or not without shutting it down and popping open the chassis.

The other thing I just thought of, and is a bad thing on my part - the old failed disk is still in the chassis and asserted, though not in a pool. This might also be causing issues for the controller depending on the gunk and commands it might be sending, even in its otherwise expected idle state. I've seen that before on StorageSpaces Direct, where a faulty drive still connected to a controller but not in a pool can still cause impact.

When I did the drive replacement, I technically had the replacement drive in the pool as a HotSpare, but the pool was refusing to take it automatically, so I ended up removing the spare from the pool, offlining the bad drive, and then from the bad drive, selecting "Replace" and choosing the former hot-spare. Trunas then swapped it out and started the resilver. But the dying drive is still there. I would remove it live, except I also cannot find a way in TruNas to toggle drive LEDs so I also would rather shut down the system to hunt for the correct drive while it's down. So ties back to my question on safety of shutdown/interruption of resilver. From what I see online, it should be fine and the resliver will start where it left off, but is also not recommended.

I ask because at this point the rebuild has slowed down dramatically. It only progressed 8% through my 8 hours of bad sleep, and just a few more percent while I am at work, so I'm at 35% after almost 24hours - and again that ETA timer is only going up and progress is only slowing down.

Ericloewe · Feb 7, 2024

schmidtd0039 said:
Do you have any suggestions on checking the HBA temp on an onboard LSI controller on my motherboard?

Try storcli /c0 show all | grep -i temperature

schmidtd0039 said:
What is the danger of shutting down the server while a resilver is running?

You lose a bit of progress, at most, but it's safe.

schmidtd0039 said:
The other thing I just thought of, and is a bad thing on my part - the old failed disk is still in the chassis and asserted, though not in a pool. This might also be causing issues for the controller depending on the gunk and commands it might be sending, even in its otherwise expected idle state. I've seen that before on StorageSpaces Direct, where a faulty drive still connected to a controller but not in a pool can still cause impact.

Could be, especially if TrueNAS is configured to poll all disks for SMART data.

schmidtd0039 · Feb 7, 2024

Thank you for the prompt answer and suggestions Ericloewe! It's greatly appreciated. I had to head back to work for a fire there, going to just leave it chewing as is until I get back home later today and I'll try checking the controller temps with storcli. And worst case, if progress is still slogging, might just shut it down, pull that dying drive, and start it back up and see if that helps at all - no need for it to be there. It did die during a SMART-Long run, and the SMART Status of that drive is stuck "running", which makes me think that even though I disabled smart tasks from starting, TruNas might still be trying to contact and pull info from that drive from its existing bad SMART run (I rebooted between then and now, and even stopped SMART services, but that drive still thinks it's running a test, even through reboot).

To note also, even while the pool/drive is resilvering at its current pace, performance on the pool otherwise feels normal from my quick tests, so it's also not like the pool is being stressed (at least not in a way that seemingly impacts conventional SMB IO). The resilver j

schmidtd0039 · Feb 7, 2024

Home from work. Now at the 27th hour, give or take, of the resilver and it's at the 55% mark, so quite a jump recently since lunch. Additionally, the ETA has actually gone down. Whereas it started at 8 hours initially, rose slowly to 2 days until lunch today, is now down to 21 hours remaining - which based on current pacing may actually be pretty accurate.

However, there's a bunch more of those aborted targets in the console since this afternoon... :/

Even so, given this marginal improvement and 20% progress in the last 6 hours, I might just wait it out 1 more sleepless night/day to let the pool return to full health on its own, then I'll shut it down and start checking the controller heatsink, SAS cable connections, and rip out that bad da15 drive for good, boot it up, do some quick perf tests on SMB and then scrub it.

Ericloewe said:
Try storcli /c0 show all | grep -i temperature

To follow up on controller temperature... my TruNAS oddly says it has 0 controllers installed, and the temp command fails due to this (See image).

Additionally, this thing's been running for a few years now, so I was trying to remember what Firmware Red my LSI 3008 onboard was on, and looks like it's a tad old: 16.0.0.10.0.0 vs what I see some people are running, 16.0.0.12.0.0. Unhelpfully, there's also not much info on the 3008 via the "What's all the noise HBA vs Raid Card" thread, just to be above v13. So, need to do more digging here, but a firmware update might not be a bad idea, given this thread lists controller timeout fixes in the 16.0.0.12 branch (granted, only for SATA whereas I have SAS, but still)

Double weird, the sas3flash utility does list the SAS3008 controller as C0..... so I was able to get some details there, but not temps.
I am also seeing dozens of posts online of others with LSI3008's reporting high ambient temps, so this has my attention as a possible issue that I need to verify in my setup - as it's entirely possible I just haven't stressed my array/controller to this degree before.

Getting deeper in the weeds, I found that while I still can't find temp info, I was able to browse dmesg via "dmesg | grep mpr", and find what target was causing timeouts and aborts on the controller - it is indeed my good friend dead/dying da15. So, a relief that it's not a completely different new drive. Yanking that drive tomorrow will clear/stop those errors.

schmidtd0039 · Feb 7, 2024

AHA! And there we have it. With "mprutil show adapter", I'm able to pull the temp of my onboard HBA.

A spicey 69C, despite being in my 45F Garage with server-case fans blowing through the chassis. Looks like I'll need to add some active cooling to the chip itself once I crack it open tomorrow.

schmidtd0039 · Feb 7, 2024

Probably my last update, at least until the current resilver finalizes tomorrow so I can crack open the case, I was able to locate my forgotten dying drive by using:
sas3ircu 0 display
To list all drives and the enclosure/slot location. Find the SN that matches the bad drive, then use
sas3ircu 0 locate 2:23 ON
To toggle the identifier LED. 2 is the Enclosure ID, and 23 is the Slot in this case.

On my SuperMicro chassis, this triggered the red LED and that was indeed the bad drive. I pulled it, and will monitor to see if that guy being gone helps Resilver times at all. At minimum, should help drop controller load a tad with those repeated bad retries/errors - though in the last 20 minutes, pace looks about the same.

Plan of action once my resilver is done:
1. Graceful shutdown
2. Check/repaste LSI controller heatsink, and assess janky-fan addition
3. Physically move Drive DA16 to DA15's chassis location, for cleanliness/OCD
4. CREATE A DRIVE TABLE IN EXCEL to eventually label drive caddies. Location LED's working saved my bacon, but an extra point of sanity checking will be good if that ever fails or stops working.
5. Boot back up, run a much needed manual scrub of the pool (probably will take another day's time based on historical scrubs).
6. Re-enable SMART and SCRUB schedules
7. Because I'm already running IT mode on a recent version, and this is a minor update, looks like I can update the SAS firmware from sas3flash builtin to TruNas? Are there dangers to doing this VS bootable DOS? If not, copy over with WinSCP and trigger LSI3008 firmware update from 16.0.0.10 to 16.0.0.12. via TruNAS CLI command sas3flash -o -f SAS9300_8i_IT.bin, wait and reboot after complete to apply. Verify with sas3flash -listall

Ericloewe · Feb 8, 2024

schmidtd0039 said:
run a much needed manual scrub of the pool

No point, that's ultimately what is already doing.

schmidtd0039 said:
Because I'm already running IT mode on a recent version, and this is a minor update, looks like I can update the SAS firmware from sas3flash builtin to TruNas? Are there dangers to doing this VS bootable DOS? If not, copy over with WinSCP and trigger LSI3008 firmware update from 16.0.0.10 to 16.0.0.12. via TruNAS CLI command sas3flash -o -f SAS9300_8i_IT.bin, wait and reboot after complete to apply. Verify with sas3flash -listall

The big caveat is that you can't do this with an online pool, so it's safer to do this from the UEFI shell (not DOS, that's likely to not even work).

schmidtd0039 · Feb 8, 2024

Ericloewe said:
The big caveat is that you can't do this with an online pool, so it's safer to do this from the UEFI shell (not DOS, that's likely to not even work).

Ah, you're right it was from UEFI shell - been a long while since I had to do a sasflash so couldn't quite remember not got that far yet, thanks for the correction!

Ericloewe said:
No point, that's ultimately what is already doing.

Valid!

To update, 50 hours into the resilver I'm at 92% complete. Speeds and performance have held steady based on the stats/charts, so just a waiting game. I still feel that this resilver is taking x2 as long as I'd expect given the normal performance of the pool, so will continue to try out the marginal improvements I listed prior, as well as at the very end (but probably before the controller update) take fresh external backups and attempt an upgrade to 13U6.1 as well.

Fun note, that could be coincidence, but ever since removing that bad drive, my SAS controller now has rested at 64C through today, instead of 69C from yesterday, so very small improvement that didn't seem to change much.

I did do a manual perusing of my drive smart data, and all my drives seem relatively fine. Most have between 0-20 Read Errors Corrected By ECC, two have ~50 read corrected, and da4 has a concerning 246,781 Corrected Read Errors (located on the other vDev than this resilver) - which screams bad/loose cable or connection? Another thing to check on once I pop it open. But 0 write corrected, and 0 uncorrected errors on all disks.

schmidtd0039 · Feb 19, 2024

Updates on this, DA4 was indeed bad. Just days after the resilver completed on the pool, DA4 starting exhibiting some insane latency times (50k ms, and I was watching the total errors corrected climb in realtime. Had to pullit from the pool, and start another resilver... this time took 65Hours.

After that was wrapped up, I managed to get the SAS card firmware updated from EFI without issue, and I'm running the in-GUI upgrade from 13U4 to 13U6.1.

Another thing I did, is I opened up the whole chassis, and poked around at the SAS3008 chip on the mobo. While attempting to cleverly ziptie a 40mm fan to it, I knocked the darn thing loose. So I had to take the whole mobo out so I could re-paste it.

Much to my displeasure, this wasn't just an easy repaste... that stuff was like cement. However, I finally got it cleaned up pretty well, and repasted. I ended up using the only thing I had onhand so I could get the system back online, which was Arctic MX-4. That said, it doesn't seem like a great long-term choice. The Heatsink can "drift" and slide on the IHS when nudged, being it's only secured by two opposing clips/springs and not a full screw setup, and the MX4 is fairly thin compared to what was on it. So, I think I need to order some thicker paste, and take it apart to paste again - as little as I want to... But MX-6 or Grizzly looks to be thicker and should last a bit longer I'd think, especially in a 24x7 system.

Do you have any recommendations for pasting something like this?

That said, even with the Arctic MX-4 + Noctua 40MM fan, I'm getting 49C vs ~66C average temps, so a significant improvement for now. Whether or not that means actual performance improvement is yet to be seen, since other than resilver, my performance has been acceptable/great.

Important Announcement for the TrueNAS Community.

Seemingly slow Resilver performance on Spanned 2xvdev with 8x10TB HDDs Each

schmidtd0039

Dabbler

Attachments

Ericloewe

Server Wrangler

schmidtd0039

Dabbler

Attachments

Ericloewe

Server Wrangler

schmidtd0039

Dabbler

Ericloewe

Server Wrangler

schmidtd0039

Dabbler

schmidtd0039

Dabbler

Attachments

schmidtd0039

Dabbler

Attachments

schmidtd0039

Dabbler

Ericloewe

Server Wrangler

schmidtd0039

Dabbler

Attachments

schmidtd0039

Dabbler

Attachments

Similar threads