uncorrectable I/O failure after hotpluging disk

Abdruck

Cadet
Joined
Jul 3, 2022
Messages
5
Hello everyone,

first of all here are my system specs:
  • Motherboard: Asrock Rack C236M WS
  • CPU: Pentium G4600
  • RAM: 16GB ECC
  • Hard drives: 2x Ironwolf 4TB in mirrored configuration
  • Boot drive: Samsung SSD 870 EVO 250GB
  • TrueNas-version: 13.0
I recently built a TrueNas server for the first time ever, so I'm pretty new to the whole NAS business. Until now, I had no Problems whatsoever, until today. When I built the server, I added a hotswapbay I got from Amazon, so I can do cold Backups. Yesterday I inserted a simple 4 TB Barracuda drive and set up a pool named 'cold_backup' and added a replication task to run every Monday. After I set up the pool, I tried exporting and importing the drive a few times and everything worked. The Idea was to insert the drive every Sunday evening so on Monday the replication could run. So today I inserted the drive and wanted to import my pool, but I couldn't log into the web UI. After trying a few times, I connected a monitor and keyboard to the server directly, but got no picture. So I had to force a shutdown. After I rebooted the system, I could see the console again, so I tried to reinsert the backup drive and could see the attached Error code. I already search the forum for similar problems but couldn't really find anything, so I decided to post this thread. Does anybody know what caused this error and what I have to do to get the cold backup working?

If the drive is inserted before the system boots, it is recognized and the pool can be imported without problem.

I want to apologize in advance if there are spelling mistakes and grammatical errors, as I am not a native English speaker.
 

Attachments

  • Error.jpg
    Error.jpg
    247.3 KB · Views: 97

Abdruck

Cadet
Joined
Jul 3, 2022
Messages
5
Just after posting this I noticed that the backup disk which is connected right now has the designation ada1 which was usually used by one of my storage pool drives. Could it be that because of the order of SATA-connectors I used to connect the hot swap bay, this error occurs?

Because if I remember correctly I used the following layout:
SATA_0: boot drive
SATA_1: hot swap bay
SATA_2: storage drive 1
SATA_3: storage drive 2
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
When I built the server, I added a hotswapbay I got from Amazon
Can you please link to the Amazon page for this backplane? I am concerned that it may not properly support hotplug (which is what caused the webUI to become unresponsive) or your system/swap partition was located on that disk.

The changing of drive order also concerns me. How is the backplane connected? A 1:1 cabling of SATA port to drive bay shouldn't be causing reordering of active drives, and ZFS isn't concerned with how the devices are attached provided that they don't change while mounted (but that could be the issue here )


Yesterday I inserted a simple 4 TB Barracuda drive
Side note: the ST4000DM004 is an SMR (Shingled Magnetic Recording) drive, and will not perform well under sustained random writes. This may not be an issue for a single-drive pool that is primarily receiving a backup workload though.
 

Abdruck

Cadet
Joined
Jul 3, 2022
Messages
5
Thank you for the reply.

Here is the Amazon link:
Bay

as far as I can tell, it is simply an Adapter.

Also, both the boot drive and this bay are connected with one SATA-Power cable. Is it possible that there is some interference when the backup disk spins up?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Ahh, it is a single hotswap tray. I thought you had bought a multi-drive backplane. This shouldn't act any differently than a directly connected drive then.

Can you confirm that your SATA ports are set to AHCI mode in the motherboard BIOS? If they are set to IDE emulation or a Fake RAID this may also cause problems.
 

Abdruck

Cadet
Joined
Jul 3, 2022
Messages
5
Can you confirm that your SATA ports are set to AHCI mode in the motherboard BIOS? If they are set to IDE emulation or a Fake RAID this may also cause problems.
Yes, I already checked, they are set to AHCI.

I also looked at the Picture I attached again and noticed that is says the boot drive was detached even though I only removed the backup drive.
 

Abdruck

Cadet
Joined
Jul 3, 2022
Messages
5
So just now I got a bunch of E-mail notifications about failed snapshot tasks. The error message was "dataset already exists, no snapshots were created" When I looked at my periodic snapshot tasks there were a bunch that I didn't create. So I think it's safe to assume that my config was damaged. Fortunately, I had a Backup of said config. Should I delete the wrongly taken snapshots, or should I just wait until they auto delete in Two weeks?
 
Top