SOLVED VDEVs randomly became "not assigned" on the pool. Worried I'll lose my data

Azaz

Cadet
Joined
Jan 6, 2023
Messages
7
So my VDEVS are all randomly unassigned from my discs, and when I try to add them back, it sounds like it will wipe all of the data on them unless I am mistaken?

I did not change anything that I am aware of that would have caused this. I went to access my files the other day, and it suddenly said I had 5 unassigned disks.

When I try to reassign them, I get a warning that I might lose my data, though I'm wondering if I actually will or not since I'm just adding them back to the same pool they should already have been on in the first place.

Please let me know what I can do or if anyone can point me in the right direction!

Getting this critical error:
"Pool NelsonNAS state is OFFLINE: None"

Here are some screenshots of what I am seeing:
Nas1.png
Nas2.png
Nas3.png
Nas4.png


Here are my system Specs:

  • Motherboard: ASRock B550 Phantom Gaming-ITX/ax
  • CPU: AMD Ryzen 5 5600G
  • RAM: Crucial 32GB (2 x 16GB) DDR4 3200 CT2K16G4DFRA32A
  • SSD (Boot): Samsung 870 EVO 250GB SATA 2.5" (MZ-77E250)
    • Using the IO CREST M.2 B+M Key to SATA III 2 Ports Expansion Card
  • SSD (Cache): Intel M.2 22 x 80mm Optane Memory 32GB + SSD 512GB PCIe NVMe 3.0 x4 HBRPEKNX0202A01
  • HDD: 4x WD Red Pro WD161KFGX 16TB
  • Case: JONSBO N1 Mini-ITX NAS Chassis
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
No, don't try to add them back. Try to export the pool with the boxes for "Destroy data" and "Delete configuration" unchecked and the box for "Confirm export" checked. Then try to import the pool back.
 

Azaz

Cadet
Joined
Jan 6, 2023
Messages
7
No, don't try to add them back. Try to export the pool with the boxes for "Destroy data" and "Delete configuration" unchecked and the box for "Confirm export" checked. Then try to import the pool back.
I've done that, now when I try and import the pool back, I get this error message:

Error: concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/middlewared/plugins/zfs.py", line 425, in import_pool
zfs.import_pool(found, pool_name, properties, missing_log=missing_log, any_host=any_host)
File "libzfs.pyx", line 1262, in libzfs.ZFS.import_pool
File "libzfs.pyx", line 1290, in libzfs.ZFS.__import_pool libzfs.ZFSException: cannot import 'NelsonNAS' as 'NelsonNAS': I/O error During handling of the above exception, another exception occurred: Traceback (most recent call last):
File "/usr/lib/python3.9/concurrent/futures/process.py", line 243, in _process_worker r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/usr/lib/python3/dist-packages/middlewared/worker.py", line 115, in main_worker res = MIDDLEWARE._run(*call_args)
File "/usr/lib/python3/dist-packages/middlewared/worker.py", line 46, in _run return self._call(name, serviceobj, methodobj, args, job=job)
File "/usr/lib/python3/dist-packages/middlewared/worker.py", line 40, in _call return methodobj(*params)
File "/usr/lib/python3/dist-packages/middlewared/worker.py", line 40, in _call return methodobj(*params)
File "/usr/lib/python3/dist-packages/middlewared/schema.py", line 1288, in nf return func(*args, **kwargs)
File "/usr/lib/python3/dist-packages/middlewared/plugins/zfs.py", line 431, in import_pool self.logger.error(
File "libzfs.pyx", line 465, in libzfs.ZFS.__exit__
File "/usr/lib/python3/dist-packages/middlewared/plugins/zfs.py", line 429, in import_pool
raise CallError(f'Failed to import {pool_name!r} pool: {e}', e.code) middlewared.service_exception.CallError: [EZFS_IO] Failed to import 'NelsonNAS' pool: cannot import 'NelsonNAS' as 'NelsonNAS': I/O error
"""
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/middlewared/job.py", line 426, in run await self.future
File "/usr/lib/python3/dist-packages/middlewared/job.py", line 461, in __run_body rv = await self.method(*([self] + args))
File "/usr/lib/python3/dist-packages/middlewared/schema.py", line 1284, in nf return await func(*args, **kwargs)
File "/usr/lib/python3/dist-packages/middlewared/schema.py", line 1152, in nf res = await f(*args, **kwargs)
File "/usr/lib/python3/dist-packages/middlewared/plugins/pool.py", line 1425, in import_pool await self.middleware.call('zfs.pool.import_pool', guid, opts, any_host, use_cachefile, new_name)
File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1306, in call return await self._call(
File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1263, in _call return await self._call_worker(name, *prepared_call.args)
File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1269, in _call_worker return await self.run_in_proc(main_worker, name, args, job)
File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1184, in run_in_proc return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1169, in run_in_executor return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs)) middlewared.service_exception.CallError: [EZFS_IO] Failed to import 'NelsonNAS' pool: cannot import 'NelsonNAS' as 'NelsonNAS': I/O error
Close
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
We'll have to try CLI at this point, as the UI isn't working. Try zpool import -f -F -R /mnt NelsonNAS.
 

Azaz

Cadet
Joined
Jan 6, 2023
Messages
7
We'll have to try CLI at this point, as the UI isn't working. Try zpool import -f -F -R /mnt NelsonNAS.

Ah, yes! That got it working. Thank you so much.

I also found this post while trying to figure out more possible solutions. Didn't see it beforehand but basically the same issue and solution.

You're the boss!
 
Last edited:

bajang14

Cadet
Joined
Aug 13, 2022
Messages
2
Hi, Sorry to bump this thread again, first time using Truenas Scale after upgarding from Truenas Core, I ran into the same problem one of my disk suddenly become unassigned, and tried the export and re-import the pool without error but the disk remain unassigned, is my drive ruined, all the files are backed ups so I could always recreate the pool but I want to know how to solve this problem if ever occurs again in the future.

I'm using non NAS self build from scratch:

Mobo: GA B85M DS3H A
RAM: 16Gb
Drive: 6 x WD Blue 2Tb, configured into 1 pool, RaidZ1

I'm using this mainly for managing large files for my works (mostly design and photography) before all the files went into backup, and, to learn about NAS and IT in general along the way.

I still not fully understand to read the smart test but it seems the drive are prefailing, but I still believe it's working at least at the moment, here's the test result:

root@truenas[~]# smartctl -a /dev/sdc
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.107+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: WDC WD20EZBX-00AYRA0
Serial Number: WD-WXB2A618U3D1
LU WWN Device Id: 5 0014ee 2bf1c287e
Firmware Version: 01.01A01
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
TRIM Command: Available, deterministic
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is: Thu Oct 5 07:42:35 2023 WIB
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (52860) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 178) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x3035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 184 152 021 Pre-fail Always - 1791
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 218
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 082 082 000 Old_age Always - 13246
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 217
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 128
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 92
194 Temperature_Celsius 0x0022 105 095 000 Old_age Always - 38
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

SMART Error Log Version: 1
ATA Error Count: 1656 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1656 occurred at disk power-on lifetime: 13246 hours (551 days + 22 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 e0 20 02 00 e0 Device Fault; Error: ABRT 224 sectors at LBA = 0x00000220 = 544

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 e0 20 02 00 e0 08 11:16:37.723 READ DMA
ef 10 02 00 00 00 a0 08 11:16:37.708 SET FEATURES [Enable SATA feature]

Error 1655 occurred at disk power-on lifetime: 13246 hours (551 days + 22 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 e0 20 02 00 e0 Device Fault; Error: ABRT 224 sectors at LBA = 0x00000220 = 544

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 e0 20 02 00 e0 08 11:16:37.652 READ DMA
ef 10 02 00 00 00 a0 08 11:16:37.641 SET FEATURES [Enable SATA feature]

Error 1654 occurred at disk power-on lifetime: 13246 hours (551 days + 22 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 e0 20 02 00 e0 Device Fault; Error: ABRT 224 sectors at LBA = 0x00000220 = 544

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 e0 20 02 00 e0 08 11:16:37.590 READ DMA
ef 10 02 00 00 00 a0 08 11:16:37.575 SET FEATURES [Enable SATA feature]

Error 1653 occurred at disk power-on lifetime: 13246 hours (551 days + 22 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 e0 20 02 00 e0 Device Fault; Error: ABRT 224 sectors at LBA = 0x00000220 = 544

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 e0 20 02 00 e0 08 11:16:37.523 READ DMA
ef 10 02 00 00 00 a0 08 11:16:37.508 SET FEATURES [Enable SATA feature]

Error 1652 occurred at disk power-on lifetime: 13246 hours (551 days + 22 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 e0 20 02 00 e0 Device Fault; Error: ABRT 224 sectors at LBA = 0x00000220 = 544

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 e0 20 02 00 e0 08 11:16:37.457 READ DMA
ef 10 02 00 00 00 a0 08 11:16:37.442 SET FEATURES [Enable SATA feature]

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 774 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

And at the notofication alerts show:

Critical​

Pool MOMO state is DEGRADED: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state.
The following devices are not healthy:
  • Disk WDC_WD20EZBX-00AYRA0 WD-WXB2A618U3D1 is UNAVAIL
  • Disk WDC_WD20EZBX-00AYRA0 WD-WXB2A613A29Y is DEGRADED



Let me know what you think

Thanks in advance
 

bajang14

Cadet
Joined
Aug 13, 2022
Messages
2
Update on my situation:

So I tried again to export/disconnect my pool, however, this time I disconnect all of my hdd and turn on my NAS without all the HDD connected, then, I turn it off and reconnect all of the drive, import the pool then all the drive was recognized as it was, but this time with storage status degraded and the drive which previously reported unassigned currently FAULTED and another drive reported DEGRADED. I guess it really is faulted and need replaced

Here's the screenshot:
Screen Shot 2023-10-05 at 20.57.52.png



Let me know what you think

Thanks in advance
 

xVertigo

Cadet
Joined
Nov 19, 2023
Messages
1
We'll have to try CLI at this point, as the UI isn't working. Try zpool import -f -F -R /mnt NelsonNAS.
Jumping into this thread. First of all THANK YOU Samuel, saved my pool.
Secondly, how can a RaidZ2 pool just go poof like that? The disk are ok, there was a scrub operation a few hours before the pool disappeared.
This is a new NAS that should replace my old trusty raid card based NAS and I lost all my faith in it...

Machine:
Asrock B550 Steel Legend motherboard
AMD R9-5950X (one CCD disabled, so 8C16T)
4x16GB DDR 4 3200 non ECC (Passed 4 memtest passes).
RaidZ2:
5x Seagate Ironwolf 8TB ST8000VN0022-2EL112
1x INTEL Intel DC P3700 SSDPEDMD020T4 2TB as cache
 

Seyude

Cadet
Joined
Jan 17, 2024
Messages
3
We'll have to try CLI at this point, as the UI isn't working. Try zpool import -f -F -R /mnt NelsonNAS.
Hi @Samuel Tai , I have the same problem here and try to use this solution that is working for others. I'm running in to a problem when using this command. I get the error saying: Namespace zpool not found in the CLI.

I'm trying this from the shell page from the webUI. When I open it I typed CLI to get there. Also tried to do sudo CLI, but get the same response. also zpool import gives me the same. I think i'm doing something wrong but can't find out what. Can you help me?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
I'm trying this from the shell page from the webUI. When I open it I typed CLI to get there. Also tried to do sudo CLI, but get the same response.
zpool is a shell command, not a TrueNAS CLI command. So from the shell directly invoke zpool ..., don't type cli first.

Also better than using the web UI shell use SSH and a terminal application.
 

Seyude

Cadet
Joined
Jan 17, 2024
Messages
3
zpool is a shell command, not a TrueNAS CLI command. So from the shell directly invoke zpool ..., don't type cli first.

Also better than using the web UI shell use SSH and a terminal application.
Thanks, but when I do that, I get the reply: zsh: command not found: zpool. Has i something to do with my admin account not being a default root user? I am reading that in some posts here, but can't find a work around

I'm clearly no expert in Linux...
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
You need to be root, so sudo su - first.
 

ilias991

Cadet
Joined
Feb 18, 2024
Messages
8
1708877929119.png
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
@ilias991 and your question is?

Please open a new thread instead of posting random images in a thread that is probably not even related to your problem.
Second, please post command output by copying and pasting text instead of images. Enclose text in CODE tags.
Third post all details regarding your hardware, system, controller if resent, disk drives or SSDs, how they are connected, ...

You have an I/O error ... so something is definitely broken. Disk drive, cable, controller, ... but without all the details nobody will be able to even have a guess.
 
Top