Upgraded from 13.0-U.2 to 13.0-U6.1 - Pool Offline

wittend · Jan 10, 2024

My System Partition (and all of my stored data) is Offline and I do not know how to restore it.
I see many reports of problems like mine, but usually there is hardware RAID, Proxmox, or other factors that don't apply to me.

I am not especially experienced with TrueNAS, but it has been working well for me for many months. I tried to follow recommendations (as well as I could understand them) when I set this system up, and tried not to make things complicated, Now after upgrading to TrueNAS-13.0-U6.1, things have fallen apart for me. I believe that I upgraded from TrueNAS-13.0-U2

My setup is:

16 GB RAM.
An older I7 motherboard with 4 SATA connections + a two port PCIe SATA card.
Two small SATA SSD drives for TrueNAS itself (attached to the PCIe card).
4 WD Red 4TB SATA drives attached to the motherboard SATA connectors.
No RAID, software or hardware.
Straightforward BIOS setup - no frills, twiddles or tweaks.
All drives visible at the BIOS level, no error reports concerning hardware.
1GB Ethernet, on the motherboard, not added in.

All NAS data is in one ZFS pool called unimaginatively) BigPool.
I backed up the system information to another machine before I applied the update.
The system continued to work well for a while, now it doesn't.

The Dashboard shows everything as normal except:

BigPool
Total Disks :
Unknown
Pool Status:
OFFLINE
Used Space:
Unknown

Under Storage / Pools:

BigPool (System Dataset Pool) OFFLINE [and a button] Export/Disconnect

No other pools to export to, no clue where to go from here.

Help!

winnielinnie · Jan 10, 2024

Are you familiar with using SSH + command-line?

Can you provide the output of the following "zpool" and "zfs" commands:

Code:

zpool status -v

zpool list

zfs list -t filesystem | grep "\.system"

zfs mount

zpool import

wittend · Jan 10, 2024

Yes, I am familiar with SSH, but I have the Web interface up and from the shell window I get:

zpool status -v
pool: boot-pool
state: ONLINE
scan: scrub repaired 0B in 00:00:17 with 0 errors on Tue Jan 9 03:45:17 2024
config:

NAME STATE READ WRITE CKSUM
boot-pool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ada0p2 ONLINE

zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH
ALTROOT
boot-pool 103G 3.03G 100G - - 0% 2% 1.00x ONLINE
-

zfs list -t filesystem | grep "\.system"
[nothing, just a newline - was I supposed to request a particular filesystem?]

zfs mount
boot-pool/ROOT/13.0-U6.1 /

zpool import
pool: BigPool
id: 12719548412266318029
state: ONLINE
status: One or more devices were being resilvered.
action: The pool can be imported using its name or numeric identifier.

Config:
BigPool ONLINE
raidz1-0 ONLINE
gptid/e4b02a68-4cb0-11ed-a5e8-bc5ff4664310 ONLINE

Is that good enough?

wittend · Jan 10, 2024

Without the 'grep' and as a non-root user I can enter:

bear% zfs list -t filesystem
NAME USED AVAIL REFER MOUNTPOINT
boot-pool 3.03G 96.8G 96K none
boot-pool/ROOT 3.02G 96.8G 96K none
boot-pool/ROOT/13.0-U6.1 3.02G 96.8G 1.51G /
boot-pool/ROOT/Initial-Install 8K 96.8G 1.50G legacy
boot-pool/ROOT/default 292K 96.8G 1.50G legacy

winnielinnie · Jan 10, 2024

wittend said:
Config:
BigPool ONLINE
raidz1-0 ONLINE
gptid/e4b02a68-4cb0-11ed-a5e8-bc5ff4664310 ONLINE

Only one member disk is available for a RAIDZ1 pool, or was the output cut off?

wittend said:
zfs mount
boot-pool/ROOT/13.0-U6.1 /

wittend said:
zfs list -t filesystem | grep "\.system"
[nothing, just a newline - was I supposed to request a particular filesystem?]

There is no System Dataset, mounted or otherwise?

Are you able to (temporarily) move your System Dataset to your boot-pool?

wittend said:
zpool import
pool: BigPool
id: 12719548412266318029
state: ONLINE
status: One or more devices were being resilvered.
action: The pool can be imported using its name or numeric identifier.

Were you aware your BigPool was being resilvered?

In the GUI -> Storage -> Disks, do you see all member disks for your BigPool pool?

wittend · Jan 10, 2024

Sorry about my ignorance, but I spent much of the time since my last post figuring out what 'resilvering' is in this context.
I have not been concerned directly with system admin tasks and terminology for many years.
I finally had to ask Google what TrueNAS means by this. Now I know. A cute metaphor. But a bit opaque, too.

Is this why BigPool can't be mounted? It can take a very long time to 'resilver', I guess.
Or have I lost something vital?

There is nothing that currently seems to correspond to "\.system", either in the directory /boot where this was run or in / itself. /BigPool exists, but it just seems to be a currently unpopulated mountpoint.

And yes, in GUI -> Storage -> Disks all six of the drives in the system are visible. I'm not sure whether the information there is just stored or whether it is actually looking at them.

Am I hosed yet?

Thanks,
Dave

winnielinnie · Jan 10, 2024

wittend said:
There is nothing that currently seems to correspond to "\.system", either in the directory /boot where this was run or in / itself. /BigPool exists, but it just seems to be a currently unpopulated mountpoint.

Moving the System Dataset to the boot-pool is done via the GUI. The "friendly" name is "System Dataset" (not ".system").

wittend · Jan 10, 2024

Ok, I hope that I understood correctly. In the Web GUI, under System->System Dataset I was able to 'Configure System Dataset' to 'boot-pool' and click 'Save'. There was no indication of an error.
I hope that was what you meant.

Davvo · Jan 10, 2024

You should not use the WebUI Shell, it's broken and cuts text.
I would try zpool import BigPool.
I suggest reading the following resource.

Introduction to ZFS

This is a short introduction to ZFS. It is really only intended to convey the bare minimum knowledge needed to start diving into ZFS and is in no way meant to cut Michael W. Lucas' and Allan Jude's book income. It is a bit of a spiritual...

www.truenas.com

victort · Jan 11, 2024

When you say you have “No Raid” does that mean your pool is striped with all four disks being used as space?

Davvo · Jan 11, 2024

victort said:
When you say you have “No Raid” does that mean your pool is striped with all four disks being used as space?

He likely means no hardware RAID. A striped pool does not resilver.

winnielinnie · Jan 11, 2024

wittend said:
There was no indication of an error.

Now that you have a (hopefully) working System Dataset, can you reboot and try to import BigPool using the standard GUI functions?

wittend · Jan 11, 2024

@Davvo,
I mean I have no 'hardware' that inflicts RAID on my physical disks.

@winnielinnie,
Yes, I seem to have a working System Dataset,
but no attempting to import BigPool fails returning a whole bunch of Python errors (trace back really).
FWIW I can show you those.
But I had pretty much come to the conclusion that I am well and truly hosed.
This being my backup to a workstation that also had to be rebuilt, it is going to be an interesting Tax season.

Dave

wittend · Jan 11, 2024

Results from GUI->Storage->Pools->Import Pool:


Error: concurrent.futures.process._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/concurrent/futures/process.py", line 246, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 111, in main_worker
    res = MIDDLEWARE._run(*call_args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 45, in _run
    return self._call(name, serviceobj, methodobj, args, job=job)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 39, in _call
    return methodobj(*params)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 39, in _call
    return methodobj(*params)
  File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 985, in nf
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zfs.py", line 352, in import_pool
    self.logger.error(
  File "libzfs.pyx", line 402, in libzfs.ZFS.__exit__
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zfs.py", line 346, in import_pool
    zfs.import_pool(found, new_name or found.name, options, any_host=any_host)
  File "libzfs.pyx", line 1175, in libzfs.ZFS.import_pool
  File "libzfs.pyx", line 1203, in libzfs.ZFS.__import_pool
libzfs.ZFSException: I/O error
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 355, in run
    await self.future
  File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 391, in __run_body
    rv = await self.method(*([self] + args))
  File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 981, in nf
    return await f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool.py", line 1462, in import_pool
    await self.middleware.call('zfs.pool.import_pool', pool['guid'], {
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1283, in call
    return await self._call(
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1248, in _call
    return await self._call_worker(name, *prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1254, in _call_worker
    return await self.run_in_proc(main_worker, name, args, job)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1173, in run_in_proc
    return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1156, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
libzfs.ZFSException: ('I/O error',)

I'm thinking that give up and start over. Right?

wittend · Jan 11, 2024

Also:

Import problem: Error importing pool

('I/O error',)

winnielinnie · Jan 11, 2024

At work. Can’t respond. Hang tight.

wittend · Jan 11, 2024

Ok, Thanks!
I have to zoom for a while anyway.

Dave

Davvo · Jan 11, 2024

Are you able to rollback and see if things work fine in U2?

wittend · Jan 11, 2024

It seems that I have three options in system->boot:

Name	Active	Created	Space	Keep
13.0-U6.1	Now/Reboot	2024-01-08 11:30:00	1.51 GiB	No
default	-	2022-10-15 07:34:00	4.29 MiB
Initial-Install	-	2022-10-15 07:37:00	3.80 MiB	Yes

I'm not certain what this is telling me because the Active image is so much larger (1.51 GiB) than the other two.
I'm not wanting to do any (more) rash things right now.

Dave

winnielinnie · Jan 11, 2024

My hunch, assuming that your BigPool vdev's underlying disks are fine, is that the middleware doesn't like two sets of System Datasets. (However, the I/O error is concerning.) Otherwise, if the RAIDZ1 vdev is "failed", then your data is pretty much gone.

First, check that your System Dataset (and its children) do indeed exist on the boot-pool now, and that they are currently mounted.

Code:

zfs list -t filesystem -r -o name,mountpoint boot-pool/.system
zfs mount | grep "boot-pool/.system"

If they exist and are currently mounted, then continue.

Now check (with the full output) if all member disks are available for BigPool, and if it's "importable".

Code:

zpool import

Now see if you can at least import the pool without mounting any datasets (by using the "-N" flag).

Code:

zpool import -N BigPool

What is the status for BigPool?

Code:

zpool status -v BigPool

Do you see your datasets for BigPool?

Code:

zfs list -t filesystem -r BigPool

Can you at least get this far? (If so, don't try to "use" the pool, nor TrueNAS in general.)

Important Announcement for the TrueNAS Community.

Upgraded from 13.0-U.2 to 13.0-U6.1 - Pool Offline

Dabbler

MVP

Dabbler

Dabbler

MVP

Dabbler

MVP

Dabbler

MVP

Guru

MVP

MVP

Dabbler

Dabbler

Dabbler

Import problem: Error importing pool​

MVP

Dabbler

MVP

Dabbler

MVP

Similar threads

Import problem: Error importing pool