Pool Offline

ForMyDemons · Mar 28, 2024

Hi,

i dont use an usv and my nas lost power, i powered it on and my pool was listed as degraded, one drive was marked unavail.

i tried scrub and zpool clear, reboots, onlining it nothing worked so i tried to wipe the one disk to maybe let it resilver.

i turned off the machine and checked the cables and reseatet them, maybe something got loose?

i startet the pc again and now my datapool is listed as offline

zpool import wont work

what can i do?

chuck32 · Mar 28, 2024

Did you actually replace the drive or just wiped it and hoped it would start to resilver? You need to manually start the replacing process.

zpool replace

Although I'm not convinced it's fruitful to do that unless you verified the drive is okay and didn't suffer any permanent damage from the blackout.

Do you have backups? Do you need to access the pool for data recovery first? In that case you could try and import as read only.

In wonder how you would import the pool in a degraded state anyway, maybe force import? The link provided in the message does not explain, however there you can also see your options, in your case I'd think replace is your option since you wiped the drive already.

ForMyDemons · Mar 28, 2024

i used the quick wipe option and of course i dont have any backups and ya some data would be nice to access.... and i only wiped 1 drive out of 5 thats why im so weirded out that the whole thing is set to offline

ForMyDemons · Mar 29, 2024

chuck32 said:
Did you actually replace the drive or just wiped it and hoped it would start to resilver? You need to manually start the replacing process.

zpool replace

Although I'm not convinced it's fruitful to do that unless you verified the drive is okay and didn't suffer any permanent damage from the blackout.

Do you have backups? Do you need to access the pool for data recovery first? In that case you could try and import as read only.

In wonder how you would import the pool in a degraded state anyway, maybe force import? The link provided in the message does not explain, however there you can also see your options, in your case I'd think replace is your option since you wiped the drive already.

how can i replace the single drive if i go zpool status i dont see the pool only with zpool import i see the degraded Datengrab but with zpool replace he wants the disk and a new disk, i dont have a new disk and the gptid look all the same how ican i figure out the device i wantot replace?

also is there a way to copy paste the shell code here? ctrl c wont work

chuck32 · Mar 29, 2024

I'm on mobile. Start a ssh session that's better than using built in shell. Hit Ctrl + v it will display a message what the keyboard shortcuts are for copy and paste.

When you wiped the disk you should have one unused disk you can use to replace. Did you follow the link in the error message?

I assume mounting read only may work to retrieve the data, I think the command ist zpool import -o readonly=on Datengrab, double check with the documentation if in doubt.

Better to post your outputs here for better guidance o replacing if you are unsure.

ForMyDemons · Mar 29, 2024

chuck32 said:
I'm on mobile. Start a ssh session that's better than using built in shell. Hit Ctrl + v it will display a message what the keyboard shortcuts are for copy and paste.

When you wiped the disk you should have one unused disk you can use to replace. Did you follow the link in the error message?

I assume mounting read only may work to retrieve the data, I think the command ist zpool import -o readonly=on Datengrab, double check with the documentation if in doubt.

Better to post your outputs here for better guidance o replacing if you are unsure.

you mean the error message regarding the "new zfs version" ? no didnt look at it since no clue why thats there.

i used

Code:

zpool import -o readonly=on Datengrab

and no i get with zpool status that

Code:

root@truenas[~]# zpool status
  pool: Datengrab
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q
  scan: scrub repaired 0B in 00:27:34 with 0 errors on Thu Mar 28 18:57:10 2024
config:

        NAME                                            STATE     READ WRITE CKSUM
        Datengrab                                       DEGRADED     0     0     0
          raidz1-0                                      DEGRADED     0     0     0
            gptid/737222a9-9d92-11ee-b6fe-38d547b73974  ONLINE       0     0     1
            ada1p2                                      ONLINE       0     0     2
            gptid/7392fda0-9d92-11ee-b6fe-38d547b73974  ONLINE       0     0     1
            gptid/73ac4dde-9d92-11ee-b6fe-38d547b73974  ONLINE       0     0     1
            4329784629172535182                         UNAVAIL      0     0     0  was /dev/ada5p2

errors: No known data errors

  pool: boot-pool
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:00:20 with 0 errors on Fri Mar 29 03:45:20 2024
config:

        NAME        STATE     READ WRITE CKSUM
        boot-pool   ONLINE       0     0     0
          ada4p2    ONLINE       0     0     0

errors: No known data errors

its listed as degraded but i can't access it from windows

and its weird that i can add it as readonly but not normally...

i rebootet the system and from beginning

zpool status

Code:

 zpool status
  pool: boot-pool
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:00:20 with 0 errors on Fri Mar 29 03:45:20 2024
config:

        NAME        STATE     READ WRITE CKSUM
        boot-pool   ONLINE       0     0     0
          ada4p2    ONLINE       0     0     0

errors: No known data errors

zpool import

Code:

zpool import
   pool: Datengrab
     id: 11785441521838410442
  state: DEGRADED
status: One or more devices are missing from the system.
 action: The pool can be imported despite missing or damaged devices.  The
        fault tolerance of the pool may be compromised if imported.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q
 config:

        Datengrab                                       DEGRADED
          raidz1-0                                      DEGRADED
            gptid/737222a9-9d92-11ee-b6fe-38d547b73974  ONLINE
            ada1p2                                      ONLINE
            gptid/7392fda0-9d92-11ee-b6fe-38d547b73974  ONLINE
            gptid/73ac4dde-9d92-11ee-b6fe-38d547b73974  ONLINE
            4329784629172535182                         UNAVAIL  cannot open

zpool import Datengrab

Code:

 zpool import Datengrab
cannot import 'Datengrab': one or more devices is currently unavailable

when i do zpool import Datengrab -F -n i get nothing returned and nothing gets added

no i really dont know how to replace "4329784629172535182 UNAVAIL cannot open" since i guess i need to replace that id 4329784629172535182 with the id of the wiped disk and how to obtain that?

sorry to ask so much noob questions =/

ForMyDemons · Mar 29, 2024

Update i shutdown the machine again and reseated everything again now the pool shows up again

Code:

zpool status
  pool: Datengrab
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q
  scan: scrub repaired 0B in 00:27:34 with 0 errors on Thu Mar 28 18:57:10 2024
config:

        NAME                                            STATE     READ WRITE CKSUM
        Datengrab                                       DEGRADED     0     0     0
          raidz1-0                                      DEGRADED     0     0     0
            gptid/737222a9-9d92-11ee-b6fe-38d547b73974  ONLINE       0     0     0
            gptid/7389e3b1-9d92-11ee-b6fe-38d547b73974  ONLINE       0     0     0
            gptid/7392fda0-9d92-11ee-b6fe-38d547b73974  ONLINE       0     0     0
            gptid/73ac4dde-9d92-11ee-b6fe-38d547b73974  ONLINE       0     0     0
            4329784629172535182                         UNAVAIL      0     0     0  was /dev/ada5p2

errors: No known data errors

  pool: boot-pool
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:00:20 with 0 errors on Fri Mar 29 03:45:20 2024
config:

        NAME        STATE     READ WRITE CKSUM
        boot-pool   ONLINE       0     0     0
          ada4p2    ONLINE       0     0     0

errors: No known data errors

but when i use the gui for replacing and checking the force button i get this:

Code:

Error: concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zfs.py", line 283, in replace
    target.replace(newvdev)
  File "libzfs.pyx", line 402, in libzfs.ZFS.__exit__
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zfs.py", line 283, in replace
    target.replace(newvdev)
  File "libzfs.pyx", line 2147, in libzfs.ZFSVdev.replace
libzfs.ZFSException: /dev/gptid/b3baba12-edeb-11ee-a450-38d547b73974 is busy, or device removal is in progress

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/concurrent/futures/process.py", line 246, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 111, in main_worker
    res = MIDDLEWARE._run(*call_args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 45, in _run
    return self._call(name, serviceobj, methodobj, args, job=job)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 39, in _call
    return methodobj(*params)
  File "/usr/local/lib/python3.9/site-packages/middlewared/worker.py", line 39, in _call
    return methodobj(*params)
  File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 985, in nf
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/zfs.py", line 285, in replace
    raise CallError(str(e), e.code)
middlewared.service_exception.CallError: [EZFS_BADDEV] /dev/gptid/b3baba12-edeb-11ee-a450-38d547b73974 is busy, or device removal is in progress
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 355, in run
    await self.future
  File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 391, in __run_body
    rv = await self.method(*([self] + args))
  File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 981, in nf
    return await f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/pool_/replace_disk.py", line 91, in replace
    await self.middleware.call('zfs.pool.replace', pool['name'], options['label'], new_devname)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1283, in call
    return await self._call(
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1248, in _call
    return await self._call_worker(name, *prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1254, in _call_worker
    return await self.run_in_proc(main_worker, name, args, job)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1173, in run_in_proc
    return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1156, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
middlewared.service_exception.CallError: [EZFS_BADDEV] /dev/gptid/b3baba12-edeb-11ee-a450-38d547b73974 is busy, or device removal is in progress

UPDATE:::::::::::

instead of replace i used via gui the "online" function and schwoops it got onlined and now system is resilvering and seems fine again... don't ask me but i guess the sudden powerloss combined with the startech expansioncard for 2 sata drive caused the issue. (yes i know... im already planning in the future to get an LSI hba and also then later upgrading to a ddr5 system with a more reliable board etc...)

but for me its still weird even if one drive is just broken thats why its raid , it should not drown the whole system like that ... weird but i'm still learning so much of that

chuck32 · Mar 29, 2024

ForMyDemons said:
you mean the error message regarding the "new zfs version" ? no didnt look at it since no clue why thats there.

No I meant
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q

ForMyDemons said:
its listed as degraded but i can't access it from windows

Did you check your share settings? I'd assume the readonly pool maybe got mounted differently and thus wasn't picked up by the smb sharing service. Windows is peculiar about its error messages in my experience.

ForMyDemons said:
middlewared.service_exception.CallError: [EZFS_BADDEV] /dev/gptid/b3baba12-edeb-11ee-a450-38d547b73974 is busy, or device removal is in progress

Weird, if this was indeed the new drive after rebooting it shouldn't have been busy with anything.

Would have been nice to see a screenshot of the GUI (to see with which drive you wanted to replace, but I would assume the GUI should have picked up the unused drive correctly).
Could've tried via CLI.

ForMyDemons said:
with the id of the wiped disk and how to obtain that?

On core I think glabel status should work.

ForMyDemons said:
instead of replace i used via gui the "online" function and schwoops it got onlined and now system is resilvering and seems fine again...

You onlined this drive:
4329784629172535182 UNAVAIL 0 0 0 was /dev/ada5p2

Maybe I'm fundamentally wrong, but your newly wiped drive should not have been associated with that entry, especially since it says UNAVAIL and not OFFLINE. Was your replacement disk offline?

ForMyDemons said:
don't ask me but i guess the sudden powerloss combined with the startech expansioncard for 2 sata drive caused the issue. (yes i know... im already planning in the future to get an LSI hba and also then later upgrading to a ddr5 system with a more reliable board etc...)

Should have mentioned that in the OP. I'm not always asking for hardware details, they should be provided anyway ;) In this case the power loss drew me away from a general hardware problem.

What hardware do you currently use? DDR5 is probably not necessary.

ForMyDemons said:
but for me its still weird even if one drive is just broken thats why its raid , it should not drown the whole system like that ...

Well, the system wasn't broken. It behaved slightly different from what I would have expected, but basically the data was there, as expected. You were able to mount the pool in read only and I'm happy to learn by some user how

ForMyDemons said:

it would have been possible to import the pool in the degraded state other than readonly.

I'm glad this played out for you, and now that nothing was lost

ForMyDemons said:
of course i dont have any backups and ya some data would be nice to access....

Let this be a scare for you, find a solution backups immediately.

ForMyDemons · Mar 29, 2024

i didn't change hardware wise anything on my system, except on replugging the sata cables to the same HDD.

i use a startech 2P6G-PCIE-SATA-CARD for 2 drives and from the datasheet i see it uses ASMedia - ASM1061 chipset also its an asrock board with a 6th gen i5 intel cpu and 16gig ram (was the old system from my gf in a little fractal nano case)

https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q looked at it and tried the stuff but didn't work and didn't know how i should replace that disk via shell with the replace command

and yes i just copied the stuff as a backup

Important Announcement for the TrueNAS Community.

Pool Offline

ForMyDemons

Dabbler

chuck32

Guru

ForMyDemons

Dabbler

ForMyDemons

Dabbler

chuck32

Guru

ForMyDemons

Dabbler

ForMyDemons

Dabbler

chuck32

Guru

ForMyDemons

Dabbler

Similar threads

Important Announcement for the TrueNAS Community.

Pool Offline

Dabbler

Guru

Dabbler

Dabbler

Guru

Dabbler

Dabbler

Guru

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Pool Offline"

Similar threads