SOLVED Probably coincidence: zpool import failure on update

Status
Not open for further replies.

pjc

Contributor
Joined
Aug 26, 2014
Messages
187
I just used the GUI to update from be89d82 to 67b83a7.

On reboot, /mnt/tank wasn't mounted. It's listed in the sidebar, but not in the "View Volumes" list. If I try to import it, I get the message "You already have a volume with same name".

The alert says "WARNING: The volume tank (ZFS) status is UNKNOWN".

zdb -C shows the pool. zfs list doesn't. A reboot didn't fix it. Rolling back to be89d82 didn't fix it. A power cycle didn't fix it.

zdb -C tank says:
WARNING: can't open objset for tank/timemachine
zdb: can't open 'tank': Input/output error

zpool import tank says:
cannot import 'tank': I/O error
Destroy and re-create the pool from
a backup source.

Ew. Not even complaints about bad metadata. These were just test drives (striped), so I'm not actually out any significant data, but it's a little puzzling.

SMART isn't showing any errors for either drive. It's a little surprising to me that there would be such a sudden failure.

Could an unclean reboot nuke the volume like this? (I don't see how it would result in I/O errors...) Is there any way to get more information about which drive is causing problems? Short self-tests passed, and I don't see any detail in /var/log/messages or dmesg about I/O errors anywhere.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
It's worth submitting a bug report.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
One thing about I/O errors. In the context that ZFS is referring to it would be akin to the file system saying "go to location X" but location X doesn't exist because it's an LBA that is after the end of the disk. I'd say there's definite corruption (duh!?) and if possible I wouldn't do anything with the pool for a few days so iX can have a chance to look at the pool more closely. Something was obviously very weird.

Just curious, what hardware were you using? You using recommended hardware and stuff?
 

pjc

Contributor
Joined
Aug 26, 2014
Messages
187
@jkh already tagged the bug report as "cannot reproduce", so I'm not sure they're interested in looking at the pool more closely.

Then again, I'm not sure anyone actually followed the steps and tried to reproduce it. Is there any way to make zpool import a bit more verbose?

For hardware, my signature has most of the crucial bits. Controller is an LSI SAS 9201-16i (IT mode), chassis is a Supermicro 836BA, and the drives were both Seagate.
 

pjc

Contributor
Joined
Aug 26, 2014
Messages
187
A little more information, and it's slightly puzzling:

zpool import -F -n tank says:

Nothing. Absolutely nothing. It doesn't say there's an error, it doesn't say it's unrecoverable, nothing. What does that mean?

If the folks at iX don't want anything from me, I'll give recovery mode a shot and see what happens. Or am I better off force-mounting read-only?
 

pjc

Contributor
Joined
Aug 26, 2014
Messages
187
Things got more interesting. It's looking like it's an issue with 9.3-BETA and non-upgraded pools. The pool magically worked when I booted back to 9.2.1.9, and datasets that I modified within 9.2.1.9 were then visible in 9.3-BETA (but the others weren't). See the bug report for more details.
 

pjc

Contributor
Joined
Aug 26, 2014
Messages
187
Yes, it's consistent as I go back and forth between 9.2.1.9 and 9.3-BETA.

I'm holding off writing to the other datasets within 9.2.1.9 in case there's any diagnostic information that we can coax out of ZFS to see what's going on.
 
S

sef

Guest
No, I asked if it happens across reboots: you upgrade to 9.3, get the error, and then if you reboot the 9.3 system does it stay that way?
 

pjc

Contributor
Joined
Aug 26, 2014
Messages
187
Yes, multiple 9.3 reboots, updates, etc., and it still fails to mount the pool. Swapping boot devices back to 9.2.1.9 and 9.2.1.9 mounts it happily, across multiple reboots. Swapping boot devices back to 9.3-BETA, it fails to mount the pool, across reboots.

(For clarity, the initial event wasn't a 9.2.x -> 9.3 GUI update. It was a GUI update within the 9.3-BETA train. My initial 9.3-BETA install was to a clean mirrored boot device, leaving my 9.2.1.9 untouched.)

See the bug report for gory details: https://bugs.freenas.org/issues/6810
 

pjc

Contributor
Joined
Aug 26, 2014
Messages
187
It turns out there was a ZFS bug. (See issues #6848 and #6868.) It's been fixed. Thanks, Xin!

No data loss after all.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
YAY! Glad this got figured out. This one had my butt-hole puckered up a little on that one.
 
Status
Not open for further replies.
Top