Disks from volume "disappeared", even though the system still sees them.

Status
Not open for further replies.

dmt0

Dabbler
Joined
Oct 28, 2011
Messages
47
Hi All!

Please help. Installed Freenas 8.02.
Created a ZFS RAIDZ1 Volume from 4 identical drives.
Copied a few TBs of data to the volume.
Could read all the data without issues.
Rebooted the machine, 2 out of 4 drives are missing.
So the volume status in the GUI is "WARNING: The volume MainOne (ZFS) status is " [blank]
Space shows as: Error getting available space Error getting total space
When I click view disks - it cannot see the serial numbers of two of the disks - "UNKNOWN"
The disks do show up in dmesg:


Oct 28 17:47:02 freenas kernel: ada0: 7279MB (14909328 512 byte sectors: 16H 63S/T 14791C)
Oct 28 17:47:02 freenas kernel: ada1 at ata2 bus 0 scbus1 target 0 lun 0
Oct 28 17:47:02 freenas kernel: ada1: <WDC WD20EARS-00MVWB0 51.0AB51> ATA-8 SATA 2.x device
Oct 28 17:47:02 freenas kernel: ada1: 150.000MB/s transfers (SATA, UDMA5, PIO 8192bytes)
Oct 28 17:47:02 freenas kernel: ada1: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
Oct 28 17:47:02 freenas kernel: ada2 at ata2 bus 0 scbus1 target 1 lun 0
Oct 28 17:47:02 freenas kernel: ada2: <WDC WD20EARS-22MVWB0 51.0AB51> ATA-8 SATA 2.x device
Oct 28 17:47:02 freenas kernel: ada2: 150.000MB/s transfers (SATA, UDMA5, PIO 8192bytes)
Oct 28 17:47:02 freenas kernel: ada2: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
Oct 28 17:47:02 freenas kernel: ada3 at ata3 bus 0 scbus2 target 0 lun 0
Oct 28 17:47:02 freenas kernel: ada3: <WDC WD20EARS-00MVWB0 51.0AB51> ATA-8 SATA 2.x device
Oct 28 17:47:02 freenas kernel: ada3: 150.000MB/s transfers (SATA, UDMA5, PIO 8192bytes)
Oct 28 17:47:02 freenas kernel: ada3: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
Oct 28 17:47:02 freenas kernel: ada4 at ata3 bus 0 scbus2 target 1 lun 0
Oct 28 17:47:02 freenas kernel: ada4: <WDC WD20EARS-00MVWB0 51.0AB51> ATA-8 SATA 2.x device
Oct 28 17:47:02 freenas kernel: ada4: 150.000MB/s transfers (SATA, UDMA5, PIO 8192bytes)
Oct 28 17:47:02 freenas kernel: ada4: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)


I went to the terminal:

zpool status
no pools available
zpool import
...
STATE: UNAVAIL
STATUS: One or more of your devices are missing from the system.
...
and the two drives are UNAVAIL and CANNOT OPEN


So I deleted the volume, created it again, copied all the data onto it, rebooted the machine, and got the same result exactly - those two drives are missing.

Currently the data is backed up, but I have to return the backup hard drive soon.
Am I bumping into a known issue? How can I get the volume back to life and get it to survive the reboot??
Please help!
 

dmt0

Dabbler
Joined
Oct 28, 2011
Messages
47
One more thing that might be relevant. I see the following in the log:

Oct 28 17:47:02 freenas kernel: GEOM: raid3/MainRaidZraid3: corrupt or invalid GPT detected.
Oct 28 17:47:02 freenas kernel: GEOM: raid3/MainRaidZraid3: GPT rejected -- may not be recoverable.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
One more thing that might be relevant. I see the following in the log:

I was just reading something about this not more than an hour ago, but there wasn't a solution. I don't remember what the link was. It was FreeBSD related. It was definitely related to the GPT being corrupt.

Were there any existing partitions? Can you wipe the disks? Here's the info from the FAQ on wiping.

20) I'm trying to re-use a disk from an old array and get "Error getting used space", "Operation not permitted", and some other errors, what's wrong?

This can also happen if you export your pool from the command line and GUI thinks its still online, do not wipe your disk if this is what you did.

With GPT partitioning scheme, the partition table is stored at the beginning (primary) and end (backup) of the disk. So if you don't zero out the WHOLE disk, the backup GPT table gets loaded.

To fix the above problem you have to do both of these commands and replace ada1 with your disk/device name:

dd if=/dev/zero of=/dev/ada1 bs=512 count=2
dd if=/dev/zero of=/dev/ada1 bs=512 skip=sectors

Sectors = the length of your disk reported by fdisk or 'gpart show' (usually the big long number), minus 1 or 2.

EDIT: Found the link:

Mysterious issue involving GPT and ZFS
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
I would check the BIOS settings, this may not be part of the problem, but those drives should be doing better than 150MB/s. This should be different "UDMA5, PIO 8192bytes", I don't know what your BIOS offers for choices. The specs for the motherboard say:
4 SATAII ports with transfer rate up to 3Gb/s.
 

dmt0

Dabbler
Joined
Oct 28, 2011
Messages
47
Were there any existing partitions? Can wipe the disks? Here's the info from the FAQ on wiping.

Here's the deal. All the drives have been bought new and never been used except in this Freenas setup.
Initially I had 3 drives in RAIDZ1 with Freenas 8.0. Than I decided to add the fourth drive - in the process deleting the original pool, and doing a fresh install of 8.02.
Drives are WD Green 2 TB.
Now the two drives that "disappeared" are one from the initial setup and one that is the new one.
Both times the same ones have "disappeared". I'm really curious as to why specifically these two!
I'm not sure if they come partitioned from the store. I kind of assumed that Freenas will just 0 them out by default. Does it?

Now, do you think there is any chance to restore the thing without me having to wipe out all the drives? It took days to copy the 3 TBs of stuff on them (and I've done that twice), it would be really awesome if that could be avoided.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
I kind of assumed that Freenas will just 0 them out by default. Does it?
NO

is any chance to restore the thing without me having to wipe out all the drives?

Not that I'm aware of, there *might* be some trick you can do with gpart to fix the partition table, but unless you know what caused it, it could just happen again. Did you read the link I found/posted above? It seems related, although it doesn't have any clear answers.

Can you do a 'gpart show' and post the results here?

I still think there's something suspicious about the BIOS settings for the disks.
 

dmt0

Dabbler
Joined
Oct 28, 2011
Messages
47
Not that I'm aware of, there *might* be some trick you can do with gpart to fix the partition table, but unless you know what caused it, it could just happen again. Did you read the link I found/posted above?
Yes I did skim through those posts. I didn't know what GPT is until today, so I'd need a lot more background to understand what is being said. And as you mentioned, there doesn't seem to be any solution there, except for zeroing out the drives.
Can you do a 'gpart show' and post the results here?
Yes:
=> 6 732558325 da0 GPT (2.7T)
6 4194304 1 freebsd-swap (16G)
4194310 728364021 2 freebsd-ufs (2.7T)

=> 63 14909265 ada0 MBR (7.1G)
63 1930257 1 freebsd [active] (943M)
1930320 63 - free - (32K)
1930383 1930257 2 freebsd (943M)
3860640 3024 3 freebsd (1.5M)
3863664 41328 4 freebsd (20M)
3904992 11004336 - free - (5.2G)

=> 34 3907029101 ada4 GPT (1.8T)
34 94 - free - (47K)
128 4194304 1 freebsd-swap (2.0G)
4194432 3902834703 2 freebsd-zfs (1.8T)

=> 0 1930257 ada0s1 BSD (943M)
0 16 - free - (8.0K)
16 1930241 1 !0 (943M)

Having posted that, I have to give you the whole story. ada1 and ada3 are the "disappeared" drives. ada2 does not show up, because I've just replaced it with a new one (the old one had unreadable blocks and stuff). Just as I got ready for resilvering it, I found that the other two drive are not there. That's the ada2 story.

I still think there's something suspicious about the BIOS settings for the disks.
I've examined the BIOS options. Everything that concerns drives is set to AUTO and the mobo sets everything to the highest available value.
One interesting thing that's not likely related is that 2 of the drives were set as second and third boot devices. BIOS screen doesn't show you enough info to figure out if those were the same two "disappeared" drives...
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Wow, that's pretty interesting and contradictory to the dmesg output. Ada1 and Ada3 should be there if dmesg is showing them. Try doing a 'camcontrol list' and see which drives are there.
 

dmt0

Dabbler
Joined
Oct 28, 2011
Messages
47
# camcontrol devlist
<CF Card Ver5.04> at scbus0 target 1 lun 0 (ada0,pass0)
<WDC WD20EARS-00MVWB0 51.0AB51> at scbus1 target 0 lun 0 (ada1,pass1)
<WDC WD20EARS-22MVWB0 51.0AB51> at scbus1 target 1 lun 0 (ada2,pass2)
<WDC WD20EARS-00MVWB0 51.0AB51> at scbus2 target 0 lun 0 (ada3,pass3)
<WDC WD20EARS-00MVWB0 51.0AB51> at scbus2 target 1 lun 0 (ada4,pass4)
<WD My Book 1130 1012> at scbus3 target 0 lun 0 (da0,pass5)


Everything is there.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Well it seems like those 2 disks have completely lost their partitioning information. I'm kind of stumped. The only thing I would be curious to try is either booting from a Linux Live CD like Ubuntu or something and seeing what partition info it can see on those disks, or take the 'disappeared' disks and put them in another system and see if you can find any partition info. If you find something there, that still doesn't explain why FreeNAS isn't seeing them, especially after they worked already, but it might give some sort of clue.
 

dmt0

Dabbler
Joined
Oct 28, 2011
Messages
47
So I booted a GParted Live USB. It sees GPT partitions tables on all three drives, and two partitions on each - a 2GB, and 1.82TB.
Partition label on the second partition is also there, just as I set it during installation.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Even more weird....

It sees GPT partitions tables on all three drives

Shouldn't there be 4 drives?

and two partitions on each - a 2GB, and 1.82TB.

Those would be the swap partition, and the ZFS partition.

I'm tempted to move this over the Bugs section, it just doesn't make sense. You did great collecting all the info, let's see if anyone else has any ideas.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
I found this link, it would be interesting to see what it reports since gpart show doesn't display any info for the missing disks. I'm still interested to know if you meant 4 drives above.

HOWTO/LILO-crash-rescue-HOWTO/disk_partition_rescue

run "gpart /dev/[lt ]your disk[gt ]", e.g. "gpart /dev/hdc". Without any options, gpart performs a standard scan, and merely looks if it can guess a consistent primary partition table.

....after the check-phase it says Ok, you should check the proposed partition table very carefully. After that you may write back the guessed table by calling "gpart -W /dev/hdc /dev/hdc" (exchange /dev/hdc with your disk device). When gpart has successfully written the new primary partition table, cross your fingers and reboot....
 

dmt0

Dabbler
Joined
Oct 28, 2011
Messages
47
Shouldn't there be 4 drives?
The fourth drive is the replacement drive I just bought for a failed one, that was meant to be resilvered. So it never got formatted.

Again, full story:
ada1, ada3 disappeared
ada2 just got replaced with a brand new one
ada4 the only one still alive
 

dmt0

Dabbler
Joined
Oct 28, 2011
Messages
47
From same page:
Currently the following filesystem types are known to gpart (listed by module names) : beos, bsddl, ext2, fat, hmlvm, lswap, minix, ntfs, qnx4, rfs, s86dl, xfs.
My version of gpart told me it doesn't know what the partitions are. It doesn't know ZFS. So I don't feel very lucky doing that one.
When I run gpart on FreeNAS, I can't run it without options to perform standard scan. Can it do the same thing? Would you have the command handy?
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
When I run gpart on FreeNAS, I can't run it without options to perform standard scan. Can it do the same thing? Would you have the command handy?

gpart /dev/ada1 or /dev/ada3
 

dmt0

Dabbler
Joined
Oct 28, 2011
Messages
47
gpart /dev/ada1
gpart: Unknown command: /dev/ada1

gpart show /dev/ada1
no such geom /dev/ada1

hmm...
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
It seems like it has to be the disk controller. The only thing left I can think of to suggest is to try another disk controller if you can.

EDIT: and if you did that, you shouldn't need to recreate your volume. If it worked, the volume should be importable, which is one of the advantages of ZFS.
 
Status
Not open for further replies.
Top