Trouble with auto-import

Starpulkka · Jul 8, 2014

Well im almost same situation as you, execpt i can mount my pool normally and fetch data fine. Had one drive witch kept disappearing 6 times (with 6 times resilvers) and freenas got fed up that drive and it finally put it to failed state. If i had raidz1 i would be sweat my forehead now.

I am little bit curious can you test does zpool mount normally if you dont put nothing on that failing sata port. But try this only after you have copied your data safe to some other hdd. If it mounts normally on gui then those other hdds are fine, if its not mount then those other hdds might have broblems and it needs hard testing.
Edit: Oh and one last thing, i have been teached to export pool and import it in new machine if you know you going to move it on another machine. (have done this 3 times) But i guess no one recommends it because 99% of people will destroy their data if try export, because of human error.

SweetAndLow · Jul 8, 2014

I think all of your issues come from you not fully meeting the hardware requirements and poor maintenance.

1. Overheating
2. Not running smart tests
3. No ecc ram
4. Not enough ram
5. Not following the drive replacement procedure

And in my opinion with every problem you have had you have only half way fixed it. You never rebuilt your pool so you could have full protection again, you rebuilt your server but without ecc ram and you tried to reuse the same bad drive that FreeNAS said failed.

You keep calling zfs volatile but when I list out the things you have failed to accomplish it's obvious the blame lies else where. I suggest copying your data off in two places if you can. Currently it sounds like you are copying it to a zfs format. I would do this as well as another filesystem format like NTFS or ext. This way if zfs turns out to be to complicated you have your backup on a format you can plug into Linux or windows and access your data. Good luck!

titan_rw · Jul 8, 2014

I slightly disagree with cyberjock here. If drives are in good health prior to storage, there should be no issue with turning them off and leaving them sit, even for long periods of time. Magnetic degradation isn't going to be an issue in only 18 months. Maybe after 18 years. I recently powered up an old ide drive last used in 2002 or so. Everything on it was still perfectly accessible.

The key point being that the drive was in good health when it went into hibernation. We don't really know for sure what the state of health of the drives were when they were shutoff. Unless you have saved smartctl -a outputs from them then.

But we do know that the drives are not in perfect health now, as cyberjock points out. If my math is right, one of the drives has even reached a temperature of 56C at one point. That's extremely hot. They haven't had smart tests done regularly. Even when they were powered on.

Metadata corruption can mean part of pool won't be accessible. So I'd definitely copy off the important stuff first, then start a more global copy and see what happens. If I remember right, all metadata is stored in triplicate on the pool. Independent of any raidZ redundancy. So something has to go quite wrong to get problems with metadata.

cyberjock · Jul 8, 2014

My concern isn't with magnetic degredation. My concern is with the electronics, the motor, the bearings. Stuff that has moving parts doesn't like to sit idle for long periods of time.

Tired_ · Aug 22, 2014

I'd like to address these individually.

SweetAndLow said:
I think all of your issues come from you not fully meeting the hardware requirements and poor maintenance.

1. Overheating

Yes, overheating was my initial problem. I attempted to resolve this problem by purchasing a proper server case. I ended up getting this one: http://www.newegg.com/Product/Product.aspx?Item=N82E16811147164 I honestly don't know what more could be done. How do you cool your drives?

2. Not running smart tests

That's valid. In the four months or so that it ran in the first place, I was still pretty new to FreeNAS, and I didn't know how to do that.

3. No ecc ram

That, too is valid. I understand the benefits of ECC RAM, but I had to make a tradeoff in the cost department. I am not a wealthy software developer, I am a disabled man unable to work. Is ECC a hard-and-fast absolute requirement? If so, why don't you block the install if it is not present? Are there any other users who do not use ECC, or am I the only one?

4. Not enough ram

I went to 8GB. I'm pretty sure that's what was recommended. What am I missing there?

5. Not following the drive replacement procedure

How do you propose I follow that? It requires an imported pool.

And in my opinion with every problem you have had you have only half way fixed it. You never rebuilt your pool so you could have full protection again, you rebuilt your server but without ecc ram and you tried to reuse the same bad drive that FreeNAS said failed.

That drive was eaten by a destructive SeaTools test (which it passed). From my reading, a transient error can be resilvered over. Are you saying that the drive is somehow 'contaminated' by having been eaten by SeaTools? And I very much wanted to rebuild my pool. The point of getting it working was so I could copy data off of it onto an ext volume, then blow it away and rebuild with new drives and the remaining old ones that were healthy. We haven't gotten that far yet. Your comment is like complaining that the cake is too runny before the baker even cooks it.

You keep calling zfs volatile but when I list out the things you have failed to accomplish it's obvious the blame lies else where. I suggest copying your data off in two places if you can.

The data is gone. I can make the pool mount with the commands I was given earlier, but trying to enter a directory causes the system to hang.

Currently it sounds like you are copying it to a zfs format. I would do this as well as another filesystem format like NTFS or ext. This way if zfs turns out to be to complicated you have your backup on a format you can plug into Linux or windows and access your data. Good luck!

I don't know where you got that idea from. I was attempting to copy it to a brand new 3TB drive I had formatted to ext. That, of course, didn't work since I couldn't get into the directories.

The reason I keep calling ZFS volatile is because that is what CyberJock told me.

CyberJock said:
I'd never have let the box sit unpowered for 18 months with no diagnostics and expect the box to still have my data safe and sound.

CyberJock said:
leaving it unpowered for 18 months was basically abandoning your data.

I don't see how that can be interpreted any other way. The act of doing nothing at all cost me all my data...that's volatility. How would you describe a filesystem that can't handle sitting on a shelf for 18 months?

For background, consider reading this thread from the beginning, and the original thread, located here: http://forums.freenas.org/index.php?threads/i-messed-up-my-array-and-i-dont-know-what-to-do.11392/

cyberjock · Aug 22, 2014

Tired_ said:
I don't see how that can be interpreted any other way. The act of doing nothing at all cost me all my data...that's volatility. How would you describe a filesystem that can't handle sitting on a shelf for 18 months?

For background, consider reading this thread from the beginning, and the original thread, located here: http://forums.freenas.org/index.php?threads/i-messed-up-my-array-and-i-dont-know-what-to-do.11392/

I looked at that thread, and it doesn't look like ZFS is to blame. In fact, if you look at that thread paleoN basically told you what you did...

paleoN said:
STOP, and fix your temperature problem. At least two, most likely more, drives exceeded their maximum operating temperatures in the past. This can damage the drives.

You literally got the drives so hot that they exceeded their operating temperature. No joke, that's beyond warranty status. Literally, if you had put in a warranty it would have been rejected for "abuse of the hardware". I'm not even kidding.

I do *not* see this as ZFS being volatile and you simply shelving the system for 18 months as the problem. I see you overheated your drives, to the extreme. In my time on this forum I've only seen temps exceed the warranty twice, you are #2. Both have caused catastrophic data loss. Am I even remotely surprised at your end result? Not one bit.

Whether you had used ZFS or any other setup, the result would have likely been the same. Heat causes the magnetic domains to randomize, so not surprisingly you've lost data. In fact, there's a good chance the drives that overheated to the extent we are discussing will never reliably store large quantities of data ever again.

What you should take away from this is to setup SMART monitoring and SMART testing just like I describe in my noobie guide. You'd have known before you literally cooked the data out of your drives.

I am sorry for your loss.

Tired_ · Aug 22, 2014

Your story keeps changing, even when confronted with your own direct quote. It is clear to me that you have no idea what you are talking about. I wish I hadn't gotten involved with this project at all...I was banking on the good name of FreeNAS from days past, but it is clear that the name is the only thing that's still here from before. Good luck to you all, I'm off to find a real solution.

cyberjock · Aug 22, 2014

My story hasn't changed.. but you are failing to recognize the bigger problems.

Good luck. I'm sure you are done with this project, and probably for the best. Not that the project is to blame, but people that can't recognize their own mistakes are likely to lose their data with OSes that are as powerful and require you to take responsibility for your own actions.

panz · Aug 22, 2014

Tired_ said:
http://pastebin.com/qWS4jcbp has camcontrol devlist and the smart info for each drive. I think I did it right.

I can't see your Pastebin(s) anymore: please don't remove them as other users can learn from their analysis. Thank you.

Important Announcement for the TrueNAS Community.

Trouble with auto-import

Starpulkka

Contributor

SweetAndLow

Sweet'NASty

titan_rw

Guru

cyberjock

Inactive Account

Tired_

Dabbler

cyberjock

Inactive Account

Tired_

Dabbler

cyberjock

Inactive Account

panz

Guru

Similar threads

Important Announcement for the TrueNAS Community.

Trouble with auto-import

Contributor

Sweet'NASty

Guru

Inactive Account

Dabbler

Inactive Account

Dabbler

Inactive Account

Guru

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Trouble with auto-import"

Similar threads