Data corruption on USB flash drive

Status
Not open for further replies.

lumbric

Dabbler
Joined
Dec 26, 2012
Messages
10
I use FreeNAS since 10 months on two different servers. Both are HP ProLiant MicroServer N40L and I used a Corsair Flash Voyager Slider USB 3. The first problem occured after several months. Strange things were going on. I saw somewhere I/O Error so I decided to reflash the the Freenas image and start from scratch (using backups of the config) with a new USB flash drive of the same type.

One hardware error might be bad luck. But now I discovered that the same problem seemd to have occured at both severs again. So in total there are 3 different USB flash drives with corrupt data in 10 months.

One of the two servers simply crashed and it doesn't boot anymore. The other one seems to be fine, but django crashed very similar to the problem discribed here:

http://forums.freenas.org/threads/login-eror-8-2-beta.6621/

I checked the file, but I could find anything wrong in it. It complained about a character \xf4.

I am using FreeNAS-8.3.0-RELEASE-p1-x64.
 

gpsguy

Active Member
Joined
Jan 22, 2012
Messages
4,472
I've had good luck with a Patriot flash drive (albeit USB 2.0) on my N40L.

Why did your server crash? Are you having power failures and not using a UPS? If so, that's probably what's causing the corruption. Get a UPS and configure it to shutdown your servers, if there's a power failuer.
 

lumbric

Dabbler
Joined
Dec 26, 2012
Messages
10
Hm yes you are probably right. Data corruption might be caused by power failures. I am not using a UPS, but now after Googeling for "freenas power failure" and "freenas power outage" I start to realise that using a UPS might be a good idea... Do you know what file system is used on the flash drive? What kind of database is used to store the settings? How come it gets corrupted that easy?
 

gpsguy

Active Member
Joined
Jan 22, 2012
Messages
4,472
The flash drive is formatted as UFS.

The configuration file is a SQLite database. It's located here: /data/freenas-v1.db

If you're not doing it now, consider creating a cron job that will back up the file automatically.


Sent from my phone
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
How come it gets corrupted that easy?

It's getting corrupted because of the unexpected shutdown. That should be prevented on servers whenever possible. So an UPS should be pretty high on your shopping list. ;)
 

lumbric

Dabbler
Joined
Dec 26, 2012
Messages
10
It's getting corrupted because of the unexpected shutdown. That should be prevented on servers whenever possible. So an UPS should be pretty high on your shopping list. ;)

Well, yes. Of course shutting down servers before unplugging the electricity is a good thing... :)
Also an UPS is definitely a good thing to have. Anyway its also nice if things do not get destroyed in case of a power outage. I thought since the times of journaling file systems, a hard power off is not a big deal anymore.

I did not loose any data. I always had a backup of the settings (exported manually using the web interface, probably there is no need to backup the settings regularly because they do not change). But I try to understand what happens and why. Googeling showed that I'm not the only one who is using FreeNAS without an UPS and having problems, so it might be interesting to collect some answers also for others and maybe even warn users (in the FAQ) who never experienced any problems on their desktop computers after power outages.

I did not have time yet to dig into FreeNAS that deep and I do not fully understand what gets corrupted exactly and why. As far as I can see SQLite is quite robust against corruption. Wikipedia says* also UFS is using a journal. And as far as I understood until now, changes are written to the flash drive only on shutdown/reboot? I suppose to increase the life time of the flash drive, right? (sounds like a good idea)

So what goes wrong exactly if I pull out the power cable?

*Wikipedia: Since FreeBSD 7.0, UFS also supports filesystem journaling using the gjournal GEOM provider
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
So what goes wrong exactly if I pull out the power cable?

That I think is open for debate and interpretation. ZFS on solaris was designed so that on a loss of power the worst that would happen is any transactions that were in progress would roll back. ZFS on FreeBSD is based on the solaris code, but is not a copy/paste. So there might be small differences.

From what I've seen in the forums since I read every post is that some people have problems recovering from a loss of power. Often those people also did other things wrong like used a RAID array for ZFS, shorted their system on RAM by a large margin, didn't use ECC RAM, etc. So it's hard to put a finger and say that the loss of power was solely responsible for the problem. Many users also don't know what they are talking about and deny that some things are the way they are despite them providing logs that disagree with them.

I've had to do some hard power cycling and never had a problem. In fact, most users have had no problems with a loss of power with FreeNAS. But.. a few users do.

Also worth considering is that many users are not running server grade parts. They aren't buying high quality server-grade components like what Solaris hardware ran on and instead looking for the absolute cheapest motherboard, CPU, RAM, SATA controller, and other components for as cheap as they can find. That's just plain stupid to do if you actually value your data. Often those cheap components aren't reliable and shouldn't ever be used in a server. But they do it anyway and often the symptom is performance and reliability issues but there's no smoking gun for the exact component that didn't perform as expected.
 

lumbric

Dabbler
Joined
Dec 26, 2012
Messages
10
Just to clarify: my ZFS data drives recovered perfectly without problem
Just the flash drive seemed to be corrupted. This happend 3 times in about 10 months with two diffierent servers (but similar hardware and similar USB flash drives). In the beginning I blamed the hardware. But now I'm not sure anymore. Of course it could be something else too and I'm not using really professional hardware (but neither do I when building desktop computers).
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
For one, comparing building desktops to servers is comparing apples to oranges. Additionally, if a desktop breaks its not a big deal, you just replace it. But if a server goes down the consequences are far more serious. I've had plenty of people message me in IRC asking me to help them recover their data because their business will be ruined otherwise and 10 people's lifes and jobs depend on this server to work.

I'd say if you aren't using server grade stuff you shouldn't expect server level reliability. For years computer geeks have enjoyed bending that rule. But FreeNAS isn't for those amateurs that don't want to go to the big leagues and are willing to buy the proper hardware. You can certainly use desktop hardware. Just don't be shocked when it doesn't go well for you.

Aside from that, you've really provided little evidence to point to any one thing as the culprit. You didn't use server grade, but that's not necessarily a deal breaker. It often is, but there's now guarantee for that. It's also possible you purchased some USB sticks from somewhere that ended up with fake Corsair USB sticks. I'm not saying that is your problem or that I have any evidence that is the case. But that is possible. There's lots of possibilities that you can consider if you really want to think about it. Unfortunately, only you can effectively determine where the fault is.

You really have provided no evidence of anything being wrong. So its hard to help you. Can you post the actual error message you are getting? It could be you are interpreting something as being wrong when that error is normal. Some errors on bootup are completely normal. And what version of FreeNAS are you using?
 

lumbric

Dabbler
Joined
Dec 26, 2012
Messages
10
Hm yes you are right, I didn't describe the errors very detailed. I saw many different errors, but lets begin with the one which I could investigate most detailed: the server was up and running, SMB shares and ssh working, but the web interface was broken.
As mentioned already in my first post, I found problems with django in some of the logs. Also when restarting django, I always got the same error msg:
Code:
 SyntaxError: Non-ASCII character '\xf4' in file [SOME DJANGO FILE FOR WEB INTERFACE]


I couldn't find anything wrong with the file. I suppose one byte was wrong or so (not an ASCII character) and VIM simply ignores it, while Python can't deal with files containing Non-ASCII characters. That's why I assume some file (system) corruption. Otherwise, why a python file should change? Before I could continue my investigations (or setup a new flash drive) the server stopped working. I got a call that suddenly the server stopped working (meaning SMB shares). When I arrived, I just saw "Boot:" on the screen, no SSH, no reaction to keyboard, ... Same thing after reseting and trying to boot: after the BIOS there was just "Boot:" on the screen. I replaced the USB flash drives with different ones (USB 2.0, different model) and it worked again - we will see how long.

I couldn't find any hardware issues with the USB flash drive I used. I formated it EXT4 and copied a file with 200MB of /dev/urandom to to the flash drive until no space was left anymore. The MD5 sums where correct, so at least simple copy & read back works fine.

If I find time, I'll set up remote logging, so I can read the last words of a dying FreeNAS server and its USB flash drive.
 
Status
Not open for further replies.
Top