Perplexed by very quick failure of 2 flash drives

Status
Not open for further replies.

hansmuff

Dabbler
Joined
Mar 13, 2015
Messages
22
I put together my FreeNAS machine about 5 weeks ago and it's been great.

HW:
Supermicro X10SL7-F latest BIOS
Xeon 1231 v3
32GB kingston ecc
(4tb hitachi drives as the pool)

Boot drives:
2x CORSAIR Flash Voyager GO 16GB USB 3.0 OTG Flash Drive Model CMFVG-16GB-NA
Plugged into USB 2.0 ports

SW:
FreeNAS 9.3 STABLE all updates

After the latest round of updates, I received a warning:
"Boot Volume Condition: DEGRADED One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected."

The pool is fine, but the boot drives report the following:
da4p2: Read 0 Write 0 Checksum 55
da3p2: Read 0 Write 0 Checksum 25

I did a "Boot scrub" but the results haven't changed.

I already ordered two new flash drives, this time Lexar ones that people seem to really like.
I am however a little puzzled about this: I know flash drives in this price range (~$12 per) are not high quality media that last forever, but I wasn't thinking that both would have errors after this short time span.

Are these checksum errors indicative of anything else I need to be looking at? The motherboard is stock configuration, no tinkering or messing with timings was done.

Also in this configuration, should I replace the drives one by one and "re-silver" the new ones one at a time, or should I do a complete re-install and rescue from the saved config? Thanks!
 
D

dlavigne

Guest
Also in this configuration, should I replace the drives one by one and "re-silver" the new ones one at a time, or should I do a complete re-install and rescue from the saved config? Thanks!

Personally, I'd do a fresh install of the latest STABLE. It sounds like you had a mirrored boot pool and you can select both during install time to create a new mirror. Resilvering is hard on devices, especially USB ones.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
That's odd, Corsair drives are usually good.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
This is becoming more and more a thing. I have built a few systems in the past 4 months where the thumb drives began throwing cksum's in the first 2 weeks. There is only one type of thumb drive that has not done that for me: Sandisk Cruzer Fit. The worst, by far, was the Kingston Micro DT's, where I had, over the course of something like 8 devices sourced over 3 vendors a 100% failure rate.

I suspect that many/most of us had corrupted USB drives for long periods of time, before we had ZFS on the boot pool to detect it. I think all of these devices (cheap ass consumer grade USB thumb drives) are much crappier than people realize.
 

hansmuff

Dabbler
Joined
Mar 13, 2015
Messages
22
So I can report that the replacement of the boot flash was easy as pie and works great!

Put the new drive into an empty USB, went to System>Boot>Status, then clicked on faulty drive #1 and clicked the "Replace" button. Up popped the new USB device as a candidate and I let it do its business. Now the boot pool shows up as ONLINE, the old flash drive no longer appears but the new one does. All I have to do is pull the old one out and I'm done.
Then rinse and repeat with faulty drive #2.

That's really pretty cool
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
So I can report that the replacement of the boot flash was easy as pie and works great!

Put the new drive into an empty USB, went to System>Boot>Status, then clicked on faulty drive #1 and clicked the "Replace" button. Up popped the new USB device as a candidate and I let it do its business. Now the boot pool shows up as ONLINE, the old flash drive no longer appears but the new one does. All I have to do is pull the old one out and I'm done.
Then rinse and repeat with faulty drive #2.

That's really pretty cool
I would caution you sir, that it may not be quite that simple. You may have to grub-install the bootable partitions. Please have a look at this:

https://bugs.freenas.org/issues/6993#note-20
 

hansmuff

Dabbler
Joined
Mar 13, 2015
Messages
22
Thank you, DrKK. I decided that's all too much trouble, re-installed from scratch with the 2 new drives in mirror and then uploaded the saved config.
Worked like a dream.

Now to see how long those Lexar drives live without any errors...
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
Thank you, DrKK. I decided that's all too much trouble, re-installed from scratch with the 2 new drives in mirror and then uploaded the saved config.
Worked like a dream.

Now to see how long those Lexar drives live without any errors...
It is my pleasure to assist you sir.
 

hansmuff

Dabbler
Joined
Mar 13, 2015
Messages
22
I thought I'd give an update. The Lexar drives have worked very well for over a month now, both still healthy.

The model is S-73 (LJDS73-16GASBNA).
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
I still question the whole mirrored USB flash stick boot devices. With as much as FreeNAS 9.3 writes to the boot device these days, I still feel like a small SSD (any SSD that costs less than $30 USD) is a better match and capable of handling the write load. And it's not about faster boot times because in reality, boot times are not much faster using a SSD. Of course this is just my opinion and you know everyone has one like ....

My Flash drive is a Patriot 8GB that I've had for a long time and re-purposed it for my NAS. Well it's one of two flash drives I use for my NAS, I also have a Sandisk Cruizer 8GB but it's got an older version of FreeNAS on it right now, I'll likely upgrade it to FreeNAS 10 Beta when it comes out. Sometime I also use an older Adata PD7 4GB flash drive, it's fast and very reliable. Most of today's stuff is junk.

Glad the Lexar drives are holding their own.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
I still question the whole mirrored USB flash stick boot devices. With as much as FreeNAS 9.3 writes to the boot device these days, I still feel like a small SSD (any SSD that costs less than $30 USD) is a better match and capable of handling the write load. And it's not about faster boot times because in reality, boot times are not much faster using a SSD. Of course this is just my opinion and you know everyone has one like ....

My Flash drive is a Patriot 8GB that I've had for a long time and re-purposed it for my NAS. Well it's one of two flash drives I use for my NAS, I also have a Sandisk Cruizer 8GB but it's got an older version of FreeNAS on it right now, I'll likely upgrade it to FreeNAS 10 Beta when it comes out. Sometime I also use an older Adata PD7 4GB flash drive, it's fast and very reliable. Most of today's stuff is junk.

Glad the Lexar drives are holding their own.
Joe,

I just checked the past 48 hours of use. Aside from directly around boot-time, there have been precisely zero bytes written to my boot flash drives. Zero. Bytes. I'm not sure what you mean by lamenting "how much FreeNAS 9.3 writes to the boot device". If you take the reporting db off the boot drives, which I thought everyone did, I think you'll have almost no activity whatsoever on the boot devices, read OR write.

? Am I missing something?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
Joe,

I just checked the past 48 hours of use. Aside from directly around boot-time, there have been precisely zero bytes written to my boot flash drives. Zero. Bytes. I'm not sure what you mean by lamenting "how much FreeNAS 9.3 writes to the boot device". If you take the reporting db off the boot drives, which I thought everyone did, I think you'll have almost no activity whatsoever on the boot devices, read OR write.

? Am I missing something?
Maybe I'm missing something, I doubt it's you. So if I look at my System Dataset and it's on my pool, then I should have nothing reading/writing to my boot device? I thought routine maintenance occurred each night at 1 AM and maybe 3 AM? I have several read operations but more importantly some write operations during those hours. They are not huge (~800 bytes to 5kb). And I do have other read/write operations throughout the day however I didn't notice a pattern, but I didn't look hard either. Even a small write operation reduces the life of the USB Flash drive because it's at least a physical 4k block if not more. And of course the weekly scrub. I am running the current version of FreeNAS. If besides the weekly scrub, I shouldn't have any other read/write operations, let me know and I'll have to investigate this further as I must be doing something wrong.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
Maybe I'm missing something, I doubt it's you. So if I look at my System Dataset and it's on my pool, then I should have nothing reading/writing to my boot device? I thought routine maintenance occurred each night at 1 AM and maybe 3 AM? I have several read operations but more importantly some write operations during those hours. They are not huge (~800 bytes to 5kb). And I do have other read/write operations throughout the day however I didn't notice a pattern, but I didn't look hard either. Even a small write operation reduces the life of the USB Flash drive because it's at least a physical 4k block if not more. And of course the weekly scrub. I am running the current version of FreeNAS. If besides the weekly scrub, I shouldn't have any other read/write operations, let me know and I'll have to investigate this further as I must be doing something wrong.
Well, having looked a little closer, I see about 95% of 30 minute windows have no reads or writes. Very occasionally, in early morning hours, there are very inconsequential writes and/or reads being done of single 4k pages or less, most of the time.

Scrubbing---assuming no errors on the drive, is purely read only.

To my mind, the total wear and tear per week on these thumb drives is less than you'd have transferring one picture per month to them from a camera, and obviously, they should withstand way more than that.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
To my mind, the total wear and tear per week on these thumb drives is less than you'd have transferring one picture per month to them from a camera, and obviously, they should withstand way more than that.
So I wasn't totally crazy. I agree that the amount of data being written shouldn't shorten the life of the USB Flash drive much, yet many of the drives out there don't last very long in a FreeNAS environment.

I still like the idea of a SSD vice USB Flash drive but maybe my argument doesn't hold water.
 

fta

Contributor
Joined
Apr 6, 2015
Messages
148
If RRDTool is to be believed, I see 20-40MB of writes per week to the boot drive. On weeks where a system update was applied, I see 0.5-4GB of writes. System dataset, syslog, and RRD are all sent to my pool.
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
Heh. Chuckled when I saw that one.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
How in the light of the above discussion should I view the following enhancement that is coming to us shortly?
It turns out we DO want to allow the .system dataset to live on the freenas-boot pool
https://bugs.freenas.org/issues/9353
I understand it to be optional, intended for those who use SSDs as the boot devices and as a way around some problems with TrueNAS' HA features.
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
So I wasn't totally crazy. I agree that the amount of data being written shouldn't shorten the life of the USB Flash drive much, yet many of the drives out there don't last very long in a FreeNAS environment.

I still like the idea of a SSD vice USB Flash drive but maybe my argument doesn't hold water.

Hell, I've been saying this for years now, and people kept telling me I was wrong. But now with ZFS and it exercising the boot a little more heavily, more people are seeing the issue more quickly.

One of the reasons I like virtualization is that the FreeNAS boot device sits on a RAID-backed redundant datastore.
 

solarisguy

Guru
Joined
Apr 4, 2014
Messages
1,125
Hell, I've been saying this for years now, and people kept telling me I was wrong. But now with ZFS and it exercising the boot a little more heavily, more people are seeing the issue more quickly.

One of the reasons I like virtualization is that the FreeNAS boot device sits on a RAID-backed redundant datastore.
I know that is not what is written, but my eyes keep reading

I like virtualization since ZFS pool devices sit on a RAID-backed redundant datastore :smile:
 
Status
Not open for further replies.
Top