Eating boot drives

Urumiko

Dabbler
Joined
Jul 2, 2017
Messages
32
I've been using Free/Tru Nas for a few years now. I run it on an HP GEN 9 micro server with 16gb of ram, And use it as a basic file server and a Plex jail also.
It sees very light use with maybe 1-2 evenings of plex viewing a week.

Back in the day I had the Boot drive on USB sticks and they would fail within a month, My last hard drive (a used samsung SSD) seemed to last a fair while by these standards (I think i may have disabled the swap drive), not sure if this would have helped. When it failed recently I thought I'd buy a new Cheap SSD and give that a go, and it has failed within about 2 weeks.

I appreciate A good quality enterprise SSD is probably the way to go but come on. Unless this SSD is a dodgy Chinese knock off with a USB key inside that seems a bit extreme.
Are there any steps I can take to stop it eating hard drives like no tomorrow or at least wear them evenly?
Are there any preferred models of drive?
How long should a cheaper reputable brand desktop drive like samsung or intel last?
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
How long should a cheaper reputable brand desktop drive like samsung or intel last?
As long if not longer than the useful life of your server. I've got a 16GB SATA DOM that's been in service for about 7 years without a hiccup.

Maybe try picking up a used enterprise SSD off ebay. I've got an 80GB Intel SSD in another server that I picked up for around 10 bucks.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
When it failed recently I thought I'd buy a new Cheap SSD and give that a go, and it has failed within about 2 weeks.
Seems like you got a dud. You probably just got unlucky and got a bad production batch. I've used SSD's from Crucial, Sandisk, Samsung, Kingston, Inland from early days capacity of 60 GB all the way to 1 TB. All of them still work. I in fact use the 60 GB on TrueNAS right now and that one is like over 10 years old now probably.

Honestly, using an expensive or enterprise SSD for the boot drive is a huge waste in my opinion.
 

Urumiko

Dabbler
Joined
Jul 2, 2017
Messages
32
Thanks guys, yeah actually looking at the reviews of the cheap patriot I bought load of people are reporting them dying within 2 weeks, must be a bad product or batch, Completely understand people wanting enterprise drives if they cant bear the thought of down time and like to do things properly, but when you can get a normal SSD new for 1/10th of the cost I'm happy to pay less and prime a new drive when it fails. That's why I use this OS in the first place, quick drive swap, upload the config and you are back online. Whole NAS dies, Import the ZFS elsewhere. Love it.
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
I've been running a pair of of Patriot "Blaze" 60Gb drives in a mirror config for probably 4 years now. But these were retail / local purchases when those drives were still in production. I've had some good luck with Intel Enterprise SSD's on eBay. There was a seller in Nevada that had a source of SATA 120Gb S3x00 units that were going for $20/ea about a year ago. Apparently they're used in casino slots machines and routinely swapped out when they hit 85% life remaining. They're not particularly fast by modern standards, but they do have PFP caps.

Keep in mind, you can mirror disparate devices. I originally mirrored my USB thumb drives, and as they started getting eaten up, replaced one with a USB adapter attached SATA SSD, which then moved to motherboard SATA port 0, and eventually a second SSD replaced the remaining thumb drive. ZFS tracks devices by a kind of UUID. You can scramble the cabling every reboot, and even move drives from your HBA to motherboard ports, etc... It just figures it out stitches the pool back together at boot. The only requirement on the boot pool is one component of it be a "bootable" device, and I believe the GUI only allows boot pool mirror configs.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
Keep in mind, you can mirror disparate devices. I originally mirrored my USB thumb drives, and as they started getting eaten up, replaced one with a USB adapter attached SATA SSD, which then moved to motherboard SATA port 0, and eventually a second SSD replaced the remaining thumb drive. ZFS tracks devices by a kind of UUID. You can scramble the cabling every reboot, and even move drives from your HBA to motherboard ports, etc... It just figures it out stitches the pool back together at boot. The only requirement on the boot pool is one component of it be a "bootable" device, and I believe the GUI only allows boot pool mirror configs.
If you want redundant booting. It's actually better to use a HW RAID controller for this. It's what it's designed to do and it'll boot regardless of which drive fails.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Mandatory link to the appropriate ressource from our Resident Grinch:
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
If you want redundant booting. It's actually better to use a HW RAID controller for this. It's what it's designed to do and it'll boot regardless of which drive fails.
Indeed, and HW RAID may provide other features, but many motherboard RAID solutions are primitive mirror or simple parity only, which is all that's really required. A couple vendors sell dual M.2 SATA SSD PCIe cards with an onboard primitive RAID solution as server boot devices. The RAID functions are usually tied to the vendor's LOM/sideband management though.

Regardless, my point was you can "walk" to a better boot solution from the old thumb drive config. This includes mirroring your HW RAID boot device to the thumb drive, and then detaching the thumb drive.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
Regardless, my point was you can "walk" to a better boot solution from the old thumb drive config. This includes mirroring your HW RAID boot device to the thumb drive, and then detaching the thumb drive.
But why deal with that pain at all when you can just not use a thumb drive and use a regular SSD and be done with it? I don't mirror my boot drive. I just use a 10-year old 60G SSD and it's been working for the last decade without me having to shuffle and swap around thumb drives. It just gets connected once and sits there collecting dust for years. I mean, it's not like old SSD's are expensive. I just use the one I used to use on my gaming PC years, years ago back when SSD was in its infancy (hence why it's only 60GB in capacity). It's too small for my current PC, but works perfectly for TrueNAS boot device.

Also honestly... if you have a setup that has a HW RAID controller... I really don't understand why you'd be fiddling with USB thumb drives AT ALL... It's like spending extra money to get better boot availability, but then sabotaging that reliability yourself with the thumb drives.
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
But why deal with that pain at all when you can just not use a thumb drive and use a regular SSD and be done with it? I don't mirror my boot drive. I just use a 10-year old 60G SSD and it's been working for the last decade without me having to shuffle and swap around thumb drives. It just gets connected once and sits there collecting dust for years. I mean, it's not like old SSD's are expensive. I just use the one I used to use on my gaming PC years, years ago back when SSD was in its infancy (hence why it's only 60GB in capacity). It's too small for my current PC, but works perfectly for TrueNAS boot device.

Also honestly... if you have a setup that has a HW RAID controller... I really don't understand why you'd be fiddling with USB thumb drives AT ALL... It's like spending extra money to get better boot availability, but then sabotaging that reliability yourself with the thumb drives.

We're "agreeing past each other"... I'm using old 60Gb SATA SSD units as well, I mirror them because they're old, and I have a third lying around in a drawer that I could actually swap in. I may get bit by an accessible but corrupt/non-bootable device one day (see jgreco's doc), but all I have to do is swap the cables and I'm up. Added bonus I get to review the logs preceding the failure, and don't have to go find my backup config.

The OP is using the old formerly suggested USB boot. You are correct, you can back up the config, install on the SSD, and restore the config. But that results in downtime. The ability to pair to a more permanent boot device may be important to some for production uptime reasons. You can pair an SSD in a USB enclosure in the middle of the week and you get back to redundant, and then move the device to a permanent SATA port during scheduled maintenance. In general I avoid USB storage for anything but backups. But as they say, "any port in a storm"... :smile:
 

Urumiko

Dabbler
Joined
Jul 2, 2017
Messages
32
I've never been sold on doing raid for the boot drive, my logic being that whatever is wearing out the drive so quickly is just going to mirror to the 2nd drive and wear that out also. I'm working on the assumption the hard drive is being used as volatile and non volatile storage, when I just need the non volatile stuff backed up. I don't know if there's any kind of fancy cold standby boot option that would let me have a 2nd boot drive but in many ways just doing 2 entirely separate boot drives would prob work for me.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
The ability to pair to a more permanent boot device may be important to some for production uptime reasons.
I totally agree with this, it's just that I've been noticing a trend in here and Reddit of people advocating it basically to anyone regardless of the context when I'm fairly sure the majority of users that visit these places are almost certainly home users that aren't running some mission-critical setup that requires such setups. In fact, if they are running such setups, chances are, they'd already know what to do and wouldn't be asking questions on here. We certainly don't see ixSystems enterprise customers asking such questions here.

I'm certainly not saying that people that need those setups don't visit here, but the frequency they do, don't seem to match with the frequency that people recommend mirrored boot drives.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
I've never been sold on doing raid for the boot drive, my logic being that whatever is wearing out the drive so quickly is just going to mirror to the 2nd drive and wear that out also.
Yeah, that's basically how mirroring works. It really is an overkill and unnecessary for the vast majority of users. Especially since reinstall+restore config file is such a trivial operation that takes less than 30 minutes. And for people like me who's just running the bare essentials (no VM/jails). I don't even really bother to save my config file as I can just recreate the config from scratch, which only takes me like an extra 15 minutes.
 

Urumiko

Dabbler
Joined
Jul 2, 2017
Messages
32
well... the drama is not over.
New integral drive lasted all of about a week before going awol again.

Seems on boot the drive is not detected.

I tried powering off and updating the bios via ILO and upon boot the drive was showing again.
I suspect more so due to the cold boot than the upgrade.

I would like to upgrade the SATA controller firmware but I think I'm going to have to used a Ubuntu live USB to do this.

Meanwhile I booted my install media once more and did a fresh install, This seems to go through ok but then on reboot it freezes on Attempting to boot from C: indefinitely...

Any thoughts? Can I be that unlucky with SSDs?

Are there any console commands i can run from the install media?
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
well... the drama is not over.
New integral drive lasted all of about a week before going awol again.

Seems on boot the drive is not detected.

I tried powering off and updating the bios via ILO and upon boot the drive was showing again.
I suspect more so due to the cold boot than the upgrade.

I would like to upgrade the SATA controller firmware but I think I'm going to have to used a Ubuntu live USB to do this.

Meanwhile I booted my install media once more and did a fresh install, This seems to go through ok but then on reboot it freezes on Attempting to boot from C: indefinitely...

Any thoughts? Can I be that unlucky with SSDs?
The issue seems to be happening so consistently across different drives make me think your problem is not the SSD's but maybe the SATA ports or even the motherboard itself. If you have a spare machine, maybe try to use that drive on a spare machine and see if it works differently. I understand this is a tough ask though.

Are there any console commands i can run from the install media?
Well, that depends on what you want to do, but if I remember correctly, the install media does allow you to drop into CLI at the end of the installation.
 

Urumiko

Dabbler
Joined
Jul 2, 2017
Messages
32
1676494890650.png
 

Urumiko

Dabbler
Joined
Jul 2, 2017
Messages
32
Well, that depends on what you want to do, but if I remember correctly, the install media does allow you to drop into CLI at the end of the installation.
it does but i have little / zero console experience with this OS so wouldnt know what to do.
Happy to pop the drive in a windows machine and run a scan tool on it
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
Those errors look like your controller might be the issue. Is it a port on your motherboard? Can you try a different port? different cable?
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
it does but i have little / zero console experience with this OS so wouldnt know what to do.
Happy to pop the drive in a windows machine and run a scan tool on it
You can probably try to run smartctl on the drive through the install media.
 
Joined
Jun 15, 2022
Messages
674
If you want redundant booting. It's actually better to use a HW RAID controller for this. It's what it's designed to do and it'll boot regardless of which drive fails.
Except you lose S.M.A.R.T. reporting because TrueNAS or smartctl will poll the array instead of the drives.

Instead, set up the [triple] mirror through TrueNAS. Set the Primary and Secondary boot devices in the [server] BIOS (or alternatively the HBA BIOS, though not as good an option).

BUT, before doing that run a smartctl report and find the source of the drive killer. (I run System Rescue off a Ventoy USB thumb drive which has a decent toolset, including smartctl.*)

---
*Don't ask how many sockets I've stuck my fob into.
 
Top