Older firmware for LSI 9305-16i

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
What do you mean by the virtual disks in practical terms? What storage exactly are you talking about? Like the data one stores on the NAS, I guess?
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
the grinch is the VM master. I tried it once, on esxi, with my backup server....I hated it, though part of that was because it was an experimental esxi server, so any time I needed to reboot it, I had to stop all my replications (which at the time took about 8 clicks in the interface for each one).
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
What do you mean by the virtual disks in practical terms?

A virtual disk is some sort of disk abstraction that is stored by your hypervisor somehow on real physical storage; on ESXi, for example, that would be a ".vmdk" file (or set of files) usually stored on a VMFS6 or NFS filesystem.

The problem comes in because of this:

Let's say you have a basic hypervisor, and it has a HDD on it, and you build a virtual disk on it for a FreeBSD VM. You then install a FAMP stack on it to act as a web server. Years later, the disk gives out, with a massive run of disk errors. You wake up to find your website down, the VM is stalled with a bunch of "Read error". This has two subforks, one is where the mpt driver is timing out with the hack that mav@ did some years back, and the other is where the VM is actually stalled by the hypervisor, which can especially happen when something bad happens like running out of disk space on the datastore. In both cases, the VM functionally stops working. Most people, upon hearing this, say "Of course."

But now think about ZFS. You have a mirror or RAIDZ vdev, made up of some vmdk's. One day an underlying datastore blows its brains out somehow. Disk full, disk errors, whatever. What happens to the NAS VM? If you don't say "it stalls of course" then you need to re-read the paragraph above. Because the NAS is "just another VM" and will stall "just like any other VM". There are a few ways to mitigate this, such as by using a datastore that has redundancy, but really it is best just to give TrueNAS the raw access to the disk controller that it really wants and needs.
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
That makes sense, but even with my borderline zero knowledge about this stuff, I somehow feel like using virtual disks for NAS storage is a really bad idea. After all, that's why we use HBAs and physical disks, isn't it?
I am basically commenting on the 2nd half of your post.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
that's why we use HBAs and physical disks
yes, but some of the problems with virtualisation will still be problems as long as you have that abstraction layer present.

for example, sometimes it seems like using TrueNAS for VM storage is a good idea (which it usually is), and someone will try to put their TrueNAS VM's storage on their virtualized TrueNAS....which is a circular redundancy; the TrueNAS needs to be online before the TrueNAS can be online, but the TruenNAS isn't online so the TrueNAS cannot be brought online...


going back to the original questions, knowing how YOUR virutalization is setup can at least let someone, like Jgreco, verify that it should work as is, or can tell you where the problems are and what you need to fix to have it working correctly. this is basically the same as with hardware, except now you have added a house of cards on top of the hardware holding up your VM.

normally I would note that RAIDz1 is less than ideal, but it looks like you have it with SSDs which should be far less of a problem. you will still have no redundancy for a resilver, but that resilver should be pretty fast. also, 2TB SSDs :cool:
 
Last edited:

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
What's wrong with RAIDz1? I made it like that, because the chance of more than one disk dying at once is close to zero, or at least I believe so.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
What's wrong with RAIDz1? I made it like that, because the chance of more than one disk dying at once is close to zero, or at least I believe so.

So what happens when one of your disks dies, and then in the process of repairing it, you inadvertently yank the wrong disk, or a cable is finicky, or there's a disk read error on one of the other disks? With RAIDZ1, your redundancy is lost when the one disk fails, and any other problems are potentially pool-killers. Will they? Who knows. But it is much safer to retain the redundancy property.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
More than RAIDZ1 being bad, think of it as not being enough for the typical crowd around here who cares about their data and prefers RAIDZ2.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
More than RAIDZ1 being bad, think of it as not being enough for the typical crowd around here who cares about their data and prefers RAIDZ2.
it's not bad, it's just the wrong tool for the job most of the time, expecially since many people use raidz as their backup plan, so when the raidz fails, poof! data is gone.

if you have a backup, and are aware of the risks? sure.
as a scatch pool for something like video compression, authoring, etc? sure.
on SSDs, where the rebuild time is dramatically faster? probably fine. should still have a backup of anything important though.
as the only copy of valuable data on spinner disks? no.
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
Funnily enough, the person who sold me the HBA and whom I returned it to has just told me it doesn't detect any drives for him either, so it might not had been a cable problem after all.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
Joined
Jun 15, 2022
Messages
674
What's wrong with RAIDz1? I made it like that, because the chance of more than one disk dying at once is close to zero, or at least I believe so.

After getting a for-test-purposes TrueNAS system working I bought more drives to set up RAIDz3. After you've been through the "lost data" ringer enough times (and if there's enough data) Z3 is cheap insurance.

If all your disks are about the same age they'll die "about the same time." Errors on an unused sector aren't detected until you're trying to read from that sector, like during a RAID rebuild, which is also hard on the disks and heats them up beyond what they've previously seen, making previously unknown errors show up. smartctl -t=long may not find errors that are just under the surface...I had a drive test "OK" with only a few remapped sectors, but a badblocks -wp 2 found 18,000+ errors on the drive (it's still climbing). I *think* they may have been correctable at this point, I'm not sure and will look when the test completes. That's probably an extreme example, but it happens, hence most people (from what I read, anyway) settling on Z2, and in my case Z3.

Anyway, as has been mentioned we are here to help, some patience is appreciated because this isn't home-gamer "it's close enough" land, it's more like The Perfectionist Zone with a good slathering of reality. (IMHO)
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
Errors on an unused sector aren't detected until you're trying to read from that sector, like during a RAID rebuild,
this is why ZFS scrubs exist. they read and verify data periodically. RAID does not have this.

heats them up beyond what they've previously seen
this is not really likely. any scrub will use the disk about the same. the difference is a scrub isn't trying to rebuild from parity, merely checking that checksums match.
The Perfectionist Zone with a good slathering of reality
Perfect Realism(TM)
 
Joined
Jun 15, 2022
Messages
674
Exactly right on drive scrubbing finding issues before something like a drive rebuild is needed, which is a bad time to find corrupt data.
[drive heating] this is not really likely. any scrub will use the disk about the same. the difference is a scrub isn't trying to rebuild from parity, merely checking that checksums match.
I wrote a script to log drive temperatures because:
A.) I had a fan go out and drive temps went from 37C to 48C, causing one drive that was on the verge of failing to throw errors in a spectacular fashion (found with badblocks -n).
B.) Some systems are located in unstable environments with large temperature fluctuations.

Logging showed the more a drive is used the more it heats up, which is pretty dramatic with SSDs.
Code:
#!/bin/bash
# record S.M.A.R.T. drive temperature

outfile="/mnt/log/`date +%F_%H%M%S`_drvtemp.txt"

echo "S.M.A.R.T. temperature information for drives:"
for drive in dev/sd?
do
    echo "$drive : `smartctl --xall "$drive" | grep 'Current' | grep 'Temperature'`" | tee --append "$outfile"
done
echo "Log file saved. Done."
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
I had a fan go out and drive temps went from 37C to 48C
large temperature fluctuations
drive is used the more it heats up
yes, drives heat up. my point was that reading drives for scrubs and reading drives for resilver and writing to drives are going to heat the drive up about the same, counter to how you said that reading a drive for resilver somehow heats them up more than they have ever seen, which is just not true.
heats them up beyond what they've previously seen
if you have poor cooling, obviously everything is going to be hot, but the amount of heat will be relatively consistent (unless you loose a fan, but that changes the cooling profile, not the disk heat generated)


aditionally, SSD and HDD are very different. SSD heat up more but as they have no physically moving parts, this doesn't matter as much, plus, they tend to read and write so much faster that the whole performance profile is very different. RAIDz1, for example, is generally much less risky on SSD, since the rebuild is fast due to both smaller drive sizes and dramatically faster read/writes.
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
New card arrived and it works like a charm. Doesn't even need molex power connectors to power the SSDs up.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Let us know how it goes! It's unusual to have such major problems with an HBA and I think you'll enjoy your new one.
 

Octopuss

Patron
Joined
Jan 4, 2019
Messages
461
Well, it just works. Like the original card was supposed to, heh.
 
Top