SOLVED z2 12*6T capacity problem

qq8554650

Dabbler
Joined
Feb 7, 2022
Messages
23
TrueNAS-12.0-U6/U7/U8
TrueNAS-SCALE-22.02-RC.1/2
have the save problem

DELL R720XD
Vendor: SEAGATE
Product: ST6000NM0034
Revision: MS2E
PERC H310 Mini (NON_RAID)

6T*12,Z2
6T=5.46
5.46*(12-2)=54.6=Estimated raw capacity
1、why not equal to Pool Size
2、Pool size
why 12-11 = 49.74-45.59 = 4.15
6-5 = 21.68-16.01 = 5.67
3、in vmware,pool size= 54.69 TiB ^^


disk Estimated raw capacity Pool Size
5 16.37 TiB 16.01 TiB
6 21.84 TiB 21.68 TiB
7 27.28 TiB 25.31 TiB
8 32.74 TiB 30.9 TiB
9 38.19 TiB 37.27 TiB
10 43.65 TiB 41.42 TiB
11 49.1 TiB 45.59 TiB
12 54.56 TiB 49.74 TiB
 

Attachments

  • zfs_size.png
    zfs_size.png
    86.1 KB · Views: 146

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
DELL R720XD
Vendor: SEAGATE
Product: ST6000NM0034
Revision: MS2E
PERC H310 Mini (NON_RAID)

That PERC H310 is clearly in RAID mode as it is showing up under the MFI driver. Please crossflash the card to IT mode.


6T*12,Z2
6T=5.46
5.46*(12-2)=54.6=Estimated raw capacity
1、why not equal to Pool Size
2、Pool size
why 12-11 = 49.74-45.59 = 4.15
6-5 = 21.68-16.01 = 5.67
3、in vmware,pool size= 54.69 TiB ^^

RAIDZ2 is not a good choice for block storage. Due to the way RAIDZ parity works, it consumes a variable amount of storage to manage parity, and in worst case scenarios can double (or more) the amount of space needed to provide storage.


You haven't really given an explanation of what your math represents or what your question is.
 

qq8554650

Dabbler
Joined
Feb 7, 2022
Messages
23
That PERC H310 is clearly in RAID mode as it is showing up under the MFI driver. Please crossflash the card to IT mode.




RAIDZ2 is not a good choice for block storage. Due to the way RAIDZ parity works, it consumes a variable amount of storage to manage parity, and in worst case scenarios can double (or more) the amount of space needed to provide storage.


You haven't really given an explanation of what your math represents or what your question is.
Because English is not good, use Google translate
already flash H310 to IT mode(03:00.0 RAID bus controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03))
1. Take 12 discs as an example. When creating the pool, it shows that the space is 54.56TB. But after mounting, it shows that the available space is only 49.74TB. one plate missing
2. When there are only 5 disks, there is basically no difference (5 disks, 16.37 TiB ,16.01 TiB). As the disc increases, the gap gets bigger and bigger. Is this normal? Is there a way to fix it
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Because English is not good, use Google translate

I already guessed that. :smile:

already flash H310 to IT mode

The image above says otherwise. In your first attachment, it shows disks with mfisyspdX devices. That is an MFI-driver based card. Card is not crossflashed to IT mode.

I do not have 12 6TB hard drives laying around to experiment with. However, I do have oodles of hypervisor capacity. So I have created a virtual machine with an array of 12 hard drives that are 5723166MB, which is the standard real size of a so-called "6TB HDD". I'm unclear on what your complaint is so I'm trying to look at the numbers you've provided to see where things might be unclear.

Let's walk through this.

vdevs1.jpg


So there's your 54.56TiB. This is literally just ten times the disk size. That's actually a tragic estimate, because that is only in the best case scenario. And I think this is what you're getting at, but it was buried in some other confusing numbers.

This 10 x 5.46TB number is the MAXIMUM amount of data that could EVER be stored on ANY RAIDZ2 device under the BEST circumstances. It does not represent what you will actually be able to store.

Forget RAIDZ2 and consider RAIDZ1 for a moment, because I have a picture handy to help illustrate. For discussion purposes I am assuming 4KB sector sizes, but this applies the same to 512B.

RAIDZ-small.png


This is parity and data storage for data blocks stored by RAIDZ1. If we look at the tan blocks beginning at LBA0, this is what everyone THINKS happens with RAIDZ1. You have 5 disks, one parity, you get four disks of stored data and one for parity. 20% overhead for parity.

But if we store a shorter ZFS block, such as a 12KB block, such as the yellow or green in LBA 2, the data must still be protected by parity, so ZFS writes a parity sector for these blocks, making the overhead 25% for parity. Next comes the humdinger. We have a tiny block, whether this is metadata or a 1KB file or whatever, the red block in LBA 3. This too has to be protected by parity, so it works out to 50% overhead for parity. Moving down to LBA 7 and 8, we also see a different behaviour, which is where ZFS has an odd number of sectors to write, and pads them to an even number (the "X"'d out sectors). All of this works to eat away at your theoretical maximum RAIDZ1 capacity.

It gets worse with RAIDZ2. I don't have a picture for this. But just consider the case where you have a single 4K block to write. It needs two parity sectors, and then, because you now have three sectors, and three is an odd number, ZFS will pad it with an additional sector out to four sectors. So it takes four sectors of raw space to protect a 4K-sized block on RAIDZ2.

So, if we circle back around to our test box, we see that the pool is created:

pool1.jpg



And there's your 49.74TiB. This number isn't being generated by TrueNAS, but rather by ZFS itself. ZFS is giving you an approximation of how much usable space it thinks there could be, and it isn't as optimistic as the TrueNAS middleware, which as discussed is simply assuming the loss of two disks to parity, which is in my opinion a dumb assumption.

Code:
root@truenas[~]# zfs list foo
NAME   USED  AVAIL     REFER  MOUNTPOINT
foo   21.5M  49.7T      219K  /mnt/foo
root@truenas[~]#


There's nothing wrong here. The amount of space on a RAIDZ2 volume is not a guaranteed number, because of how parity works. You are just seeing two different estimates.
 
Top