BUILD Doing a ~$14 000 build, input on harddrive choice (and the rest of the components)

Christer · Mar 26, 2015

The compellent solution we have is very old (like 4 years), and as depasseg wrote it was actually SMC hardware (don't recall if it says Compellent or Dell on the stickers on the bezels...). Dual 10 Gbps switches, dual controllers and 3 drive chassis.
So I don't know how Compellent measure up today, my knowledge is "outdated", but in the beginning it worked very well, since then we have also more than doubled the performance capacity of the virtual machine hosts. My college and I aren't certified on this stuff. But a big complement goes to the US-based co-pilot support, I think they're awesome.

We saw the warning diagrams and got some tickets from co-pilot but a few months ago we had allocated like 98% or 99% of the space. Means that the whole Tier 1 of 15k drives are totally full and no data progression seemed to happen so all writes end up at a handful 7200 rpm drives in RAID 6 or something like that. I don't know if/how these controllers do any writecaching but let me tell you, the speed for VM:s was horrible. We use compellent for storing comapany-internal VM:s, the few customer databases (with very limited SLA) we have are hosted on a server with only SSD storage so they were unaffected.

Had the our good reseller send some certified ESXi and Compellent technicians to do a review of our environment and we got a quite a few recommendation on things to correct and since then we have gain better performance (but still not impressed, perhaps we've grown out of the shoes here).
What aggravates me is that we have several free slots in the tier 3 chassis but haven't been able to buy drives because they're sold out since a year or so. The last ones they've stockpiled for warranty replacements. Thus I really desire for this ZFS build to have empty slots for new drives and use non-firmware locked drives so I can expand it (and as we've agreed I should double up the RAM if I decide to do so).

Just to be on the very safe side on hard drives I've decided to buy 4 spares. Don't really know how it's today but earlier the size of drives from different manufacturers/models that were labeled they same could differ a little bit. I don't want ZFS to refuse a drive replacement with a different model that lacks the last 0.0001% of space.

For replacing the Compellent later we're looking at some options. I really believe very smart storage solutions are the future. The time for adding more and more cabinets of disk for performance is in the past. Had a good sales and technology brief from Nimble Storage recently. But performance costs, but also space. At certain levels stepping down one level of raw disk capacity could pay for the cost of this SMC machine... (and definitely comparing per-terabyte cost). Which will be around $13000 now I think.

Christer · Mar 26, 2015

One more thing. I wrote that I didn't plan for a mirrored SLOG setup. I haven't gotten any reactions to that. For VM storage I guess the ZIL write operations are mostly like random writes, not huge chunks of data.

jgreco · Mar 26, 2015

Christer said:
Just to be on the very safe side on hard drives I've decided to buy 4 spares. Don't really know how it's today but earlier the size of drives from different manufacturers/models that were labeled they same could differ a little bit. I don't want ZFS to refuse a drive replacement with a different model that lacks the last 0.0001% of space.

Mostly this hasn't been a problem in recent years. If it is, there's a little bit of slop space allocated to the swap partition. You could allocate a larger swap if you're truly fearful, otherwise, just get a few different candidate drives and check. A heterogeneous drive environment is a fine thing with ZFS. Also usually by the time a disk fails, you could be replacing it with a larger drive anyways.

For replacing the Compellent later we're looking at some options. I really believe very smart storage solutions are the future. The time for adding more and more cabinets of disk for performance is in the past. Had a good sales and technology brief from Nimble Storage recently. But performance costs, but also space. At certain levels stepping down one level of raw disk capacity could pay for the cost of this SMC machine... (and definitely comparing per-terabyte cost). Which will be around $13000 now I think.

Yeah, it's also becoming crazy as the price of SSD continues to plummet. It is already totally practical to have different storage tiers and have vCenter monitor performance and Storage vMotion stuff around as needed. This works even better if your VM designers segregate stuff onto multiple disks based on what the workload is. Still some work though.

Christer said:
One more thing. I wrote that I didn't plan for a mirrored SLOG setup. I haven't gotten any reactions to that. For VM storage I guess the ZIL write operations are mostly like random writes, not huge chunks of data.

SLOG writes are effectively sequential, but may happen in very small increments. As a result, a very fast device is desirable but it can be pretty small.

depasseg · Mar 26, 2015

Re: Mirrored SLOG. Based on my reading, I'm comfortable using a non-mirrored device. There used to be a possibility of data loss if the SLOG failed at the same time as a loss of server power. My understanding is that it's no longer the case. Hence my willingness to accept only using 1 SLOG device.

jgreco · Mar 26, 2015

depasseg said:
Re: Mirrored SLOG. Based on my reading, I'm comfortable using a non-mirrored device. There used to be a possibility of data loss if the SLOG failed at the same time as a loss of server power. My understanding is that it's no longer the case. Hence my willingness to accept only using 1 SLOG device.

That's correct. However, if the SLOG device fails, do be aware that ZFS will fall back to using the in-pool ZIL for sync writes, which will pound the livin' crap out of the pool. This can, of course, be immediately remedied in a crisis by disabling sync writes, but that's operator intervention, which some people will find acceptable and others won't.

The additional cost and performance hit of a mirrored SLOG is of questionable value.

Christer · Mar 26, 2015

jgreco said:
Mostly this hasn't been a problem in recent years. If it is, there's a little bit of slop space allocated to the swap partition. You could allocate a larger swap if you're truly fearful, otherwise, just get a few different candidate drives and check. A heterogeneous drive environment is a fine thing with ZFS. Also usually by the time a disk fails, you could be replacing it with a larger drive anyways.

Yeah, it's also becoming crazy as the price of SSD continues to plummet. It is already totally practical to have different storage tiers and have vCenter monitor performance and Storage vMotion stuff around as needed. This works even better if your VM designers segregate stuff onto multiple disks based on what the workload is. Still some work though.

SLOG writes are effectively sequential, but may happen in very small increments. As a result, a very fast device is desirable but it can be pretty small.

Haha forgot about the last part. Yeah if I stood without a replacement drive on the shelf I would of course prefer to order a disk twice the size, especially if it happens in a few years when 5, 6 or 8 TB drives would cost less than the 4 TB costs now.

We have vMotion and HA but I haven't looked into Storage vMotion. With the current setup it hasn't been any need for it (Compellent should handle that).

So I will probably be fine with a 200 GB S3710. As said it has quite lower sequential write performance but the IOPS are spec:ed the same to the bigger 400. Maybe the numbers aren't even valid after massive change to the overprovisioning.

Christer · Mar 26, 2015

jgreco said:
That's correct. However, if the SLOG device fails, do be aware that ZFS will fall back to using the in-pool ZIL for sync writes, which will pound the livin' crap out of the pool. This can, of course, be immediately remedied in a crisis by disabling sync writes, but that's operator intervention, which some people will find acceptable and others won't.

The additional cost and performance hit of a mirrored SLOG is of questionable value.

For this system I'm fine with saying to the users - "Sorry, a part broke down. Write performance will be very poor until tomorrow. Or the day after that".

jgreco · Mar 26, 2015

Christer said:
Haha forgot about the last part. Yeah if I stood without a replacement drive on the shelf I would of course prefer to order a disk twice the size, especially if it happens in a few years when 5, 6 or 8 TB drives would cost less than the 4 TB costs now.

We have vMotion and HA but I haven't looked into Storage vMotion. With the current setup it hasn't been any need for it (Compellent should handle that).

So I will probably be fine with a 200 GB S3710. As said it has quite lower sequential write performance but the IOPS are spec:ed the same. Maybe the numbers aren't even valid after massive change to the overprovisioning.

As you look forward towards a post-Compellent future, it would probably be interesting to play with the Storage vMotion features. The real downside is that in order to make best use of them, you actually need to build your VM's with some forethought that there are different tiers, and therefore you need to install the OS on one disk, logs on another, database on a third, or whatever makes sense for varying levels of workload.

For the SLOG, I should also note that a lot of people make the mistake of worrying about write speeds during maintenance. By definition, during planned maintenance, you could turn off sync writes and gain a lot of speed during the time you really need it, without also wearing the SSD unnecessarily, and then turn sync writes back on.

Important Announcement for the TrueNAS Community.

BUILD Doing a ~$14 000 build, input on harddrive choice (and the rest of the components)

Christer

Dabbler

Christer

Dabbler

jgreco

Resident Grinch

depasseg

FreeNAS Replicant

jgreco

Resident Grinch

Christer

Dabbler

Christer

Dabbler

jgreco

Resident Grinch

Similar threads

Important Announcement for the TrueNAS Community.

BUILD Doing a ~$14 000 build, input on harddrive choice (and the rest of the components)

Dabbler

Dabbler

Resident Grinch

FreeNAS Replicant

Resident Grinch

Dabbler

Dabbler

Resident Grinch

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Doing a ~$14 000 build, input on harddrive choice (and the rest of the components)"

Similar threads