FreeNAS Backup repository server - need a sanity check

C2Opps · Jun 18, 2018

rvassar said:
I'm going to second Inxsible here. I've had nothing but bad luck with USB thumb drive boot devices, and I've just been running at home for a couple months. It's not just the lack of wear levelling, but the HCI support appears to interact poorly with ZFS.

Small 60Gb SATA SSD's can be found for $30 USD.

On this the chassis has space for 2X 2.5" SSD's (that i want to use for redundant SLOG) and the rest of the slots are for mechanical drives for ZFS or spare space for additional vdev's later if required (24 slots but only 12 filled initially with 2X 6 drive vDev's - meaning later we could add 1 or two more 6 drive vDev's)

Surely would be ok if we got two super high quality USB drives and had them mirrored? - I thought USB stick was recommended boot drive?

pro lamer · Jun 18, 2018

C2Opps said:
- I thought USB stick was recommended boot drive?

But some system dataset has to be moved away from them to the HDDs (during FreeNAS installation).

Ericloewe said:
The answer is crap USB drives. Unless you have your system dataset on your boot pool, in which case the solution is to move it to the main pool.

I guess to prevent wear.

C2Opps said:
two super high quality USB drives

Pendrives I guess? (comparing to USB attached SDDs/HDDs).

noob's question: Is a high endurance SD card an option? (Transcend and Sandisk make such ones) They are not cheap but so aren't industrial grade pendrives...
EDIT: SSDs may be cheaper. I guess it's the matter of free SATA connectors too.

On the other hand some are happy (and some aren't) with consumer grade thumb drives...

rvassar · Jun 18, 2018

C2Opps said:
Surely would be ok if we got two super high quality USB drives and had them mirrored? - I thought USB stick was recommended boot drive?

I went with this initially. I have a storage QA background. I did my initial testing and familiarization on a test server, and then built a system for home use. I bought new SanDisk thumb drives and configured them in a mirror. I've been "in production" here at home for ~2 months. Both thumb drives have dropped out of the pool on write failures before 35 days. I replaced one with a 60Gb SSD in a USB 3.0 enclosure. The other I replaced with a Samsung FIT (unfortunately 64Gb, as it needed to match the other SSD...). So far this has held for a week. I bought a SAS HBA, and am testing it in preparation for the next glitch.

The thing is... Both the failed SanDisk thumb drives format & test fine on my Linux box. They appear to still be functional. The 60Gb SSD has been in use for ~2 weeks, but is on a PCIe USB 3.0 HCI, not the motherboard USB 2.0 HCI. I'm waiting for the FIT drive to drop out to prove to myself there's something amiss with the USB 2.0 HCI drivers. @Chris Moore warned me I'd see random pool device drop outs. He's not kidding!

Using a SATA SSD for boot will get you:
- Real wear levelling. USB thumb drives have none.
- A proper attachment bus that works.
- Devices that have rich error reporting & handling that the ZFS drivers understand. ZFS talking to a USB thumb drive is like a college professor arguing with a two year old.

Chris Moore · Jun 20, 2018

rvassar said:
I went with this initially. I have a storage QA background. I did my initial testing and familiarization on a test server, and then built a system for home use. I bought new SanDisk thumb drives and configured them in a mirror. I've been "in production" here at home for ~2 months. Both thumb drives have dropped out of the pool on write failures before 35 days. I replaced one with a 60Gb SSD in a USB 3.0 enclosure. The other I replaced with a Samsung FIT (unfortunately 64Gb, as it needed to match the other SSD...). So far this has held for a week. I bought a SAS HBA, and am testing it in preparation for the next glitch.

The thing is... Both the failed SanDisk thumb drives format & test fine on my Linux box. They appear to still be functional. The 60Gb SSD has been in use for ~2 weeks, but is on a PCIe USB 3.0 HCI, not the motherboard USB 2.0 HCI. I'm waiting for the FIT drive to drop out to prove to myself there's something amiss with the USB 2.0 HCI drivers. @Chris Moore warned me I'd see random pool device drop outs. He's not kidding!

Using a SATA SSD for boot will get you:
- Real wear levelling. USB thumb drives have none.
- A proper attachment bus that works.
- Devices that have rich error reporting & handling that the ZFS drivers understand. ZFS talking to a USB thumb drive is like a college professor arguing with a two year old.

It makes me think that there might be an issue with the USB controller, or bad interactions with the driver, that is the source of the problem with premature USB flash drive fails.

Sent from my SAMSUNG-SGH-I537 using Tapatalk

C2Opps · Jun 20, 2018

Ah @ChrisMoore Just the man - Hi !

Has seen you in another tread regarding slog devices and wonder if you had any thoughts on me using the Samsung SM863a 120GB as a SLOG - had seen the page linked in that thread (https://www.servethehome.com/what-is-the-zfs-zil-slog-and-what-makes-a-good-one/)
but what wasn't clear was the actual SLOG performance for a given drive - ie what is being written to it -
This drive is up to 450MB/sec sequential write (128KB writes) and 10K IOPS (4KB writes)- if this allowed us 450MB/sec write speed to our pool this should be fine for our needs but just second guessing myself at the moment as i can't see from the above site what the actual type of writes a SLOG device experiences is (for us it will be NFS writes from Veeam) or is it completely variable ?

Also have found we can use 2X SATA DOM's for the OS so going to do that (No spare space in the case for additional SSD's)

Chris Moore · Jun 20, 2018

C2Opps said:
seen you in another tread regarding slog devices and wonder if you had any thoughts on me using the Samsung SM863a 120GB as a SLOG

I went back to the first post and seeing as how you are using this as a backup server I am not sure you even need SLOG drives.
From your first post:

C2Opps said:
SLOG device: 2 X 240GB Samsung SM863a SATA 2.5”- Has Power Loss protection and Mirrored just in case of device failure: Rated at 450MB/sec write speed. I’m guessing this would limit us to around 480MB/sec write speed for our pool which wouldn’t be a limiting factor as 2X 1Gbe will max out around 240MB/sec (240 only when using 2 separate network transport streams otherwise half that) we don’t currently have 10Gbe and adding it probably not going to be a possibility for the moment – but if we manage 480MB/sec writes that should be enough. Was also going to overprovision them as I’d read you can only use a max of around 16GB/ 5 seconds of writes?

I have never used Veeam, so I can't say how it works, but I would not expect it to be doing sync writes. If it is synchronous, then you would want SLOG, if not, then there is probably no point in it.
Since you are buying a new server, I would suggest that you get an NVMe / PCIe connected SSD instead of a SATA connected drive. The speed is just so much better.
This forum thread has some benchmark numbers that you can look at:
https://forums.freenas.org/index.ph...-and-finding-the-best-slog.63521/#post-459239
I will go ahead and say that a Intel DC S3500, SATA connected SSD, scored a 56847; where a Hitachi, SAS connected SSD, scored 15591; and a Intel DC P3700, PCIe connected SSD, scored a 7465... Those numbers are μs/IO (microsecond/IO) and lower is better. You might want to look at the thread because there are some other drives tested there and I hope to be testing an Optane drive later this year; a different one than the one that is in the referenced thread.
It is down to your budget, and the need. If you know that the writes will be sync writes, then get a SLOG of some kind but keep in mind that it can be a speed limit to performance.

Chris Moore · Jun 20, 2018

C2Opps said:
Also have found we can use 2X SATA DOM's for the OS so going to do that (No spare space in the case for additional SSD's)

If the vendor says that there is no room in the case for internal SSDs, I would question their knowledge of the Supermicro product line. There is an optional bracket that allows you to mount drives internally in the 4u 24bay chassis.
Seeing is believing:

Supermicro SC846E16-R1200B boot drives.JPG

~~Let me know if you need the part number.~~
Here is a link to it instead:
https://www.newegg.com/Product/Product.aspx?Item=N82E16816101828

Chris Moore · Jun 20, 2018

C2Opps said:
Chassis : SuperMicro rackmount 4U 24x 3.5” drive slots with 2X 2.5” slots (initially without all slots filled so room to add more) – The reseller said the backplane is ‘fully expanded’

There is also this:
http://www.supermicro.com/products/accessories/drivekit/MCP-220-84610-0N.cfm
You can mount two 2.5" drives in the rear of the chassis in hot-swap trays like this:

C2Opps · Jun 20, 2018

Chris Moore said:
I went back to the first post and seeing as how you are using this as a backup server I am not sure you even need SLOG drives.
From your first post:
I have never used Veeam, so I can't say how it works, but I would not expect it to be doing sync writes. If it is synchronous, then you would want SLOG, if not, then there is probably no point in it.
Since you are buying a new server, I would suggest that you get an NVMe / PCIe connected SSD instead of a SATA connected drive. The speed is just so much better.
This forum thread has some benchmark numbers that you can look at:
https://forums.freenas.org/index.ph...-and-finding-the-best-slog.63521/#post-459239
I will go ahead and say that a Intel DC S3500, SATA connected SSD, scored a 56847; where a Hitachi, SAS connected SSD, scored 15591; and a Intel DC P3700, PCIe connected SSD, scored a 7465... Those numbers are μs/IO (microsecond/IO) and lower is better. You might want to look at the thread because there are some other drives tested there and I hope to be testing an Optane drive later this year; a different one than the one that is in the referenced thread.
It is down to your budget, and the need. If you know that the writes will be sync writes, then get a SLOG of some kind but keep in mind that it can be a speed limit to performance.

Thanks for all the detailed info Chris!, - I will read up more on the thread and the method that Veeam backup uses for writes when it is using a Linux server and storing data via NFS tomorrow as its a bit late here.

Just one other quick question though - regarding "the SLOG can be the speed limiting factor" - if i find out the Veeam writes are aysnchronous then could forcing them to be synchronous (with a decent SLOG) improve write performance or will async writes always be quicker leaving them be than forcing them to be async no matter what SLOG performance you have?

Also ESXi <----> FreeNAS (Via NFS) that we use will definitely by synchronous although this isn't the primary function of the FreeNAS

To note i think the case you posted was the one the vendor had for this build - the 2 external drives were in my initial plan for SLOG drives, i'll ask the vendor about internal drives tomorrow too.

kdragon75 · Jun 20, 2018

async will always be faster.

Chris Moore · Jun 20, 2018

C2Opps said:
if i find out the Veeam writes are aysnchronous

If they are async, there is no need to spend the money to buy a SLOG device. The SLOG device is only needed when you are trying to improve the speed of sync writes. Like @kdragon75 said, async is faster, so forcing it to be sync if it does not need to be is not helping your situation.

C2Opps said:
Also ESXi <----> FreeNAS (Via NFS) that we use will definitely by synchronous although this isn't the primary function of the FreeNAS

If you have a sync workload, then you would need a SLOG device. I have seen other people show as much as a 5x improvement in write speed. Depending on the exact nature of the work, your results will vary.

C2Opps said:
To note i think the case you posted was the one the vendor had for this build - the 2 external drives were in my initial plan for SLOG drives, i'll ask the vendor about internal drives tomorrow too.

There is actually a space to put a second one of the internal brackets, but it can be a tight fit depending on what else is in the case.

rvassar · Jun 20, 2018

I'll just make one comment here: NFS should for the most part be sync (aka O_SYNC per the standard C library), anything else is asking for trouble. That's kind of a long standing rule going way back to NFSv2 and the original NFS accelerator, the Sun X1021A sBus Prestoserve card. I understand that some more modern filesystems accommodate failure better (ala VMFS5), but in most cases you are saving the hypervisor virtual file-system, while throwing the individual VM filesystem consistency under the bus.

The SLOG is kind of the modern version of a Prestoserve card.

Chris Moore · Jun 20, 2018

rvassar said:
I'll just make one comment here: NFS should for the most part be sync (aka O_SYNC per the standard C library), anything else is asking for trouble.

rvassar said:
The SLOG is kind of the modern version of a Prestoserve card.

Great.

C2Opps said:
if i find out the Veeam writes are aysnchronous then could forcing them to be synchronous (with a decent SLOG) improve write performance or will async writes always be quicker leaving them be than forcing them to be async no matter what SLOG performance you have?

That means you do need a SLOG device and the faster it is, the better the performance of the storage will be.

Chris Moore · Jun 20, 2018

C2Opps said:
Just one other quick question though - regarding "the SLOG can be the speed limiting factor"

When I say it can be a speed limiting factor... Here is what I am thinking. The maximum throughput you could ever get (after overhead) with a SATA SSD would be about equivalent to 4Gig network speed. That would likely be fine if you only have 1Gig networking, but if you go to 10Gig networking, you will be limited. The speed of the IO is more a factor if you are trying to run a VM from the storage because the VM response would be slowed by the storage. With your use for backups, that may not be nearly as much of a concern.

rvassar · Jun 21, 2018

Chris Moore said:
That means you do need a SLOG device and the faster it is, the better the performance of the storage will be.

You can adjust the ZFS sync property. That is an option. If you have a solid UPS, and are wired up to shutdown on powerfail, you might be willing to accept a little risk.

Code:

zfs set sync=disabled <pool>/<dataset>

I would be very unlikely to run such a system in general production. But it does have it's use cases.

C2Opps · Jun 22, 2018

rvassar said:
the most part be sync (aka O_SYNC per the standard C library), anything else is asking for trouble. That's kind of a long standing rule going way back to NFSv2 and the original NFS accelerator, the Sun X1021A sBus Prestoserve card. I understand that some more modern filesystems accommodate failure better (ala VMFS5), but in most cases you are saving the hypervisor virtual file-system, while throwing the individual VM filesystem consistency under the bus.

The SLOG is kind of the modern version of a Prestoserve card.

Hmm Veeam have said with their Linux Datamover/ Repository server saving data to NFS storage that they are asynchronous - they also didn't say what would happen during storage server reboot / network connectivity other than the backup could be at risk and they would suggest a health check job. Are you saying that even though Veeam would default save asynchronously that I should force it to be synchronous?
Having said that even if i did force sync for Veeam using FreeNAS's setting not sure if that would mean backup files would be ok in case of FreeNAS failiure / network connectivity as I think Veeam jobs don't keep a log of what they are doing between runs to fix a previously broken run (or if that would even be possible)

rvassar · Jun 22, 2018

C2Opps said:
Hmm Veeam have said with their Linux Datamover/ Repository server saving data to NFS storage that they are asynchronous - they also didn't say what would happen during storage server reboot / network connectivity other than the backup could be at risk and they would suggest a health check job. Are you saying that even though Veeam would default save asynchronously that I should force it to be synchronous?

I suspect we have a nomenclature mix up here. The backup job is asynchronous. Meaning it does not happen in parallel with whatever writes are occurring on the primary machine. Ala... A 3am backup is asynchronous with the work done 9 to 5 the day before.

NFS is generally O_SYNC by default. That's a specific flag setting on a filehandle operation in the standard C library. It tells the OS to not optimize the write operations. Don't delay, don't aggregate them together, in fact don't even return to my code execution until it has been safely committed to the spinning platter of a disk. It is a safety flag used to ensure data security & consistency. Now, in NFS, that storage device is on another machine, on a network that could be in the adjacent rack, or halfway across the country. A write is committed, a packet gets formed, checksummed, placed in the network queue, transmitted some distance, received, checksummed again, verified for auth / permissions, etc... And scheduled on to a device that must be available, ready, rotate into position, and finally the data arrives safely on the storage medium. It is not until this point in that NFS write call that the O_SYNC flag has been satisfied. The server then generates a NFS "ack" packet and transmits it back to the client host, which receives it, and can finally return to code execution.

That's why NFS is so slow. That's why even 25 years ago, Sun was inventing battery backed ram write log cards to accelerate the process. These days things are more tuneable. You can adjust the O_SYNC behavior of a ZFS filesystem if the degree of risk suits you. For a nightly backup, with a snapshot on the receiving host prior to start, it's probably entirely reasonable to disable it. For the ZFS dataset hosting a bank's Oracle DB. Not acceptable.

-Rob

C2Opps · Jun 22, 2018

rvassar said:
I suspect we have a nomenclature mix up here. The backup job is asynchronous. Meaning it does not happen in parallel with whatever writes are occurring on the primary machine. Ala... A 3am backup is asynchronous with the work done 9 to 5 the day before.

NFS is generally O_SYNC by default. That's a specific flag setting on a filehandle operation in the standard C library. It tells the OS to not optimize the write operations. Don't delay, don't aggregate them together, in fact don't even return to my code execution until it has been safely committed to the spinning platter of a disk. It is a safety flag used to ensure data security & consistency. Now, in NFS, that storage device is on another machine, on a network that could be in the adjacent rack, or halfway across the country. A write is committed, a packet gets formed, checksummed, placed in the network queue, transmitted some distance, received, checksummed again, verified for auth / permissions, etc... And scheduled on to a device that must be available, ready, rotate into position, and finally the data arrives safely on the storage medium. It is not until this point in that NFS write call that the O_SYNC flag has been satisfied. The server then generates a NFS "ack" packet and transmits it back to the client host, which receives it, and can finally return to code execution.

That's why NFS is so slow. That's why even 25 years ago, Sun was inventing battery backed ram write log cards to accelerate the process. These days things are more tuneable. You can adjust the O_SYNC behavior of a ZFS filesystem if the degree of risk suits you. For a nightly backup, with a snapshot on the receiving host prior to start, it's probably entirely reasonable to disable it. For the ZFS dataset hosting a bank's Oracle DB. Not acceptable.

-Rob

Hi Rob, thanks for the clear explanation of the NFS path - I would have assumed he meant ASYNC is in the NFS writes - not that the backup happens at 3am when the data was saved from workers during the day - below is a was a direct quote from Veeam support
"By design Veeam Data Mover (target agent) uses asynchronous mode to send data to NFS export mounted on Linux server."

I will further clarify if he definitely means that O_SYNC is set to off for this (I'm assuming here that Veeam can mount an NFS share and set that option O_SYNC to off - or is it something that FreeNAS has set and can't be changed by an NFS client?)

I have the feeling that either with SYNC or ASYNC if the job is interrupted during it for whatever reason the Veeam backup chain risks corruption (is it is modifying previous backup 'restore points' of the same in the chain (of the VM it is currently backing up) whilst backing up fresh data )

-C2

rvassar · Jun 22, 2018

C2Opps said:
I will further clarify if he definitely means that O_SYNC is set to off for this (I'm assuming here that Veeam can mount an NFS share and set that option O_SYNC to off - or is it something that FreeNAS has set and can't be changed by an NFS client?)

In FreeNAS it is a property of the ZFS dataset, not under control of the client. When set to default, FreeNAS should not send the NFS "ack" until the data is committed. Wither the client cares or not, and runs ahead with a bunch of outstanding unacknowledged operations, is anyone's guess.

It's more likely they're building a backup "blob" of data, in what ever format they use, in a way that does not halt execution. Pause VM -> Snapshot -> unpause VM -> generate backup from snapshot. That's essentially asynchronous to the the VM.

-Rob

C2Opps · Jun 22, 2018

rvassar said:
In FreeNAS it is a property of the ZFS dataset, not under control of the client. When set to default, FreeNAS should not send the NFS "ack" until the data is committed. Wither the client cares or not, and runs ahead with a bunch of outstanding unacknowledged operations, is anyone's guess.

It's more likely they're building a backup "blob" of data, in what ever format they use, in a way that does not halt execution. Pause VM -> Snapshot -> unpause VM -> generate backup from snapshot. That's essentially asynchronous to the the VM.

-Rob

Ah ok - thats the critical bit of info then - NFS writes are by default synchronous and therefore we need a SLOG

Important Announcement for the TrueNAS Community.

FreeNAS Backup repository server - need a sanity check

Dabbler

Guru

Guru

Hall of Famer

Dabbler

Hall of Famer

Hall of Famer

Hall of Famer

Dabbler

Wizard

Hall of Famer

Guru

Hall of Famer

Hall of Famer

Guru

Dabbler

Guru

Dabbler

Guru

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "FreeNAS Backup repository server - need a sanity check"

Similar threads