fuzzbawl
Cadet
- Joined
- Apr 5, 2014
- Messages
- 5
My adventure started when I needed to build a storage array for work. We needed at least 3TB of storage for our current needs plus some expansion room, it needed to be somewhat fast (using ESXi) and reliable (ZFS scrubs were very attractive for that along with RAID-Z2/Z3). So, using my best information collecting tools (my eyeballs) I set out to do what any good server administrator does when deploying something they've never used before: Read the forums.
After several days of pouring over forum posts, guides, PDF files and who knows what else I decided upon a design:
My intent was to use NFS instead of iSCSI. There are a multitude of reasons that I want to avoid iSCSI, mainly the potential for data loss if something completely out-of-the-ordinary were to happen in the data center and cause an improper shutdown and a bunch of other reasons I won't list right now. I fired up FreeNAS and configured the array in a RAID-Z2 (4 drives per vdev, 2 vdevs). I fired up ESXi, connected my hosts and attached the shiny new NFS export. I ran some initial tests and checked "zilstat" to make sure that I actually needed a SLOG. The screen was full of numbers (no zeros) so from my understanding, yes I needed a SLOG. I added the SSDs to the chassis and configured them as a mirrored SLOG in FreeNAS. Everything was going fine but things seemed a little slow. I did a bunch more reading, got concerned that somehow jgreco or cyberjock would somehow jump through the screen and slap me for thinking about doing sync=disabled as a production solution and decided to do more reading instead. After a couple hours, my head was swimming with information. I tried a few other things but nothing was changing my results.
My bonnie++ tests were coming back at roughly 22MB/s block writes, 25MB/s re-writes, 394 MB/s read and 2440 random seeks. I decided to try something, I removed the SLOG and did the test again. This dropped my writes to the 5 MB/s to 7 MB/s everyone else was seeing without SLOG. So I thought maybe something else was going on. I tested my Ethernet cards with iperf and was getting 9.89 Gbit/s on short 10 second tests and full 1 hour tests. The cards are fine. So then it just had to be a problem with ZFS, I thought. I took another day and read more material, pouring over post after post until the middle of the night. Then I woke up and spend yet a third day going over documentation and forums.
What I learned is that I have much more to learn about ZFS. But I found a few utilities and decided to run them to see what was going on. I fired up my ESXi VM and ran bonnie++ again, this time while watching "zpool iostat -v 5" and "gstat". I noticed that the SSDs were on average 80% busy at less than 1000 ops/s. If I understand everything I have read correctly, this would indicate that my choice in SSD was poorly made. The Kingston KC300 I thought would be a decent choice, however it appears that the Intel DC S3700 or the Samsung 840 Pro would have been better choices.
So, my question is this: Am I on the right track here? I definitely don't want to set sync=disabled but I'm pretty confident I just need a faster SSD. I would opt for one of the Fusion-io (or similar) cards but it's just not in the budget this time around. Once we get everyone addicted to FreeNAS then perhaps we can upgrade :)
I've attached my "zpool iostat -v" output.
After several days of pouring over forum posts, guides, PDF files and who knows what else I decided upon a design:
- Supermicro X9DRD-7LN4F-JBOD (Has an LSI SAS2308 in IT mode) motherboard
- Supermicro CSE-846BE16-R1K28B chassis
- 64GB of ECC DDR3 RAM
- 2x Kingston KC300 60GB SSD
- 2x Intel X520-DA2 Dual port 10-Gigabit SFP+ Ethernet NICs
- 8x Seagate Constellation 64MB cache SAS2 7200rpm drives (to start with)
My intent was to use NFS instead of iSCSI. There are a multitude of reasons that I want to avoid iSCSI, mainly the potential for data loss if something completely out-of-the-ordinary were to happen in the data center and cause an improper shutdown and a bunch of other reasons I won't list right now. I fired up FreeNAS and configured the array in a RAID-Z2 (4 drives per vdev, 2 vdevs). I fired up ESXi, connected my hosts and attached the shiny new NFS export. I ran some initial tests and checked "zilstat" to make sure that I actually needed a SLOG. The screen was full of numbers (no zeros) so from my understanding, yes I needed a SLOG. I added the SSDs to the chassis and configured them as a mirrored SLOG in FreeNAS. Everything was going fine but things seemed a little slow. I did a bunch more reading, got concerned that somehow jgreco or cyberjock would somehow jump through the screen and slap me for thinking about doing sync=disabled as a production solution and decided to do more reading instead. After a couple hours, my head was swimming with information. I tried a few other things but nothing was changing my results.
My bonnie++ tests were coming back at roughly 22MB/s block writes, 25MB/s re-writes, 394 MB/s read and 2440 random seeks. I decided to try something, I removed the SLOG and did the test again. This dropped my writes to the 5 MB/s to 7 MB/s everyone else was seeing without SLOG. So I thought maybe something else was going on. I tested my Ethernet cards with iperf and was getting 9.89 Gbit/s on short 10 second tests and full 1 hour tests. The cards are fine. So then it just had to be a problem with ZFS, I thought. I took another day and read more material, pouring over post after post until the middle of the night. Then I woke up and spend yet a third day going over documentation and forums.
What I learned is that I have much more to learn about ZFS. But I found a few utilities and decided to run them to see what was going on. I fired up my ESXi VM and ran bonnie++ again, this time while watching "zpool iostat -v 5" and "gstat". I noticed that the SSDs were on average 80% busy at less than 1000 ops/s. If I understand everything I have read correctly, this would indicate that my choice in SSD was poorly made. The Kingston KC300 I thought would be a decent choice, however it appears that the Intel DC S3700 or the Samsung 840 Pro would have been better choices.
So, my question is this: Am I on the right track here? I definitely don't want to set sync=disabled but I'm pretty confident I just need a faster SSD. I would opt for one of the Fusion-io (or similar) cards but it's just not in the budget this time around. Once we get everyone addicted to FreeNAS then perhaps we can upgrade :)
I've attached my "zpool iostat -v" output.