striped-mirrors, striped-mirrors, striped-mirrors, striped-mirrors...
yes, I know, striped-mirrors!
now that we are clear on that:
I have a pool with 2 striped 6xRaidZ as a XenServer Shared Storage.
each drive is 300Gb SAS, and the server has 24Gb of RAM.
it's been working great until now, where all the VMs are shared hosting solutions, hosting 1000s websites mostly Open-Source CMSs (Wordpress, Joomla, Drupal, etc).
why do I mention this? to emphasise the fact that this is mainly random read intensive.
the heavy load, actually comes on backup moments, split into 2 main actions:
as it seems, outside of the "backup time" everything works great, and yes the RaidZ does provide enough IOPs/throughput for the VMs. but on backup time, the VMs seem to need a little bit more juice.
yesterday and today I tested on the 'normal' not-under-stress situation and I do get 80-110 MBps (as I should from a 1Gbps connection).
I also created a new VM with minimal CentOS on different hosts connected to the same storage, and I do get the same good (80-110) results!
any idea what can I do (or at least test) for the 'backup-time' so the pool could actually handle the stress?
yes, I know, striped-mirrors!
now that we are clear on that:
I have a pool with 2 striped 6xRaidZ as a XenServer Shared Storage.
each drive is 300Gb SAS, and the server has 24Gb of RAM.
it's been working great until now, where all the VMs are shared hosting solutions, hosting 1000s websites mostly Open-Source CMSs (Wordpress, Joomla, Drupal, etc).
why do I mention this? to emphasise the fact that this is mainly random read intensive.
the heavy load, actually comes on backup moments, split into 2 main actions:
- when each VM (not necessarily at the same time) makes a TAR of the USER's /home/USER directory... again, lot of small reads (significantly more than the actual write of the tar.gz file). now multiply this by the number of USERs (and a few VMs), this becomes a very long (read) process.
These tar.gz files are being stored on an NFS mount (out side of the VM's VDI) but still on the same ZFS pool. this is done to have a bit more of an "unlimited" storage. - [mostly not at the same time] each VM's backup (from the NFS mount) is being rsync'd into an external ZFS server (different server, but still on the same local network).
as it seems, outside of the "backup time" everything works great, and yes the RaidZ does provide enough IOPs/throughput for the VMs. but on backup time, the VMs seem to need a little bit more juice.
yesterday and today I tested on the 'normal' not-under-stress situation and I do get 80-110 MBps (as I should from a 1Gbps connection).
I also created a new VM with minimal CentOS on different hosts connected to the same storage, and I do get the same good (80-110) results!
any idea what can I do (or at least test) for the 'backup-time' so the pool could actually handle the stress?