Web server dies

orddie · Jun 11, 2016

hi all!

after about a week of use the server's management web interface will stop responding and I'm no longer able to ping outside of the storage vlan.

I'm also noticing storage delay increases in vmware reporting. What used to be < 10 MS is not 50 with spikes in the 300ms.

All storage is iSCSI tragets to vmware.
Local storage is 4x 1tb WD red's in pool 1 and 2x 1tb WD blues in pool 2.

The storage is 80% full in pool 1. storage is 50% full in pool 2

server has 32gb of ram
total storage is 2tb's after raid is done.
The install was done onto a USB stick of the server. Is it possible the flash has gone bad? This is the second time this has happened.

nightshade00013 · Jun 11, 2016

http://forums.freenas.org/index.php?threads/forum-rules.22553/#post-134570/

You have posted about 1/4 of the needed data.

orddie · Jun 11, 2016

nightshade00013 said:
http://forums.freenas.org/index.php?threads/forum-rules.22553/#post-134570/

You have posted about 1/4 of the needed data.

cpu = intel i5 6500
MoBO = asrock z170m pro4s
ram = 32gb
Version = 9.10

Spearfoot · Jun 11, 2016

Your ASRock Z170M ProP4s is more of a gamer/prosumer system and isn't particularly well-suited to use with FreeNAS. It doesn't support ECC memory, for one (big) thing...

You may have a USB stick problem, though I don't see how the system would exhibit the symptoms you describe -- GUI doesn't respond but it still serves files -- if that were the case. What brand and size of USB stick are you using? Are you plugging it into a USB 2.0 port? (Because USB 3.0 can be 'bad juju' on FreeNAS/FreeBSD).

Also, iSCSI performance drops drastically when it doesn't have plenty of free space to work with, and your pools are 80% and 50% full. This alone probably explains the increased storage delay you're seeing.

You haven't told us how your pools are configured: mirrors? RAIDZ1,2,3? Are your disks running off the motherboard's SATA ports? Or an HBA?

You mentioned a storage VLAN... can you give us more details about your network setup?

In general, the more detailed information you provide, the more likely you are to get a response from the experts.

orddie · Jun 11, 2016

thanks for your responses thus far.

would you recommend NFS over iSCSI?

RAIDZ2 is the config i went with for the 4disk 1tb storage. Disks are running off the motherboard Sata. ACH is turned on.

the flash is current in a usb 3 port (will move it) but usb has been down clocked to usb 2 via bios from my understanding.

freenas is connected via an acces port on vlan 80.
vmware is connected to the same switch and has an IP on the same vlan. Vmware is tagging the traffic though.

orddie · Jun 11, 2016

okay. shutdown all the VM's and rebooted the HOST. Lots of errors about a hard drive. when i logged in i get the following error

Code:

CRITICAL: June 11, 2016, 4:36 p.m. - The volume Vmware (ZFS) state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected

now a single disk is solidly light and vmware is reporting the drives are over 4500 delay.

Code:

[root@freenas] ~# zpool status -v
  pool: Vmware
state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: resilvered 26.8M in 0h0m with 0 errors on Fri Jun 10 17:44:44 2016
config:

        NAME                                            STATE     READ WRITE CKSUM
        Vmware                                          ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/ea5df295-29cb-11e6-9efb-d05099822d44  ONLINE       0     0     0
            gptid/ebd1c10c-29cb-11e6-9efb-d05099822d44  ONLINE       0     0     0
            gptid/ec872aa9-29cb-11e6-9efb-d05099822d44  ONLINE       0     0   562
            gptid/ed407a2d-29cb-11e6-9efb-d05099822d44  ONLINE       0     0     0

errors: No known data errors

  pool: Workstations
state: ONLINE
  scan: none requested
config:

        NAME                                            STATE     READ WRITE CKSUM
        Workstations                                    ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            gptid/9c00c1fa-2d0e-11e6-be1a-d05099822d44  ONLINE       0     0     0
            gptid/9cb2dfce-2d0e-11e6-be1a-d05099822d44  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          da0p2     ONLINE       0     0     0

errors: No known data errors

gpsguy · Jun 11, 2016

Please post the results of zpool status -v in code tags.

orddie · Jun 11, 2016

issued

Code:

[root@freenas] ~# zpool clear Vmware gptid/ec872aa9-29cb-11e6-9efb-d05099822d44

and started a scrub.

Code:

[root@freenas] ~# zpool status -v
  pool: Vmware
state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: scrub in progress since Sat Jun 11 17:13:27 2016
        12.7M scanned out of 602G at 12.4K/s, (scan is slow, no estimated time)
        168K repaired, 0.00% done
config:

        NAME                                            STATE     READ WRITE CKSUM
        Vmware                                          ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/ea5df295-29cb-11e6-9efb-d05099822d44  ONLINE       0     0     0
            gptid/ebd1c10c-29cb-11e6-9efb-d05099822d44  ONLINE       0     0     0
            gptid/ec872aa9-29cb-11e6-9efb-d05099822d44  ONLINE       0     0   476  (repairing)
            gptid/ed407a2d-29cb-11e6-9efb-d05099822d44  ONLINE       0     0     0

errors: No known data errors

  pool: Workstations
state: ONLINE
  scan: none requested
config:

        NAME                                            STATE     READ WRITE CKSUM
        Workstations                                    ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            gptid/9c00c1fa-2d0e-11e6-be1a-d05099822d44  ONLINE       0     0     0
            gptid/9cb2dfce-2d0e-11e6-be1a-d05099822d44  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          da0p2     ONLINE       0     0     0

errors: No known data errors

Spearfoot · Jun 11, 2016

Yikes! Lots of checksum errors on that third disk. Looks like it's going to take a long time to finish the scrub, so you may as well shutdown any running VMs and turn off the CIFS, NFS, and iSCSI services until it completes.

Would be a good idea to run an extended SMART test on the drives after the scrub finishes.

orddie · Jun 11, 2016

Spearfoot said:
Yikes! Lots of checksum errors on that third disk. Looks like it's going to take a long time to finish the scrub, so you may as well shutdown any running VMs and turn off the CIFS, NFS, and iSCSI services until it completes.

Would be a good idea to run an extended SMART test on the drives after the scrub finishes.

how does one drive make my entire san take a dive? i thought r2 would allow a disk to fail and still keep on going.

Spearfoot · Jun 11, 2016

orddie said:
how does one drive make my entire san take a dive? i thought r2 would allow a disk to fail and still keep on going.

The system is running, so it is still going, right? So your RAIDZ2 pool is behaving as expected. It looks like you might have a bad disk, but we won't know for certain without further tests.

I suggested stopping VMs and services because it seems to be taking a long time to finish the scrub you started; taking a load off the NAS might quicken the process.

EDIT: As far as your NAS 'taking a dive', that may be due to how full your pools are, as I mentioned above.

You might consider replacing the 4 x 1TB drives with larger capacity disks (one at a time per the disk replacement instructions) to increase the usable space in your pool. Because running at 80% utilization is killing performance.

orddie · Jun 11, 2016

Spearfoot said:
The system is running, so it is still going, right? So your RAIDZ2 pool is behaving as expected. It looks like you might have a bad disk, but we won't know for certain without further tests.

I suggested stopping VMs and services because it seems to be taking a long time to finish the scrub you started; taking a load off the NAS might quicken the process.

EDIT: As far as your NAS 'taking a dive', that may be due to how full your pools are, as I mentioned above.

You might consider replacing the 4 x 1TB drives with larger capacity disks (one at a time per the disk replacement instructions) to increase the usable space in your pool. Because running at 80% utilization is killing performance.

im looking at adding more disks to the array. the disk finally went into removed status.

i guess what i WAS expecting was that the raid array would loose a disk and still work without a performance hit. CPU and disk IO were still very low so im not sure where the issue was coming from.

would NFS be a better option for when the raid array becomes full or this a general raidz2 thing?

Spearfoot · Jun 11, 2016

orddie said:
im looking at adding more disks to the array. the disk finally went into removed status.

Well, you can't add disks to a RAIDZ2 array. You can replace them, like I described, to get more space.

orddie said:
i guess what i WAS expecting was that the raid array would loose a disk and still work without a performance hit. CPU and disk IO were still very low so im not sure where the issue was coming from.

I thought I had explained that your full pool would cause performance to suffer.

orddie said:
would NFS be a better option for when the raid array becomes full or this a general raidz2 thing?

I'm not sure about the answer to that question; I can't say for certain that NFS won't take as much of a performance hit as iSCSI on a full pool. I use NFS for my VM datastore, but my needs are simple, and my pool isn't anywhere near being full. NFS is easier to administer, and I didn't see any performance difference when I compared it with iSCSI about a year ago when I built my first all-in-one.

The full pool problem isn't related to the pool topology, so RAIDZ2 isn't to blame in that regard. But using mirrors instead of RADZn will give the best performance for serving virtual machines.
.

gpsguy · Jun 12, 2016

Here's a brand new thread that corroborates what Spearfoot said.

https://forums.freenas.org/index.ph...-mirrors-for-block-storage.44068/#post-293549

Spearfoot said:
But using mirrors instead of RADZn will give the best performance for serving virtual machines.
.

orddie · Jun 12, 2016

okay. based off things here i have removed all the data, blew away the raid, replaced the bad drive, added 2x more drives, created a new storage array with raidZ (for i understand this is mirror). This allowed 3.6 TB of usable space. I than created a new store for vmware via iScsi with a max storage of 2tb leave 1.5tb unused in the array.

does this sound correct?

gpsguy · Jun 12, 2016

Please do a "zpool status" (w/o quotes) and post the results in code tags.

Sent from my iPhone using Tapatalk

Spearfoot · Jun 12, 2016

orddie said:
okay. based off things here i have removed all the data, blew away the raid, replaced the bad drive, added 2x more drives, created a new storage array with raidZ (for i understand this is mirror). This allowed 3.6 TB of usable space. I than created a new store for vmware via iScsi with a max storage of 2tb leave 1.5tb unused in the array.

does this sound correct?

No, sir. I'm afraid you may have created a RAIDZ1 array, which is not mirrors. I'm pretty sure @gpsguy asked for the output of 'zpool status' to confirm.

This isn't a big deal, since you've backed up your data and are reconstructing your pool. If necessary, you can just destroy it (if it is indeed RAIDZ1) and recreate it as mirrors.

orddie · Jun 12, 2016

gpsguy said:
Please do a "zpool status" (w/o quotes) and post the results in code tags.

Sent from my iPhone using Tapatalk

Code:

[root@freenas] ~# zpool status
  pool: HostStorage
state: ONLINE
  scan: none requested
config:

        NAME                                            STATE     READ WRITE CKSUM
        HostStorage                                     ONLINE       0     0     0
          raidz1-0                                      ONLINE       0     0     0
            gptid/64d90c17-30b6-11e6-b3aa-d05099822d44  ONLINE       0     0     0
            gptid/6577b9fe-30b6-11e6-b3aa-d05099822d44  ONLINE       0     0     0
            gptid/66649d83-30b6-11e6-b3aa-d05099822d44  ONLINE       0     0     0
          raidz1-1                                      ONLINE       0     0     0
            gptid/670a1ea5-30b6-11e6-b3aa-d05099822d44  ONLINE       0     0     0
            gptid/67bd61ad-30b6-11e6-b3aa-d05099822d44  ONLINE       0     0     0
            gptid/68bef85f-30b6-11e6-b3aa-d05099822d44  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          da0p2     ONLINE       0     0     0

errors: No known data errors

If i did this wrong, i have no idea how to make the mirror.

Spearfoot · Jun 12, 2016

orddie said:

Ah ha! You made a pair of RAIDZ1 vdevs in your pool; what you want is 3 mirrored vdevs, each vdev comprising 2 drives.

Here's a link to the relevant section of the documentation:
http://doc.freenas.org/9.10/freenas_storage.html#volume-manager

I know it can be confusing and frustrating! Ask me how I know... :)

The basic process is to create a volume (pool), giving it a name (typically 'tank', or 'HostStorage' in your case), and using only 2 drives to create a mirror in the 'Volume Layout' dropdown. (You will have to destroy your existing pool to do this.)

Then you would modify/extend the newly created volume, giving its name in the 'Volume to extend' dropdown, and then selecting 2 more disks in a 'mirror' layout. Repeat this process again for the last 2 of your 6 disks, and you should now have a volume/pool made up of 3 mirrored pairs.

At this point it doesn't matter if you goof something up and have to start over, so don't be afraid to give it a try!

orddie · Jun 12, 2016

Spearfoot said:
Ah ha! You made a pair of RAIDZ1 vdevs in your pool; what you want is 3 mirrored vdevs, each vdev comprising 2 drives.

Here's a link to the relevant section of the documentation:
http://doc.freenas.org/9.10/freenas_storage.html#volume-manager

I know it can be confusing and frustrating! Ask me how I know... :)

The basic process is to create a volume (pool), giving it a name (typically 'tank', or 'HostStorage' in your case), and using only 2 drives to create a mirror in the 'Volume Layout' dropdown. (You will have to destroy your existing pool to do this.)

Then you would modify/extend the newly created volume, giving its name in the 'Volume to extend' dropdown, and then selecting 2 more disks in a 'mirror' layout. Repeat this process again for the last 2 of your 6 disks, and you should now have a volume/pool made up of 3 mirrored pairs.

At this point it doesn't matter if you goof something up and have to start over, so don't be afraid to give it a try!

Okay.. i did this and got 2.7 TB of usable space.

created a zvpool of 2tb (77% usage) and presented it to vmware as iScsi. Think i will run into the same issue again because of the 77% usage?

Code:

[root@freenas] ~# zpool status
  pool: HostStorage
state: ONLINE
  scan: none requested
config:

        NAME                                            STATE     READ WRITE CKSUM
        HostStorage                                     ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            gptid/65e9177d-30bf-11e6-b3aa-d05099822d44  ONLINE       0     0     0
            gptid/66aef58f-30bf-11e6-b3aa-d05099822d44  ONLINE       0     0     0
          mirror-1                                      ONLINE       0     0     0
            gptid/7744bf5b-30bf-11e6-b3aa-d05099822d44  ONLINE       0     0     0
            gptid/77e87ba8-30bf-11e6-b3aa-d05099822d44  ONLINE       0     0     0
          mirror-2                                      ONLINE       0     0     0
            gptid/877cd9e3-30bf-11e6-b3aa-d05099822d44  ONLINE       0     0     0
            gptid/88283e93-30bf-11e6-b3aa-d05099822d44  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          da0p2     ONLINE       0     0     0

errors: No known data errors

Important Announcement for the TrueNAS Community.

Web server dies

Contributor

Wizard

Contributor

He of the long foot

Contributor

Contributor

Active Member

Contributor

He of the long foot

Contributor

He of the long foot

Contributor

He of the long foot

Active Member

Contributor

Active Member

He of the long foot

Contributor

He of the long foot

Contributor

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Web server dies"

Similar threads