voyager529
Dabbler
- Joined
- Jul 27, 2011
- Messages
- 36
Hey guys,
I wasn't sure whether this should go in the hardware forum or someplace else, so apologies to the mods if they need to move it...
At present, my FreeNAS is a Gigabyte mobo with 6GB of RAM and an AMD Sempron 140 (i think) processor. The motherboard has 8 onboard SATA ports, and they're all occupied with 500GByte Western Digital Caviar Blue drives. The whole setup is roughly 6 months old, except the Mobo which I got in the last month (it replaced a mobo with 4 SATA ports and a PCI SATA adapter). All eight drives are in the same storage pool, which has six datasets in it. I'm running the RTM release of FreeNAS 8.0; anxiously awaiting a non-beta release of 8.0.1...well, that and buckling down to purchase a 2GB USB drive for the task.
I notice today that my Windows XP machine at home isn't communicating with the FreeNAS. At first I thought it simply didn't like the network drive mapping, so I tried remapping the drive. That didn't work, so I tried going to the NAS via the UNC path. Still nothing. I had equal success (or lack thereof) when attempting to access the web UI console. My first thought was "hey, maybe the machine got powered off somehow". That theory was bunked when I was able to successfully ping, so I fired up Xshell and SSH'd into the box successfully. My first thought was the same as any predominantly-Windows user - when in doubt, reboot. If nothing else, I figured it was the simplest way to both restart the CIFS and lighttpd servers. It came back up, I SSH'd in again, but still no WebUI or CIFS, though I could successfully view and transfer files via SFTP (or FTPS; I could never keep them straight...the one where you FTP via SSH).
I finally figured out that a drive was bad when I tried doing a zpool scrub and it said that the array was in a degraded state because one of the drives wasn't available. However, the zscrub did come back clean. While admittedly I have not been home yet to verify that it simply a loose cable, that case seems rather unlikely and even if it is, I'd like to consider this a training exercise so I know what I'm doing when a drive *does* fail. So yes, if I get home and it's a loose cable I'll take all the ZoMg N00b RtFm!!111 flak you'd like to give me. Until then, let's assume that a disk has legitimately failed...
1) is there a correlation between having no access except SSH and the failed disk that is known? A search through the forums for 'failed disk' and 'failed lighttpd' didn't yield anything obviously useful. Also, restarting the lighttpd daemon didn't seem to help. Is there a separate set of steps needed to get back into the webUI, or am I stuck at SSH only until the drive is replaced?
2) I've heard rumors that 8.0 RTM doesn't swap out failed disks very gracefully and that I'll need 8.0.1 to do it right. Given that I'm REALLY not a fan of running with one failed disk in an array until 8.0.1 officially releases, is there any word from the beta users out there whether 8.0.1 is "stable enough" to be used, given the situation?
2a) The present set of things I've been reading indicate that I'd need a 2GB flash drive to do the update; I'm presently using a 1GB drive. The upgrade procedures indicate that my best bet is to export my config, do a fresh install on the new stick, then re-import it again. If that's the case, how do I export from SSH, and how happy will it be importing a degraded ZFS volume?
3) How do I display the serial numbers of the drives in the command line? I can show their addresses using the zpool status command, but not the serials - since they're all identical drives, I'll need the serial to tell them apart.
4) While another post about a failed drive seemed to indicate that the most desirable course of action is to plug in a new disk alongside the failed disk, let it rebuild, and remove the bad disk, I don't have the SATA ports to do that correctly at present. My two options are to either pull the bad disk out and put the new disk in, then let the array rebuild (as is the case with most enterprise-grade storage devices), or to pull my PCI SATA card out of retirement, install the drive, let it build, then pull out both the SATA card and the bad drive together. Which is more desirable?
I have the disk on order from Newegg; I'll likely have it either tomorrow or the following day. Obviously I intend on doing a warranty swap on the old disk and leaving it on the shelf as a spare in the event this happens again. Thanks in advance for your help; I'll be sure to provide documentation for what I do and whether it works or not.
Joey
I wasn't sure whether this should go in the hardware forum or someplace else, so apologies to the mods if they need to move it...
At present, my FreeNAS is a Gigabyte mobo with 6GB of RAM and an AMD Sempron 140 (i think) processor. The motherboard has 8 onboard SATA ports, and they're all occupied with 500GByte Western Digital Caviar Blue drives. The whole setup is roughly 6 months old, except the Mobo which I got in the last month (it replaced a mobo with 4 SATA ports and a PCI SATA adapter). All eight drives are in the same storage pool, which has six datasets in it. I'm running the RTM release of FreeNAS 8.0; anxiously awaiting a non-beta release of 8.0.1...well, that and buckling down to purchase a 2GB USB drive for the task.
I notice today that my Windows XP machine at home isn't communicating with the FreeNAS. At first I thought it simply didn't like the network drive mapping, so I tried remapping the drive. That didn't work, so I tried going to the NAS via the UNC path. Still nothing. I had equal success (or lack thereof) when attempting to access the web UI console. My first thought was "hey, maybe the machine got powered off somehow". That theory was bunked when I was able to successfully ping, so I fired up Xshell and SSH'd into the box successfully. My first thought was the same as any predominantly-Windows user - when in doubt, reboot. If nothing else, I figured it was the simplest way to both restart the CIFS and lighttpd servers. It came back up, I SSH'd in again, but still no WebUI or CIFS, though I could successfully view and transfer files via SFTP (or FTPS; I could never keep them straight...the one where you FTP via SSH).
I finally figured out that a drive was bad when I tried doing a zpool scrub and it said that the array was in a degraded state because one of the drives wasn't available. However, the zscrub did come back clean. While admittedly I have not been home yet to verify that it simply a loose cable, that case seems rather unlikely and even if it is, I'd like to consider this a training exercise so I know what I'm doing when a drive *does* fail. So yes, if I get home and it's a loose cable I'll take all the ZoMg N00b RtFm!!111 flak you'd like to give me. Until then, let's assume that a disk has legitimately failed...
1) is there a correlation between having no access except SSH and the failed disk that is known? A search through the forums for 'failed disk' and 'failed lighttpd' didn't yield anything obviously useful. Also, restarting the lighttpd daemon didn't seem to help. Is there a separate set of steps needed to get back into the webUI, or am I stuck at SSH only until the drive is replaced?
2) I've heard rumors that 8.0 RTM doesn't swap out failed disks very gracefully and that I'll need 8.0.1 to do it right. Given that I'm REALLY not a fan of running with one failed disk in an array until 8.0.1 officially releases, is there any word from the beta users out there whether 8.0.1 is "stable enough" to be used, given the situation?
2a) The present set of things I've been reading indicate that I'd need a 2GB flash drive to do the update; I'm presently using a 1GB drive. The upgrade procedures indicate that my best bet is to export my config, do a fresh install on the new stick, then re-import it again. If that's the case, how do I export from SSH, and how happy will it be importing a degraded ZFS volume?
3) How do I display the serial numbers of the drives in the command line? I can show their addresses using the zpool status command, but not the serials - since they're all identical drives, I'll need the serial to tell them apart.
4) While another post about a failed drive seemed to indicate that the most desirable course of action is to plug in a new disk alongside the failed disk, let it rebuild, and remove the bad disk, I don't have the SATA ports to do that correctly at present. My two options are to either pull the bad disk out and put the new disk in, then let the array rebuild (as is the case with most enterprise-grade storage devices), or to pull my PCI SATA card out of retirement, install the drive, let it build, then pull out both the SATA card and the bad drive together. Which is more desirable?
I have the disk on order from Newegg; I'll likely have it either tomorrow or the following day. Obviously I intend on doing a warranty swap on the old disk and leaving it on the shelf as a spare in the event this happens again. Thanks in advance for your help; I'll be sure to provide documentation for what I do and whether it works or not.
Joey