So, a little back story. I got a new HBA (M1015 in IT Mode), yay. I shutdown all services relying on the ZPOOL, and exported it. I then did my shutdown and installed the new card and hooked my drives up to the new HBA. Fired it back up, and did an Auto Import. The ZPool seemed to be exactly the way it was before. Did some testing and was getting terrible performance from before. So, fast forward to today. I was doing some searching for performance issues, and when I did I noticed that my cache device looks like its not longer a cache device.
So, did some googling and found another thread with someone with a similar issue (thread). If you don't want to read it, the general basis is...I used a debug code. (Still a noob at this stuff)
So, I decided to run a scrub. Ya, that didn't go so hot. I was up to 2900 errors. I stopped the scrub. So, far the only affected things really aren't that important. Honestly, I wouldn't mind losing the whole pool, but I don't have to, then bonus. Anyways, I removed da0 from the cache device. I looked up how to fix the cache device showing up as it is above. I took a look on the interface and saw something even more weird. The drive is now showing up as being part of a stripe. (see attached). At this point, I'm mother fucking myself. So... I tried anyway I could to remove the drive from the pool. However, it all failed, which I'm assuming is for good reason. My guess at this point, is I probably need to copy everything I can to another location outside the zpool and just kill it with fire. However, again, if I don't need to then awesome.
Also, I still have access to all the data (except for my VM dataset), so... it's still alive. Help appreciated.
Please feel free to call me a moron.
Code:
zpool status
Code:
- [root@freenas] ~# zpool status -v
- pool: Storage
- state: ONLINE
- scan: resilvered 682G in 7h6m with 0 errors on Fri Feb 6 05:38:27 2015
- config:
- NAME STATE READ WRITE CKSUM
- Storage ONLINE 0 0 0
- gptid/c3a1dbf7-255c-11e4-bcf1-0015175b2e90 ONLINE 0 0 0
- raidz3-1 ONLINE 0 0 0
- gptid/c4442745-255c-11e4-bcf1-0015175b2e90 ONLINE 0 0 0
- gptid/c4fd106d-255c-11e4-bcf1-0015175b2e90 ONLINE 0 0 0
- gptid/ba1b044f-adb0-11e4-a29c-0015175b2e90 ONLINE 0 0 0
- gptid/c666e636-255c-11e4-bcf1-0015175b2e90 ONLINE 0 0 0
- gptid/c716e721-255c-11e4-bcf1-0015175b2e90 ONLINE 0 0 0
So, did some googling and found another thread with someone with a similar issue (thread). If you don't want to read it, the general basis is...I used a debug code. (Still a noob at this stuff)
Code:
- [root@freenas] ~# sysctl kern.geom.debugflags=0x10
- kern.geom.debugflags: 0 -> 16
- [root@freenas] ~#
- [root@freenas] ~#
- [root@freenas] ~# zpool add Storage cache da0
Code:
- [root@freenas] ~# zpool status -v
- pool: Storage
- state: ONLINE
- scan: resilvered 682G in 7h6m with 0 errors on Fri Feb 6 05:38:27 2015
- config:
- NAME STATE READ WRITE CKSUM
- Storage ONLINE 0 0 0
- gptid/c3a1dbf7-255c-11e4-bcf1-0015175b2e90 ONLINE 0 0 0
- raidz3-1 ONLINE 0 0 0
- gptid/c4442745-255c-11e4-bcf1-0015175b2e90 ONLINE 0 0 0
- gptid/c4fd106d-255c-11e4-bcf1-0015175b2e90 ONLINE 0 0 0
- gptid/ba1b044f-adb0-11e4-a29c-0015175b2e90 ONLINE 0 0 0
- gptid/c666e636-255c-11e4-bcf1-0015175b2e90 ONLINE 0 0 0
- gptid/c716e721-255c-11e4-bcf1-0015175b2e90 ONLINE 0 0 0
- cache
- da0 ONLINE 0 0 0
So, I decided to run a scrub. Ya, that didn't go so hot. I was up to 2900 errors. I stopped the scrub. So, far the only affected things really aren't that important. Honestly, I wouldn't mind losing the whole pool, but I don't have to, then bonus. Anyways, I removed da0 from the cache device. I looked up how to fix the cache device showing up as it is above. I took a look on the interface and saw something even more weird. The drive is now showing up as being part of a stripe. (see attached). At this point, I'm mother fucking myself. So... I tried anyway I could to remove the drive from the pool. However, it all failed, which I'm assuming is for good reason. My guess at this point, is I probably need to copy everything I can to another location outside the zpool and just kill it with fire. However, again, if I don't need to then awesome.
Also, I still have access to all the data (except for my VM dataset), so... it's still alive. Help appreciated.
Please feel free to call me a moron.