Can't seem to replace failed drive

Status
Not open for further replies.

Jables

Cadet
Joined
Aug 20, 2016
Messages
8
Hi! I'm fairly new to FreeNAS and am having an issue replacing a failing drive - the new one doesn't seem to want to join the pool and step in.

quick stats - I've got a 20 drive zpool in 2 10 drive sets of zraid2, so plenty of parity and I'm ok running degraded for a bit, but would love it if someone else more experienced had some ideas to try. I offlined and have physically pulled & swapped the suspect drive as per the docs, but when trying to do a "replace" with the new drive in it's place, I'm confronted with a blank replace dialog box and an error message similar to this guy.
----
"Replacing disk 12557778013931841925
Member disk:[blank, no pulldown option] !This field is required!"
----
I see mentions of this in previous threads but I've tried everything mentioned without success. No one seems to have the correct recipe in my case, but I'm dying to fix this and avoid a copy and rebuild from backups - seems like I should be able to pop a fresh drive in off the shelf, resilver and be on my merry way. Any ideas welcomed!

I'm not for sure - but if I had to guess, it looks like possibly my usb key boot drive has taken over the previous ID that my 20th drive was on (da19)? Maybe there's some way to give the fresh drive a new ID not in use (da20)? Or get the usb key off of da19 somehow and open it back up? Just a blind guess at this point - I'm not sure how much the IDs really matter.

The pertinent info:

zpool status -v:
Code:
  pool: freenas-boot
state: ONLINE
  scan: scrub repaired 0 in 0h0m with 0 errors on Mon Apr 24 03:45:28 2017
config:

  NAME  STATE  READ WRITE CKSUM
  freenas-boot  ONLINE  0  0  0
  gptid/3a77e50e-b810-11e6-aad8-003048d37eaa  ONLINE  0  0  0

errors: No known data errors

  pool: spr
state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
  the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
  see: http://illumos.org/msg/ZFS-8000-2Q
  scan: scrub repaired 97.8G in 11h30m with 0 errors on Sun May 21 11:30:38 2017
config:

  NAME  STATE  READ WRITE CKSUM
  spr  DEGRADED  0  0  0
  raidz2-0  ONLINE  0  0  0
  da0p1  ONLINE  0  0  0
  gptid/910affd0-69f1-4b4e-9e57-b7012cbafb81  ONLINE  0  0  0
  da2p1  ONLINE  0  0  0
  da3p1  ONLINE  0  0  0
  da4p1  ONLINE  0  0  0
  gptid/85b2854d-687a-e94b-97d2-7dcc6a9c8481  ONLINE  0  0  0
  da6p1  ONLINE  0  0  0
  da7p1  ONLINE  0  0  0
  da8p1  ONLINE  0  0  0
  da9p1  ONLINE  0  0  0
  raidz2-1  DEGRADED  0  0  0
  da10p1  ONLINE  0  0  0
  da11p1  ONLINE  0  0  0
  da12p1  ONLINE  0  0  0
  gptid/e049bc26-eb1a-b647-876b-aa89df92ee61  ONLINE  0  0  0
  da14p1  ONLINE  0  0  0
  da15p1  ONLINE  0  0  0
  da16p1  ONLINE  0  0  0
  da17p1  ONLINE  0  0  0
  da18p1  ONLINE  0  0  0
  12557778013931841925  UNAVAIL  0  0  0  was /dev/da19p1
  logs
  ada2p1  ONLINE  0  0  0  block size: 512B configured, 4096B native
  cache
  gptid/48141b73-e42b-c14a-b310-0caf7f434225  ONLINE  0  0  0
  gptid/7305ec4b-c099-ec42-820c-3e9ff9255ef9  ONLINE  0  0  0

errors: No known data errors



camcontrol devlist

Code:
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 0 lun 0 (pass0,da0)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 1 lun 0 (pass1,da1)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 2 lun 0 (pass2,da2)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 3 lun 0 (pass3,da3)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 4 lun 0 (pass4,da4)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 5 lun 0 (pass5,da5)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 6 lun 0 (pass6,da6)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 7 lun 0 (pass7,da7)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 8 lun 0 (pass8,da8)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 9 lun 0 (pass9,da9)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 10 lun 0 (pass10,da10)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 11 lun 0 (pass11,da11)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 12 lun 0 (pass12,da12)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 13 lun 0 (pass13,da13)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 14 lun 0 (pass14,da14)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 15 lun 0 (pass15,da15)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 16 lun 0 (pass16,da16)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 17 lun 0 (pass17,da17)
<AMCC 9650SE-24M DISK 4.10>  at scbus0 target 18 lun 0 (pass18,da18)
<LITEONIT LCS-512M6S 2.5 7mm 512GB DC9110E>  at scbus1 target 0 lun 0 (pass19,ada0)
<LITEONIT LCS-512M6S 2.5 7mm 512GB DC9110D>  at scbus4 target 0 lun 0 (pass20,ada1)
<Corsair CSSD-F60GB2-A 2.1b>  at scbus6 target 0 lun 0 (pass21,ada2)
<Kanguru FlashBlu 30 PMAP>  at scbus8 target 0 lun 0 (pass22,da19)



glabel status

Code:
  Name  Status  Components
gptid/48141b73-e42b-c14a-b310-0caf7f434225  N/A  ada0p1
gptid/93a43714-be7c-3446-88ce-5e8ca7ca7a72  N/A  ada0p9
gptid/7305ec4b-c099-ec42-820c-3e9ff9255ef9  N/A  ada1p1
gptid/580ae8e3-413d-fd42-8963-4eaf2c4f00eb  N/A  ada1p9
gptid/ba6cf680-736b-8e48-9069-d3a5f0bd1f0e  N/A  ada2p9
gptid/07b79ac1-a9a4-5e4e-a2d9-03397d39a5ce  N/A  da0p9
gptid/910affd0-69f1-4b4e-9e57-b7012cbafb81  N/A  da1p1
gptid/021b5b30-c515-0f43-91b9-5cb340b75b7c  N/A  da1p9
gptid/32041579-2db7-484f-aa27-132b748f0122  N/A  da2p9
gptid/a18b216e-5dfc-f748-85d1-b80e0d6259b2  N/A  da3p9
gptid/144620b5-6b26-e441-a9e0-6ee7ce88e67c  N/A  da4p9
gptid/85b2854d-687a-e94b-97d2-7dcc6a9c8481  N/A  da5p1
gptid/be702907-7bbc-d44a-b482-58e9fbc48ea7  N/A  da5p9
gptid/a8b7e6e5-b16a-9d44-a6df-0f4956997833  N/A  da6p9
gptid/40566bc3-ed30-1f46-af6c-6c6ee8992f84  N/A  da7p9
gptid/63da1a39-6ae7-bb43-8f8e-e6ec9a0031d2  N/A  da8p9
gptid/cbfe2bae-adf2-bb47-9d67-8aba826c151a  N/A  da9p9
gptid/eddf6ff0-b036-de4d-8c24-294949231ff2  N/A  da10p9
gptid/e3d325cc-1e84-3746-8868-3874f70a4c45  N/A  da11p9
gptid/467012a6-3143-e947-bf5c-fa77415abf63  N/A  da12p9
gptid/e049bc26-eb1a-b647-876b-aa89df92ee61  N/A  da13p1
gptid/bbb8fc78-734e-8344-a380-1bd0e38d7327  N/A  da13p9
gptid/c8744b21-3111-434b-b1c0-e5fd7d0c9588  N/A  da14p9
gptid/e1fc392c-1be2-3a4d-995a-2ae14f663f47  N/A  da15p9
gptid/fa7a61a6-ef2b-7e4e-a421-13337db94df6  N/A  da16p9
gptid/3c298fef-b7de-2a4a-84d8-643badf82df1  N/A  da17p9
gptid/e6633e06-f0e6-d944-9c40-cde2bf8c487e  N/A  da18p9
gptid/3a5db658-b810-11e6-aad8-003048d37eaa  N/A  da19p1
gptid/3a77e50e-b810-11e6-aad8-003048d37eaa  N/A  da19p2



gpart show

Code:
=>  34  1000215149  ada0  GPT  (477G)
  34  2014  - free -  (1.0M)
  2048  1000196096  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (477G)
  1000198144  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  1000214528  655  - free -  (328K)

=>  34  1000215149  ada1  GPT  (477G)
  34  2014  - free -  (1.0M)
  2048  1000196096  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (477G)
  1000198144  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  1000214528  655  - free -  (328K)

=>  34  117231341  ada2  GPT  (56G)
  34  2014  - free -  (1.0M)
  2048  117211136  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (56G)
  117213184  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  117229568  1807  - free -  (904K)

=>  34  3906228157  da0  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da1  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da2  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da3  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da4  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da5  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da6  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da7  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da8  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da9  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da10  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da11  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da12  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da13  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da14  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da15  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da16  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da17  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  3906228157  da18  GPT  (1.8T)
  34  2014  - free -  (1.0M)
  2048  3906207744  1  !6a898cc3-1dd2-11b2-99a6-080020736631  (1.8T)
  3906209792  16384  9  !6a945a3b-1dd2-11b2-99a6-080020736631  (8.0M)
  3906226176  2015  - free -  (1.0M)

=>  34  30965693  da19  GPT  (15G)
  34  1024  1  bios-boot  (512K)
  1058  6  - free -  (3.0K)
  1064  30964656  2  freebsd-zfs  (15G)
  30965720  7  - free -  (3.5K)



cat /etc/version

Code:
FreeNAS-9.10.2-U3 (e1497f269)



thanks in advance.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
There's a lot of strangeness going on here, though I'm not sure how much any of it has to do with your immediate problem. You're using a hardware RAID controller, which is always discouraged unless it can pass through the raw devices directly to the OS (which yours isn't). Your disks are (mostly) identified in the pool by dan numbers rather than by gptid. And though they're partitioned (which generally isn't recommended for ZFS, though FreeNAS does it anyway), they weren't partitioned by FreeNAS. So how did you build this pool to begin with? And what hardware are you using?
 

Jables

Cadet
Joined
Aug 20, 2016
Messages
8
Yes, the card is a 3ware 9650SE-24M8. Pool was originally built (by a friend) with ZFS for Linux on Debian, I think it was. I had thought we set that up where it was doing pass through (or JBOD?), but perhaps not - what is it that tips you off, and how does it look when properly configured? Assuming it's the camcontrol devlist should just be showing individual drives as if they were SATA and not 20 entries from the card, right? Hmmm. I don't think I can change that over to JBOD without nuking and rebuilding.

Hmmm, if it's being controlled by the raid card... would I need to add it to the raid during BIOS startup before it will properly register? Is that what's happening? yarg, that means hooking up a monitor and the like. But if this is the case and it is being RAID controlled, then I think I probably need to tear it down and redo it. (might consider swapping to a more ideal HBA too)

Any other tips in this case? When you ask what hardware, what do you want to know? The gist - older dual xeon L5420 w 48gb of ECC RAM, on an old supermicro mobo. Mellonox x2 10G nic. cheaper 20 bay case, forget the brand (norco I think). I have a pair of 512gb SSDs that are meant to be an l2arc and then a 60gb SSD zil. I can provide more specifics if it's helpful.

Well, I can check this RAID thing and report back. Hoping I can get it going and not have to do a total teardown but yes, it does look like some things might be amiss...
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Assuming it's the camcontrol devlist should just be showing individual drives as if they were SATA and not 20 entries from the card, right?
Pretty much. It should be showing make and model for the drives. The biggest problem with what the card is doing is that it gets in the way of reading SMART data, which FreeNAS does to monitor drive health. I don't know if that card can be used in true HBA mode, but that's what you want to do. If it can't be used as a HBA, better to replace it with a card that can; the LSI 9200/9300 series are most popular here.

Other than that, really, more recent hardware would be very helpful. A SuperMicro X8-series motherboard would be the minimum to avoid a FSB, and would give you a significant boost in performance/watt. I'm looking around eBay for something that would work as a relatively turn-key server that you could plug your existing drives into, and the best I can find so far is this. At a little over $1k shipped, it's more expensive than I was hoping for, but it's the least expensive I've seen so far with SAS2 expander backplanes (SAS1 expander backplanes are believed to not support drive capacities > 2 TB). OTOH, it would give you 36 bays, a known-solid HBA, and much-newer CPUs. It would work out to be very similar to my system, which works quite well.

If you aren't using, and don't intend to use, drives larger than 2 TB, you can get a ready-to-run system for around $400. Here's one example.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
Yeah this is pretty much a cringer. Suggest you not use this "friend" in the future, we'll be happy to help you on a build.

Hardware RAID, in particular, is the scariest part.
 

Jables

Cadet
Joined
Aug 20, 2016
Messages
8
Thanks for the recs, all things I'll consider if/when I scrap and rebuild.

Oh, I'm to blame for this - I put together the hardware and friend only did the software side since I was too busy at the time to get it up and running. It was all done super low budget and wasn't necessarily meant to be a Freenas box from the start - just a light duty file server that is meant to eventually become a nightly backup box or a secondary server to a new "real" server once we get more work/larger budget projects. I guess I'm just trying to squeeze some ZFS goodness out of what was really just meant to be a beater linux RAID 5 style server initially. We don't pay for our power, it's included in rent. ;)

The FSB bit is new info to me - really good to know, had no idea. Will look at E3/E5 based post-FSB setups for the next round.

Freenas seems great. Great docs and it's great to be able to hit the boards up like this for help. Thanks!
 

Jables

Cadet
Joined
Aug 20, 2016
Messages
8
Yep - hooked up a monitor and got into the 9650SE bios and sure enough, there was the fresh drive hanging out, unconfigured. configged it - noticing all drives are set to export as single instead of JBOD while I was in there. hrm.

on reboot, FreeNAS recognized it and I was finally able to do the drive replace. resilvering now. I'm back in business, Mortimer!

So that gets me back to whatever clunky setup I had, and now I have a good track on what I need to be thinking about next time I can take it down and put a little more $ into it. Thanks again everyone!
 
Status
Not open for further replies.
Top