Please ID drives via Controller channel ID rather than drive ID at boot time.

Status
Not open for further replies.

bforest

Dabbler
Joined
Jun 18, 2011
Messages
10
[FreeNAS v/p (64 bit) - various ver 8.x from 8.0.x to 8.2 (current)
General hardware info - various AMD/Intel; ASUS GigiByte; 8Gb Ram
Specific hardware info - No RAID controller or hw RAID disabled
. . . . . . . . . . . . . . . . . various usually: SIL3124 Chipset
]

I have three FreeNAS systems with multiple drive bays 20+ and multiple controllers.

FreeNAS references these drives as "adaX". If I add or remove drives, depending upon where I plug in the drive the "adaX" drive references at reboot can change for some or all drives. This renders the "adaX" references worthless.

I document the Controller ID and Controller-channel for each system because this information will not change. (Maybe it will if I replace a controller but I have not experienced that situation)

Below is the current steps I must take to replace a failed drive. Most of the work is locating the actual failed physical drive (because I cannot trust the name FreeNAS uses to reference the device):

REPLACING A DRIVE (Failed or Not)

In this case we want to replace a smaller drive 80G with a larger drive 250G

You should ONLY replace ONE drive at a time (which can take a few hours to be fully “accepted” by the system)

If you log in via SSH you can find the actual physical drive based upon what SATA “siisch” port it is connected to. The ada# values will change depending upon the order the drives are discovered at boot up time. ( If the ada# values change, it will only happen at bootup time. ) In FreeNAS v8.2.0 you can access this info in the “Shell” menu function on the Web GUI.

Here is a listing of my current drives:

[user@nas2] /> cat /var/run/dmesg.boot | grep ada | grep siisch

ada0 at siisch0 bus 0 scbus0 target 0 lun 0
ada1 at siisch4 bus 0 scbus4 target 0 lun 0
ada2 at siisch8 bus 0 scbus8 target 0 lun 0
ada3 at siisch12 bus 0 scbus12 target 0 lun 0
ada4 at siisch16 bus 0 scbus22 target 0 lun 0
ada5 at siisch17 bus 0 scbus23 target 0 lun 0
ada6 at siisch20 bus 0 scbus26 target 0 lun 0
ada7 at siisch21 bus 0 scbus27 target 0 lun 0

Use the siisch number to locate the physical drive from the system drive configuration diagram.


[user@nas2] /> cat /var/run/dmesg.boot | grep ada | grep sectors

ada0: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C)
ada1: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C)
ada2: 76319MB (156301488 512 byte sectors: 16H 63S/T 16383C)
ada3: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C)
ada4: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C)
ada5: 76319MB (156301488 512 byte sectors: 16H 63S/T 16383C)
ada6: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C)
ada7: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C)


[user@nas2]

I am interested in the smaller drives. I will choose the 76319Mb (80G) drive on ada5. I can see from the first listing that it is connected to siisch17.

Using the NAS diagram, I pull the drive on channel 17 from the system. In my case it is in slot 3r3b.

Reboot the NAS and insert the replacement 250G drive into slot 3r3b.

I believe if the GUI would reference the Controller Channel number it would simplify the process of drive replacement (however infrequent it may be)

Thank you for your time!
Thank you for such a great product!

-bforest
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I agree with you, but that's REALLY difficult. I have a 3ware controller in one FreeNAS server. William and I worked together to add serial numbers to the FreeNAS UI. There's lots of problems with trying to match physical connectors to logical devices. Ultimately I gave William SSH access to my server so he could come up with a way to make it work. Overall I think we got pretty lucky because the controller doesn't readily provide that information. We had to run 3 different commands to compare the physical connection to the logical device, then cross reference that to the serial.

So while I agree that it would be beneficial, I know that the truth of the matter is that its just not going to happen.

Not to mention:

1. When a disk starts having problems with SMART, you get an error that adaX has problems.
2. When ZFS has a problem identified from command zfs status it identifies the disk as adaX.
3. When a disk is disconnected from a hotswap supported controller the message states that disk adaX has been disconnected.
4. When wiping and testing disks from the terminal you use the /dev/adaX to identify the disk to work with.

I'm sure there's more examples. But the bottom line is that FreeBSD relates to disks by their /dev/adaX(or whatever device it is). The physical connector that the disk is attached to has almost no value in the FreeBSD world.

It's like comparing the physical connector to the "C" drive in Windows. It doesn't matter what physical connector is used in a running system. What's important is to know that "C" is your boot drive.

As the administrator it's your job to take the logical device and figure out what the physical device is. That's why I worked with William a few weeks ago to get serial numbers working for all 3ware controllers. At the end of the day, if you are an admin at some company with a server, that's why YOU get paid to be the admin. If the server would just say "replace disk 3" then girl scouts would be trained to do your job for minimum wage and you'd be unemployed.

I'm not sure what controller you have, you didn't specify what version of FreeNAS you are using, what hardware you are using or anything(RT forum rules). While most likely this isn't true, the problem might just have easily been that you have a non-standard controller that doesn't provide information to match the physical to logical. 3ware doesn't provide any command that says "ada0 = phys01". You have to logically figure it out on your own.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,402
Your setup doesn't allow hotswapping? Your steps show you pulling the drive and then rebooting?

As long as siisch is always the same and on the same scbus for a given physical bay, ada7 at siisch21 bus 0 scbus27 target 0 lun 0, then you can use the following:

Kostik Belousov said:
I use the following stanza in /boot/device.hints for machine with intel
on-board ahci and siis in pcie:
hint.scbus.0.at="ahcich0"
hint.ada.0.at="scbus0"
hint.scbus.1.at="ahcich1"
hint.ada.1.at="scbus1"
hint.scbus.2.at="ahcich2"
hint.ada.2.at="scbus2"
hint.scbus.3.at="ahcich3"
hint.ada.3.at="scbus3"
hint.scbus.4.at="ahcich4"
hint.ada.4.at="scbus4"
hint.scbus.5.at="siisch0"
hint.ada.5.at="scbus5"
hint.scbus.6.at="siisch1"
hint.ada.6.at="scbus6"

You should get an idea from this.
 

bforest

Dabbler
Joined
Jun 18, 2011
Messages
10
I agree with you, but that's REALLY difficult. . . .

Thank you for not saying it was impossible.

I have recently found another issue between the GUI and the CLI when listing drives using the name that FreeNAS does use.

Please keep your eye on: ada8

Code:
Build		FreeNAS-8.2.0-RELEASE-p1-x64 (r11950)
Platform	AMD Athlon(tm) II X2 255 Processor
Memory		8175MB
System Time	Sun Oct 07 13:57:40 EDT 2012

Web GUI - Storage; Volumes; View Disks

    EDIT  NAME 	SERIAL 			DESCRIPTION 		TMODE	HDDS	APM	AcousticL	SMART
Edit Wipe ada0	WD-WXHX08936746		Member of dpool1 raidz2	Auto	60	127	Disabled	true	
Edit Wipe ada1	WD-WXA1A6117672		Member of dpool1 raidz2	Auto	60	127	Disabled	true	
Edit Wipe ada2	091105BB6100WAKA3ZBF	Member of dpool1 raidz2	Auto	60	127	Disabled	true	
Edit Wipe ada3	WD-WXE908CZ7894		Member of dpool1 raidz2	Auto	60	127	Disabled	true	
Edit Wipe ada4	WD-WXC1AC0Y1876		Member of dpool1 spare	Auto	60	127	Disabled	true	
Edit Wipe ada5	6VCRVHSX		Member of dpool1 raidz2	Auto	60	127	Disabled	true	
Edit Wipe ada6	100524PBN2043SDSGSXT	Member of dpool1 spare	Auto	60	127	Disabled	true	
Edit Wipe ada7	6VCRJH17		Member of dpool1 raidz2	Auto	60	127	Disabled	true	
Edit Wipe ada8	100610PBN2083SE59X6T				Auto	60	127	Disabled	true


[root@homenas2 ~]# zpool status                                                 
  pool: dpool1                                                                  
 state: ONLINE                                                                  
 scrub: none requested                                                          
config:                                                                         
                                                                                
        NAME                                          STATE     READ WRITE CKSUM
        dpool1                                        ONLINE       0     0     0
          raidz2                                      ONLINE       0     0     0
            ada0p2                                    ONLINE       0     0     0
            ada1p2                                    ONLINE       0     0     0
            ada3p2                                    ONLINE       0     0     0
            ada4p2                                    ONLINE       0     0     0
            ada6p2                                    ONLINE       0     0     0
            ada8p2                                    ONLINE       0     0     0
        spares                                                                  
          gptid/1072fef0-53ef-11e1-a5db-6cf049e52a26  AVAIL                     
          gptid/1109da35-53ef-11e1-a5db-6cf049e52a26  AVAIL                     
                                                                                
errors: No known data errors


The GUI view (top) lists ada8 as not a part of the RAIDz2 but the CLI view (bottom) lists ada8 as the last drive in the RAIDz2.

I believe there needs to be a method of "re-syncing" the GUI database to the "actual" configuration of the system. (as well as changing the reference name of the drives in the GUI (as suggested in the first post in this thread.))

-Ben
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
There is something else going on and not just with ada8. The top list shows that dpool1 includes drives 0,1,2,3,5 and 7. But the bottom is 0,1,3,4,6 and 8.

Have you tried rebooting your machine? This discrepancy might clear up with a reboot.
 

William Grzybowski

Wizard
iXsystems
Joined
May 27, 2011
Messages
1,754
The description of the disk as is, is legacy thing, it should not be taken seriously. You have it probably because you upgraded from 8.0.x.

Enterprise systems use enclosures with LEDs, so you can identify the disks by pressing buttons in the UI, and/or making them blink when a disk fails.
Associating disk number with controller slot id is not going to happen in freenas.
 
Status
Not open for further replies.
Top