SOLVED Upgrade from 9.2.1 to 9.3-BETA failes

Status
Not open for further replies.

lastcmaster

Dabbler
Joined
May 5, 2013
Messages
20
Trying to upgrade one machine using the http://download.freenas.org/9.3/BETA/x64/FreeNAS-9.3-BETA-aaa86f0-x64.GUI_Upgrade.txz file, I end up in a constant reboot. Going back to the 9.2.1 image by selecting the F1 nanoBSD image.
The errors looked like misisng /rescue/sh when trampline.rc was run. It also looks like the boo0cfg has not set the default boot image to the new image but still points to F1.

Anyhow, this is the s2a slice:
<pre>
[root@balder] /mnt/system/home/admglz# mount /dev/ufs/FreeNASs
FreeNASs1a% FreeNASs3% FreeNASs4%
[root@balder] /mnt/system/home/admglz# mount /dev/da0
da0% da0s1% da0s1a% da0s2% da0s2a% da0s3% da0s4%
[root@balder] /mnt/system/home/admglz# mount /dev/da0s2a /tmp/mnt
[root@balder] /mnt/system/home/admglz# cd /tmp/mnt
[root@balder] /tmp/mnt# ls
./ .snap/ gui-install-environment.tar
../ boot/ gui-packages.tar
[root@balder] /tmp/mnt# ls boot/
./ brand.4th frames.4th loader.help modules/ userboot.so
../ cdboot gptboot loader.rc pmbr version.4th
beastie.4th check-password.4th gptzfsboot loader.rc.local pmbr-datadisk zfs/
boot color.4th grub/ mbr pxeboot zfsboot
boot0 defaults/ kernel/ menu-commands.4th screen.4th zfsloader*
boot0sio delay.4th loader* menu.4th shortcuts.4th
boot1 device.hints loader.4th menu.rc support.4th
boot2 firmware/ loader.conf menusets.4th trampoline.ufs.gz
</pre>

and I don't see any /rescue there. Should there be?

Sorry about the minimal information but I find no boot log or console log and I lost the information of the screen buffer.
 

lastcmaster

Dabbler
Joined
May 5, 2013
Messages
20
Trying to upgrade one machine using the http://download.freenas.org/9.3/BETA/x64/FreeNAS-9.3-BETA-aaa86f0-x64.GUI_Upgrade.txz file, I end up in a constant reboot. Going back to the 9.2.1 image by selecting the F1 nanoBSD image.
The errors looked like misisng /rescue/sh when trampline.rc was run. It also looks like the boo0cfg has not set the default boot image to the new image but still points to F1.

Anyhow, this is the s2a slice:
<pre>
[root@balder] /mnt/system/home/admglz# mount /dev/ufs/FreeNASs
FreeNASs1a% FreeNASs3% FreeNASs4%
[root@balder] /mnt/system/home/admglz# mount /dev/da0
da0% da0s1% da0s1a% da0s2% da0s2a% da0s3% da0s4%
[root@balder] /mnt/system/home/admglz# mount /dev/da0s2a /tmp/mnt
[root@balder] /mnt/system/home/admglz# cd /tmp/mnt
[root@balder] /tmp/mnt# ls
./ .snap/ gui-install-environment.tar
../ boot/ gui-packages.tar
[root@balder] /tmp/mnt# ls boot/
./ brand.4th frames.4th loader.help modules/ userboot.so
../ cdboot gptboot loader.rc pmbr version.4th
beastie.4th check-password.4th gptzfsboot loader.rc.local pmbr-datadisk zfs/
boot color.4th grub/ mbr pxeboot zfsboot
boot0 defaults/ kernel/ menu-commands.4th screen.4th zfsloader*
boot0sio delay.4th loader* menu.4th shortcuts.4th
boot1 device.hints loader.4th menu.rc support.4th
boot2 firmware/ loader.conf menusets.4th trampoline.ufs.gz
</pre>

and I don't see any /rescue there. Should there be?

Sorry about the minimal information but I find no boot log or console log and I lost the information of the screen buffer.

Did a new run with the IPMI screen capture. But it still is missing the initial error, we just get that the da0s2a does not seems mounted on the MD /mnt.
 
S

sef

Guest
/rescue is actually in boot/trampoline.ufs.gz -- if you look at loader.conf it should set
mfsroot_name=/boot/trampoline.ufs

The slice is selected by using "gpart set -a active" before it reboots. That part seems to be happening.
 
J

jkh

Guest
This is https://bugs.freenas.org/issues/6761

We are still mystified by it as we are unable to reproduce this problem. If you can, please append information on your hardware configuration (particularly the boot device(s) you are using) to that bug and we'll see if that narrows things down. Thanks.
 
S

sef

Guest
In 9.2, on the affected system, what does "sysctl kern.disks" show?
 

lastcmaster

Dabbler
Joined
May 5, 2013
Messages
20
/rescue is actually in boot/trampoline.ufs.gz -- if you look at loader.conf it should set
mfsroot_name=/boot/trampoline.ufs

The slice is selected by using "gpart set -a active" before it reboots. That part seems to be happening.
[root@balder] /var/tmp/mnt# cat boot/loader.conf

autoboot_delay="0"

beastie_disable="YES"



mfsroot_load="YES"

mfsroot_type="md_image"

mfsroot_name="/boot/trampoline.ufs"



init_path="/rescue/init"

init_shell="/rescue/sh"

init_script="/trampoline.rc"

tmpfs_load="YES"

zfs_load="YES"

And, yes that part thing is done.
 

lastcmaster

Dabbler
Joined
May 5, 2013
Messages
20
This is https://bugs.freenas.org/issues/6761

We are still mystified by it as we are unable to reproduce this problem. If you can, please append information on your hardware configuration (particularly the boot device(s) you are using) to that bug and we'll see if that narrows things down. Thanks.
Uploaded cam control of all devices attached, fdisk and gpart of boot disk and a dmsg.boot of the whole system.
 

Attachments

  • balder.txt
    2.5 KB · Views: 445
  • dmesg.boot.txt
    14.3 KB · Views: 401
S

sef

Guest
Well, that eliminates one hypothesis -- you've only got one da device, so something else didn't grab its place. That unfortunately leaves me mystified -- as I said, you're booting off the right thing, but for some reason it's not seeing /dev/da0s2a.
 

lastcmaster

Dabbler
Joined
May 5, 2013
Messages
20
Well, that eliminates one hypothesis -- you've only got one da device, so something else didn't grab its place. That unfortunately leaves me mystified -- as I said, you're booting off the right thing, but for some reason it's not seeing /dev/da0s2a.
Is there any way to enable the console.log and then not reboot when it fails? As it is now, the reboot after the fail clears the terminal screen so I cant go back in the screen buffer but so even just blocking the reboot would be a good thing.
 
S

sef

Guest
Unfortunately not -- debugging the trampoline installer is hard. (I was using a VM and recording the screen.)

Well, no. You can grab the upgrade image, untar it, edit the 0200 script to do what you want. But that's about it.
 

lastcmaster

Dabbler
Joined
May 5, 2013
Messages
20
C
Well, that eliminates one hypothesis -- you've only got one da device, so something else didn't grab its place. That unfortunately leaves me mystified -- as I said, you're booting off the right thing, but for some reason it's not seeing /dev/da0s2a.
Could that be because the USB mass device driver has not detected it yet? I remember that the kernel message about the da0 came AFTER the boot failure. So just delay the booting booting or have a loop looking for the device before continuing?
 

lastcmaster

Dabbler
Joined
May 5, 2013
Messages
20
Looks like it was that! Hacked trampoline.rc to wait for da0 and see now it's extracting. Only thing i see is the [: =: unexpected operator but it seems to have cleared the boot device and working on the base install.
And reboot.png is just before reboot after install.
The host is coming up now and looks just fine.
Uploaded the hacked trampoline.rc. Could not find any sleep in rescue, so I tried to do an ls that would take some time and still not block the system if only one CPU was active.
 

Attachments

  • boot.png
    boot.png
    56.5 KB · Views: 378
  • reboot.png
    reboot.png
    52.9 KB · Views: 368
  • trampoline.rc.txt
    1 KB · Views: 360
Last edited:
S

sef

Guest

lastcmaster

Dabbler
Joined
May 5, 2013
Messages
20
Okay, I put a slightly-modified GUI upgrade file at:

http://download.freenas.org/test/FreeNAS-9.3-BETA-0826e5c-x64.GUI_Upgrade.txz

SHA256 checksum 1e177ff9e6bfe66257d18638675ffdd6f00c5c7233fa43f85a2290d18c39b834

This is for debugging purposes only, and will go away after a short period.

If it boots off the thumb drive, but then can't find it, it'll drop into the shell. Hopefully, anyway, I can't reproduce it.
Thanks sef but as you saw I managed to get it running by waiting for the USB device to get detected.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
@sef or @jkh

You guys see the fix? Just want to make sure this isn't dropped off the radar. ;)
 

lastcmaster

Dabbler
Joined
May 5, 2013
Messages
20
@sef or @jkh

You guys see the fix? Just want to make sure this isn't dropped off the radar. ;)
For me, the 'while [ ! -c /dev/da0s2a ]' I patched into trampline.rc was OK but I assume that some more thought must be done to not hang and maybe we need sleep in rescue. If we had sleep, I would suggest using a counted sleep 1 to break out of the while loop and give a proper error message like 'Data device xxx didn't appear before timeout' or something like that.

And look: https://svnweb.freebsd.org/changeset/base/275435
Nice, guys.
 
Last edited:
Status
Not open for further replies.
Top