Trying to browse web UI just stops at the shark logo - no login prompt ever comes up.

Team503

Cadet
Joined
Jul 21, 2020
Messages
8
I built this FreeNAS box a month or two ago as a 9.3 box, and noticed it would get unresponsive in the web UI, so I upgraded to 11.3 (I think U3, but no way to tell now that I know of) in hopes that it was a resolved issue. Seemed to be for a few weeks, then I get to this.

I can't ever get to the log in prompt. Console seems responsive (but I don't really know what I'm doing there). Refreshing the page takes forever, but I never get the login screen, just the shark logo and nothing.

It's a dual proc socket SuperMicro 1156 board (don't remember the CPUs specifically) with 32gb of ECC RAM, using an Intel 10gbe NIC to connect to my network. I did at one point set the MTU to 9000 (which works with my VMware hosts just fine). I've tried on multiple machines with multiple browsers, including Win7 and Win10, Chrome, MSIE, and even Edge.

The funny thing is that the volumes set up (6x8TB in RAIDZ2 hanging off an LSI HBA) share out just fine via SMB until you start trying to browse them, then they timeout. I just added a PERC H310 flashed to IT mode and connected to 4x1TB SSDs that I need to set up in iSCSI and share out to vSphere so I can host VMs off it, but I can't do anything since I can't log into the web GUI.

EDIT: top shows no more than 3-5% CPU usage and 27G of memory free. Mostly collectd and python3.7 using the proc. Obviously, console and shell are responsive.
EDIT2: There are no VMs, and the RAIDZ2 array is configured for either SMB sharing only. Nothing should be taxing memory or proc usage.

Really hoping for some strong troubleshooting tips or links - I'm passable with Debian but I don't know BSD at all. Help?
 

Team503

Cadet
Joined
Jul 21, 2020
Messages
8
I can't ever get to the log in prompt. Console seems responsive (but I don't really know what I'm doing there). Refreshing the page takes forever, but I never get the login screen, just the shark logo and nothing.

It's a dual proc socket 1156 board (don't remember the CPUs specifically) with 32gb of ECC RAM, using an Intel 10gbe NIC to connect to my network. I did at one point set the MTU to 9000 (which works with my VMware hosts just fine). I've tried on multiple machines with multiple browsers, including Win7 and Win10, Chrome, MSIE, and even Edge.

I just added a PERC H310 flashed to IT mode and connected to 4x1TB SSDs that I need to set up in iSCSI and share out to vSphere so I can host VMs off it, but I can't do anything since I can't log into the web GUI.

Setup:

SuperMicro X8DT3 with dual Xeon W5590s and 30gb of RAM.
6x8TB SATA in RAIDZ2 on LSI HBA
4x1TB SSD in striped vdev on PercH310 flashed to IT mode
2x120gb SSD plugged into mobo onboard SATA controller as boot devices
Dual 10gbe NIC (only one DAC connected)

Fresh install of 11.3U3 STABLE downloaded from the site today, was able to import existing volume and create the striped vdev before web UI stopped responding (just get the shark logo).

I have tried:
  1. Restarting server
  2. Reseating RAM
  3. Changing SATA ports for boot drives
  4. Reseating cables
  5. Fresh install that formatted the two drives
  6. zpool status shows all vdevs in a healthy state
  7. Restarting the nginx and django services
  8. Disabling jumbo frames (set MTU to 1500)
  9. Ran top - shows CPU at <5% at all times, shows 27G free memory
LOGS:
I've been looking through the console logs, and the only obvious errors I can find are these:

Code:
(probe5:mps1:0:3:0): INQUIRY.  CDB:  12 00 00 00 24 00
(probe5:mps1:0:3:0): CAM status: CCB request completed with an error
(probe5:mps1:0:3:0): Retrying command
(probe5:mps1:0:3:0): Error 5, Retries exhausted
...
/etc/rc: WARNING: failed to start rrdcached


EDIT: top shows no more than 3-5% CPU usage and 27G of memory free. Mostly collectd and python3.7 using the proc. Obviously, console and shell are responsive.
EDIT2: There are no VMs, and the RAIDZ2 array is configured for either SMB sharing only. Nothing should be taxing memory or proc usage.
EDIT3: They're Xeon W5590s running at 3.3gHz. I'm flashing the BIOS in case this is some weird hardware thing.
EDIT4: After successful BIOS flash, reinstalled FreeNAS. It runs on a pair of 120gb SSDs (PNY I think) attached via the onboard SATA controller. Worked for 5-10 minutes, then the page started timing out again. The only things I did were configuring the NICs - two 10gbe interfaces on different subnets (192.168.1.0/24 and 192.168.99.0/24) - and creating a striped vdev of the SSDs and adding a volume to it. I had MTU at 9000, but I dropped it back down to 1500 to see if that made a difference. Same failure. I disconnected the second NIC in case it was some odd collision thing, but no joy.
EDIT5: I'm not completely unable to access the web gui, right back where I started. Still really hoping for someone to point me to wherever the various logs are kept so I can try to find something applicable.
EDIT6:
Really hoping for some strong troubleshooting tips or links - I'm passable with Debian but I don't know BSD at all. Help?
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
The errors indicates one of your HBAs may be going bad. Do you get the same behavior with the HBAs removed from the system?
 

Team503

Cadet
Joined
Jul 21, 2020
Messages
8
I'll remove them and try tomorrow. Out of curiosity, how did you determine that?

It would have to be the LSI controller, since this behavior was happening prior to the installation of the PERC 310. Forgive my ignorance, but I know that in a hardware RAID, the array config is stored both on the controller and on each disk, allowing the replacement of the controller without losing the array (and its data). Is it the same in ZFS? If I replace the controller with a same model, will the RAIDZ2 configured on those disks still be available? Can I replace it with a different controller (another PERC 310 cuz man they're cheap)?

Thanks again for yoru help!
 

Team503

Cadet
Joined
Jul 21, 2020
Messages
8
I pulled both controllers and experienced the same behavior and error messages after reboot.

The only thing I can think of is that the two 120GB SSDs I installed FreeNAS on are connected to the motherboard's onboard SATA controller. I'll order another breakout cable and see if moving them to the PERC H310 in IT mode makes any difference, but it'll be a week before it's done.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
It would have to be the LSI controller, since this behavior was happening prior to the installation of the PERC 310. Forgive my ignorance, but I know that in a hardware RAID, the array config is stored both on the controller and on each disk, allowing the replacement of the controller without losing the array (and its data). Is it the same in ZFS? If I replace the controller with a same model, will the RAIDZ2 configured on those disks still be available? Can I replace it with a different controller (another PERC 310 cuz man they're cheap)?
WIth ZFS the information is only on the disk drives. So if you did not use a RAID card that presents virtual drives to the OS but flashed all in IT mode, you can use any technology you like to connect the drives and your pool will import. E.g. though that does not make much sense, you could put each drive into an enclosure with external USB, hook them all up to your server and import the pool.
 
Last edited:

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Out of curiosity, how did you determine that?

Code:
(probe5:mps1:0:3:0): INQUIRY.  CDB:  12 00 00 00 24 00
(probe5:mps1:0:3:0): CAM status: CCB request completed with an error
(probe5:mps1:0:3:0): Retrying command
(probe5:mps1:0:3:0): Error 5, Retries exhausted


The mps kernel driver corresponds to one of your HBAs, in PCI slot 0:3:0.
 

Team503

Cadet
Joined
Jul 21, 2020
Messages
8
WIth ZFS the information is only on the disk drives. So if you did not use a RAID card that presents virtual drives to the OS but flashed all in IT mode, you can use any technology you like to connect the drives and your pool will import. E.g. though that does not make much sense, you could put each drive into an enclosure with external USB, hook them all up to your server and import the pool.

Awesome, thanks!
 

Team503

Cadet
Joined
Jul 21, 2020
Messages
8
Code:
(probe5:mps1:0:3:0): INQUIRY.  CDB:  12 00 00 00 24 00
(probe5:mps1:0:3:0): CAM status: CCB request completed with an error
(probe5:mps1:0:3:0): Retrying command
(probe5:mps1:0:3:0): Error 5, Retries exhausted


The mps kernel driver corresponds to one of your HBAs, in PCI slot 0:3:0.

I had wondered what that address was! Thanks for explaining. Any thoughts on why it doesn't change the error when both cards are removed? Bad slot, perhaps?
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Look at lspci -tv to see what's at that address. If that's the on-board controller, then you unfortunately have a bad motherboard.
 

Team503

Cadet
Joined
Jul 21, 2020
Messages
8
Look at lspci -tv to see what's at that address. If that's the on-board controller, then you unfortunately have a bad motherboard.

If that's the case, would moving the drives off the on-board controller and just not using it be a workaround? Dual proc Xeon motherboards aren't cheap, unfortunately.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Yes, that could be a work-around. I'd also disable the controller in the BIOS, if feasible.
 
Top