Intermittent freezing issue

Status
Not open for further replies.

eiton

Cadet
Joined
Dec 11, 2017
Messages
9
Hi guys,

I've been an issue lately with parts of my system freezing for lack of a better term. The problem most often manifests itself with the webgui being inaccessible (attempting to connect will time out) and when I try to ssh into the server, after typing in my password, I get a prompt with "%" but after I type "bash", nothing happens.

All my apps in jails continue to work fine and I can ssh into my jails without issue.

On rare occasion, I'll have issues trying to access files in my zpool through smb and I'll also notice the previous two issues when that happens.

The only way I've managed to resolve the issue is a hard reboot.

I'm sure there's some logs and similar things you guys would need to see to understand what's going on. I have no clue which ones those would be but I'd love to find out and share them with you.

Machine specs:
2x xeon E5-2650
old supermicro board
64gb ram
4x 4tb hd (raidz2)
Freenas 11 U4

/var/log/messages output (not close to the point where the freeze occurs)
Code:
Dec 12 08:13:33 freenas collectd[3261]: utils_vl_lookup: The user object callback failed with status 2.
Dec 12 08:31:13 freenas collectd[3261]: aggregation plugin: Unable to read the current rate of "freenas.local/cpu-9/cpu-user".

This message appears for multiple CPUs

Seeing as my hardware is second hand off ebay, I wouldn't be too surprised if one of those components was the root of the problem.
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Start by looking at the local console and tell us what's going on there.
 

eiton

Cadet
Joined
Dec 11, 2017
Messages
9
Anything in /var/log/messages when this happens?
Unfortunately, my logs don't seem to go back far enough to where it last froze yesterday. However, I do see a lot of entries like this on various cpus.
Code:
Dec 12 08:13:33 freenas collectd[3261]: utils_vl_lookup: The user object callback failed with status 2.
Dec 12 08:31:13 freenas collectd[3261]: aggregation plugin: Unable to read the current rate of "freenas.local/cpu-9/cpu-user".


Start by looking at the local console and tell us what's going on there.
For that, I see the same above messages. (from what I recall) Picking the shell option and input any commands results in the same unresponsiveness I mentioned earlier.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
So you get to the shell and then it becomes unresponsive?

Unfortunately, my logs don't seem to go back far enough to where it last froze yesterday. However, I do see a lot of entries like this on various cpus.
Code:
Dec 12 08:13:33 freenas collectd[3261]: utils_vl_lookup: The user object callback failed with status 2.
Dec 12 08:31:13 freenas collectd[3261]: aggregation plugin: Unable to read the current rate of "freenas.local/cpu-9/cpu-user".
That's just noise, which is fixed in 11.1.
 

eiton

Cadet
Joined
Dec 11, 2017
Messages
9
Connecting from SSH: After entering my password, I'm presented with a "%" and a blinking cursor. No commands work after this point. I see the line on the screen and nothing happens.

Connecting to a local terminal via IPMI: Selecting the option that would be the shell option and pressing enters does nothing. I'll see my choice on the screen but no response from the machine after that.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Okay, that makes more sense. Still doesn't give any specific clues, but it's less surreal.
 

eiton

Cadet
Joined
Dec 11, 2017
Messages
9
Poking around in my /var/log directory I noticed these files.
Code:
-rw-r--r--  1 root  wheel	 19654 Dec 12 00:00 messages.0.bz2
-rw-r--r--  1 root  wheel	 21429 Dec  1 00:00 messages.1.bz2


The last time this problem occured was yesterday, the 11th. Might messages.0.bz2 contain the missing logs?
 

eiton

Cadet
Joined
Dec 11, 2017
Messages
9
I found the logs from the freeze in the messages.0.bz2 file and the thing that's jumping out at me is my external drive that the os lives on disconnected.

Here's the logs right up to what I believe is the reboot:
Code:
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 43 f9 0e 00 00 55 00
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): Retrying command
Dec 11 17:55:20 freenas ugen1.3: <Western Digital External HDD> at usbus1 (disconnected)
Dec 11 17:55:20 freenas umass0: at uhub2, port 5, addr 3 (disconnected)
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 43 f9 0e 00 00 55 00
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): Retrying command
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 43 f9 0e 00 00 55 00
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): Retrying command
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 43 f9 0e 00 00 55 00
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): Retrying command
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 43 f9 0e 00 00 55 00
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): Error 5, Retries exhausted
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 06 38 00 00 10 00
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): Retrying command
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 06 38 00 00 10 00
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): Retrying command
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 06 38 00 00 10 00
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): Retrying command
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 06 38 00 00 10 00
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): Retrying command
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 06 38 00 00 10 00
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): CAM status: CCB request completed with an error
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): Error 5, Retries exhausted
Dec 11 17:55:20 freenas da4 at umass-sim0 bus 0 scbus8 target 0 lun 0
Dec 11 17:55:20 freenas da4: <WD 2500BEV External 1.05> s/n 575845383038535537383637 detached
Dec 11 17:55:20 freenas (da4:umass-sim0:0:0:0): Periph destroyed
Dec 11 17:55:20 freenas umass0: detached
Dec 11 17:55:28 freenas ugen1.3: <Western Digital External HDD> at usbus1
Dec 11 17:55:28 freenas umass0 numa-domain 0 on uhub2
Dec 11 17:55:28 freenas umass0: <Western Digital External HDD, class 0/0, rev 2.00/1.05, addr 3> on usbus1
Dec 11 17:55:28 freenas umass0:  SCSI over Bulk-Only; quirks = 0x0108
Dec 11 17:55:28 freenas umass0:8:0: Attached to scbus8
Dec 11 17:55:28 freenas da4 at umass-sim0 bus 0 scbus8 target 0 lun 0
Dec 11 17:55:28 freenas da4: <WD 2500BEV External 1.05> Fixed Direct Access SPC-2 SCSI device
Dec 11 17:55:28 freenas da4: Serial Number 575845383038535537383637
Dec 11 17:55:28 freenas da4: 40.000MB/s transfers
Dec 11 17:55:28 freenas da4: 238475MB (488397168 512 byte sectors)
Dec 11 17:55:28 freenas da4: quirks=0x2<NO_6_BYTE>
Dec 11 17:56:21 freenas ZFS: vdev state changed, pool_guid=4172507297089948200 vdev_guid=16449666988256648440
Dec 11 18:02:29 freenas syslog-ng[1585]: syslog-ng starting up; version='3.7.3'


Would I be correct in assuming there's something up with that old USB drive and that I should try replacing it with a new one? My freeze happened while moving files around through SMB in this particular instance.
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
Use diskinfo -v da to be sure which drive it is. Even if you have only one USB drive, always check.
 
Status
Not open for further replies.
Top