SOLVED FreeNAS 11.1 Disappearing RAM?

fracai · Dec 20, 2017

First, I apologize for the lack of details. I'm remote to the system right now and can't get all the system details. I'll post again later today with further details and results from any suggestions. Thanks.

Now then, I upgraded from 9.10 to 11.1 yesterday afternoon. I haven't experienced any issues with my existing jails, save for an iohyve vm that lost network access and I've since destroyed and started rebuilding. Then I saw the thread with instructions on potentially reusing the disk zvol. Oh well.

Anyway, everything has been fine until today when I noticed that I lost around 2GB of RAM last night. The system info page reports 16GB, but the reporting page shows a drop of almost 3GB around 0100 EST. It then bumps up by a bit more than 500 MB at 0120. Around 1030 it dropped another GB. I'm sitting just about exactly 3GB down from what I expect.

If I look at the FreeNAS web ui I see 16 GB reported. 'sysctl hw.physmem' shows 16 GB as well. The reporting page shows the drop in RAM.

If it was a bad stick I'd have expected a system halt or to see the total drop by 8GB. This looks more like something is taking up the RAM, but not being included in the numbers for active, inactive, wired, free, and cache. I do have the new VM running, which I gave about 3GB, but I'd expect that to be included in the reporting. When I get the chance I'll try killing the VM to see if the RAM comes back.

So, any suggestions on what I can do to identify what is using the missing RAM? Or if something is actually wrong with a stick and this is individual RAM chips failing over time?

I'll be back as soon as possible to update this with my actual system config and any results from further testing.

Thanks for your help.

edited to add system configuration:

ASRock E3C226D2I
Intel Xeon E3-1220 v3 @ 3.10GHz
16 GB ECC
RAIDZ2: 6x Seagate 4TB ST4000DM000

danb35 · Dec 20, 2017

fracai said:
I noticed that I lost around 2GB of RAM last night.

How, exactly, did you notice that you "lost around 2GB of RAM"?

fracai · Dec 20, 2017

I send the collectd stats and some others to Librato (see attached image) but the charts on the Reporting tab (data from collectd) show the same drop.

The gap in the image is when I rebooted for the upgrade to 11.1.

Xelas · Dec 20, 2017

If it's an older system, maybe it's getting senile? My total RAM is also dropping with age as well, and the ECC functionality failed right about when my kids were born and I was sleeping for 3-4 hours per night. It never came back. I get memory errors all the time. I actually couldn't recall my boss's full name for a few seconds on a conference call earlier today!

Now, to get back on topic - have you tried restarting the collectd daemon? It could just be a glitch with the daemon.
What if you side-step collectd and query the system more thoroughly through sysctl? Here is page with some scripts you may find helpful, including a fairly nifty shell script:
https://www.cyberciti.biz/faq/freebsd-command-to-get-ram-information/

Also, this thread was helpful - you can apparently query the health of ECC RAM from within FreeBSD (and I assume by extension FreeNAS) if you doubt the health of your RAM:
https://lists.freebsd.org/pipermail/freebsd-performance/2012-April/004585.html
The danger here is that nothing being reported could mean your RAM is fine, or it could be that ECC errors aren't being reported or logged correctly. If you have a know-bad RAM module you can test with, it might be useful to test the ECC functionality. Maybe one way to do this is to intentionally damage a single RAM package on a RAM module that is too small to be useful? Or you have spares laying around you can use as a test specimen? I've never done that, so I have no idea if that will work or not!

fracai · Dec 20, 2017

Code:

> sudo mcelog --ascii --file /var/log/messages
>

Looks good. Either my RAM is fine or ECC isn't detected properly.

Code:

> sysctl hw.physmem
hw.physmem: 17070518272

Yep, 16GB.

Code:

> sysctl hw | egrep 'hw.(phys|user|real)'
hw.physmem: 17070518272
hw.usermem: 6129745920
hw.realmem: 17985175552

Not sure how this works out exactly, but looks OK.

Code:

> grep memory /var/run/dmesg.boot
real memory  = 17985175552 (17152 MB)
avail memory = 16517664768 (15752 MB)

Looks reasonable, not sure why available is so high, but different from real, but it's not too far off.

Code:

> /usr/local/bin/perl ./freebsd-memory.pl
SYSTEM MEMORY INFORMATION:
mem_wire:	   10978406400 (  10469MB) [ 66%] Wired: disabled for paging out
mem_active:  +	737406976 (	703MB) [  4%] Active: recently referenced
mem_inactive:+   1292951552 (   1233MB) [  7%] Inactive: recently not referenced
mem_cache:   +			0 (	  0MB) [  0%] Cached: almost avail. for allocation
mem_free:	+	306196480 (	292MB) [  1%] Free: fully available for allocation
mem_gap_vm:  +   3307835392 (   3154MB) [ 19%] Memory gap: UNKNOWN
-------------- ------------ ----------- ------
mem_all:	 =  16622796800 (  15852MB) [100%] Total real memory managed
mem_gap_sys: +	447721472 (	426MB)		Memory gap: Kernel?!
-------------- ------------ -----------
mem_phys:	=  17070518272 (  16279MB)		Total real memory available
mem_gap_hw:  +	109350912 (	104MB)		Memory gap: Segment Mappings?!
-------------- ------------ -----------
mem_hw:	  =  17179869184 (  16384MB)		Total real memory installed

SYSTEM MEMORY SUMMARY:
mem_used:	   15580721152 (  14858MB) [ 90%] Logically used memory
mem_avail:   +   1599148032 (   1525MB) [  9%] Logically available memory
-------------- ------------ ----------- ------
mem_total:   =  17179869184 (  16384MB) [100%] Logically total memory

Ah, here we go. That mem_gap_vm line is what is missing from the memory reporting. It's odd that it's listed as UNKNOWN, while also identifying it as mem_gap_vm. Does the 'vm' not refer to Virtual Machine? Stopping the VM cut the mem_gap_vm line in half, but it didn't go away.

So I'm not sure if this is a reporting bug, or a memory leak.
I've created #27356 to track the issue.

ASRock E3C226D2I
Intel Xeon E3-1220 v3 @ 3.10GHz
16 GB ECC
RAIDZ2: 6x Seagate 4TB ST4000DM000

Xelas · Dec 20, 2017

I don't think that's RAM taken up by VMs. I think the "VM" is in reference to Virtual Memory. There could be some complex inter-play with the page-file? I wouldn't make sense for RAM allocated to virtual machines to show up in a RAM report at the kernel level.
What if you query swap file use at the same time? Just curious if there is a correlation.

fracai · Dec 21, 2017

That's a good point, though it's interesting that the gap dropped by half when I shut down the VM.
I think I remember swap use being very low, but I don't have the numbers. I'll post those later today. I'll try starting the VM up again today and I'll post that impact as well.

Xelas · Dec 21, 2017

fracai said:
That's a good point, though it's interesting that the gap dropped by half when I shut down the VM.
I think I remember swap use being very low, but I don't have the numbers. I'll post those later today. I'll try starting the VM up again today and I'll post that impact as well.

The gap drop could be explained if stopping the virtual machine impacted how much memory was being swapped. That would also explain why there was a correlation, but not 1:1. Shutting down the virt machine means less RAM used, so the system moves some stuff around but doesn't completely page in what it had paged out into the swap file. If "vm" = "virtual machine", then you would expect shutting down the virtual machine to free all of the RAM up.

fracai · Dec 21, 2017

Alexander Motin figured it out in the ticket. It looks like the Cache category has been replaced with Library. I modified the freebsd-memory.pl script to include a line for Library and the gap disappears.

Thanks all.

Xelas · Dec 21, 2017

Excellent - and thank you for wrapping up the thread with the solution. I HATE it when the OP fixes the issue and just disappears, or even worse, writes "Fixed it" and disappears.

Looks like they'll need to also modify the config for collectd so that gets reported properly as well.

Important Announcement for the TrueNAS Community.

SOLVED FreeNAS 11.1 Disappearing RAM?

fracai

Guru

danb35

Hall of Famer

fracai

Guru

Attachments

Xelas

Explorer

fracai

Guru

Xelas

Explorer

fracai

Guru

Xelas

Explorer

fracai

Guru

Xelas

Explorer

Similar threads

Important Announcement for the TrueNAS Community.

SOLVED FreeNAS 11.1 Disappearing RAM?

Guru

Hall of Famer

Guru

Attachments

Explorer

Guru

Explorer

Guru

Explorer

Guru

Explorer

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "FreeNAS 11.1 Disappearing RAM?"

Similar threads