TrueCommand 1.3 is Available

Joined
Jan 4, 2014
Messages
1,644
The SMR drives alerts were much easier to do on TrueCommand. Its got a more powerful 2 factor alerting engine and we wanted to make it available ASAP and for users with previous releases of software.... not just 12.0. If you only have one system, there is the option of manually checking the drives.

According to this StackExchange thread, dmidecode -t 17 can be used to determine whether or not ECC RAM is being used on a FreeBSD system. In the same way that TC 1.3 is able to alert the end-user about SMR drives, I wonder if it's useful for a future version of TC (or TrueNAS) to alert the end-user that non-ECC RAM is being used in a system? Like an SMR drive, there's nothing to prevent its use, but it's strongly discouraged. Of course, the CPU may not support ECC RAM either, but at least there's an alert that FreeNAS/TrueNAS is not being run in a recommended setting.
 
Last edited:
Joined
Jan 4, 2014
Messages
1,644
I suspect this is a valid arrangement with multiple management terminals...

screenshot.364.png


...but this is not...

screenshot.361.png


My question is... does upgrading to a newer version of TC result in changes to the TC database so that a rollback to an earlier version of TC is no longer possible?
 
Last edited:

kenmoore

TrueCommand Project Lead
iXsystems
Joined
May 1, 2019
Messages
51
@Basil Hendroff : In answer to your questions (I will tag them with numbers below)

1. Multiple TC instances/versions using the same database
Answer: No, that is not recommended. I have seen a couple instances of database corruption when I startup multiple containers using the same data directory. I think it is something related to how docker does the directory->container passthrough, but I am not sure about the exact cause.

2. Rolling a 1.3 TC container back to 1.2
Answer: This is actually possible, and I do this regularly when testing out nightly/production versions of TrueCommand locally. We are very careful to not institute breaking changes in the TC database layout, specifically so that it is backwards compatible as much as possible. Now I would not assume/rely on this though, we can never predict the future and a breaking change might be needed some time in a later version.

3. ECC RAM alert
Answer: This is an interesting possibility, so I went ahead and created a ticket for this request here: https://jira.ixsystems.com/browse/TC-1477
We will talk about this internally and see if this is something we can put together for you in a future version.

4. Memory Statistics
Answer: This is a very tricky topic, as you can tell by all the various ways memory usage gets reported across different platforms. Basically it comes down to the question "What are you using your system for?". If you are using it purely as a file server, then you typically want as much of your memory used at all times (preferably by the ZFS cache, with a high cache hit rate). If you are using it as a VM/application platform, or some kind of hybrid mix, then that picture gets all muddied and people have a tendency to get alarmed over memory stats which are not actually a problem.

What we have found so far with TrueCommand is that the memory statistics on the dashboard typically caused more questions for people than answers, because those memory statistics do not easily translate into actionable information (aside from the total memory size, which you referenced earlier). After working with experienced sysadmins and IT teams, we decided to shift the dashboard metrics to a layered framework based on priorities:
1. Multi-system dashboard card: Show the top-priority information for the system - things that can result in production down situations or degraded performance.
2. Single-system dashboard: Open up the next layer of the system metrics for enhanced diagnostics (Example: Is the high CPU utilization a temporary state, or has it been this way for a while?)
3. Detailed system analysis via reports (typically as the result of an alert): Basically inspect any/all of the metrics about the system surrounding the time of the alert in order to gain insight into the causes/solutions for the issue.

We found that the memory statistics were more often used as part of the system analysis or post-alert inspection rather than a good source of top-level information for system admins, so we dropped it from the dashboard cards in 1.3 to prevent it from causing confusion. I do think a case could be made to add that back into the expanded dashboard metrics (probably as another time chart, similar to the storage growth or usage charts), so I went ahead and made an improvement ticket to track this change for a future version of TrueCommand: https://jira.ixsystems.com/browse/TC-1478

Sorry for the long reply, but I hope this answered all your questions!
 
Last edited:
Joined
Jan 4, 2014
Messages
1,644
@kenmoore Thank you for your considered response. You've clarified a number of points for me and given me some insight into the design of TC 1.3. Thank you also for considering my suggestions and for raising the relevant ticket requests.

Basically it comes down to the question "What are you using your system for?". If you are using it purely as a file server, then you typically want as much of your memory used at all times (preferably by the ZFS cache, with a high cache hit rate).
Thanks for this useful tip! I'll bear this in mind.

When I first saw the colour scheme for TC 1.3, like @Patrick M. Hausen in post #3, my first thought was 'WTF!', but I now understand the clever use of subdued colours. Using brighter colours for issues, makes it visually very easy to identify a server experiencing problems within a cluster of servers.
 
Last edited:

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,691
According to this StackExchange thread, dmidecode -t 17 can be used to determine whether or not ECC RAM is being used on a FreeBSD system. In the same way that TC 1.3 is able to alert the end-user about SMR drives, I wonder if it's useful for a future version of TC (or TrueNAS) to alert the end-user that non-ECC RAM is being used in a system? Like an SMR drive, there's nothing to prevent its use, but it's strongly discouraged. Of course, the CPU may not support ECC RAM either, but at least there's an alert that FreeNAS/TrueNAS is not being run in a recommended setting.

An SMR drive issue could cause a VDEV and pool failure and loss of a lot of data... it could be very catastrophic and predictable. We built the logic so that we can in future flag any other bad drive issues.

An ECC RAM issue is very rare on small systems and may cause a file corruption or a system reboot. It can waste a lot of time diagnosing the issue and so we recommend and support systems with ECC. If you have this issue, let us know and make the suggestion. There area lot of issues that we would like to detect early, but we are prioritizing based on events we see in deployed systems.
 
Joined
Jan 4, 2014
Messages
1,644
An ECC RAM issue is very rare on small systems and may cause a file corruption or a system reboot.
I believe you meant to say 'A non-ECC RAM issue is very rare on small systems and may cause a file corruption or a system reboot.'

It can waste a lot of time diagnosing the issue and so we recommend and support systems with ECC.
I wholeheartedly agree and herein lies the problem for the community. Many novices begin their journey with FreeNAS on h/w that isn't server-grade. While experimenting with FreeNAS, that's okay. The issues start when they begin to depend on that h/w to run FreeNAS. It's a potential showstopper for forum support. Consider the following conversation:

'FreeNAS has corrupted my file!' The conversation goes one of two ways now... 'It's possible that your use of non-ECC RAM has caused the corruption so I'm not going to waste my time diagnosing the issue any further until you address this.' or... 'I see you're using ECC-RAM. Let's investigate further'.

There are a lot of issues that we would like to detect early, but we are prioritizing based on events we see in deployed systems.

The SMR drive issue caught the ZFS community by surprise and IX Systems are to be commended for making it a priority to detect SMR drives in deployed TrueNAS systems. It would indeed be surprising if deployed systems that IX Systems were involved with used anything but ECC RAM. From this perspective, early detection of non-ECC RAM use would not even get a look in. However, it's not surprising to see FreeNAS community builds on non-server grade hardware.

The ability of TC 1.3 to detect SMR drives got me thinking, from a community perspective, that the ability to warn the user that non-server grade components (such as non-ECC RAM) were in use, might be an interesting consideration for TC (or TrueNAS Core) in the future.
 
Last edited:
Joined
Jan 4, 2014
Messages
1,644
Joined
Jan 4, 2014
Messages
1,644
I thought it might be a theme thing, and I thought I remembered a theme setting, but in a little bit of clicking around and checking the manual (which has yet to be updated, BTW) I didn't find it.
The colour palette is in the top left-hand corner.
screenshot.378.png
 

KevDog

Patron
Joined
Nov 26, 2016
Messages
462
This is useful. TC 1.3 picked up that I had an SMR drive on one of my servers.

Disk is nowhere to be seen on the Resources % tile on any of my servers. Also, are flat lines expected on an active system because that's what I'm observing?

View attachment 39953

On all systems with SMB and NFS active, the Clients tile shows SMB zero and NFS is nowhere to be seen.

View attachment 39954

Just scratching my head -- based on what you posted with your graphs? How did you conclude you had a SMR drive?
 
Joined
Jan 4, 2014
Messages
1,644
Just scratching my head -- based on what you posted with your graphs? How did you conclude you had a SMR drive?
Not from the graphs, but from an alert. The ability to detect SMR drives appears to be built into TC.
screenshot.824.png
 

KevDog

Patron
Joined
Nov 26, 2016
Messages
462

Don Dayton

Cadet
Joined
Jan 7, 2021
Messages
9
I had 4 Supermicro servers with 8x hot swap 2TB or 4TB SAS drives, 6 core XEON with 64GB ECC RAM that were running FreeNAS 11.1-RELEASE. I had been following users that have started adopting TrueNAS and it seemed that TrueNAS 12.0U1 was ready for deploy, so I upgraded 2 of the 4 and that went just fine. The other thing that I wanted to implement was Truecommand. First thing to know is that none of our servers have internet access. We can get to the internet via our WAN connection to the corporate datacenter as long as we login to an AD domain account that belongs to an external internet access allow group and then only via a browser with a defined proxy server. I downloaded the Truecommand vmdk and what I got was the version 1.2. Using VirtualBox on my laptop I installed it and got it working, but it seemed like I had to retry the initial sign up several times before I finally got past the login. Once I was in I had no trouble adding the 2 servers. The next day I was able to get to the webUI login, but unable to login. Checked the service and it was stopped in failed state. If I started it the service would just stop again. I signed up for the ixportal and used it to get the latest vmdk and it was TrueCommand-2020-1.3.2-VMDK. I was able to install it on VirtualBox and get logged in but every time I would attempt to add one of the servers that I had registered before the service crashes and restarts. Same thing happens if I try to add another user. I decided to try running the vmdk on my VMware ESXi 6.7U3 host. The Truecommand 1.2 runs but the service crashes every time I try the sign up part of the login. Tried the 1.3.2 and it won't work saying it doesn't have write access to the disk. Checked the protection and both versions had the same rw- --- ---. Tried changing to rw- rw- rw- but still no write access. In VMware they had to be defined as IDE. The SCSI and SATA would not work with the vmdk. On my laptop at home, where I am connected to the internet, using VirtualBox I setup a VM of TrueNAS Core 12.0U1 s and a VM of Truecommand 1.3.2 server with a vmdk. This all worked without any issue. Same laptop but at home I'm using wifi only and used a bridge adapter to connect to the hardware and at work the adapter was the gigabit adapter in the HP docking station. In all cases I can ping all of the devices between each other so I know there is good network connectivity. At work all is connected via Cisco enterprise gigabit switches in the same subnet and VLAN.
 

Don Dayton

Cadet
Joined
Jan 7, 2021
Messages
9
I had 4 Supermicro servers with 8x hot swap 2TB or 4TB SAS drives, 6 core XEON with 64GB ECC RAM that were running FreeNAS 11.1-RELEASE. I had been following users that have started adopting TrueNAS and it seemed that TrueNAS 12.0U1 was ready for deploy, so I upgraded 2 of the 4 and that went just fine. The other thing that I wanted to implement was Truecommand. First thing to know is that none of our servers have internet access. We can get to the internet via our WAN connection to the corporate datacenter as long as we login to an AD domain account that belongs to an external internet access allow group and then only via a browser with a defined proxy server. I downloaded the Truecommand vmdk and what I got was the version 1.2. Using VirtualBox on my laptop I installed it and got it working, but it seemed like I had to retry the initial sign up several times before I finally got past the login. Once I was in I had no trouble adding the 2 servers. The next day I was able to get to the webUI login, but unable to login. Checked the service and it was stopped in failed state. If I started it the service would just stop again. I signed up for the ixportal and used it to get the latest vmdk and it was TrueCommand-2020-1.3.2-VMDK. I was able to install it on VirtualBox and get logged in but every time I would attempt to add one of the servers that I had registered before the service crashes and restarts. Same thing happens if I try to add another user. I decided to try running the vmdk on my VMware ESXi 6.7U3 host. The Truecommand 1.2 runs but the service crashes every time I try the sign up part of the login. Tried the 1.3.2 and it won't work saying it doesn't have write access to the disk. Checked the protection and both versions had the same rw- --- ---. Tried changing to rw- rw- rw- but still no write access. In VMware they had to be defined as IDE. The SCSI and SATA would not work with the vmdk. On my laptop at home, where I am connected to the internet, using VirtualBox I setup a VM of TrueNAS Core 12.0U1 s and a VM of Truecommand 1.3.2 server with a vmdk. This all worked without any issue. Same laptop but at home I'm using wifi only and used a bridge adapter to connect to the hardware and at work the adapter was the gigabit adapter in the HP docking station. In all cases I can ping all of the devices between each other so I know there is good network connectivity. At work all is connected via Cisco enterprise gigabit switches in the same subnet and VLAN.

1610084943070.png
 

Don Dayton

Cadet
Joined
Jan 7, 2021
Messages
9
I used timedatectl to set the truecommand to the correct time zone. I didn't see this in the webUI. It had the correct time for the default time zone it was in so I assumed internally it may be using NTP or something to initialize the clock setting.
 
Top