Is Truenas stable and reliable?

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
I've been working towards setting up my first Truenas (Scale) setup. I've run Synologies and fully custom setups without many issues.

After the initial setup and taking it into "production" I'm starting to doubt if relying on Truenas is a good idea...

I've run into multiple issues out of the box. Many of the "features" do not work on a clean system and need fiddling to get them running. Now I'm not agianst fiddling, however that is NOT the point of this system. It also shouldn't be the case for production ready software.

Up to this point I've run into:
- A lack of proper error logging, context and information in alerts messages
- Apps (Kubernetes) not working without removing and reading the Dataset multiple times
- UPS connection not working properly, yet to resolve this, let alone that the UI does not seem to provide any information about the UPS. From what I understand it's a permissions issue
- LAG - LACP is not working, still need to resolve this. I've setup tons of LAGs and never had ANY issues
-> I've resolved this. For some reason NIC 1 was listening for DHCP, NIC 3 got an IP address as this was the one that was connected.

Much of these are to be expected from a Beta or Alpha. However, I'm running a release version.

Some will probably argue it's free and I shouldn't complain, this however is not my intention. I need something stable, so I pose the question "Is it stable and reliable?".

As for the choice of SCALE over CORE. This is because the feature set fits my use cases better. I could get the same result with CORE, it would take a lot of fiddling. The more I fiddle the more chances of reliability issues.

With kind regards and hoping for some useful replies,

Davy Vaessen

Updates
 
Last edited:

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222

SCALE is less mature than CORE; in both cases you need to learn how to use it, it has never been a work out of the box solution.

If you don't give us more informations regarding your issues we can't help you. Start by sharing your version and your hardware please.
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105

SCALE is less mature than CORE; in both cases you need to learn how to use it, it has never been a work out of the box solution.

If you don't give us more informations regarding your issues we can't help you. Start by sharing your version and your hardware please.
A few things:
- The question is not to help with the issues, it is if Truenas is stable and reliable
- The issues are examples explaining why I post this question, for assistance I would usually create separate threads. I will add the info to this thread though. Since you requested it.
- If there is a option to add a LAG for example in the UI and someone with 20+ years of networking experience runs into issues with something as simple as this. It has little to do with "understanding the system or learning how to use it". Same goes for a UPS service not working.

What I'm trying to determine is Truenas will be viable long term.

Updates: Textual, rewrote point 2 according to new reality
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
A bit of information as requested byDavvo:

[Server]
Chasis: Dell R730XD
CPU: 2x Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
Memory: 125.8GiB
Storage: a combination of M2 SSDs, SATA SSDs and SATA HDDs
NIC: 1x idrac 1GB, 2x daughter card 1GB, 2x daughter card 10GB (daughter card is Intel)
Connection 0, not availabe to Truenas: management on idrac NIC
Connection 1: 1x daughter card for setup (No VLAN)
Connection 2: LAG with a bunch of VLANS, not operational as of yet

[Switch]
Switch to be connected to: Netgear GS724T will migrate to a XS712T-100NES on monday

[TrueNAS SCALE]
TrueNAS-SCALE-22.12.3.3 (Up to post #16)
TrueNAS-SCALE-22.12.4 (From post #17)
 
Last edited:

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
Update: This issue has been resolved and had nothing to with LAG. It was the result of another issue where the DHCP client on NIC 1 can influence NIC 3. A fresh install of TrueNAS Scale 22.12.3.3 has this "issue". Creating LAG on NIC 1 and 2 removes the DCHP client on NIC 1 which removed the DHCP IP on NIC 3 as a result.

LAG issue information:
- Network MTU for LAG = 1500, being the default and what is set for the network
- Tested all connections with with "Layer2", "Layer2+3", "Layer3+4"

- Tested with "bond0", "bond1" and "bond2" as name, since apparently "bond0" can cause issues
- Tested with LAG timeout slow and fast, of course configuration on switch always matches configuration in Truenas

Connection 2 try 1: 1x LAG (eno1 and eno2 10GB ports) with a number of tagged VLANS
Connection 2 try 2: 1x LAG (eno1 and eno2 10GB ports) without VLANS tagged in vlan1 which of course is limited to the switch
Connection 2 try 3: 1x LAG (eno1 and eno2 10GB ports) without VLANS tagged in vlan1 which of course is limited to the switch
Connection 2 try 4: 1x LAG (eno1 and eno2 10GB ports) without VLANS tagged in vlan10 the management VLAN, enabled and disabled DHCP in management VLAN, of course would usually not run a DHCP in management vlan

The weird thing is that TrueNAS is complaining about the partner not being 802.3ad compatible. The switch however is:
Oct 7 18:02:41 truenas kernel: bond2: Warning: No 802.3ad response from the link partner for any adapters in the bond
1696688191167.png

After which the interface comes up:
Oct 7 18:02:41 truenas kernel: bond2: active interface up!

Switch config (for relevant LAG):
1696688418721.png

1696688455324.png

1696688488667.png

1696688514257.png


For some weird reason I see that eno4 my setup NIC disconnects when creating the LAG and reenabling after it for some weird reason. One interface should not affect the other.
Oct 7 18:26:00 truenas kernel: igb 0000:07:00.1 eno4: igb: eno4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Oct 7 18:26:00 truenas kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eno4: link becomes ready
 
Last edited:

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
I have tested, disabling STP, Admin mode and Link trap. No joy though.
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
As for the UPS:
1696689275251.png


1696689330875.png


The worst thing is that there is NO WAY to see if the UPS connection is actually operation or if it works! The only reason I noticed there might be an issue is because I went into the command line...

Truenas even though free for consumers, is a commercial system and THIS IS NOT OK in a commercial system!
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
- The question is not to help with the issues, it is if Truenas is stable and reliable
It is as stable as you understand and control it. If you do not understand and control it, it will not be that stable...
As for the choice of SCALE over CORE. This is because the feature set fits my use cases better.
And this is already strike 1 :
Scale is not as stable as Core because it does not have as much history.

This is because the feature set fits my use cases better.
And this is strike 2 :
TrueNAS is a storage appliance first. To run extra services on it is possible but you already step outside of its main path here.

Truenas even though free for consumers, is a commercial system and THIS IS NOT OK in a commercial system!
And here is strike 3 :
If you wish commercial grade support, just buy it and open a ticket with IXSystem. They will be pleased to support you once you got your contract.


So Yes, TrueNAS can be rock solid. It is up to the owner / operator to be sure it will run within its limits to remain that solid.
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
Just ran into another weird design decision, took me sometime to figure it out.

Apparently you need to assign builtin_admins to a user you want to be able to log in to the WebUI. Which is EXTREMELY bad, obviously you'd want to disable the root account, which means you need to create a user and mark it as builtin.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Just ran into another weird design decision, took me sometime to figure it out.

Apparently you need to assign builtin_admins to a user you want to be able to log in to the WebUI. Which is EXTREMELY bad, obviously you'd want to disable the root account, which means you need to create a user and mark it as builtin.

You can choose to use "admin" instead of "root" in the installer. It even has "recommended" in parentheses IIRC. We're working on RBAC for DragonFish, but it's still early days.
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
It is as stable as you understand and control it. If you do not understand and control it, it will not be that stable...
How does that differ from ANY other product? It is also not an answer to my question.

The question is rather straight forward, is it stable and reliable. Let's just assume I have the knowledge to get and keep just about any Linux system running. I've read up on ZFS and understand the implications. I have 30+ years in IT as software engineer, network engineer, software architect and network architect. I've designed and assisted in the implementation of large systems spanning multiple datacenters.

And this is already strike 1 :
Scale is not as stable as Core because it does not have as much history.
Sorry, but that is not a correct correlation.

And this is strike 2 :
TrueNAS is a storage appliance first. To run extra services on it is possible but you already step outside of its main path here.
The services I am trying to run are default services required for any storage system. Namely networking and UPS. Further more the TrueNAS site and knowledge database would disagree with you.

As with any system a increase in complexity may impact reliability. That's a given.

And here is strike 3 :
If you wish commercial grade support, just buy it and open a ticket with IXSystem. They will be pleased to support you once you got your contract.
Well as someone who has a home lab and works with similar systems for work I am setting this up to evaluate it for business use as well. TrueNAS is not getting a good start here on the latter. From a quality perspective as well a community perspective ;) Feels a bit like reddit here...

So Yes, TrueNAS can be rock solid. It is up to the owner / operator to be sure it will run within its limits to remain that solid.
This is a strange concluding as nothing you previously said leads up to this nor supports this.

I ask you is it a stable system you tell me that the user determines that. Well duh, not my question though!

If you do not have anything to add, then please don't post. This is neither good for the community nor good for TrueNAS! Reactions as these will only discourage new users from using the system. As TrueNAS is a commercial company do you think less customers is what they want?
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
You can choose to use "admin" instead of "root" in the installer. It even has "recommended" in parentheses IIRC. We're working on RBAC for DragonFish, but it's still early days.
I have admin indeed and disabled it for a less guessable user name once I figured out the quirks. Thnx.

I missed that text though, was surprised RBAC was missing seeing Apps (kubernetes) is a main selling point of SCALE.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
I have admin indeed and disabled it for a less guessable user name once I figured out the quirks. Thnx.

I missed that text though, was surprised RBAC was missing seeing Apps (kubernetes) is a main selling point of SCALE.
RBAC is implemented at level of middleware and nontrivial to do safely. It's best to have people understand that people using the webui have god-like powers and such access should be given out with very careful consideration.

Generally, considering the sorts of capabilities some apps are designed to grab (for instance full access to procfs), such RBAC might be a fig leaf if you grant some untrusted party admin rights in the app. It always behooves the admin to understand the technologies he or she is administering.
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
RBAC is implemented at level of middleware and nontrivial to do safely. It's best to have people understand that people using the webui have god-like powers and such access should be given out with very careful consideration.
I'm not arguing that. Still this is an Enterprise level system and you PER DEFINITION want multiple people to be educated and have access with DIFFERENT accounts.

For my use case I just want a non default user for security with a backup account in the case I lock myself out. Mistakes happen and you'd better be prepared is my philosophy.

Generally, considering the sorts of capabilities some apps are designed to grab (for instance full access to procfs), such RBAC might be a fig leaf if you grant some untrusted party admin rights in the app. It always behooves the admin to understand the technologies he or she is administering.
Indeed that is one of the issues with containerization. If this is a concern I'd recommend the use a VM, since this does not share the same kernel space.

For my use case it's honestly not such an issue though. For any future business use cases that might be a different story of course. This however is an inherent risk of the technology and not the product itself.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
CORE is incredibily stable and reliable.
SCALE not yet as it's still in constant development (iX doesn't offer Enterprise contracts with it yet iirc), but I'm no SCALE user.

That being said, you must understand how to work with the appliance: having years of experience in the field and knowing how Linux works is not enough.

I haven't seen you posting about how to solve the issues you listed, so I'm getting a strange vibe about this thread (you started questioning right away the validity of the product instead of asking how to make things work).

You seem to understand very little about the product, the company, and its strategy, but your perceived tone is quite quarrelsome.

Also, I doesn't seem able to find which version of SCALE you are running.

EDIT: regardless, if you tell us what your requirements are there is a good chance we can tell you whether they are something compatibile with SCALE or not; this is in order to help you, using your words, determine if Truenas will be viable long term.
 
Last edited:

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
CORE is incredibily stable and reliable.
SCALE not yet as it's still in constant development (iX doesn't offer Enterprise contracts with it yet iirc), but I'm no SCALE user.
Yes, and it's said a lot. For which complex system is this not the case? It's an irrelevant comment which serves NO benefit.

CORE is incredibily stable and reliable.
SCALE not yet as it's still in constant development (iX doesn't offer Enterprise contracts with it yet iirc), but I'm no SCALE user.
Thank you that actually answer my question somewhat. I was under the impression they were.

You seem to understand very little about the product, the company, and its strategy, but your perceived tone is quite quarrelsome.
Sorry, if I sound quarralsome. This is not my goal. I must admit that I find it irksome that most now a days explain away companies putting systems in the market that are FAR from ready to perform their designed function!

Your assumption about my knowledge of the company and it's products I'll leave for what it is since me answering this in any useful way would result in a fair piece of text. I also question the relevance tbh.

I haven't seen you posting about how to solve the issues you listed, so I'm getting a strange vibe about this thread (you started questioning right away the validity of the product instead of asking how to make things work).
I asked the question that is important for my use case. Which is, is it stable and reliable. Given the example of three core services which are fairly straightforward, not working out of the box. My first concern after this is stability and reliability.

There are two comments in this threads on the specifics where I added details on your request. But honestly with what I've seen up till now I'm deliberating if TrueNAS is indeed the system for me. Since if it's not stable and reliable then well what's the point.

EDIT: regardless, if you tell us what your requirements are there is a good chance we can tell you wether it's something compatibile with SCALE or not; this is in order to help you, using your words, determine if Truenas will be viable long term.
The requirement is in the question. Is Scale stable and Reliable. To which you just now provided an on topic answer.

Also, I doesn't seem able to find which version of SCALE you are running.
That's easily explained. I forgot... I'll add it now to the system comment.

I'll also verify there are no updates. Still my main concern at this point is the innate stability and reliability.
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
As there is an updated version on the release train I'll upgrade to TrueNAS-SCALE-22.12.4 and retest.
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
I've been able to setup a LAG. The issue was that I had the setup connection on NIC 3 and was using DHCP to make the setup easier.

NIC 3 would receive an IP, however the DHCP listener was running on NIC 1. Once the LAG came up, not having access to any network it would of course revert since it'd remove te IP from NIC 3 with NIC 1 then being part of the LAG and thus not having DHCP client enabled there was no network connection.

So I'd say cudos to TrueNAS not locking the user out with the only option being idrac or physical access. I do see some areas for improvement though.

First off, allowing a DHCP client for one NIC to act as DHCP client for another is a bad idea! I assume this is a bug. Is there a way for me to create an issue for that with reproduction steps.
Second, the error(s)\log(s) the system gives are at best confusing, since the issue had nothing to with the LAG config. Is there a way to provide specific feedback to iSystems?
 
Last edited:

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
First off, allowing a DHCP client for one NIC to act as DHCP client for another is a bad idea! I assume this is a bug. Is there a way for me to create an issue for that with reproduction steps.
Second, the error(s)\log(s) the system gives are at best confusing. Is there a way to provide specific feedback to iSystems?
The Report a Bug botton.

iirc there should also be one directly in the WebUI as well.
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
Top