setting 3 node Cluster via Truecommand (after setting AD) never ending setup...

vtravalja

Dabbler
Joined
Apr 25, 2022
Messages
27
Hi.

I am aware that Scale 22.12 and TC 2.0.2 is experimental, but I did not expect that is that much buggy.
I have created 3 node SCALE boxes in VMWare ESXi, and TrueCommand in Ubuntu LTS (docker).

As I understood, the requirements were 20GB installation for Scale, and 8x 5GB SCSI drives, and I used 16GB RAM (this is only a test boxes to see hw cluster is working). TC on the other hand has 8core, 16GB RAM and 100GB storage.

all 3 SCALE have 3 LAN (one which is on the network 10.16.100.x/24 - cluster) and (another is on our public IP network where our AD is located. As I don't want to disclose our network for simplicity lets say that it is 123.123.123.0/23). Now, the first step is that I have set all 3 boxes of Scale to be identical in terms of performance, hw requirements, also NTP to be exactly the same. One thing that I did not do was, I did not create pool on them (as I was not sure if this is needed or do, or should I do that when I join them into cluster and then create shared SMB).

After setting up SCALE bricks, I went setting up TC. Ubuntu was minimal installation, but has 1 NIC card which is in the same network as our public network 123.123.123.0/23. (again I don't know if it requires to have access to the cluster volume or just communication with the SCALE on public IP network 123.123.123.0/23). After adding all 3 systems into TC via API keys, I can see that they all 3 appeared.

One thing I noticed that when I go to the details of the each box, to get more information about resources, I see that it takes forever to load SCALE information. (almost like browser did not sent all the inquiries to the SCALE), then again this seems to be the problem on all browsers (Firefox, Chrome, Edge).

I have then moved to creating the cluster. I added all 3 machines into a cluster and that part went OK, I got list of all 3 bricks into a cluster with information that cluster is "healthy". Then the next step was to finish the setup by setting interfaces and AD before creating SMB share.

After setting all the info Brick1 - int192 - 123.123.123.1/23, Brick2 - int192 - 123.123.123.2/23, Brick3 - int192 - 123.123.123.3/23, I added AD admin account and password and selected smb share name, I clicked on finish and this is when the weird stuff starts to show up.

- I noticed that after 30 mins it still did not finish.
- I also noticed in the another browser, that two bricks dropped from the cluster stating that "authentication has failed". Only after applying API keys again two times, it adds them back, again the SMB wizard did not finish creating a connection.

Something is wrong, but I have no clue where to start troubleshooting. Can someone give me some direction or if someone has encountered similar issue.
The logs in the TC are no good as it does not reveal any problems. Same thing in SCALE. Almost like that wizard for creating SMB did not do anything...

Any clue how to do a proper troubleshooting and perhaps provide you with more info or to a developers so they can check it out?

Looking forward to your reply.
Have a lovely weekend.
 

vtravalja

Dabbler
Joined
Apr 25, 2022
Messages
27
This is where it hangs... "Registering public addresses". Funny part, I created pool locally on every brick and added Active Directory without issues, but in the TC it somehow hangs on this part.

Screenshot 2023-01-18 081247.jpg
 

vtravalja

Dabbler
Joined
Apr 25, 2022
Messages
27
what I also noticed in a new browser when I connect to TC somehow when adding this AD to enable SMB share, two out of 3 bricks suddenly drops and reports that it is "invalid auth credentials". Almost like it deletes API keys. Seems like bug in the system.
Screenshot-2 2023-01-18 081505.jpg


On top of that when I check UI of those controllers, I see this: it never gets up and become alive. Like it is stuck or something.

Screenshot-3 2023-01-18 083429.jpg
 
Last edited:

vtravalja

Dabbler
Joined
Apr 25, 2022
Messages
27
an update:

Definitely in the step of setting AD and SMB for cluster, the API keys gets scrambled. For example, all bricks gets disconnected. Funny thing is that when you copy paste key for brick 2 (as I copied and saved them), it is not working. Then I tried for fun to place key for brick 1 and it worked. It connected brick 2 (this is definitely a bug in a system). same thing goes for brick3, it works with API key for brick 2, and Brick 3 with API key of brick 1. It almost it somehow scrambles the configuration.

Screenshot-4 2023-01-18 085403.jpg


Second problem is that on top of that, the configuration part for SMB and AD never ends (even after I re-connect them into a cluster.)

This TC is insanely buggy and definitely far from being usable.
 
Top