Lost connection to Active Directory and cannot re-establish

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
As per the title. I noticed that my TrueNAS had lost access to AD. I tried to re-establish the connection - but it seems to be failing with no error message.

I have 3 DC's. Two primary (on ESXi - Server 2016) and one other thats turned off, gets turned on infrequently (Server 2019)

Time: Dashboard shows the same time as the DC's
DNS: Resolves correctly. DC's respond as does the domain
The Active Directory seems to work outside the NAS. Other Systems can log on just fine.

On the NAS:
Active Directory and LDAP are disabled
Attempting to enable AD produces a brief blue line at the top of the window and nothing happens except that Save is no longer clickable.
1673273149246.png


Is there a way of resetting any AD integration, back to nothing so I can completely start again (with the AD menu) as something seems to have gone badly wrong. I have done some flailing around inside AD, including removing the object in an attempt to reset but I think AD is still trying to use the same object. Rebooting does not help

Or any other ideas?

The error seems!! to have occurred last night after I went to bed [I got an alert at 00:36]. I was playing with S3 and WebDAV, but not with anything else. I have turned WebDAV and S3 off for the moment
Alert was:
  • Attempt to connect to netlogon share failed with error: [ENOENT] (7, 'WBC_ERR_DOMAIN_NOT_FOUND: wbcPingDc2 failed', '../../nsswitch/py_wbclient.c:551').
 
Last edited:

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Looks like probably DNS-related. What is output of `midclt call activedirectory.domain_info SENDARIAN.CO.UK`? Are you using the domain controllers as nameservers? Any non-AD nameservers?
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
I have 3 DC's. Two primary (on ESXi - Server 2016) and one other thats turned off, gets turned on infrequently (Server 2019)
I believe this is a more-or-less broken AD design. Your DNS will have SRV records pointing to a target that is almost always unavailable, and DCs will probably report replication errors.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
midclt call activedirectory.domain_info SENDARIAN.CO.UK

{"LDAP server": "192.168.38.10", "LDAP server name": "ZeusDC.sendarian.co.uk", "Realm": "SENDARIAN.CO.UK", "Bind Path": "dc=SENDARIAN,dc=CO,dc=UK", "LDAP port": 389, "Server time": 1673278239, "KDC server": "192.168.38.10", "Server time offset": 0, "Last machine account password change": 1654453545}

DNS Servers are AD Servers, and vice versa. There are no non-AD nameservers. The temp server is on. Its turned on every week or so

DC's are ZeusDC, HeraDC and AthenaDC (the temp one). DNS Order is Zeus, Hera, Athena in networks

Active Directory seems to be working. Group Policy is applied. I even tested with a new drive map and that is working just fine (but not obviously to the NAS)
 
Last edited:

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
@anodos Any thoughts?
When I try to add the NAS to the domain with the advanced section open. I get "
Failed to validate bind credentials: Client 'TRUENAS$@SENDARIAN.CO.UK' not found in Kerberos database while getting initial credentials

1673349616059.png
 
Last edited:

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Update: I have built a new test Scale Box on my test server. This can join AD sucessfully. I have also removed the spare DC (that was achieving an objective I no longer need). Also a variety of DC tests all come back without issue

The original server still has the same issue.
It seems to be stuck in a half in / half out state from which I can neither retreat or advance.

Is there a way of completely resetting the AD configuration so that I can start again from scratch with domain name and then get to add the admin username and password which I think will solve the issue?
 
Last edited:

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Well that was easy.
I was trying to reset the connection details, but was doing it the wrong way. Once given a clue it was really simple
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
It was what I would describe as an ID10T error.
or
Problem between keyboard and chair

I was failing to operate the GUI properly.
 

EndreEndi

Cadet
Joined
Feb 27, 2023
Messages
7
Hi, how you solved this, i have the same issue.
In the notifications tab it says 2 things:

1.

Attempt to connect to netlogon share failed with error: [ENOENT] (7, 'WBC_ERR_DOMAIN_NOT_FOUND: wbcPingDc2 failed', '../../nsswitch/py_wbclient.c:551').​


2.

Domain validation failed with error: [EFAULT] Failed to retrieve machine account status: kerberos_kinit_password STORAGE$@ENDREENDI.LOCAL failed: Preauthentication failed​


The command midclt call activedirectory.domain_info endreendi.local
output is
{"LDAP server": "10.10.1.212", "LDAP server name": "ee.endreendi.local", "Realm": "ENDREENDI.LOCAL", "Bind Path": "dc=ENDREENDI,dc=LOCAL", "LDAP port": 389, "Server time": 1684148075, "KDC server": "10.10.1.212", "Server time offset": -123, "Last machine account password change": 1684061760}



i dont really know what to do, i'm a newbie :(
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
I was simply failing to operate the GUI properly. Once I was educated that I was an idiot - it worked properly

Go back to basics. Remove anything relaated to AD then check:
1. Basic network connectivity. Make sure that DNS is using the AD servers
2. Time setting - is the server time correct

Then try again
 

EndreEndi

Cadet
Joined
Feb 27, 2023
Messages
7
I kinda resolved my issue, i freshly installed Scale 22.02.3 because that version worked for me with AD, after i upgraded to 22.12.x i had issues with it, i'm upgrading again to see if they're fixed now tho, i'm upgrading because i'm afraid of importing the 22.12 pool into an older version i don't want to break it and lose anything...
 

EndreEndi

Cadet
Joined
Feb 27, 2023
Messages
7
It's solved, so my issue had something to do with my installation of TrueNAS Scale 22.12.x, exporting storage and deleting install, installation of 22.02.4 and joining on that versiion, and upgrading back to 22.12.x and importing storage seems to solve my issue.
AD is on green and everything works now.
fingers crossed it remains that way :)
 

ansh

Cadet
Joined
Jun 23, 2023
Messages
5
Hi.
Hi, how you solved this? I have the same issue.
WARNING
Attempt to connect to netlogon share failed with error:
[ENOENT] (7, 'WBC_ERR_DOMAIN_NOT_FOUND: wbcPingDc2 failed', '…/…/nsswitch/py_wbclient.c:552').
Joining a domain
kerberos. tcp.DOMAIN.LAN: Nameserver 192.168.10.151 failed to resolve SRV record for domain DOMAIN.LAN : The DNS operation
 
Top