[TrueNAS 12.0-U2] - Impossible to join AD domain?

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,543
Had the exact same issue, attempted TheUsD's process and didn't work for me with the same error.

I loaded 12.0-u1 and was able to join then updated to 12.0-u2 and all the reports are healthy but I am trying to now diagnose an issue where I can't assign AD users and groups into the ACLs. I didn't test on u1 (d'oh) before jumping ahead.
Some quick checks you can do to see where things are breaking down are as follows:
1. wbinfo -t - verify trust relationship with DC
if this succeeds:
2. wbinfo -u - verify that winbindd can enumerate users in AD
if this succeeds:
3. getent passwd - verify that your AD users and groups have proper passwd entries

If (3) fails, but (1)-(2) succeed, then you probably have an issue with idmap configuration or nsswitch configuration. /etc/nsswitch.conf should have `winbind` present in it. If it doesn't, then run command midclt call etc.generate nss.

If (1) or (2) fail, then either winbindd is not running, samba is misconfigured, our time offset from KDC is too great, or our trust password is bad.

If (1)-(3) succeed, but you're getting errors setting an ACL with a user/group visible in "getent passwd" or "getent group" output, then there's a chance that the webui is passing invalid values to the setacl API. You can run the command midclt call core.get_jobs|jq to view the parameters being passed to the filesystem.setacl API.
 

Emile.Belcourt

Dabbler
Joined
Feb 16, 2021
Messages
23
Some quick checks you can do to see where things are breaking down are as follows:
1. wbinfo -t - verify trust relationship with DC
if this succeeds:
2. wbinfo -u - verify that winbindd can enumerate users in AD
if this succeeds:
3. getent passwd - verify that your AD users and groups have proper passwd entries

If (3) fails, but (1)-(2) succeed, then you probably have an issue with idmap configuration or nsswitch configuration. /etc/nsswitch.conf should have `winbind` present in it. If it doesn't, then run command midclt call etc.generate nss.

If (1) or (2) fail, then either winbindd is not running, samba is misconfigured, or our trust password is bad.

If (1)-(3) succeed, but you're getting errors setting an ACL with a user/group visible in "getent passwd" or "getent group" output, then there's a chance that the webui is passing invalid values to the setacl API. You can run the command midclt call core.get_jobs|jq to view the parameters being passed to the filesystem.setacl API.[/cmd]

I had done all 3 commands and all seemed to come back ok but then I made the mistake of unbinding AD and trying to rejoin which dumped me back to the primary issue of this thread. I'm going to re-apply all the settings manually in u1 and test Samba there. If it works, great, will take a config snapshot then update to u2 and see if the issue returns.

I have now wiped my u2 installation and gone back to u1 but will use the tests you've put up to double and triple check but if the issue with samba permissions return I'll raise another thread as to not hijack this one which definitely has a bigger problem.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,543
I haven't been able to reproduce the AD join issue being reported here. If one of you who is affected can send me a PM and we can arrange a teamviewer session where I can see / investigate the problem in your environment, this will help diagnose the issue / get a fix to you faster.
 

Emile.Belcourt

Dabbler
Joined
Feb 16, 2021
Messages
23
I haven't been able to reproduce the AD join issue being reported here. If one of you who is affected can send me a PM and we can arrange a teamviewer session where I can see / investigate the problem in your environment, this will help diagnose the issue / get a fix to you faster.

Am a bit adverse to reinstalling the server yet again, but if no one else proffers, i may be able to commit to doing this around 4.30pm GMT or tomorrow morning
 

TheUsD

Contributor
Joined
May 17, 2013
Messages
116
I haven't been able to reproduce the AD join issue being reported here.

Which OS are you trying to reproduce from? Not questioning you but just looking at everyone's issue, so far I've only seen older Windows Server versions which makes me question if it is the older Forest Function Levels that is the root culprit. No evidence to back this up. Just going with gut feelings.
 

Emile.Belcourt

Dabbler
Joined
Feb 16, 2021
Messages
23
Which OS are you trying to reproduce from? Not questioning you but just looking at everyone's issue, so far I've only seen older Windows Server versions which makes me question if it is the older Forest Function Levels that is the root culprit. No evidence to back this up. Just going with gut feelings.

Our forest functional level is 2012 R2, forgot to note.
 
Joined
Jan 21, 2021
Messages
4
I my case is a very old SBS 2003, but working ok before update to Truenas 12. Btw the LDAP service running ok.
 

Valorem

Cadet
Joined
Mar 29, 2017
Messages
2
In my case both forest and domain functional levels are at Server 2016 .

Fow now I gave up and re-installed 12.0-U1.1 as I urgently need to move a few TB out of an old file server.

Which OS are you trying to reproduce from? Not questioning you but just looking at everyone's issue, so far I've only seen older Windows Server versions which makes me question if it is the older Forest Function Levels that is the root culprit. No evidence to back this up. Just going with gut feelings.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,543
Am a bit adverse to reinstalling the server yet again, but if no one else proffers, i may be able to commit to doing this around 4.30pm GMT or tomorrow morning
Update regarding this user's problem. Issue was caused by missing idmap configuration due to AD and LDAP being enabled simultaneously. U3 will prevent users from enabling both at the same time.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,543
I my case is a very old SBS 2003, but working ok before update to Truenas 12. Btw the LDAP service running ok.
SBS 2003 / Windows 2003 lack support for SMB2 protocol. In this situation, you need to lower the client minimum protocol to SMB1 via the following auxiliary parameter under services->SMB:

client min protocol = NT1
 
Joined
Feb 16, 2021
Messages
1
I am having this same issue. @anodos, you said that you would like to take a look at this issue over TeamViewer, could I help you with that? I am absolutely sure that I have not enabled both LDAP and Active Directory. This is a brand new install, can't join AD no matter what I do.
 
Joined
Jan 21, 2021
Messages
4
SBS 2003 / Windows 2003 lack support for SMB2 protocol. In this situation, you need to lower the client minimum protocol to SMB1 via the following auxiliary parameter under services->SMB:

client min protocol = NT1
Well done, excelent job. Running like a charm.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,543
Okay. A community member graciously offered access to an affected server. It was a timing issue for one of the domain health validation checks that we adjusted between U1 and U2 because on busy systems it was causing disruption of the winbindd netlogon connection in some rare edge cases. Fix is pretty simple (switch from our general domain health check to a specific one for excessive clock offset, which is of particular concern during AD setup). You can probably work around it by configuring everything for the AD service, but _not_ checking enable, the from the shell run the command midclt call activedirectory.start. (this command returns a job_id since it's a backgrounded task) You should be able to view status of AD start job in the task manager in the gui. Fix (at least for affected user) is here: https://github.com/truenas/middleware/pull/6453
 

Elo

Contributor
Joined
Mar 11, 2012
Messages
122
I have two servers both part of a domain and both running 12 U2. had problem of joining the domain and it was clearly flagged as a time sync problem (more than 3mins difference). Never had problems before for last two years though. I have a "primary" ADC (W Server 2016) and a secondary in a W Server 2019.

i did have to take both servers out of the domain and rejoin then fixed the sync (according to domain hierarchy) and using the ADC as time server. Both TrueNAS servers now join the domain..
 
Last edited:

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,398
NTP sync instability is common enough that this thread may be helpful.

 

Elo

Contributor
Joined
Mar 11, 2012
Messages
122
The issue with the AD is unfortunately not over for me. I also found an Error concerning the FSMO roles on our main domain server (AD +++). The strange thing is that the server holds all the roles mentioned ...

Critical Alert: FSMOCompliance is raised at OB-MSSERV-2016. <Title> License Error: FSMO Role Check. <Description> The FSMO Role Check detected a condition in your environment that is out of compliance with the licensing policy. The Management Server must hold the primary domain controller and domain naming master Active Directory roles. Please move the Active Directory roles to the Management Server now.. <AdditionalInfo> .

I have done so much back and forth so I tried to completely remove the AD connection in the TrueNAS server. I am not succesfull . When I try to reconnect I get the following errors:

Error: Traceback (most recent call last):

File "/usr/local/lib/python3.8/site-packages/middlewared/plugins/activedirectory.py", line 769, in validate_credentials
self.middleware.call_sync('kerberos.do_kinit', data)
File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1254, in call_sync
return self.run_coroutine(methodobj(*prepared_call.args))
File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1294, in run_coroutine
return fut.result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
File "/usr/local/lib/python3.8/site-packages/middlewared/plugins/kerberos.py", line 269, in do_kinit
raise CallError(f"kinit for domain [{data['domainname']}] "

middlewared.service_exception.CallError: [EFAULT] kinit for domain [xxx.yyy.COM] with principal [OB-NAS-MAIN$@xxx.yyy.COM] failed: kinit: krb5_get_init_creds: Already tried ENC-TS-info, looping


HOW CAN I COMPLETELY REMOVE ALL TRACES OF THE FORMER AD CONNECTION in TrueNAS. As part of this I will also rebuild all my shares and check update permissions in all datasets implied

Runing : TrueNAS-12.0-U2.1
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,543
The issue with the AD is unfortunately not over for me. I also found an Error concerning the FSMO roles on our main domain server (AD +++). The strange thing is that the server holds all the roles mentioned ...

Critical Alert: FSMOCompliance is raised at OB-MSSERV-2016. <Title> License Error: FSMO Role Check. <Description> The FSMO Role Check detected a condition in your environment that is out of compliance with the licensing policy. The Management Server must hold the primary domain controller and domain naming master Active Directory roles. Please move the Active Directory roles to the Management Server now.. <AdditionalInfo> .

I have done so much back and forth so I tried to completely remove the AD connection in the TrueNAS server. I am not succesfull . When I try to reconnect I get the following errors:

Error: Traceback (most recent call last):

File "/usr/local/lib/python3.8/site-packages/middlewared/plugins/activedirectory.py", line 769, in validate_credentials
self.middleware.call_sync('kerberos.do_kinit', data)
File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1254, in call_sync
return self.run_coroutine(methodobj(*prepared_call.args))
File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1294, in run_coroutine
return fut.result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
File "/usr/local/lib/python3.8/site-packages/middlewared/plugins/kerberos.py", line 269, in do_kinit
raise CallError(f"kinit for domain [{data['domainname']}] "

middlewared.service_exception.CallError: [EFAULT] kinit for domain [xxx.yyy.COM] with principal [OB-NAS-MAIN$@xxx.yyy.COM] failed: kinit: krb5_get_init_creds: Already tried ENC-TS-info, looping


HOW CAN I COMPLETELY REMOVE ALL TRACES OF THE FORMER AD CONNECTION in TrueNAS. As part of this I will also rebuild all my shares and check update permissions in all datasets implied

Runing : TrueNAS-12.0-U2.1
There's a leave button for AD in the advanced options of the AD form.
 

Elo

Contributor
Joined
Mar 11, 2012
Messages
122
There's a leave button for AD in the advanced options of the AD form.
Sorry for beeing unclear about the applied process but YES thats the one I used. After I used that it looks like this (with the domain I left as xxx.yyy.com ??):
1614896174149.png



Its not asking for USER and PW of the domain and it remebers the domain I left... When i mark enable i get the errors as stated in previous post
 
Top