AD/SMB not working after system reboot

adocampo

Cadet
Joined
Aug 31, 2021
Messages
2
Same situation here. I guess it doesn't mind, but I've installed TN on a KVM VM.
Fresh installation of 12.0-U5, joined without problems to a 2016 AD, configured the shares without a glitch and after powering off and powering on (for some reason reboot doesn't work on my VM, but this is not a problem for me).
If I do the @Performance100 's workaround, it finally re-joins the domain after several minutes after it boots, but I agree that's a terrible workaround.
So I've installed another VM with the same version and configured AD before creating the zpool and powered off and on, and... it was still connected to the AD.
I added then the disks to create the zpool, but didn't created yet, and powered off and on once again. AD was working just as expected.
Then I'd configured the zpool, just created one with the disks I'd added previously, and powered off/on once again... and then, AD Directory Service fails lo start.
Finally, I've destroyed the zpool and wiped the disks, and powered off and on again, and... voilà, AD was working again.

So, it seems somehow zpool is interfering with AD.
 

adocampo

Cadet
Joined
Aug 31, 2021
Messages
2
Oh, I did more tests. I think I found a way to make it work (from scratch)
Before joining TrueNAS to the AD, you must create first the zpool, and then, join it to the AD. In thjs order, AD is working after powering on back the NAS.
 

tonci

Dabbler
Joined
Mar 14, 2013
Messages
18
Hello,
now it seems we have new experience. It somehow started working (U5), after restart NAS gets rejoined correctly (it is very probably connected to what @adocampo figured out) , wbinfo-u/g shows AD objects as well, shares are accessible from local lan , but there is one thing that still confuses:
1630575255444.png

This monitor still shows AD-Faulted ...

So , now we cannot determine whether something is still bad (and we do not know what) , or AD-service-monitor just does not work ... (!?)

BR

Tonci
 

tonci

Dabbler
Joined
Mar 14, 2013
Messages
18
Hello,
now it seems we have new experience. It somehow started working (U5), after restart NAS gets rejoined correctly (it is very probably connected to what @adocampo figured out) , wbinfo-u/g shows AD objects as well, shares are accessible from local lan , but there is one thing that still confuses:
View attachment 49157
This monitor still shows AD-Faulted ...

So , now we cannot determine whether something is still bad (and we do not know what) , or AD-service-monitor just does not work ... (!?)

BR

Tonci
Update to previous post .... I should've been more patient :) ... Now it is OK .. after every restart ... (?!)
 

tonci

Dabbler
Joined
Mar 14, 2013
Messages
18
Update to previous post .... I should've been more patient :) ... Now it is OK .. after every restart ... (?!)
No, it isn't good ... there is no pattern ... after few new restarts , everything works but notification says AD-faulted .... for an hour already ...
 

S0ulib

Cadet
Joined
Sep 2, 2021
Messages
7
I have issues with SMB service as well. It's not starting manually or automatically after reboot. Any suggestions?
 

oktaya

Cadet
Joined
Sep 16, 2021
Messages
4
There are various different problems listed here but I too have one similar to those. This is on the latest 12.x release. I just updated today but Active Directory still does not work after a reboot. I have to leave the domain and rejoin. Right now wbinfo -u, wbinfo -g, getent passwd all show only system info, nothing from the domain. Directory Services Monitor shows Active Directory: Healthy . I had the included active directory server on it before, if it makes a difference. Also my domain is not real active directory. It's samba. I followed the instructions to migrate. It didn't work, but I got it to work after some trial and error, mostly to do with RID numbers for the services.

At one point I remember finding a script here to restart active directory properly. It's not documented but installed. I don't remember the path for it now.

I have seen Freenas staff asking for debug info here and not getting it as far as I can see. I'd be willing to provide it if you can direct me to documentation telling me how to do it. I'd be happy to provide more info.

Thanks.
 

oktaya

Cadet
Joined
Sep 16, 2021
Messages
4
A few bits of info from the middlewared log.

This kerberos error seems relevant:

[2021/09/16 17:46:07] (WARNING) DirectoryServices.initialize():432 - Failed to start kerberos after directory service initialization. Services dependent on kerberos maynot work correctly.

Actual error is:

middlewared.service_exception.CallError: [EFAULT] Timed out hung kinit after [30] seconds

From the way it behaves, (doesn't work when rebooting, but works after you leave the domain and join again, or the way people are manually joining via startup scripts after boot) it seems to me like a situation where at the boot stage where this runs something is not ready yet.

Is there a way to disable joining the domain during boot but then running the middlewared command manually?
 

oktaya

Cadet
Joined
Sep 16, 2021
Messages
4
I think I got it working for my own scenario. I noticed my Kerberos Realm settings only contained the REALM name. Thinking this might cause timeout trying to look up the KDC I thought to add the domain controller name (not IP, maybe that would have worked too) in the KDC field. I rebooted and for the first time wbinfo -u worked without leaving/rejoining the domain.

I don't think this is all the story. The lookup should have worked anyway, but I seem to have a stale domain controller record somewhere. I have two samba domains both of which have the correct controllers listed for the domain, however FreeNAS seems to be querying an old & defunct domain controller at boot. The error message in the middlewared log with DEBUG enabled looks like this:

(DEBUG) ActiveDirectoryService.port_is_listening():113 - connection to WIN-FN2O8GU81OO.domain.MYNAME.com failed with error: [Errno 64] Host is down

I checked the domain controllers on a windows machine too and this WIN-FN2O8GU81OO controller does not exist. In the logs there's also a domain controller IP I do not recognize. It's probably the old IP for the old domain controller but I can't correlate them because they do not appear in the logs together.

I have no idea where FreeNAS is getting the stale data from or how to delete it. I checked a current config dump and that ip/name does not exist anwyhere. If I remember it correctly I had to demote this domain without access to it, which is known to cause problems in Samba. My theory is trying to connect to this is what causes the timeout. I couldn't find where to increase that timeout of 30 seconds either. Maybe that would have solved the problem too.

I am not too experienced with samba but even though some people in the thread are talking about smbd actually not starting, that might be another kind of timeout related problem, so trying this method might be worthwhile, unless you already have the correct info in the settings.

I can't keep rebooting the server to see if it really worked unfortunately since it's trying to do a resilver right now. But I recommend people having similar problems to check if they have similar errors in the log and/or make sure REALM settings are filled in completely.
 

oktaya

Cadet
Joined
Sep 16, 2021
Messages
4
This is being retrieved via DNS in your AD domain.
Thanks for the info. I have fixed the stale DNS records in my domain. I haven't tested a reboot yet but I don't expect anything to break. Also since I'd already gotten my issue resolved, I won't keep poking at it right now. I just wanted to share my findings in case anybody else might have similar issues.
 

juwani

Cadet
Joined
Sep 24, 2019
Messages
1
Hi, I have the same issue, fresh installation of TrueNAS-12.0-U5.1, I have created a pool, and then joined the TrueNAS to AD. It's working fine until I restart the freenas box. After restart I can see the status Active Directory Faulted and I have to leave and join AD again to make it working.
can you help me somehow it is very annoying. On the other hand, I don't want to use the script because there is an AD password in plaintext.

PS. I can see in the task manager the following error:
sysdataset_update.pool: System dataset location may not be moved while the Active Directory service is enabled.
 
Top