AD service Disabled on Reboot

Status
Not open for further replies.

pechkin000

Explorer
Joined
Jan 24, 2014
Messages
59
Hi, I was just wondering if anyone else is experiencing that and if its a feature or a bug? Basically after every update on reboot, I have to manually enable AD service. Its not really a big deal but it would be nice if I didnt have to do that every time.
Thanks!
 
D

dlavigne

Guest
Anything related in /var/log/messages or /var/run/dmesg.boot?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Can you post a debug file from the machine?
 

pechkin000

Explorer
Joined
Jan 24, 2014
Messages
59
Hi Guys,
sorry I dindt get back to you right away, ended up going away. @cyberjock - I am not sure which debug file you would need. could you specify a location? Thanks @dlavigne: I looked through those but cant really say whats relevant...
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
This just happened to me. I rebooted, system came up. AD wasn't enabled. I enabled it and then saved the debug. See attached. EDIT: I should clarify that this is FreeNas Directory Services AD connecting to a Windows AD server (not FN acting like an AD server). If that wasn't what the OP meant, I can start a new thread.
 

Attachments

  • debug-freenas1-20150612223043.tgz
    710.5 KB · Views: 230

mayday175

Cadet
Joined
Jun 14, 2015
Messages
4
I have this issue too. Have been having it for a while now. But today, the power went out... when it came back on again, AD service was disabled. I entered the password, check enable and save. Everything was working again. I then applied available updates (9.3 stable), rebooted, AD service disabled again. I entered the password, check enable and save. Everything was working again... again. Debug attached. Hope this is of use to someone.

Outages like this are a real problem at home... my TV server won't record or play back recorded TV when this is broken. This has a massive affect on WAF (wife approval factor)!

Edit: removed the attached log file.
 
Last edited:

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I just looked through your debug. Here's what I saw that was totally weird:

- You have 4 mirrors of your slog device (why!?)
- You have errors that need serious investigation...
Example:
(ada6:ahcich10:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 80 80 1b e8 40 73 00 00 00 00 00
(ada6:ahcich10:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada6:ahcich10:0:0:0): Retrying command

- You have kinit errors on bootup that seems very abnormal (likely a symptom of your problems, but may be the cause).
- You have mount requests to locations that don't exist
Example: Jul 14 19:49:08 nas1 mountd[2349]: mount request from 192.168.0.20 for non existent path /Raw
-There's evidence that at least one time the system crashed, rebooted spontaneously, or was powered off without a proper shutdown.
- You've got a lot of NT_STATUS_ACCESS_DENIED entries. That's not someone failing to authenticate (which is a different error), that's someone that is authenticating properly and being denied permission.
- The boot device is reporting errors on ZFS that are apparently undiagnosed and uncorrected.
- You are using dedup with just 32GB of RAM (gulp!)


I don't know you, but when I see all of these problems it does make me question how thorough the admin is at finding problems, identifying they are truly a problem, and then fixing them appropriately. So my gut feeling is that something has been tweaked, not set properly, or not setup at all that is responsible for your problems. As for what that problem is, I have no clue. There's lots to read in the logs and such, but no errors that I saw that would tell me what is wrong. It does look like your system isn't even trying to do any kind of AD connection on bootup. But I have no idea why it would behave that way if it was previously enabled. It kind of takes me back to the "what was tweaked, not setup properly, or not setup at all" thing.

But quite a few of those bullets are major no-nos and should have been investigated thoroughly when they first happened. Others seems to be the result of doing things "just because" and not for any reason except to add complexity for no value.
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
- You are using dedup with just 32GB of RAM (gulp!)

To be fair, there's only 213G in the deduped dataset. But yes, I'd be curious to see the resulting dedupe ratio, and how many blocks are in the DDT's.

"zdb -Dv Pool1" should do it. Might take a while to run depending on the number of deduped blocks.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
To be fair, there's only 213G in the deduped dataset. But yes, I'd be curious to see the resulting dedupe ratio, and how many blocks are in the DDT's.

"zdb -Dv Pool1" should do it. Might take a while to run depending on the number of deduped blocks.

Yes, but do you realize that in a worst case scenario, just 213GB of deduped data would require something like 196GB of RAM? Just for the DDT. :P
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
Yes, but do you realize that in a worst case scenario, just 213GB of deduped data would require something like 196GB of RAM? Just for the DDT. :p

I assume this is with 512 byte sectors? With an ashift 12 pool I don't think that's possible, as the smallest allocatable size on pool is 4k. But that would still be 24.5 gigs of ram, which is excessive for a 32gig system, assuming default tuning of 1/4 ram for metadata, there'd be problems.

I very much doubt there's all 4k sectors though. Which is why I said I'd be curious to see "zdb -Dv Pool1".

I average 110k block size across 569 GB, and 5.42M blocks. At 320 bytes per block, that's 1.6 gigs of ram for my DDT's.
 

mayday175

Cadet
Joined
Jun 14, 2015
Messages
4
Thankyou for taking the time to examine my debug. That was so much more than I expected... I was only trying to help get to the bottom of the issues raised by the OP. With that in mind, I do not want to hijack this thread by going off topic, but I will respond briefly to many of issues you identified...

- You have 4 mirrors of your slog device (why!?)
Are you saying this is a configuration error, or simply not required? What you don't know is that I am in the process of building my home ESX lab, so NFS shares on this NAS will be hit hard soon. I obtained 4 small mSATA SSD's (and a PCIe SATA card to install them onto at a reasonable price) to improve NFS write performance, so I'm simply trying to wear level the SSDs over as broad an area as possible. If the configuration is incerrect, I'm happy to be pointed in the right direction or tackle this issue in a dedicated thread.
- You have errors that need serious investigation...
Example:
(ada6:ahcich10:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 80 80 1b e8 40 73 00 00 00 00 00
(ada6:ahcich10:0:0:0): CAM status: Uncorrectable parity/CRC error
(ada6:ahcich10:0:0:0): Retrying command
I was not aware of this. This error has not displayed itself to the web GUI nor sent to me as an email alert. I will investigate.
- You have kinit errors on bootup that seems very abnormal (likely a symptom of your problems, but may be the cause).
If this is on the right track, I hope others can confirm similar behaviour in order to help identify a pattern. That said, I have not noticed this error before, but I shall flex my google-fu and see what I can learn about it.
- You have mount requests to locations that don't exist
Example: Jul 14 19:49:08 nas1 mountd[2349]: mount request from 192.168.0.20 for non existent path /Raw
Eh? Yes it does... when AD is working, it mounts without issue. Perhaps there is a timing issue... this client was not affected by the power outage (on UPS) and so would be continuously requesting reconnection to /Raw before NAS1 had finished starting up. Since it works when AD is working, I an not concerned about this.
-There's evidence that at least one time the system crashed, rebooted spontaneously, or was powered off without a proper shutdown.
Yes. As mentioned in my previous post, the power went out (twice). I've just discovered its not connected to the UPS. DOH!
- You've got a lot of NT_STATUS_ACCESS_DENIED entries. That's not someone failing to authenticate (which is a different error), that's someone that is authenticating properly and being denied permission.
The string of errors between 9am and 2pm is probably my TV server trying to reconnect to the SMB share to which it is recording/reading. With AD not working, no wonder. Since it works when AD is working, I an not concerned about this.
- The boot device is reporting errors on ZFS that are apparently undiagnosed and uncorrected.
I am aware if this one. Got it via an email alert and in the web GUI. I have taken note of the details, reset the error counters (after generating the debug) and will monitor over time.
- You are using dedup with just 32GB of RAM (gulp!)
32Gb is all I can afford at the moment... and 32Bg ECC is better than the 16Gb non-ECC I used to have in it. I don't intend to abuse this feature and am storing only a small amount of data for testing purposes. Anyway, this is off topic.


I don't know you, but when I see all of these problems it does make me question how thorough the admin is at finding problems, identifying they are truly a problem, and then fixing them appropriately.
No, you don't know me. I am an IT Administrator with over 20 years experience maintaining enterprise systems. Virtualisation (VMware), networking, and storage networks are my daily bread a butter. Running a 12Tb NAS with 2 ESX hosts at home is small fry compared to what I work with... but at the same time it is different. SMB on non-Windows and NFS on non-EMC (or non fibre channel/LUN type storage) is new for me.
So my gut feeling is that something has been tweaked, not set properly, or not setup at all that is responsible for your problems.
Possible, but I have only ever used the WebGUI and have not tinkered with any CLI setting (except to investigate that boot drive/CRC error). I very carefully followed the FreeNAS Docs and read extensively about FreeNAS, Nexenta and ZFS in general, for many months before deploying my first "test" FreeNAS VM. I believe I educated myself more than the average noob before beginning this journey.
As for what that problem is, I have no clue. There's lots to read in the logs and such, but no errors that I saw that would tell me what is wrong. It does look like your system isn't even trying to do any kind of AD connection on bootup. But I have no idea why it would behave that way if it was previously enabled.
I agree, this problem does sound strange. But others have described this same problem, so I am happy that I am not alone in this regard. If only everyone would post logs/debugs etc. Simply saying "me too" doesn't help solve it.
It kind of takes me back to the "what was tweaked, not setup properly, or not setup at all" thing.
As mentioned, nothing tweaked outside the web GUI. Or, if the webGUI allowed me to make a configuration faux pas, I can't take full responsibility for it. I'm getting to the point where I'm thinking about re-installing from scratch and importing the existing zpools or moving to another platform (which I don't really want to do). Either way, I can't risk the declining WAF for much longer, just because things break after rebooting.

But quite a few of those bullets are major no-nos and should have been investigated thoroughly when they first happened. Others seems to be the result of doing things "just because" and not for any reason except to add complexity for no value.
Thankyou for your time and effort examining my debug. I was only expecting it to be used in the research of the AD-disabled-after-reboot problem. You have gone way above and beyond what I expected. I have taken on board the factual issues raised and, sorry to say, have ignored the editorial commentary.
 

mayday175

Cadet
Joined
Jun 14, 2015
Messages
4
To be fair, there's only 213G in the deduped dataset. But yes, I'd be curious to see the resulting dedupe ratio, and how many blocks are in the DDT's.

"zdb -Dv Pool1" should do it. Might take a while to run depending on the number of deduped blocks.
Here's the output from that command:

Code:
[root@nas1] ~# zdb -Dv Pool1
DDT-sha256-zap-duplicate: 69412 entries, size 2412 on disk, 535 in core
DDT-sha256-zap-unique: 1523484 entries, size 778 on disk, 172 in core

DDT histogram (aggregated over all DDTs):

bucket  allocated  referenced
______  ______________________________  ______________________________
refcnt  blocks  LSIZE  PSIZE  DSIZE  blocks  LSIZE  PSIZE  DSIZE
------  ------  -----  -----  -----  ------  -----  -----  -----
  1  1.45M  179G  139G  141G  1.45M  179G  139G  141G
  2  49.4K  5.72G  3.71G  3.78G  108K  12.5G  8.02G  8.18G
  4  14.2K  1.73G  905M  931M  60.6K  7.38G  3.79G  3.90G
  8  3.03K  369M  195M  200M  30.6K  3.66G  1.95G  2.01G
  16  758  93.2M  55.6M  56.8M  16.8K  2.05G  1.23G  1.25G
  32  351  43.8M  26.0M  26.6M  17.0K  2.12G  1.26G  1.29G
  64  39  4.88M  2.77M  2.82M  4.39K  562M  320M  326M
  128  8  642K  395K  413K  1.20K  82.7M  50.9M  54.1M
  256  1  512  512  5.81K  288  144K  144K  1.63M
  16K  2  256K  2K  11.6K  47.2K  5.91G  47.2M  275M
 Total  1.52M  187G  144G  146G  1.73M  213G  155G  158G

dedup = 1.08, compress = 1.37, copies = 1.02, dedup * compress / copies = 1.46

space map refcount mismatch: expected 222 != actual 202

Sorry, doesn't mean much to me (but I will investigate that refcount mismatch). That said, could this be discussed elsewhere? I'd like this thread to stay on topic... ie. the AD-service-disabled-on-reboot topic.
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
I wouldn't worry about the space map mismatch. Mine says that too.

That being said, you aren't gaining anything from dedupe. You have 158 gigs of data deduping down to 146 on disk. Strongly consider un-deduping this. The best way is usually to create a new dataset(s) with dedupe off, move all the data to the new dataset and delete the old.

You've got an average blocksize of 100k though, which isn't too bad, and the dedupe tables are using appx 500 megs of ram, which again isn't too bad, but there's no reason for it as a dedupe ratio of 1.08 is not worth it.

As for the AD service, I don't have any experience with that aspect of freenas, not having a domain at home.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Are you saying this is a configuration error, or simply not required? What you don't know is that I am in the process of building my home ESX lab, so NFS shares on this NAS will be hit hard soon. I obtained 4 small mSATA SSD's (and a PCIe SATA card to install them onto at a reasonable price) to improve NFS write performance, so I'm simply trying to wear level the SSDs over as broad an area as possible. If the configuration is incerrect, I'm happy to be pointed in the right direction or tackle this issue in a dedicated thread.

Yeah, it's not wear leveling over a broad area. It's taking the same data and writing the entirety of all applicable data to all 4 drives. You are literally wearing out 4 drive at the same time with the full workload. Basically your configuration is not doing what you think it's doing and not doing you anything useful (in fact, it's probably hurting you because your server is having to write all of the data to all 4 drives before ZFS will acknowledge that the data has been written).

Eh? Yes it does... when AD is working, it mounts without issue. Perhaps there is a timing issue... this client was not affected by the power outage (on UPS) and so would be continuously requesting reconnection to /Raw before NAS1 had finished starting up. Since it works when AD is working, I an not concerned about this.

Incorrect. There is no such location as /Raw. Go to the CLI of FreeNAS and try to go to /Raw. There is no location for that. Everything, and I mean everything, must be shared under /mnt/zpool/something. The WebGUI won't let you share anything else. So even if the location did exist, you wouldn't be able to share it. This limitation is by design. There is no /Raw under FreeNAS with any version past or present. I kind of figured you would say this because it's a common mistake that new users to FreeNAS make. But it is definitely a mistake.

The string of errors between 9am and 2pm is probably my TV server trying to reconnect to the SMB share to which it is recording/reading. With AD not working, no wonder. Since it works when AD is working, I an not concerned about this.

Again, if you read what I said it authenticated and accepted the credentials, but rejects the actual mounting or accessing of the share due to permissions. So no, this is not an issue from AD. If you are actually going to argue that it *is* an issue with AD then your server is way, way more misconfigured than I realize. Again, the server is accepting the login credentials, it is authenticating, and the server is rejecting the attempt to access the share due to permissions of the share and/or its files. But the username and password *is* being accepted. If it is being accepted when you aren't on the domain you have other things misconfigured and I really don't even want to contemplate how you'd end up with a server that is accepting connections that it should not be accepting.

No, you don't know me. I am an IT Administrator with over 20 years experience maintaining enterprise systems. Virtualisation (VMware), networking, and storage networks are my daily bread a butter. Running a 12Tb NAS with 2 ESX hosts at home is small fry compared to what I work with... but at the same time it is different. SMB on non-Windows and NFS on non-EMC (or non fibre channel/LUN type storage) is new for me.

To be honest, you're at more of a disadvantage than someone who has no experience. The first problem many new users make is they assume their prior experience is going to be very on-topic and very correct. Unfortunately, the experience is often very off-topic and even more incorrect. This leads people like you that may be very good with other NAS solutions very off-track and you will make many more mistakes than you realize because you've assumed that what knowledge you've relied on for a decade or more is applicable when it is not. It's not a personal fault on your part. It's just how us humans are. It's also why when people post that they're having some major issue and trying to demand that someone do a remote session with them and fix their problem for free they have problems. Then they throw the "but I got X years in the IT field and I know what I'm talking about" and that's when everyone here knows they are way more clueless than they will ever realize. You aren't in this category, but those of us with experience have seen it time and time again. Generally, the second someone pulls the "I got X years in IT.. how dare you tell me I don't know how to configure X" I tune out. There is no convincing them that they are in error, and it's not worth spending a week or more fighting them on it. I leave them be and either they figure out they were wrong the whole time or they use something else out of frustration. Anyway, it's better to assume everything you know about NASes are wrong with regards to FreeNAS. You'll be better off if you do. :P

Possible, but I have only ever used the WebGUI and have not tinkered with any CLI setting (except to investigate that boot drive/CRC error). I very carefully followed the FreeNAS Docs and read extensively about FreeNAS, Nexenta and ZFS in general, for many months before deploying my first "test" FreeNAS VM. I believe I educated myself more than the average noob before beginning this journey.

There's still plenty of ways to misconfigure things outside of the WebGUI. AD requires lots of other things to be properly setup and configured. DNS, potentially DHCP, proper networking, etc. Most of the time when I'm troubleshooting someone's AD problems it's that either their AD structures are corrupted (this is a real pain to fix and I never get involved with fixing that for obvious reasons) or they have improper settings that make FreeNAS not able to link to their AD controllers properly. It is rarely obvious as the error messages aren't clear as to what the problem is, and any given error often has multiple possible causes. My initial hunch is that your DNS is not properly configured or at least not working properly since you say that it works just fine if you try to enable AD services after bootup.

Thankyou for your time and effort examining my debug. I was only expecting it to be used in the research of the AD-disabled-after-reboot problem. You have gone way above and beyond what I expected. I have taken on board the factual issues raised and, sorry to say, have ignored the editorial commentary.

Unfortunately when dealing with complex issues like AD, the issues are almost never limited to just some issues with AD services not working. It's often the things that the AD services work with (DHCP, DNS, etc.) and those are often not properly configured, which is outside of the FreeNAS box itself. AD relies on many other things to work. If even one of those is off-kilter, AD won't work and you'll end up very frustrated because it just isn't "working correctly". Even troubleshooting AD issues myself is often frustrating because the errors don't exactly say things like "Check your DNS". They say things like "Error X has occurred" and the number of things that can cause that are often more than you can count on one hand. But you have to rule out each one, one at a time. :(

Hopefully you don't take this as a personal attack. That's not my intent and I'm not trying to attack you. I do hope you take this stuff to heart and do solve your problem. If you've got 20 years in NAS tech you'll probably find that once you start using FreeNAS and ZFS you never want to go back to that other stuff ever again. Many prior NAS admins have been converted to the ZFS love since I started frequenting the forums several years ago.
 

mayday175

Cadet
Joined
Jun 14, 2015
Messages
4
@pechkin000, @depasseg, @boston243, Do you guys still have this issue?

I've just applied the latest FreeNAS updates... one of which included the description:
#6951 Bug Important ix-activedirectory is deleting computer object on service stop

Upon restarting after this update, I have AD configuration/connectivity active and working. Yeah! Problem solved :) While I will continue to monitor my NAS, I'm curious to know if anyone else with this issue has had it resolved recently...
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
I haven't had a chance to upgrade yet. But that sounds like good news.

Sent from my Nexus 10 using Tapatalk
 
Status
Not open for further replies.
Top