BUG: NTP not syncing, stuck on .INIT. - workarounds described within

invar

Dabbler
Joined
Jan 23, 2021
Messages
36
*Update: skip ahead to this post for definitive workaround:
Post #10


And explanation of what happened when I told iX about it:
Post #11






On TrueNAS Core 12 but if memory serves me correctly, this used to happen to me in FreeNAS 11 as well. The default NTP servers are configured as 0.freebsd.pool.ntp.org and 1 and 2.

Over time, my clock would slowly drift which would eventually result my not being able to login due to Google authenticator's time based 2FA. Fortunately I had console access via SSH and sometimes performing a "service ntpd restart" caused things to sync back up.

However, this was a nuisance and a temporary fix. Today I really dove into it.

I tried adding about a dozen servers in such as time.nist.gov and time.windows.com and several others from*.pool.ntp.org.

Then I checked on the console with "ntpq -p" to see what the status was and found that a bunch were stuck on .INIT. I had about 7 of 12 synced up initially so gave it 24 hours to see if things would sync up and instead, I was stuck with only 3 servers synced. The rest had gone to .INIT.

Puzzled, I dove into the ntp.conf file by hand and the first line I commented out was "restrict default ignore"

I performed a "service ntpd restart" and then a "ntpq -p" and to my delight it seemed almost every server had synced. Now I was making progress and had it narrowed down to an access restriction issue and nothing to do with my ISP or router or ports.

After reading the ntp.conf docs here:

I discovered the following:
ALERT! You must use IP addresses on restrict statements.

And

You may use either a hostname or IP address on the server line. You must use an IP address on the restrict line.

A peek at the ntp.conf generated by TrueNAS clearly was not conforming to that.

After deleting the default 3 servers via the web interface, I immediately started looking up the IP addresses of public NTP servers and added them in.

As of now, all servers I configured are synced.

remote refid st t when poll reach delay offset jitter ============================================================================== time-c-b.nist.g .NIST. 1 u 6 64 21 43.632 -1.116 0.313 -104.171.113.34 204.9.54.119 2 u 38 64 377 32.063 +5.971 0.521 *usnyc3-ntp-003. .GPSs. 1 u 49 64 377 2.851 -0.974 0.598 -dev.smatwebdesi 204.9.54.119 2 u 38 64 377 43.156 -1.664 0.276 #91.206.16.3 (tm 195.28.27.26 2 u 171 64 144 153.327 +3.700 1.222 -ntp2.as200552.n 202.70.69.81 2 u 46 64 377 72.704 -0.541 0.439 -ntp0.edu-zg.io 85.158.27.30 2 u 36 64 377 98.168 -3.028 0.294 -82.193.104.168 62.149.0.30 2 u 39 64 377 113.148 -8.644 6.818 -23.92.64.226 31.222.135.144 3 u 37 64 375 42.031 +2.459 0.629 +159.203.82.102 17.253.2.123 2 u 37 64 377 4.472 -0.661 0.209 -li116-100.membe 192.58.120.8 2 u 78 64 206 42.461 -1.849 0.959 +162.221.74.15 ( 185.140.51.3 2 u 28 64 377 8.201 -1.147 0.371

If anyone else is running into NTP issues, this might be as good idea for you to try and see if it resolves your issue. I'm in NYC, BTW, so my server choices are based off that. If you are in another country or even different coast, you should probably use different servers.
 
Last edited:

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Your drifting has nothing to do with NTP. It's because your system is using TSC as the timecounter, but it's unstable.

 

invar

Dabbler
Joined
Jan 23, 2021
Messages
36
Your drifting has nothing to do with NTP. It's because your system is using TSC as the timecounter, but it's unstable.

Thank you but that's a separate issue for me to tackle. The issue I'm tackling here is a failure of ntpd to sync at all.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Thank you but that's a separate issue for me to tackle. The issue I'm tackling here is a failure of ntpd to sync at all.

You've brute forced it by using multiple NTP IPs. On my TrueNAS installation, the restrict statements cause no problems. Fix your timecounter, and you'll have a stable system clock.
 

invar

Dabbler
Joined
Jan 23, 2021
Messages
36
What ntp servers are you using on your own install?

I appreciate the input. I do plan on fixing it but again, that's not going to fix the problem of ntpd not being able to get out of .INIT. and also not conforming to the requirements of using an IP address for the restrict section.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
What ntp servers are you using on your own install?

The defaults, with the default restrict statements. I had the same instability until I set kern.timecounter.hardware=HPET.
 

invar

Dabbler
Joined
Jan 23, 2021
Messages
36
Also for reference I'm experiencing a drift of about 20 seconds behind over a period of half a year.
 

invar

Dabbler
Joined
Jan 23, 2021
Messages
36
My suspicion is it's DNS related somehow?

nslookup gives me the following for 0.freebsd.pool.ntp.org which changes every time and makes sense to me.

Code:
Non-authoritative answer:
Name:   0.freebsd.pool.ntp.org
Address: 23.92.64.226
Name:   0.freebsd.pool.ntp.org
Address: 185.102.185.67
Name:   0.freebsd.pool.ntp.org
Address: 139.143.5.30
Name:   0.freebsd.pool.ntp.org
Address: 194.100.206.70
> 0.freebsd.pool.ntp.org
Server:         8.8.8.8
Address:        8.8.8.8#53
Non-authoritative answer:
Name:   0.freebsd.pool.ntp.org
Address: 85.21.78.23
Name:   0.freebsd.pool.ntp.org
Address: 38.100.216.142
Name:   0.freebsd.pool.ntp.org
Address: 80.88.90.14
Name:   0.freebsd.pool.ntp.org
Address: 94.124.107.190
> 0.freebsd.pool.ntp.org
Server:         8.8.8.8
Address:        8.8.8.8#53
Non-authoritative answer:
Name:   0.freebsd.pool.ntp.org
Address: 23.92.64.226
Name:   0.freebsd.pool.ntp.org
Address: 185.102.185.67
Name:   0.freebsd.pool.ntp.org
Address: 139.143.5.30
Name:   0.freebsd.pool.ntp.org
Address: 194.100.206.70


See this thread from 2010 which describes an issue with utilizing*.pool.ntp.org servers and another thread regarding a "restrict source" configuration directive that seems to still not have been implemented

Re: problem with "restrict default ignore"
539 views
Subscribe
dave...@gmail.com's profile photo
dave...@gmail.com
unread,
7/30/10
to J. Bakshi, ques...@lists.ntp.org
On Fri, Jul 30, 2010 at 07:11 UTC, J. Bakshi <joy...@infoservices.in> wrote:
> I like to secure my ntp daemon with "restrict default ignore"
You didn't ask, but my personal opinion is that is usually overkill
and just causes more pain than it's worth. I use "restrict default
limited kod notrap".

> but ntp stops synchronizing with this configuration; though I have restrict lines for ntp servers.

Yes, but your 'servers' are *.pool.ntp.org, which DNS names resolve to
a different handful of servers every few minutes. You don't mention
which version of ntpd, but I'll bet it is not recent enough to add a
restriction for each of several IP addresses a DNS name resolves to.
Instead, I suspect it is using only a single IP address for each
"restrict" line in ntp.conf. "ntpdc -c reslist" displays the
resulting restriction list.

Since running up-to-date ntpd is heresy to most, I'll first assume you
want to make it work with the version of ntpd you have already. One
way is to switch from using *.pool.ntp.org to hand-selected servers,
perhaps from:


Newer ntp-dev releases of ntpd (4.2.7p22 and beyond) have been
enhanced with this specific problem in mind, adding a "restrict
source" directive to configure blanket restrictions for servers listed
in "server", "pool", "manycastclient", and other directives which
configure associations. If you were to jump to the bleeding edge, you
could replace all your per-server restrict lines with a single
"restrict source notrap noquery".

If you do try ntp-dev, you might also kick the tires of the reworked
"pool" directive, by using it in place of "server" for *.pool.ntp.org
lines.

Cheers,
Dave Hart

Here's another thread inquiring about which versions of ntp have "restrict source" in it:


And as of now, I am on ntpd 4.2.8p15-a (1).

the manpage for ntp.conf makes no mention of the "restrict source" ability either.
 
Last edited:

invar

Dabbler
Joined
Jan 23, 2021
Messages
36
Update:

I modified my ntp.conf manually via console to the following:
Code:
pool 0.north-america.pool.ntp.org iburst maxpoll 10 minpoll 6
pool 1.north-america.pool.ntp.org iburst maxpoll 10 minpoll 6
pool 2.north-america.pool.ntp.org iburst maxpoll 10 minpoll 6
pool 3.north-america.pool.ntp.org iburst maxpoll 10 minpoll 6
pool us.pool.ntp.org iburst maxpoll 10 minpoll 6
restrict default ignore
restrict -6 default ignore
restrict 127.0.0.1
restrict -6 ::1
restrict 127.127.1.0
restrict source nomodify notrap noquery


Restarted the ntpd service

Code:
service ntpd restart

And now "ntpq -p" returns the following:

Code:
 
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 0.north-america .POOL.          16 p    -   64    0    0.000   +0.000   0.000
 1.north-america .POOL.          16 p    -   64    0    0.000   +0.000   0.000
 2.north-america .POOL.          16 p    -   64    0    0.000   +0.000   0.000
 3.north-america .POOL.          16 p    -   64    0    0.000   +0.000   0.000
 us.pool.ntp.org .POOL.          16 p    -   64    0    0.000   +0.000   0.000
+104.194.8.227   132.239.1.6      2 u   40   64    1   73.222   +4.251   0.548
+ip32.ip-149-56- 206.108.0.131    2 u   38   64    1   16.632   +1.357   1.460
*206.81.5.45     192.5.41.41      2 u   38   64    1    7.751   -0.304   0.894
-tick.srs1.ntfo. 206.55.64.77     3 u   38   64    1   69.859   +3.398   0.199
-t1.time.bf1.yah 98.139.133.62    2 u   35   64    1   13.703   -0.824   0.723
-44.190.40.123   132.163.96.1     2 u   35   64    1   68.981   +1.104   0.393
 t2.time.gq1.yah 98.137.249.214   2 u   46   64    1   66.414   +0.271   0.000
 backup01.mx.dat 66.70.119.39     3 u   32   64    1    6.432   -1.627   0.896



My conclusion is that this is how it SHOULD function when using POOL servers.

I submit that TrueNAS needs to be updated to take that into account.

A simple solution would be to have in the ntp section of the WebGUI have a checkbox to indicate if an address listed is part of a pool and if it is, instead of using "server" in the ntp.conf it should use "pool". Additionally, a "restrict source" directive should be added if there are pool servers being utilized.

For now, I've copied my ntp.conf file to one of my pools and added a startup command to tasks to simply copy my custom ntp.conf over the generated one and then restart the ntpd service.
 
Last edited:

invar

Dabbler
Joined
Jan 23, 2021
Messages
36
So just to update. I posted about this over on Jira. Someone closed out my ticket and basically said:
1) they don't consider my issue a bug. They consider it an enhancement request and

2) they're not going to "enhance" it because they're moving onto TrueNAS scale which uses something other than ntpd.

That is RIDICULOUS because from many accounts, this is a BUG. The ntp functionality does NOT function as it should and it's clear why it doesn't function as evidenced by documentation submitted above and also anecdotally from other people posting about ntp not working.

Frankly, the person brushing off the issue is an embarrassment.

Most people probably don't even realize there's an issue because they don't run know to execute "ntpq -p" to see the sync status and maybe their clock doesn't drift much.
 

tolsen718

Cadet
Joined
Sep 29, 2021
Messages
1
I concur that this is bad configuration issue. I tried Samuel's advice of changing which clock was used and had no luck with that.

The suspect the issue is that if you specify a hostname for restrict ntp will do a reverse DNS lookup to see if it matches. This totally fails for any pool server as an A record lookup just gives back some IPs from a pool and their reverse PTR records point to something very different than the original hostname. That's why everything gets denied.

I looked at the ntp.conf file that gets generated in a jail, and look what we have here:

# # Security: # # By default, only allow time queries and block all other requests # from unauthenticated clients. # # The "restrict source" line allows peers to be mobilized when added by # ntpd from a pool, but does not enable mobilizing a new peer association # by other dynamic means (broadcast, manycast, ntpq commands, etc). # # See http://support.ntp.org/bin/view/Support/AccessRestrictions # for more information. # restrict default limited kod nomodify notrap noquery nopeer restrict source limited kod nomodify notrap noquery # # Alternatively, the following rules would block all unauthorized access. # #restrict default ignore # # In this case, all remote NTP time servers also need to be explicitly # allowed or they would not be able to exchange time information with # this server. # # Please note that this example doesn't work for the servers in # the pool.ntp.org domain since they return multiple A records. # #restrict 0.pool.ntp.org nomodify nopeer noquery notrap #restrict 1.pool.ntp.org nomodify nopeer noquery notrap #restrict 2.pool.ntp.org nomodify nopeer noquery notrap # # The following settings allow unrestricted access from the localhost restrict 127.0.0.1 restrict ::1

In particular, see the comment "Please note that this example doesn't work for the servers in the pool.ntp.org domain since they return multiple A records." Guess the TrueNAS developers missed that?

Here is what I am now using as my ntp.conf file with much success:

minclock 3 maxclock 6 pool 0.freebsd.pool.ntp.org iburst restrict default limited kod nomodify notrap noquery nopeer restrict source limited kod nomodify notrap noquery restrict 127.0.0.1 restrict ::1 leapfile "/var/db/ntpd.leap-seconds.list"

This is basically a simplified version of what was in my jail (the FreeBSD default).

Hope this helps the next person who runs into this. It's sad that iXsystems is avoiding fixing bugs in TrueNAS by labelling them as "enhancement requests."
 
Top