NTP not syncing

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Wonder if your ISP is limiting access or your IP address/block was flagged for excess NTP requests?
 

Mastakilla

Patron
Joined
Jul 18, 2019
Messages
203
fyi:
I've just discovered that my ISP (EDPnet in Belgium) was blocking port 123 in such a way that NTP client traffic didn't work (on Windows, TrueNAS, etc). 'ntpdate -u pool.ntp.org' did work though...

After contacting them, they opened the port for me and it started working again...
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
my ISP (EDPnet in Belgium) was blocking port 123 in such a way that NTP client traffic didn't work

Actually, you're exactly incorrect. :smile:

NTP *server* traffic was being blocked. That is, traffic to and from your local port 123.

Your local client, such as ntpdate, selects a random UDP port and reaches out to a remote server on port 123. This worked for you because it is very common NTP traffic.

Your local server, however, binds to port 123, and attempts to reach other servers at port 123. So ntpd itself is unable to sync, even though ntpdate is successful.

In many cases, a NAT gateway (*cough* "router" *cough*) will change the port number of outbound connections with a source port of port 123, and so this often works, even if the ISP has blocked 123<->123, because your NAT gateway randomizes the UDP source port number. In some cases, NAT gateways attempt to retain UDP port numbers. In this case, the FIRST client on your local network to reach out via NTP will break, while the SECOND and additional client will not, because port 123 on the NAT gateway is already being used for the first client, forcing a different port number allocation for the second, etc., clients.

This explanation vaguely simplified for easier comprehension. :smile:
 

Mastakilla

Patron
Joined
Jul 18, 2019
Messages
203
Hi jgreco, thanks for the simplification, but I'm afraid I'm still having problems really understanding it ;)

Let me give you a bit more background of the issue had (which was solved after my ISP opened port 123 for me).

  • NTP clients on my whole local network have been failing for more than a year. This includes multiple Windows 10 PCs and laptops and my TrueNAS server.
  • I've tried
    • Re-installing windows
    • All possible NTP settings and NTP-fix-tricks on Windows 10
    • All kinds of NTP servers from pool.ntp.org, to country specific, to specific NTP servers, both using FQDNs and IPs
    • Reconfiguring my network from double-NAT to bridged
    • Disabling Windows / router firewalls, Asus AiProtection and all other kinds of router settings
  • NTP works fine when I unplug my ethernet cable and connect to my cellphones hotspot (which has a different ISP)
  • NTP works fine on my VM that connects to the internet via VPN
  • Only after my ISP unblocked port 123 (at least, that is what they said they did), NTP started working properly again (on all my Windows computers and my TrueNAS server)
  • On TrueNAS, I actually only started testing today. This was mostly to prove that the issue wasn't Windows-settings related. I found this thread with a similar issue, which I why I decided to post "my solution" here as well.
For TrueNAS specific, this was the situation BEFORE my ISP unblocked port 123
Code:
data# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 ntp.cybertu.be  .INIT.          16 u    - 1024    0    0.000   +0.000   0.000
 ntp2.unix-solut .INIT.          16 u    - 1024    0    0.000   +0.000   0.000
 dns-rec-2-brudi .INIT.          16 u    - 1024    0    0.000   +0.000   0.000
 
data# service ntpd stop
Stopping ntpd.
Waiting for PIDS: 59836.

 data# ntpdate -v pool.ntp.org
 6 Dec 15:05:09 ntpdate[59268]: ntpdate 4.2.8p15-a (1)
 6 Dec 15:05:18 ntpdate[59268]: no server suitable for synchronization found

However, when using '-u' (still BEFORE my ISP unblocked port 123)
Code:
data# ntpdate -q -u -v -d -4 pool.ntp.org
 6 Dec 15:12:09 ntpdate[59400]: ntpdate 4.2.8p15-a (1)
transmit(162.159.200.1)
receive(162.159.200.1)
transmit(91.121.216.238)
receive(91.121.216.238)
transmit(78.129.36.63)
receive(78.129.36.63)
transmit(45.87.78.35)
receive(45.87.78.35)

server 162.159.200.1, port 123
stratum 3, precision -25, leap 00, trust 000
refid [10.78.8.138], root delay 0.006165, root dispersion 0.000351
reference time:      e5589866.53f772b0  Mon, Dec  6 2021 15:10:46.327
originate timestamp: e55898de.6ffa5451  Mon, Dec  6 2021 15:12:46.437
transmit timestamp:  e55898b9.8695b0ac  Mon, Dec  6 2021 15:12:09.525
delay 0.03613, dispersion 0.00000, offset +36.906350

server 91.121.216.238, port 123
stratum 2, precision -24, leap 00, trust 000
refid [193.190.230.37], root delay 0.004517, root dispersion 0.029449
reference time:      e55895b2.cdcb4ea7  Mon, Dec  6 2021 14:59:14.803
originate timestamp: e55898de.a50468dd  Mon, Dec  6 2021 15:12:46.644
transmit timestamp:  e55898b9.bb619eb0  Mon, Dec  6 2021 15:12:09.731
delay 0.03914, dispersion 0.00000, offset +36.905817

server 78.129.36.63, port 123
stratum 3, precision -22, leap 00, trust 000
refid [193.190.198.14], root delay 0.011200, root dispersion 0.115738
reference time:      e558981c.af5df9ab  Mon, Dec  6 2021 15:09:32.685
originate timestamp: e55898de.d90df016  Mon, Dec  6 2021 15:12:46.847
transmit timestamp:  e55898b9.ee94a91e  Mon, Dec  6 2021 15:12:09.931
delay 0.04633, dispersion 0.00000, offset +36.905506

server 45.87.78.35, port 123
stratum 2, precision -25, leap 00, trust 000
refid [109.68.160.223], root delay 0.000610, root dispersion 0.001068
reference time:      e55895ca.650e9eba  Mon, Dec  6 2021 14:59:38.394
originate timestamp: e55898df.08fd426e  Mon, Dec  6 2021 15:12:47.035
transmit timestamp:  e55898ba.1fa77a5c  Mon, Dec  6 2021 15:12:10.123
delay 0.03653, dispersion 0.00000, offset +36.905996

 6 Dec 15:12:10 ntpdate[59400]: step time server 45.87.78.35 offset +36.905996 sec

And this is the situation AFTER my ISP unblocked port 123
Code:
data# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
+ntp.rack66.net  193.79.237.14    2 u   58   64  377   11.066   +7.634   0.301
+94-224-67-24.ac 192.53.103.108   2 u   25   64  377   21.906   +7.806   1.934
*ntp1.unix-solut 85.82.99.131     2 u   18   64  377   10.474   +7.758   0.473

data# service ntpd stop
Stopping ntpd.
Waiting for PIDS: 59836.

data# ntdate -v pool.ntp.org
-su: ntdate: command not found
data# ntpdate -v pool.ntp.org
 6 Dec 21:33:54 ntpdate[65916]: ntpdate 4.2.8p15-a (1)
 6 Dec 21:34:01 ntpdate[65916]: adjust time server 45.87.76.3 offset +0.008038 sec


As you can see, also on TrueNAS, things started magically working again (even without the '-u') after my ISP unblocked port 123.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
working again (even without the '-u')

You can stop worrying about "the -u" because that only applies to ntpdate, which causes it to act as an unprivileged client, and therefore irrelevant to what I'm saying.

The point is that server-to-server traffic, such as from Windows time service, or UNIX ntpd, has a SOURCE port of 123.

Your ISP is(/was) blocking traffic addressed TO you with a destination port of 123. This is because there have been misconfigured NTP daemons that were acting as traffic amplifiers, and one of the easiest "fixes" is just to block destination port 123.

So when you SEND a packet from your 123 to a pool.ntp.org server's port 123, that traffic passes, but the return response from the remote server's port 123 to your port 123 is BLOCKED by your ISP.

So when you run either ntpd, which is ALWAYS a server, or ntpdate without the -u, which ALSO uses the reserved port 123, your return path traffic was being blocked by your ISP. This wasn't the case over your hotspot. Your ISP probably had Windows default time servers exempted as well, but using pool servers would be problematic. However, the weird thing about such blocking is that NAT gateways often create ambiguity in the blocking, because if two of your clients send NTP requests in close temporal proximity, there will be a NAT entry for one that resides on port 123, and the other one will have to be mapped to a different port, so you get this really weird thing where "some" servers might work "once in a while" "for a little bit", until the NAT translation entry times out (30s-5min usually)
 

Mastakilla

Patron
Joined
Jul 18, 2019
Messages
203
Ok now I think I understand what you're saying. And indeed, I can confirm that occasionally (very rarely though) a time sync did get through...

However, I don't think my ISP has whitelisted the Windows default time servers (e.g.: time.windows.com), as even they didn't work (even a freshly installed Windows didn't have working time sync)

I do wonder why I seem to be the only person who bumped into this issue for my ISP? Thousands of people must have exactly the same issue as me, no? And that for many years already... (I think they started blocking port 123 since around 2013)

And also how are other ISPs solving this? As I understand, my ISP was one of the latest to block these ports, as before the DDoS abuse, they had a strict never-block-anything policy.
I remember one of my older ISPs blocking everything under 1023. But time sync did work with them... Did they do this by whitelisting the default servers?

Well, although I'm still a bit confused, I do am very happy that after multiple dozens of hours of wasted time, I (or more precise, my ISP) was finally able to solve this issue...
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Well, speaking as the operator of a service provider who blocked NTP back in the '90's for various reasons, and actually unblocked it early last decade, the reasons for these things can be somewhat opaque to end users. As network operators, it is generally our desire to make networks do the right thing while also not doing the wrong thing with the least amount of impact on customer operations.

A common NTP misconfiguration in 2013 was being used as a DDoS amplifier created a situation where a network operator would end up being the source of a large amount of traffic. Because bandwidth is a direct cost issue for ISP's, having dozens pr hundreds of vulnerable customers upstreaming many megabits of bandwidth could result in significant additional transit costs for an ISP, and it may be deemed to be cheaper to convert this to a tech support cost by having them contact the helpdesk.

Normally, an ISP wouldn't do this for every single possible use of NTP, and would carefully limit the scope of damage to the amount necessary to remediate the issue. In this case, for example, the ISP doesn't block *:123 entirely, but just you:123. Some additional analysis was probably done to identify how much customer impact this would have, and somebody at the ISP probably learned all about Windows timekeeping. I think Windows also has a fallback TCP clock setting protocol, which may be what time.windows.com uses if NTP doesn't seem to be doing the trick.
 

Mastakilla

Patron
Joined
Jul 18, 2019
Messages
203
Thanks for the clarification!

All my computers have been using the CMOS clock for more than a year. That includes my TrueNAS server and multiple Windows 10 machines, some with non-default NTP settings, some with default NTP settings.
Only after my ISP unblocked port 123, it started working again. So I guess that my ISP actually did block '*:123' instead of 'me:123'. They didn't carefully limit the scope as you did :wink:
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Only after my ISP unblocked port 123, it started working again. So I guess that my ISP actually did block '*:123' instead of 'me:123'.

No, you're still not quite getting what I'm saying. If they had blocked *:123, your ntpdate -u wouldn't have worked either, because this makes a UDP packet from, let's say for example, you:34567->remotentp:123. Blocking *:123 would have blocked that. This is probably unjustifiably paranoid and would break lots of stuff.

Sorry, but I tend to drag horses to water and then force them to drink. ;-)
 

Mastakilla

Patron
Joined
Jul 18, 2019
Messages
203
Then I don't know what exactly they did, but I'm glad they fixed it :)
 
Top