Date and Uptime increment rapidly

Status
Not open for further replies.

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
It's quite likely that UPS monitoring problems are a symptom, and not the cause, of the problem here. Expect the NUT upgrade to be meaningless.

Most packages that are expecting an external device to do something will periodically wake up, time out, or otherwise take time-dependent action when an event doesn't happen. So, for example, if NUT thinks it hasn't heard from the UPS in 10 minutes, it may cry that there's a problem, try to re-establish the connection, and then go on.

But if the system's time is jumping for some reason, it could be that seconds are passing normally here in the real world, while to the processes on the computer, suddenly it's 10 minutes later. The processes running on your computer do not have an independent sense of time, and are reliant on the system to provide accurate time. If the system time is poor, or unreliable, problems will result. NUT will not realize that it's only been a real-world second since it last heard from the UPS, and would see a system time jump as being ten minutes since last communications. This wouldn't be a bug in NUT, but rather your system.

So I wouldn't bother worrying about NUT. Figure out getting stable time. NUT will "fix" itself (fsvo "fix" that means the symptoms go away, since nothing is technically wrong with NUT).
 

Z300M

Guru
Joined
Sep 9, 2011
Messages
882
It's quite likely that UPS monitoring problems are a symptom, and not the cause, of the problem here. Expect the NUT upgrade to be meaningless.

Most packages that are expecting an external device to do something will periodically wake up, time out, or otherwise take time-dependent action when an event doesn't happen. So, for example, if NUT thinks it hasn't heard from the UPS in 10 minutes, it may cry that there's a problem, try to re-establish the connection, and then go on.

But if the system's time is jumping for some reason, it could be that seconds are passing normally here in the real world, while to the processes on the computer, suddenly it's 10 minutes later. The processes running on your computer do not have an independent sense of time, and are reliant on the system to provide accurate time. If the system time is poor, or unreliable, problems will result. NUT will not realize that it's only been a real-world second since it last heard from the UPS, and would see a system time jump as being ten minutes since last communications. This wouldn't be a bug in NUT, but rather your system.

So I wouldn't bother worrying about NUT. Figure out getting stable time. NUT will "fix" itself (fsvo "fix" that means the symptoms go away, since nothing is technically wrong with NUT).

I think you must be correct: I updated to 8.0.3RC1, but the problem remains. I disconnected the UPS and disabled the UPS service, but the time still jumped 40 minutes overnight.

But I think I have found previously that the time shown by FreeNAS does not necessarily correspond to the time displayed by the motherboard BIOS. I'll have to reboot and check again. But I also see that on many occasions when I have reset the time from the console, a few minutes later there is the message

"freenas smartd[1462]: System clock time adjusted to the past. Resetting next wakeup time."
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
That smartd message may be completely reasonable. It has noticed that time did something screwy - a fact many applications may ignore - and has updated its wakeup time. Think about it. A very simplistic smartd might have code that schedules a wakeup 3600 seconds after the previous wakeup. However, if you reset the time backwards, then that next wakeup is scheduled further out. If your system clock reads 12/25/2020, and you reset to 12/25/2011, your next wakeup might not come for many years, which probably isn't the behaviour you wish from smartd.
 

Z300M

Guru
Joined
Sep 9, 2011
Messages
882
I have now changed three things all at about the same time, so I can't be sure which fixed the problem, but everything does now seem to be OK.

Changes:

1. Updated motherboard BIOS again.

2. Upgraded to FreeNAS 8.0.3RC3.

3. It having sunk in after far too long that when the system reported that the required time correction was "insane" it was telling me to reset the time manually to the correct UTC (yes, UTC) time, I changed the time zone from America/Detroit to etc/UTC.

As I said, now everything is fine so far.
 

Z300M

Guru
Joined
Sep 9, 2011
Messages
882
20 minute jumps

(First, something strange is going on here: I received an email informing me of a response by gcooper, which I could read in the email but cannot find here.)

I "spoke" too soon: when I rebooted the machine after making the changes I noted above, the time was correct at Sun Jan 1 01:57:34 UTC 2012
I checked the time (at the console) from time to time, and it was still correct at Tue Jan 3 01:46:46 UTC 2012.

But then at Jan 3 03:46:12 there is a message that "time correction of -1200 seconds exceeds sanity limit (1000)", and now the time is 20 minutes ahead.

The advances are always 20 minutes or a multiple thereof now -- or maybe the apparent multiples of 20 minutes are simply because two or more separate 20-minute jumps occurred between checks. This is not simply the clock on the motherboard running fast. What could be causing jumps of 20 minutes at a time? And why was everything OK for 48 hours? And how can I change the automatic time correction so that -1200 seconds is not considered to be "insane," and perhaps have time checks/corrections frequently enough so that there should not be more than one 20-minute advance between checks?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I'm guessing that it is ntpd that's refusing to jump more than 1000 seconds. This is probably correct behaviour; Harlan and the NTP guys have lots of good reasons for the things they do. NTP is supposed to jump the clock at startup (traditionally via ntpdate, though I believe this may be integrated into ntpd itself these days) prior to beginning daemon operations, at which point ntpd is supposed to be able to keep track of time to a high degree of accuracy, and a jump is theoretically just not supposed to ever be needed, unless something is seriously wrong. ntpd should not be used to adjust an insane clock, and probably won't work as expected if you try to make it adjust an insane clock. You really need to sit down and identify the problem.

I'm away from the office right now and just wasting some time, so I can't really look at any of our FreeNAS boxes easily. I don't have any great ideas, either, sorry. What you might try is killing off ntpd when this happens and see what is happening to your system time over the next few hours. Is the time staying consistently N.00 minutes ahead? Is it racing ahead at 1.5x speed or something like that?
 
G

gcooper

Guest
I'm guessing that it is ntpd that's refusing to jump more than 1000 seconds. This is probably correct behaviour; Harlan and the NTP guys have lots of good reasons for the things they do. NTP is supposed to jump the clock at startup (traditionally via ntpdate, though I believe this may be integrated into ntpd itself these days) prior to beginning daemon operations, at which point ntpd is supposed to be able to keep track of time to a high degree of accuracy, and a jump is theoretically just not supposed to ever be needed, unless something is seriously wrong. ntpd should not be used to adjust an insane clock, and probably won't work as expected if you try to make it adjust an insane clock. You really need to sit down and identify the problem.

I'm away from the office right now and just wasting some time, so I can't really look at any of our FreeNAS boxes easily. I don't have any great ideas, either, sorry. What you might try is killing off ntpd when this happens and see what is happening to your system time over the next few hours. Is the time staying consistently N.00 minutes ahead? Is it racing ahead at 1.5x speed or something like that?

Another potential problem: the crystal that handles core/time synchronization on your motherboard / CPU is busted or isn't doing the right thing with FreeBSD (in which case you'll see erratic behavior scheduling interrupts and other generic forms of time keeping). Try a different timecounter as root, e.g.

Code:
# sysctl kern.timecounter.choice
kern.timecounter.choice: TSC(-100) HPET(900) ACPI-safe(850) i8254(0) dummy(-1000000)
# sysctl kern.timecounter.hardware=ACPI-safe
kern.timecounter.hardware: HPET -> ACPI-safe


If it fixes your issue, I would look for a BIOS firmware update as it might be a firmware bug that was fixed in later revisions; alternatively, I would talk to your CPU / motherboard manufacturer about running more in-depth hardware tests (the Dell diags for instance provide a timecounter hardware test, amongst other things that you can try).
 

Z300M

Guru
Joined
Sep 9, 2011
Messages
882
Another potential problem: the crystal that handles core/time synchronization on your motherboard / CPU is busted or isn't doing the right thing with FreeBSD (in which case you'll see erratic behavior scheduling interrupts and other generic forms of time keeping).

Would this account for the time sometimes remaining OK for a day or more (when I check at the console) but then jumping in increments of 20 minutes?

Try a different timecounter as root, e.g.

Code:
# sysctl kern.timecounter.choice
kern.timecounter.choice: TSC(-100) HPET(900) ACPI-safe(850) i8254(0) dummy(-1000000)
# sysctl kern.timecounter.hardware=ACPI-safe
kern.timecounter.hardware: HPET -> ACPI-safe

# sysctl kern.timecounter.choice

returns

kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) dummy(-1000000)

When I enter

kern.timecounter.choice: TSC(-100) HPET(900) ACPI-safe(900) i8254(0) dummy(-1000000)

I get

Badly placed (.

When I enter

sysctl kern.timecounter.hardware=ACPI-safe

I get

sysctl kern.timecounter.hardware=ACPI-safe: Invalid argument

When I enter

kern.timecounter.hardware: HPET -> ACPI-safe

I get

ACPI-safe: Read-only file system.


If it fixes your issue, I would look for a BIOS firmware update as it might be a firmware bug that was fixed in later revisions; alternatively, I would talk to your CPU / motherboard manufacturer about running more in-depth hardware tests (the Dell diags for instance provide a timecounter hardware test, amongst other things that you can try).

I do have the latest BIOS update, but I'll see what else I can find out.
 

Z300M

Guru
Joined
Sep 9, 2011
Messages
882
After updating to FreeNAS 8.0.4 and updating to the latest BIOS, it ran for weeks with the correct time. I was about to report that here when I thought that I should just check one more time, only to find that now there were messages with dates several days apart and that it was now a month ahead. As before, the increments always seem to be multiples of 20 minutes. I haven't been able to find a diagnostics tool.
 

Z300M

Guru
Joined
Sep 9, 2011
Messages
882
I updated to 8.0.4-RELEASE-p1, and it ran for weeks with the correct time, the two nightly reports always being received with the correct date and time (same time each night). Yesterday night's security run output came at the usual time, but the daily run output (including a scrub of two pools) did not come until 4 hours later. Now tonight's reports are both dated 06/08/12.
 

Z300M

Guru
Joined
Sep 9, 2011
Messages
882
It's happened again: everything was fine for 35 days -- until the next scrub operation on Saturday night/Sunday morning (06/16-17/2012). Now the date is 12/08/2012.
 

negozio

Cadet
Joined
Jun 11, 2012
Messages
7
It's happened again: everything was fine for 35 days -- until the next scrub operation on Saturday night/Sunday morning (06/16-17/2012). Now the date is 12/08/2012.

I have the same problem with the same motherboard and freenas 8.2
The clock is crazy, forward 20 in 20 minutes when we copy files from workstation for the freenas, update the bios is the solution???
 

negozio

Cadet
Joined
Jun 11, 2012
Messages
7
I have the same problem with the same motherboard and freenas 8.2
The clock is crazy, forward 20 in 20 minutes when we copy files from workstation for the freenas, update the bios is the solution???

BIOS UPDATE dont work, the system follow 20 min ahead yet
 
Status
Not open for further replies.
Top