Jul 7 09:00:22 freenas autorepl.py: [common.pipesubr:61] Popen()ing: /usr/bin/ssh -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 freenas-2.intranet.domain.com "zfs destroy -d 'trunk/Remote/MM@auto-20160630.0900-1w'" Jul 7 09:00:23 freenas autorepl.py: [common.pipesubr:61] Popen()ing: /usr/bin/ssh -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 freenas-2.intranet.domain.com "zfs destroy -d 'trunk/Remote/Backups@auto-20160630.0900-1w'" Jul 7 09:00:24 freenas autorepl.py: [common.pipesubr:61] Popen()ing: /usr/bin/ssh -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 freenas-2.intranet.domain.com "zfs destroy -d 'trunk/Remote/jails@auto-20160630.0900-1w'" Jul 7 09:00:25 freenas autorepl.py: [common.pipesubr:61] Popen()ing: /usr/bin/ssh -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 freenas-2.intranet.domain.com "zfs destroy -d 'trunk/Remote@auto-20160630.0900-1w'" Jul 7 21:20:23 freenas syslog-ng[1875]: syslog-ng starting up; version='3.6.4' Jul 7 21:20:23 freenas Copyright (c) 1992-2016 The FreeBSD Project. Jul 7 21:20:23 freenas Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Jul 7 21:20:23 freenas The Regents of the University of California. All rights reserved. Jul 7 21:20:23 freenas FreeBSD is a registered trademark of The FreeBSD Foundation.
That is a VERY huge chunk of missing time alright!Besides a huge chunk of missing time in the messages.0.bz2 file before the reboot, everything looked peachy:
The act of waiting would only have been needed if your log "turned over" as I showed in the picture file I posted above.I'm unclear how waiting for this to happen again will help things -- I'll be right back where I am now unless there's a way to halt the reboot to see the kernel panic info?
Jun 24 09:00:44 freenas autorepl.py: [common.pipesubr:61] Popen()ing: /usr/bin/ssh -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 freenas-2.intranet.domain.com "zfs destroy -d 'trunk/Remote/jails@auto-20160617.0900-1w'" Jun 24 09:00:45 freenas autorepl.py: [common.pipesubr:61] Popen()ing: /usr/bin/ssh -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 freenas-2.intranet.domain.com "zfs destroy -d 'trunk/Remote@auto-20160617.0900-1w'" Jun 25 00:00:00 freenas syslog-ng[6885]: Configuration reload request received, reloading configuration; Jun 25 03:30:01 freenas cachetool.py: [common.pipesubr:61] Popen()ing: klist Jun 25 03:30:03 freenas cachetool.py: [common.pipesubr:61] Popen()ing: klist Jun 25 09:00:03 freenas autosnap.py: [tools.autosnap:61] Popen()ing: /sbin/zfs snapshot -r "trunk@auto-20160625.0900-1y"
Everything is normal as of 9pm on the 7th, then at some unknown point before hour 21:20Besides a huge chunk of missing time in the messages.0.bz2 file before the reboot, everything looked peachy:
Code:Jul 7 09:00:22 freenas autorepl.py: [common.pipesubr:61] Popen()ing: /usr/bin/ssh -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 freenas-2.intranet.domain.com "zfs destroy -d 'trunk/Remote/MM@auto-20160630.0900-1w'" Jul 7 09:00:23 freenas autorepl.py: [common.pipesubr:61] Popen()ing: /usr/bin/ssh -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 freenas-2.intranet.domain.com "zfs destroy -d 'trunk/Remote/Backups@auto-20160630.0900-1w'" Jul 7 09:00:24 freenas autorepl.py: [common.pipesubr:61] Popen()ing: /usr/bin/ssh -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 freenas-2.intranet.domain.com "zfs destroy -d 'trunk/Remote/jails@auto-20160630.0900-1w'" Jul 7 09:00:25 freenas autorepl.py: [common.pipesubr:61] Popen()ing: /usr/bin/ssh -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 freenas-2.intranet.domain.com "zfs destroy -d 'trunk/Remote@auto-20160630.0900-1w'" Jul 7 21:20:23 freenas syslog-ng[1875]: syslog-ng starting up; version='3.6.4' Jul 7 21:20:23 freenas Copyright (c) 1992-2016 The FreeBSD Project. Jul 7 21:20:23 freenas Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 Jul 7 21:20:23 freenas The Regents of the University of California. All rights reserved. Jul 7 21:20:23 freenas FreeBSD is a registered trademark of The FreeBSD Foundation.
I'm unclear how waiting for this to happen again will help things -- I'll be right back where I am now unless there's a way to halt the reboot to see the kernel panic info? Or are you saying that normally when there is a kernel panic, the system doesn't reboot?
If it was a power issue, we shouldn't see this time gap though, correct?
It's early in the debug process, but my
guess at this point, is power supply.
Please keep us posted so others can refer back to this thread and learn from your experience,
as you work towards a solution.
I have limited knowledge of advanced diagnostic testing of power supplies, but do knowThe PSU test came up clean using this device: http://www.thermaltake.com/products-model.aspx?id=C_00001777
I tested with both the 24pin connector and what they call the peripheral connector. This particular PSU does not have a CPU plug.
Done. In the meantime I've booted up Hiren's Boot CD 15.2 and am running Memtest86 v4.20. I plan on running it over the weekend. If you know of something more comprehensive, I'm game.Can you generate a debug file and send it to me via a PM in the forums? I will be able to tell you if the system crashed or not, but if there's a hardware failure the logs may not indicate that a failure occurred.
I have limited knowledge of advanced diagnostic testing of power supplies, but do know
that the tester you linked to doesn't put a proper load on the PSU and therefore only tests
basic voltage output which may not be affected until a load is placed on the components.
AFAIK, a load test will sometimes have to remain ongoing for hours until an issue occurs.
Maybe someone else might be able to expound on advanced PSU testing methods.
I have use the PassMark version the last few times I've tested RAM.
http://www.memtest86.com/features.htm
The page shows their claims of features compared to the older v4 software,
you can see the ECC differences listed there FWIW.
The free version was what I was referring to, I have not even priced the Pro version. I'm green with envy:DDo you mean the Pro version? I've just purchased it and am running it now. Fancy.
Oooh, can you try injecting ECC errors? That would be extremely informational.