Greetings,
Stats:
cpu: intel i7 3930k
ram: g.skill 64GB kit (F3-12800CL10Q2-64GBZL)
motherboard: Asus Rampage IV Extreme
NIC: Intel x540-T2
HBAs: 2x Avago LSI 9300-16i, 1x Avago LSI 9300-8e
Storage:
0) 2x Norco DS24E populated w/24 IBM Deskstar 750GB SATA disks and the other w/24 Seagate ST400VN000 4TB 'NAS' drives.
1) 2x Quantum Superloader 3 SAS LTO6 tape autoloaders.
I pulled the trigger on the 9.10 upgrade from 9.3 stable this last Sunday evening (20160925). I did something foolish wrt pkg updates and ended up hosing samba. I had been meaning to upgrade anyway, so this was a good excuse to do so. Silly me it seems. LSI card fw was phase 9, so I had to updated to phase 12. After the upgrade to 9.10, the dashboard 'traffic light' was blinking yellow indicating that issue. Beyond that, the only other issue was that, w/9.10, uftdi is no longer either in the kernel or autoloading. I added the tunable to load the module, and got my ups monitoring working, then acquired the bits to perform the uefi controller fw updates and did so.
I thought all was well. That is until the first backup job under the new version launched. And that's when the badness started. It's worth noting that I did not elect to wipe my jails. I have 2 that I particularly care about. The P5 backup installation and the plex media server. The latter is infinitely more obnoxiously to restore the meta metrics and corrected bad matches than I had cared to want to undertake. They both came up and work fine. I had thought P5 was the culprit because most of the punts were when I was attempting to run the backup, at the point data should be streaming to the tape. The two exceptions being a smartd run and a grep over dumped ktrace on the P5 process.
Here are the various trap blocks for the dumps I have:
The zio_read_intr* bit seems very troubling. Is there a driver/fw issue I missed in my, admittedly non-exhaustive forum search?
I have _not_ performed a zfs upgrade as of yet and am on version 5. Although I'm uncertain how that could manifest as an interrupt issue.
I had a bad disk in the smaller pool on the 21st but the replacement resilvered w/out issue.
Current status shows all to be well:
Any assistance/insight would be grand.
Thanks!
Stats:
cpu: intel i7 3930k
ram: g.skill 64GB kit (F3-12800CL10Q2-64GBZL)
motherboard: Asus Rampage IV Extreme
NIC: Intel x540-T2
HBAs: 2x Avago LSI 9300-16i, 1x Avago LSI 9300-8e
Storage:
0) 2x Norco DS24E populated w/24 IBM Deskstar 750GB SATA disks and the other w/24 Seagate ST400VN000 4TB 'NAS' drives.
1) 2x Quantum Superloader 3 SAS LTO6 tape autoloaders.
I pulled the trigger on the 9.10 upgrade from 9.3 stable this last Sunday evening (20160925). I did something foolish wrt pkg updates and ended up hosing samba. I had been meaning to upgrade anyway, so this was a good excuse to do so. Silly me it seems. LSI card fw was phase 9, so I had to updated to phase 12. After the upgrade to 9.10, the dashboard 'traffic light' was blinking yellow indicating that issue. Beyond that, the only other issue was that, w/9.10, uftdi is no longer either in the kernel or autoloading. I added the tunable to load the module, and got my ups monitoring working, then acquired the bits to perform the uefi controller fw updates and did so.
I thought all was well. That is until the first backup job under the new version launched. And that's when the badness started. It's worth noting that I did not elect to wipe my jails. I have 2 that I particularly care about. The P5 backup installation and the plex media server. The latter is infinitely more obnoxiously to restore the meta metrics and corrected bad matches than I had cared to want to undertake. They both came up and work fine. I had thought P5 was the culprit because most of the punts were when I was attempting to run the backup, at the point data should be streaming to the tape. The two exceptions being a smartd run and a grep over dumped ktrace on the P5 process.
Here are the various trap blocks for the dumps I have:
Code:
-- msgbuf_20160927_1439.txt -- Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0x8091d1368 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80da9ed4 stack pointer = 0x28:0xfffffe104adfd430 frame pointer = 0x28:0xfffffe104adfd440 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 5061 (smartd) -- msgbuf_20160927_1727.txt -- Fatal trap 12: page fault while in kernel mode cpuid = 4; apic id = 04 fault virtual address = 0x817ffd268 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80da9ed4 stack pointer = 0x28:0xfffffe104a11c3e0 frame pointer = 0x28:0xfffffe104a11c3f0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (zio_read_intr_4_2) -- msgbuf_20160927_1755.txt -- Fatal trap 12: page fault while in kernel mode cpuid = 11; apic id = 0b fault virtual address = 0x8181dd268 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80da9ed4 stack pointer = 0x28:0xfffffe1049dee400 frame pointer = 0x28:0xfffffe1049dee410 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (zio_read_intr_7_3) -- msgbuf_20160927_1845.txt -- Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0x817fe2d68 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80da9ed4 stack pointer = 0x28:0xfffffe104a1803e0 frame pointer = 0x28:0xfffffe104a1803f0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (zio_read_intr_5_8) -- msgbuf_20160927_2159.txt -- Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x80938fb68 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80da9ed4 stack pointer = 0x28:0xfffffe104af39a00 frame pointer = 0x28:0xfffffe104af39a10 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 23949 (grep)
The zio_read_intr* bit seems very troubling. Is there a driver/fw issue I missed in my, admittedly non-exhaustive forum search?
I have _not_ performed a zfs upgrade as of yet and am on version 5. Although I'm uncertain how that could manifest as an interrupt issue.
I had a bad disk in the smaller pool on the 21st but the replacement resilvered w/out issue.
Current status shows all to be well:
Code:
freenas910# zpool status -v pool: freenas-boot state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Sat Sep 17 03:45:29 2016 config: NAME STATE READ WRITE CKSUM freenas-boot ONLINE 0 0 0 ada0p2 ONLINE 0 0 0 errors: No known data errors pool: media00 state: ONLINE scan: resilvered 440G in 2h48m with 0 errors on Wed Sep 21 00:15:15 2016 config: NAME STATE READ WRITE CKSUM media00 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gptid/5656324c-4c8f-11e5-a1d8-a0369f3c3d84 ONLINE 0 0 0 gptid/3a27fc18-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0 gptid/3a821ca3-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0 gptid/3adaefd9-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0 gptid/3b36c7cc-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0 gptid/84a9250b-b8ea-11e5-ad0e-a0369f3c3d84 ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 gptid/d67cacf9-ba61-11e5-84d7-a0369f3c3d84 ONLINE 0 0 0 gptid/3c39e5f3-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0 gptid/3c9bab55-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0 gptid/3cf4f8d7-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0 gptid/d6ad4c24-519c-11e5-819c-a0369f3c3d84 ONLINE 0 0 0 gptid/b6ba3e94-3e2d-11e5-8f50-a0369f3c3d84 ONLINE 0 0 0 raidz2-2 ONLINE 0 0 0 gptid/f3bdddad-b8e9-11e5-ad0e-a0369f3c3d84 ONLINE 0 0 0 gptid/3e5dac7a-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0 gptid/3eb88653-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0 gptid/3f19d73a-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0 gptid/3f6ef42f-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0 gptid/b0c406d0-4216-11e5-8f50-a0369f3c3d84 ONLINE 0 0 0 raidz2-3 ONLINE 0 0 0 gptid/4025eba9-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0 gptid/39ab45ed-7fb3-11e6-81fb-a0369f3c3d84 ONLINE 0 0 0 gptid/40df0b78-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0 gptid/4139e8df-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0 gptid/4a0191ca-ddf5-11e5-85db-a0369f3c3d84 ONLINE 0 0 0 gptid/41fcf5d7-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0 errors: No known data errors pool: media01 state: ONLINE scan: scrub repaired 0 in 16h11m with 0 errors on Sun Sep 25 18:11:47 2016 config: NAME STATE READ WRITE CKSUM media01 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gptid/4ac35f13-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 gptid/4b4b54c7-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 gptid/4bdad04b-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 gptid/4c6ba722-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 gptid/4ce815db-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 gptid/4d89bd83-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 gptid/4e2b84dc-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 gptid/e37e3682-b0e4-11e5-ac01-a0369f3c3d84 ONLINE 0 0 0 gptid/d782e41f-8f68-11e5-9193-a0369f3c3d84 ONLINE 0 0 0 gptid/4fca5239-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 gptid/506651c0-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 gptid/50f4c90d-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 raidz2-2 ONLINE 0 0 0 gptid/8c637bb5-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 gptid/8cfc92e8-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 gptid/8d927a19-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 gptid/8e15216e-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 gptid/8eb7c552-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 gptid/8f3e1e22-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0 raidz2-3 ONLINE 0 0 0 gptid/b67a2131-90f3-11e5-8dba-a0369f3c3d84 ONLINE 0 0 0 gptid/b7468669-90f3-11e5-8dba-a0369f3c3d84 ONLINE 0 0 0 gptid/b80eb362-90f3-11e5-8dba-a0369f3c3d84 ONLINE 0 0 0 gptid/b8cfc9a3-90f3-11e5-8dba-a0369f3c3d84 ONLINE 0 0 0 gptid/b99129c0-90f3-11e5-8dba-a0369f3c3d84 ONLINE 0 0 0 gptid/ba7d52a8-90f3-11e5-8dba-a0369f3c3d84 ONLINE 0 0 0 errors: No known data errors
Any assistance/insight would be grand.
Thanks!