Greetings,
Stats:
cpu: intel i7 3930k
ram: g.skill 64GB kit (F3-12800CL10Q2-64GBZL)
motherboard: Asus Rampage IV Extreme
NIC: Intel x540-T2
HBAs: 2x Avago LSI 9300-16i, 1x Avago LSI 9300-8e
Storage:
0) 2x Norco DS24E populated w/24 IBM Deskstar 750GB SATA disks and the other w/24 Seagate ST400VN000 4TB 'NAS' drives.
1) 2x Quantum Superloader 3 SAS LTO6 tape autoloaders.
I pulled the trigger on the 9.10 upgrade from 9.3 stable this last Sunday evening (20160925). I did something foolish wrt pkg updates and ended up hosing samba. I had been meaning to upgrade anyway, so this was a good excuse to do so. Silly me it seems. LSI card fw was phase 9, so I had to updated to phase 12. After the upgrade to 9.10, the dashboard 'traffic light' was blinking yellow indicating that issue. Beyond that, the only other issue was that, w/9.10, uftdi is no longer either in the kernel or autoloading. I added the tunable to load the module, and got my ups monitoring working, then acquired the bits to perform the uefi controller fw updates and did so.
I thought all was well. That is until the first backup job under the new version launched. And that's when the badness started. It's worth noting that I did not elect to wipe my jails. I have 2 that I particularly care about. The P5 backup installation and the plex media server. The latter is infinitely more obnoxiously to restore the meta metrics and corrected bad matches than I had cared to want to undertake. They both came up and work fine. I had thought P5 was the culprit because most of the punts were when I was attempting to run the backup, at the point data should be streaming to the tape. The two exceptions being a smartd run and a grep over dumped ktrace on the P5 process.
Here are the various trap blocks for the dumps I have:
The zio_read_intr* bit seems very troubling. Is there a driver/fw issue I missed in my, admittedly non-exhaustive forum search?
I have _not_ performed a zfs upgrade as of yet and am on version 5. Although I'm uncertain how that could manifest as an interrupt issue.
I had a bad disk in the smaller pool on the 21st but the replacement resilvered w/out issue.
Current status shows all to be well:
Any assistance/insight would be grand.
Thanks!
Stats:
cpu: intel i7 3930k
ram: g.skill 64GB kit (F3-12800CL10Q2-64GBZL)
motherboard: Asus Rampage IV Extreme
NIC: Intel x540-T2
HBAs: 2x Avago LSI 9300-16i, 1x Avago LSI 9300-8e
Storage:
0) 2x Norco DS24E populated w/24 IBM Deskstar 750GB SATA disks and the other w/24 Seagate ST400VN000 4TB 'NAS' drives.
1) 2x Quantum Superloader 3 SAS LTO6 tape autoloaders.
I pulled the trigger on the 9.10 upgrade from 9.3 stable this last Sunday evening (20160925). I did something foolish wrt pkg updates and ended up hosing samba. I had been meaning to upgrade anyway, so this was a good excuse to do so. Silly me it seems. LSI card fw was phase 9, so I had to updated to phase 12. After the upgrade to 9.10, the dashboard 'traffic light' was blinking yellow indicating that issue. Beyond that, the only other issue was that, w/9.10, uftdi is no longer either in the kernel or autoloading. I added the tunable to load the module, and got my ups monitoring working, then acquired the bits to perform the uefi controller fw updates and did so.
I thought all was well. That is until the first backup job under the new version launched. And that's when the badness started. It's worth noting that I did not elect to wipe my jails. I have 2 that I particularly care about. The P5 backup installation and the plex media server. The latter is infinitely more obnoxiously to restore the meta metrics and corrected bad matches than I had cared to want to undertake. They both came up and work fine. I had thought P5 was the culprit because most of the punts were when I was attempting to run the backup, at the point data should be streaming to the tape. The two exceptions being a smartd run and a grep over dumped ktrace on the P5 process.
Here are the various trap blocks for the dumps I have:
Code:
-- msgbuf_20160927_1439.txt --
Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 03
fault virtual address = 0x8091d1368
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80da9ed4
stack pointer = 0x28:0xfffffe104adfd430
frame pointer = 0x28:0xfffffe104adfd440
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 5061 (smartd)
-- msgbuf_20160927_1727.txt --
Fatal trap 12: page fault while in kernel mode
cpuid = 4; apic id = 04
fault virtual address = 0x817ffd268
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80da9ed4
stack pointer = 0x28:0xfffffe104a11c3e0
frame pointer = 0x28:0xfffffe104a11c3f0
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 0 (zio_read_intr_4_2)
-- msgbuf_20160927_1755.txt --
Fatal trap 12: page fault while in kernel mode
cpuid = 11; apic id = 0b
fault virtual address = 0x8181dd268
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80da9ed4
stack pointer = 0x28:0xfffffe1049dee400
frame pointer = 0x28:0xfffffe1049dee410
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 0 (zio_read_intr_7_3)
-- msgbuf_20160927_1845.txt --
Fatal trap 12: page fault while in kernel mode
cpuid = 5; apic id = 05
fault virtual address = 0x817fe2d68
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80da9ed4
stack pointer = 0x28:0xfffffe104a1803e0
frame pointer = 0x28:0xfffffe104a1803f0
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 0 (zio_read_intr_5_8)
-- msgbuf_20160927_2159.txt --
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0x80938fb68
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80da9ed4
stack pointer = 0x28:0xfffffe104af39a00
frame pointer = 0x28:0xfffffe104af39a10
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 23949 (grep)
The zio_read_intr* bit seems very troubling. Is there a driver/fw issue I missed in my, admittedly non-exhaustive forum search?
I have _not_ performed a zfs upgrade as of yet and am on version 5. Although I'm uncertain how that could manifest as an interrupt issue.
I had a bad disk in the smaller pool on the 21st but the replacement resilvered w/out issue.
Current status shows all to be well:
Code:
freenas910# zpool status -v
pool: freenas-boot
state: ONLINE
scan: scrub repaired 0 in 0h0m with 0 errors on Sat Sep 17 03:45:29 2016
config:
NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
ada0p2 ONLINE 0 0 0
errors: No known data errors
pool: media00
state: ONLINE
scan: resilvered 440G in 2h48m with 0 errors on Wed Sep 21 00:15:15 2016
config:
NAME STATE READ WRITE CKSUM
media00 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
gptid/5656324c-4c8f-11e5-a1d8-a0369f3c3d84 ONLINE 0 0 0
gptid/3a27fc18-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0
gptid/3a821ca3-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0
gptid/3adaefd9-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0
gptid/3b36c7cc-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0
gptid/84a9250b-b8ea-11e5-ad0e-a0369f3c3d84 ONLINE 0 0 0
raidz2-1 ONLINE 0 0 0
gptid/d67cacf9-ba61-11e5-84d7-a0369f3c3d84 ONLINE 0 0 0
gptid/3c39e5f3-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0
gptid/3c9bab55-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0
gptid/3cf4f8d7-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0
gptid/d6ad4c24-519c-11e5-819c-a0369f3c3d84 ONLINE 0 0 0
gptid/b6ba3e94-3e2d-11e5-8f50-a0369f3c3d84 ONLINE 0 0 0
raidz2-2 ONLINE 0 0 0
gptid/f3bdddad-b8e9-11e5-ad0e-a0369f3c3d84 ONLINE 0 0 0
gptid/3e5dac7a-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0
gptid/3eb88653-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0
gptid/3f19d73a-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0
gptid/3f6ef42f-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0
gptid/b0c406d0-4216-11e5-8f50-a0369f3c3d84 ONLINE 0 0 0
raidz2-3 ONLINE 0 0 0
gptid/4025eba9-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0
gptid/39ab45ed-7fb3-11e6-81fb-a0369f3c3d84 ONLINE 0 0 0
gptid/40df0b78-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0
gptid/4139e8df-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0
gptid/4a0191ca-ddf5-11e5-85db-a0369f3c3d84 ONLINE 0 0 0
gptid/41fcf5d7-292c-11e5-8e99-a0369f3c3d84 ONLINE 0 0 0
errors: No known data errors
pool: media01
state: ONLINE
scan: scrub repaired 0 in 16h11m with 0 errors on Sun Sep 25 18:11:47 2016
config:
NAME STATE READ WRITE CKSUM
media01 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
gptid/4ac35f13-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
gptid/4b4b54c7-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
gptid/4bdad04b-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
gptid/4c6ba722-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
gptid/4ce815db-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
gptid/4d89bd83-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
raidz2-1 ONLINE 0 0 0
gptid/4e2b84dc-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
gptid/e37e3682-b0e4-11e5-ac01-a0369f3c3d84 ONLINE 0 0 0
gptid/d782e41f-8f68-11e5-9193-a0369f3c3d84 ONLINE 0 0 0
gptid/4fca5239-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
gptid/506651c0-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
gptid/50f4c90d-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
raidz2-2 ONLINE 0 0 0
gptid/8c637bb5-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
gptid/8cfc92e8-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
gptid/8d927a19-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
gptid/8e15216e-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
gptid/8eb7c552-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
gptid/8f3e1e22-2a53-11e5-95ce-a0369f3c3d84 ONLINE 0 0 0
raidz2-3 ONLINE 0 0 0
gptid/b67a2131-90f3-11e5-8dba-a0369f3c3d84 ONLINE 0 0 0
gptid/b7468669-90f3-11e5-8dba-a0369f3c3d84 ONLINE 0 0 0
gptid/b80eb362-90f3-11e5-8dba-a0369f3c3d84 ONLINE 0 0 0
gptid/b8cfc9a3-90f3-11e5-8dba-a0369f3c3d84 ONLINE 0 0 0
gptid/b99129c0-90f3-11e5-8dba-a0369f3c3d84 ONLINE 0 0 0
gptid/ba7d52a8-90f3-11e5-8dba-a0369f3c3d84 ONLINE 0 0 0
errors: No known data errors
Any assistance/insight would be grand.
Thanks!