NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

port-i386/57197: GENERIC kernel crash on pentium-III and earlier CPUs



>Number:         57197
>Category:       port-i386
>Synopsis:       GENERIC kernel crash on pentium-III and earlier CPUs
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    port-i386-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Jan 24 04:10:00 +0000 2023
>Originator:     John D. Baker
>Release:        NetBSD 10.0_BETA
>Organization:
>Environment:
NetBSD 10.0_BETA (PLEXOR) #5: Sun Jan 22 18:35:43 CST 2023 sysop%plex760.technoskunk.fur@localhost:/r0/build/netbsd-10/obj/i386/sys/arch/i386/compile/PLEXOR
Architecture: i386
Machine: i386
>Description:

Booting the GENERIC kernel (or one which includes the GENERIC config)
on a system with a pentium-III or lesser CPU (VIA Samuel, Am5x86) crashes
as follows:

pentium-III:
[   1.0000000] NetBSD 10.0_BETA (PLEXOR) #4: Wed Jan 18 21:10:13 CST 2023
[   1.0000000]
sysop%plex760.technoskunk.fur@localhost:/r0/build/netbsd-10/obj/i386/sys/arch/i386/com
pile/PLEXOR
[   1.0000000] total memory = 510 MB
[   1.0000000] avail memory = 488 MB
[...]
[   1.0000030] cpu0 at mainbus0
[   1.0000030] cpu0: Intel 686-class, 936MHz, id 0x68a
[   1.0000030] cpu0: node 0, package 0, core 0, smt 0
[...]
[  30.4197804] fatal page fault in supervisor mode
[  30.4197804] trap type 6 code 0 eip 0xc0617718 cs 0x8 eflags 0x10246 cr2
0x1000003c ilevel 0x7 esp 0xc0a29500
[  30.4197804] curlwp 0xc16ac040 pid 0 lid 2 lowest kstack 0xd7c902c0
kernel: supervisor trap page fault, code=0
Stopped in pid 0.2 (system) at  netbsd:hardclock+0x23:  movl 3c(%esi),%eax
db{0}> bt
hardclock(10000000,d7a92ec4,c02864a1,0,10000000,c068d9f3,c027fc14,16b6000,72
00,d7f375f0) at netbsd:hardclock+0x23
clockintr(0,10000000,c068d9f3,c027fc14,16b6000,7200,d7f375f0,0,c1761000,c010
313a) at netbsd:clockintr+0x36
intr_kdtrace_wrapper(c1990540,d7a92ed4,6,d7a90010,c0620030,c16b0010,d7a90010
,c16ac040,1,d7a92f34) at netbsd:intr_kdtrace_wrapper+0x21
Xintr_legacy0() at netbsd:Xintr_legacy0+0xda
--- interrupt ---
cx8_spllower(1,0,d7a92f6c,c02864a1,c186c800,c1761000,c1990540,7,0,c186ba00)
at netbsd:cx8_spllower+0x14
intr_biglock_wrapper(c186be80,d7c92f10,0,0,0,0,0,0,0,0) at
netbsd:intr_biglock_wrapper+0x68
--- switch to interrupt stack ---
Xintr_legacy5() at netbsd:Xintr_legacy5+0xda
--- interrupt ---
x86_stihlt(c16ac040,0,c0637e70,0,0,c168f100,c0a10100,c168f100,d7c90000,c168f
100)
 at netbsd:x86_stihlt+0x5
idle_loop(c16ac040,cbb000,cc4000,0,c01005a8,0,0,0,0,0) at
netbsd:idle_loop+0x153


Am5x86:
[   1.0000000] NetBSD 10.0_BETA (GENERIC) #4: Wed Jan 18 20:46:54 CST 2023
[   1.0000000]
sysop%plex760.technoskunk.fur@localhost:/r0/build/netbsd-10/obj/i386/sys/arch/i386/com
pile/GENERIC
[   1.0000000] total memory = 65148 KB
[   1.0000000] avail memory = 38620 KB
[...]
[   1.0000040] cpu0: AMD 486-class, id 0x4f4
[   1.0000040] cpu0: node 0, package 0, core 0, smt 0
[...]
[   1.0000040] fatal page fault in supervisor mode
[   1.0000040] trap type 6 code 0 eip 0xc0d3d7d8 cs 0xc57b0008 eflags
0x10246 cr2 0x3c ilevel 0x7 esp 0
[   1.0000040] curlwp 0xc165a840 pid 0 lid 0 lowest kstack 0xc19f32c0
kernel: supervisor trap page fault, code=0
Stopped in pid 0.0 (system) at  netbsd:hardclock+0x23:  movl
3c(%esi),%eax
db{0}> bt
hardclock(0,0,c57bff6c,c04ac8f1,0,0,0,0,0,0) at netbsd:hardclock+0x23
clockintr(0,0,0,0,0,0,0,0,c1c72000,c010322a) at netbsd:clockintr+0x2a
intr_kdtrace_wrapper(c1c33b80,c19f5d9c,0,0,0,0,0,0,0,0) at
netbsd:intr_kdtrace_wrapper+0x21
--- switch to interrupt stack ---
Xintr_legacy0() at netbsd:Xintr_legacy0+0xda
--- interrupt ---
outb(c16260c0,c1623f80,0,20,1,0,0,c16c5a80,c19f5e94,0) at netbsd:outb+0x9
intr_establish_xname(0,c16260c0,0,1,7,c04c96b5,0,0,c134f916,0) at
netbsd:intr_establish_xname+0x2ba
isa_intr_establish_xname(0,0,1,7,c04c96b5,0,c134f916,c19f5f14,c04c9baf,0) at
netbsd:isa_intr_establish_xname+0x91
isa_intr_establish(0,0,1,7,c04c96b5,0,c19f5f60,c0d3d19a,c04b6858,1000) at
netbsd:isa_intr_establish+0x3c
i8254_initclocks(c04b6858,1000,3,c11b0770,c6020000,c601f000,c1670b40,0,c19f5
f60,c0e5f527) at netbsd:i8254_initclocks+0x3a
initclocks(3,0,64,0,0,0,0,0,2a6a000,0) at netbsd:initclocks+0x1c
main(0,0,0,0,0,0,0,0,0,0) at netbsd:main+0x365


VIA "Samuel":
[   1.0000000] NetBSD 10.0_BETA (NEOWARE) #4: Wed Jan 18 21:19:15 CST 2023
[   1.0000000] 	sysop%plex760.technoskunk.fur@localhost:/r0/build/netbsd-10/obj/i386/sys/arch/i386/compile/NEOWARE
[   1.0000000] total memory = 959 MB
[   1.0000000] avail memory = 930 MB
[...]
[   1.0000040] cpu0 at mainbus0
[   1.0000040] cpu0: VIA Samuel 2, id 0x673
[   1.0000040] cpu0: node 0, package 0, core 0, smt 0
[...]
[   1.0011329] fatal page fault in supervisor mode
[   1.0011329] trap type 6 code 0 eip 0xc055c308 cs 0x8 eflags 0x10246 cr2 0x3c ilevel 0x7 esp 0x6
[   1.0011329] curlwp 0xc095f140 pid 0 lid 0 lowest kstack 0xc0bf92c0
kernel: supervisor trap page fault, code=0
Stopped in pid 0.0 (system) at  netbsd:hardclock+0x23:  movl    3c(%esi),%eax
db{0}> bt
hardclock(0,d95d2f6c,c022c031,0,0,0,0,0,0,0) at netbsd:hardclock+0x23
clockintr(0,0,0,0,0,0,0,0,c1ecc000,c010313a) at netbsd:clockintr+0x36
intr_kdtrace_wrapper(c213c500,c0bfbd9c,0,0,0,0,0,0,0,0) at netbsd:intr_kdtrace_w
rapper+0x21
--- switch to interrupt stack ---
Xintr_legacy0() at netbsd:Xintr_legacy0+0xda
--- interrupt ---
outb(c0955480,c0953540,0,20,1,0,0,c099ffc0,c0bfbe94,0) at netbsd:outb+0x9
intr_establish_xname(0,c0955480,0,1,7,c0248305,0,0,c0813d2b,0) at netbsd:intr_es
tablish_xname+0x2ba
isa_intr_establish_xname(0,0,1,7,c0248305,0,c0813d2b,c0bfbf14,c02487cc,0) at net
bsd:isa_intr_establish_xname+0x91
isa_intr_establish(0,0,1,7,c0248305,0,c0bfbf60,c055bcca,c0235c38,1000) at netbsd
:isa_intr_establish+0x3c
i8254_initclocks(c0235c38,1000,3,c0795264,da280000,da27f000,bfe8,c0bfbf60,c05c08
24,c06797d7) at netbsd:i8254_initclocks+0x3a
initclocks(3,5,64,0,0,0,0,0,16800000,0) at netbsd:initclocks+0x1c
main(0,0,0,0,0,0,0,0,0,0) at netbsd:main+0x365
db{0}> 


After bisecting the source, the fault was introduced with:

  src/sys/arch/x86/x86/intr.c r1.163

Pentium-4 and later CPUs appear to be unaffected.

Curiously, on the Am5x86, the NET4501 kernel (or one derived from it)
boots without issues.

The added code between r1.162 and r1.163 looks innocuous enough, but
it apparently trips up these older/lesser CPUs.

>How-To-Repeat:
Boot GENERIC (or GENERIC-derived) kernel on system with pentium-III or
earlier CPU.

>Fix:
Workaround:  revert "src/sys/arch/x86/x86/intr.c" to r1.162



Home | Main Index | Thread Index | Old Index