Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: 10.0_BETA/i386 GENERIC crash at boot on older CPUs



On Thu, 19 Jan 2023, John D. Baker wrote:

> On machines with older CPUs (pentium-III, Am5x86), a GENERIC kernel
> (or one based on GENERIC) dies during boot with:

> [...]
> [  30.4197804] fatal page fault in supervisor mode
> [  30.4197804] trap type 6 code 0 eip 0xc0617718 cs 0x8 eflags 0x10246 cr2 0x1000003c ilevel 0x7 esp 0xc0a29500
> [  30.4197804] curlwp 0xc16ac040 pid 0 lid 2 lowest kstack 0xd7c902c0
> kernel: supervisor trap page fault, code=0
> Stopped in pid 0.2 (system) at  netbsd:hardclock+0x23:  movl    3c(%esi),%eax
> db{0}> bt
> hardclock(10000000,d7a92ec4,c02864a1,0,10000000,c068d9f3,c027fc14,16b6000,7200,d
> 7f375f0) at netbsd:hardclock+0x23
> clockintr(0,10000000,c068d9f3,c027fc14,16b6000,7200,d7f375f0,0,c1761000,c010313a
> ) at netbsd:clockintr+0x36
> intr_kdtrace_wrapper(c1990540,d7a92ed4,6,d7a90010,c0620030,c16b0010,d7a90010,c16
> ac040,1,d7a92f34) at netbsd:intr_kdtrace_wrapper+0x21
> Xintr_legacy0() at netbsd:Xintr_legacy0+0xda
> --- interrupt ---
> cx8_spllower(1,0,d7a92f6c,c02864a1,c186c800,c1761000,c1990540,7,0,c186ba00) at n
> etbsd:cx8_spllower+0x14
> intr_biglock_wrapper(c186be80,d7c92f10,0,0,0,0,0,0,0,0) at netbsd:intr_biglock_w
> rapper+0x68
> --- switch to interrupt stack ---
> Xintr_legacy5() at netbsd:Xintr_legacy5+0xda
> --- interrupt ---
> x86_stihlt(c16ac040,0,c0637e70,0,0,c168f100,c0a10100,c168f100,d7c90000,c168f100)
>  at netbsd:x86_stihlt+0x5
> idle_loop(c16ac040,cbb000,cc4000,0,c01005a8,0,0,0,0,0) at netbsd:idle_loop+0x153


I've bisected the source and have determined that the fault was introduced
with:

  /*      $NetBSD: intr.c,v 1.163 2022/10/29 13:59:04 riastradh Exp $     */


Reverting this file to r1.162 eliminates the crash on my pentium-III
and Am5x86 (GENERIC kernel) systems.

It is not clear why the NET4501 kernel (or a kernel derived from it) does
not exhibit the crash.  I thought it might be because these kernels never
enable "options DIAGNOSTIC", but disabling it in my GENERIC-derived kernel
did not avoid the crash.

I'll see what lineup of i386-class (i486 and 32-bit pentium) CPUs I can
manage and try to see where the "go/no-go" line is.

-- 
|/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
|\ / jdbaker[snail]consolidated[flyspeck]net  OpenBSD            FreeBSD
| X  No HTML/proprietary data in email.   BSD just sits there and works!
|/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645


Home | Main Index | Thread Index | Old Index