Re: kern/51133: KASSERT on shutdown

To: kern-bug-people%netbsd.org@localhost,gnats-admin%netbsd.org@localhost,netbsd-bugs%netbsd.org@localhost,martin%NetBSD.org@localhost
Subject: Re: kern/51133: KASSERT on shutdown
From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
Date: Fri, 14 Apr 2023 08:00:03 +0000 (UTC)

The following reply was made to PR kern/51133; it has been noted by GNATS.

From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: Martin Husemann <martin%duskware.de@localhost>
Cc: Ryota Ozaki <ozaki-r%netbsd.org@localhost>,
	gnats-bugs%NetBSD.org@localhost
Subject: Re: kern/51133: KASSERT on shutdown
Date: Fri, 14 Apr 2023 07:59:17 +0000

 > panic: kernel diagnostic assertion "(boothowto & RB_HALT) =3D=3D 0" faile=
 d: file "../../../../kern/subr_pserialize.c", line 174=20
 
 I think this is a red herring.  bge_detach -> if_detach always leads
 to pserialize_perform, and from your log earlier, bge_detach completed
 successfully before the original panic:
 
 > bge3: detached
 > bge2: detached
 > Skipping crash dump on recursive panic
 > panic: kernel diagnostic assertion "!cpu_intr_p()" failed: file "../../..=
 /../kern/subr_xcall.c", line 351
 
 The fact that it's on line 351, presumably from subr_xcall.c 1.18,
 shows that somehow the softint handler is running in hard interrupt
 context, according to cpu_intr_p (xc__highpri_intr is only ever used
 as a softint function):
 
    344  void
    345  xc__highpri_intr(void *dummy)
    346  {
    347          xc_state_t *xc =3D &xc_high_pri;
    348          void *arg1, *arg2;
    349          xcfunc_t func;
    350 =20
    351          KASSERT(!cpu_intr_p());
 
 So this could be a buggy softint_dispatch vector.  That's consistent
 with the line from ps saying it's in softser:
 
 > 0    >   6 7   0       200          103b48840          softser/0
 
 Here's a wild guess (line numbers from locore.s 1.443, in HEAD):
 
 https://nxr.netbsd.org/xref/src/sys/arch/sparc64/sparc64/locore.s?r=3D1.433
 
    4661 	! Increment the per-cpu interrupt depth in case of hardintrs
    4662 	btst	SOFTINT_INT, %l3
    4663 	bnz,pn	%icc, sparc_intr_retry
    4664 	 sethi	%hi(CPUINFO_VA+CI_IDEPTH), %l1
    4665 	ld	[%l1 + %lo(CPUINFO_VA+CI_IDEPTH)], %l2
    4666 	inc	%l2
    4667 	st	%l2, [%l1 + %lo(CPUINFO_VA+CI_IDEPTH)]
    4668=20
    4669 sparc_intr_retry:
 ...
    4763 	/*
    4764 	 * Re-read SOFTINT to see if any new  pending interrupts
    4765 	 * at this level.
    4766 	 */
    4767 	mov	1, %l3			! Ack softint
    4768 	rd	SOFTINT, %l7		! %l5 contains #intr handled.
    4769 	sll	%l3, %l6, %l3		! Generate IRQ mask
    4770 	btst	%l3, %l7		! leave mask in %l3 for retry code
    4771 	bnz,pn	%icc, sparc_intr_retry
    4772 	 mov	1, %l5			! initialize intr count for next run
    4773=20
    4774 	! Decrement this cpu's interrupt depth in case of hardintrs
    4775 	btst	SOFTINT_INT, %l3
    4776 	bnz,pn	%icc, 1f
    4777 	 sethi	%hi(CPUINFO_VA+CI_IDEPTH), %l4
    4778 	ld	[%l4 + %lo(CPUINFO_VA+CI_IDEPTH)], %l5
    4779 	dec	%l5
    4780 	st	%l5, [%l4 + %lo(CPUINFO_VA+CI_IDEPTH)]
 
 When this re-reads SOFTINT, can it start invoking a new softint
 handler, before decrementing ci_idepth?
 
 I don't understand this stack trace:
 
 > netbsd:vpanic+0x16c(18806d8, 1ce6ac0, 18893d0, 1b026fc58, 1ce6bc0, 1c6700=
 0) fp =3D 1b026f261
 > netbsd:kern_assert+0x34(18893d0, 1b026fc58, 1ce5800, 1ce6ac0, 1ce6800, 4)=
  fp =3D 1b026f311
 > netbsd:xc__highpri_intr+0xc4(18893d0, 18053c0, 1869640, 18893b0, 161, 17f=
 8e08) fp =3D 1b026f3d1
 > netbsd:softint_dispatch+0xf8(1c9a000, 1c8b3a0, 50, 1c8b3a0, 276, aa670ae3=
 8cc28e39) fp =3D 1b026f4a1
 > netbsd:softint_fastintr+0x80(0, 4, 103b48840, 0, 1b018e178, 1b018e388) fp=
  =3D 1b026f571
 > netbsd:softint_schedule+0x4(103b48840, 4, 1cdd800, 103b498c0, 0, 2014000)=
  fp =3D 1b026f621
 
 softint_schedule+0x4 might be the call to kpreempt_disabled?  But I
 don't see how it could lead to softint_fastintr -- surely there should
 be an interrupt frame, or if nothing else, a frame with a return
 address pointing into sparc_interrupt?

Prev by Date: Re: port-sparc64/57350: Panic during shutdown on sun fire v445 system
Next by Date: toolchain/57351: Automated test usr.bin/c++/t_tsan_vptr_race:vptr_race fails intermittently on an amd64 machine
Previous by Thread: Re: kern/51133: KASSERT on shutdown
Next by Thread: Re: kern/51133: KASSERT on shutdown
Indexes:

Home | Main Index | Thread Index | Old Index