NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/57226: NetBSD 10_BETA kernel crash
The following reply was made to PR kern/57226; it has been noted by GNATS.
From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: PHO <pho%cielonegro.org@localhost>
Cc: gnats-bugs%NetBSD.org@localhost
Subject: Re: kern/57226: NetBSD 10_BETA kernel crash
Date: Sun, 25 Jun 2023 18:37:12 +0000
This is a multi-part message in MIME format.
--=_QSVhpQW7wL2t2ZyM0lNb6kI+m4HN6cIl
Content-Transfer-Encoding: quoted-printable
> Date: Mon, 26 Jun 2023 00:10:04 +0900
> From: PHO <pho%cielonegro.org@localhost>
>=20
> But what I found with a bunch of debugging logs is that:
>=20
> 1. cpu0 calls timerfd_create() to initialize an itimer.
>=20
> 2. The same CPU calls timerfd_settime() to schedule the timer.
>=20
> 3. After some ticks itimer_callout() gets invoked. It re-schedules the=20
> callout because it's not dying yet.
>=20
> 4. itimer_callout() releases the itimer lock right before returning.
>=20
> 5. But before itimer_callout() returns and callout_softclock()=20
> re-acquires the callout lock, cpu1 calls timerfd_destroy().
>=20
> 6. cpu1 can acquire the itimer lock (because it's now released), and=20
> calls itimer_poison(). It marks the itimer as dying, but it's too late.=20
> itimer_poison() invokes callout_halt(), which returns without waiting=20
> for the callout to finish because c->c_flags no longer contains=20
> CALLOUT_FIRED. The flag has been cleared on the step 3.
>=20
> 7. cpu1 then calls itimer_fini(), which in turn calls callout_destroy().=
=20
> cpu0 is still in callout_softclock() and trying to lock the callout,=20
> which makes cpu1 panic.
Ah! I understand. The problem is that there is a window between:
/* itimer_callout */
if (!it->it_dying)
callout_schedule(&it->it_ch, ...); /* (A) */
itimer_unlock(it);
return;
/* callout_softclock */
cc->cc_active =3D NULL; /* (B) */
During that time, between (A) and (B), it->it_ch doesn't have
CALLOUT_FIRED set because callout_schedule clears it at (A).
So if another CPU calls itimer_poison and then itimer_fini,
itimer_poison won't wait for (B) to have happened, and itimer_fini can
run before (B), in which case callout_destroy crashes.
The bug is that callout_halt fails to wait for (B).
Nice work! I agree this is the right analysis and correct solution.
Couple nits:
> callout(9): Obtain the lock before checking if a callout is destroyable
>=20
> One cannot safely observe the state of a callout without holding its lock,
> because c->c_flags and c->c_cpu might be getting mutated somewhere else.
Did you observe this to be necessary to fix the problem, or do you
have a theory for why this should be necessary? I don't think it
should be. I think if anything, it should be under #ifdef DEBUG (but
I'm not sure we should keep it at all). The comment you deleted in
this change,
/*
* It's not necessary to lock in order to see the correct value
* of c->c_flags. If the callout could potentially have been
* running, the current thread should have stopped it.
*/
correctly reflects the callout(9) API contract, so if there is any
race here, it is either a bug in callout(9) or a user error -- and we
want to detect it and fail noisily in either case, not suppress it by
taking a lock.
In the other change, which I think is the only one that should be
needed to solve the problem:
> @@ -593,7 +592,7 @@ callout_wait(callout_impl_t *c, void *interlock, kmut=
ex_t *lock)
> * - the callout itself has called callout_halt() (nice!)
> */
> cc =3D c->c_cpu;
I think you need to make sure to load c->c_cpu exactly once here, not
once in callout_wait and then once again in
callout_running_somewhere_else.
It's possible that one can prove it is OK to reload c->c_cpu twice,
but let's keep the change minimal.
In fact for now I would suggest just inlining it in callout_halt so
that the fix is a one-line change to start, and to pull up, and then
we can tidy it up afterward with no-functional-change commits.
> - if (__predict_true(cc->cc_active !=3D c || cc->cc_lwp =3D=3D l))
> + if (__predict_false(!callout_running_somewhere_else(c)))
This should stay __predict_true. Callouts are supposed to be quick,
so by the time we've reloaded c->c_cpu from memory, it may have
changed again, even though the caller already determined
callout_running_somewhere_else returned true.
(Attached are the changes I reviewed, for audit trail.)
--=_QSVhpQW7wL2t2ZyM0lNb6kI+m4HN6cIl
Content-Type: text/plain; charset="ISO-8859-1"; name="736fd5151124eaf2b5de1e6b5f4f0d6c71edd445"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="736fd5151124eaf2b5de1e6b5f4f0d6c71edd445.patch"
From 736fd5151124eaf2b5de1e6b5f4f0d6c71edd445 Mon Sep 17 00:00:00 2001
From: PHO <pho%cielonegro.org@localhost>
Date: Sun, 25 Jun 2023 20:15:44 +0900
Subject: [PATCH] callout(9): Fix panic() in callout_destroy() (kern/57226)
The culprit was callout_halt(). "(c->c_flags & CALLOUT_FIRED) !=3D 0" wasn't
the correct way to check if a callout is running. It failed to wait for a
running callout to finish in the following scenario:
1. cpu0 initializes a callout and schedules it.
2. cpu0 invokes callout_softlock() and fires the callout, setting the flag
CALLOUT_FIRED.
3. The callout invokes callout_schedule() to re-schedule itself.
4. callout_schedule_locked() clears the flag CALLOUT_FIRED, and releases
the lock.
5. Before the lock is re-acquired by callout_softlock(), cpu1 decides to
destroy the callout. It first invokes callout_halt() to make sure the
callout finishes running.
6. But since CALLOUT_FIRED has been cleared, callout_halt() thinks it's not
running and therefore returns without invoking callout_wait().
7. cpu1 proceeds to invoke callout_destroy() while it's still running on
cpu0. callout_destroy() detects that and panics.
---
sys/kern/kern_timeout.c | 26 +++++++++++++++++++-------
1 file changed, 19 insertions(+), 7 deletions(-)
diff --git a/sys/kern/kern_timeout.c b/sys/kern/kern_timeout.c
index 8c2b180c6dae..210f24dc942b 100644
--- a/sys/kern/kern_timeout.c
+++ b/sys/kern/kern_timeout.c
@@ -191,6 +191,7 @@ static struct callout_cpu ccb;
#ifndef CRASH /* _KERNEL */
static void callout_softclock(void *);
static void callout_wait(callout_impl_t *, void *, kmutex_t *);
+static bool callout_running_somewhere_else(callout_impl_t *);
=20
static struct callout_cpu callout_cpu0 __cacheline_aligned;
static void *callout_sih __read_mostly;
@@ -375,7 +376,7 @@ callout_destroy(callout_t *cs)
KASSERTMSG((c->c_flags & CALLOUT_PENDING) =3D=3D 0,
"pending callout %p: c_func (%p) c_flags (%#x) destroyed from %p",
c, c->c_func, c->c_flags, __builtin_return_address(0));
- KASSERTMSG(c->c_cpu->cc_lwp =3D=3D curlwp || c->c_cpu->cc_active !=3D c,
+ KASSERTMSG(!callout_running_somewhere_else(c),
"running callout %p: c_func (%p) c_flags (%#x) destroyed from %p",
c, c->c_func, c->c_flags, __builtin_return_address(0));
c->c_magic =3D 0;
@@ -543,7 +544,6 @@ callout_halt(callout_t *cs, void *interlock)
{
callout_impl_t *c =3D (callout_impl_t *)cs;
kmutex_t *lock;
- int flags;
=20
KASSERT(c->c_magic =3D=3D CALLOUT_MAGIC);
KASSERT(!cpu_intr_p());
@@ -553,11 +553,10 @@ callout_halt(callout_t *cs, void *interlock)
lock =3D callout_lock(c);
SDT_PROBE4(sdt, kernel, callout, halt,
c, c->c_func, c->c_arg, c->c_flags);
- flags =3D c->c_flags;
- if ((flags & CALLOUT_PENDING) !=3D 0)
+ if ((c->c_flags & CALLOUT_PENDING) !=3D 0)
CIRCQ_REMOVE(&c->c_list);
- c->c_flags =3D flags & ~(CALLOUT_PENDING|CALLOUT_FIRED);
- if (__predict_false(flags & CALLOUT_FIRED)) {
+ c->c_flags &=3D ~(CALLOUT_PENDING|CALLOUT_FIRED);
+ if (__predict_false(callout_running_somewhere_else(c))) {
callout_wait(c, interlock, lock);
return true;
}
@@ -593,7 +592,7 @@ callout_wait(callout_impl_t *c, void *interlock, kmutex=
_t *lock)
* - the callout itself has called callout_halt() (nice!)
*/
cc =3D c->c_cpu;
- if (__predict_true(cc->cc_active !=3D c || cc->cc_lwp =3D=3D l))
+ if (__predict_false(!callout_running_somewhere_else(c)))
break;
=20
/* It's running - need to wait for it to complete. */
@@ -770,6 +769,19 @@ callout_ack(callout_t *cs)
mutex_spin_exit(lock);
}
=20
+/*
+ * Check if the callout is currently running on an LWP that isn't curlwp.
+ */
+static bool
+callout_running_somewhere_else(callout_impl_t *c)
+{
+ struct callout_cpu *cc =3D c->c_cpu;
+
+ KASSERT(mutex_owned(cc->cc_lock));
+
+ return cc->cc_active =3D=3D c && cc->cc_lwp !=3D curlwp;
+}
+
/*
* callout_hardclock:
*
--=_QSVhpQW7wL2t2ZyM0lNb6kI+m4HN6cIl
Content-Type: text/plain; charset="ISO-8859-1"; name="dfe928e33245062241b936a0232d5f6b6a3e98a6"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="dfe928e33245062241b936a0232d5f6b6a3e98a6.patch"
From dfe928e33245062241b936a0232d5f6b6a3e98a6 Mon Sep 17 00:00:00 2001
From: PHO <pho%cielonegro.org@localhost>
Date: Sun, 25 Jun 2023 20:28:31 +0900
Subject: [PATCH] callout(9): Obtain the lock before checking if a callout is
destroyable
One cannot safely observe the state of a callout without holding its lock,
because c->c_flags and c->c_cpu might be getting mutated somewhere else.
---
sys/kern/kern_timeout.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/sys/kern/kern_timeout.c b/sys/kern/kern_timeout.c
index 3f3338a67872..8c2b180c6dae 100644
--- a/sys/kern/kern_timeout.c
+++ b/sys/kern/kern_timeout.c
@@ -364,17 +364,14 @@ void
callout_destroy(callout_t *cs)
{
callout_impl_t *c =3D (callout_impl_t *)cs;
-
+#if defined(KDTRACE_HOOKS) || defined(DIAGNOSTIC)
+ kmutex_t *lock =3D callout_lock(c);
+#endif
SDT_PROBE1(sdt, kernel, callout, destroy, cs);
=20
KASSERTMSG(c->c_magic =3D=3D CALLOUT_MAGIC,
"callout %p: c_magic (%#x) !=3D CALLOUT_MAGIC (%#x)",
c, c->c_magic, CALLOUT_MAGIC);
- /*
- * It's not necessary to lock in order to see the correct value
- * of c->c_flags. If the callout could potentially have been
- * running, the current thread should have stopped it.
- */
KASSERTMSG((c->c_flags & CALLOUT_PENDING) =3D=3D 0,
"pending callout %p: c_func (%p) c_flags (%#x) destroyed from %p",
c, c->c_func, c->c_flags, __builtin_return_address(0));
@@ -382,6 +379,10 @@ callout_destroy(callout_t *cs)
"running callout %p: c_func (%p) c_flags (%#x) destroyed from %p",
c, c->c_func, c->c_flags, __builtin_return_address(0));
c->c_magic =3D 0;
+
+#if defined(KDTRACE_HOOKS) || defined(DIAGNOSTIC)
+ mutex_spin_exit(lock);
+#endif
}
=20
/*
@@ -654,12 +655,12 @@ callout_bind(callout_t *cs, struct cpu_info *ci)
struct callout_cpu *cc;
kmutex_t *lock;
=20
- KASSERT((c->c_flags & CALLOUT_PENDING) =3D=3D 0);
- KASSERT(c->c_cpu->cc_active !=3D c);
KASSERT(c->c_magic =3D=3D CALLOUT_MAGIC);
KASSERT((c->c_flags & CALLOUT_MPSAFE) !=3D 0);
=20
lock =3D callout_lock(c);
+ KASSERT(c->c_cpu->cc_active !=3D c);
+ KASSERT((c->c_flags & CALLOUT_PENDING) =3D=3D 0);
cc =3D ci->ci_data.cpu_callout;
c->c_flags |=3D CALLOUT_BOUND;
if (c->c_cpu !=3D cc) {
--=_QSVhpQW7wL2t2ZyM0lNb6kI+m4HN6cIl--
Home |
Main Index |
Thread Index |
Old Index