NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/59692: itimer(9): possible infinite loop
>Number: 59692
>Category: kern
>Synopsis: itimer(9): possible infinite loop
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Oct 05 14:45:01 +0000 2025
>Originator: Taylor R Campbell
>Release: current, 11, 10, 9, ...
>Organization:
The Clock of the Long BSD Foundation
>Environment:
>Description:
The algorithm run periodically for an itimer is roughly as
follows:
1. Check the clock, call it `now', with getnano(up)time.
2. Based on the time it previously fired, `last', and on `now',
(a) how many overruns there have been, and
(b) at what struct timespec the itimer should fire next,
call it `next'.
3. Check the clock again, call it `now1'.
4. callout_schedule(tstohz(next - now1)).
Source references:
https://nxr.netbsd.org/xref/src/sys/kern/kern_time.c?r=1.228#839
https://nxr.netbsd.org/xref/src/sys/kern/kern_time.c?r=1.228#816
https://nxr.netbsd.org/xref/src/sys/kern/subr_time.c?r=1.41#65
Note that, barring clock changes, last < now < MIN(now1, next).
However, it is not guaranteed that now1 < next. If the time to
compute step 2 is large enough (and this may be unpredictably
delayed by interrupt processing -- possibly even by scheduling,
if softints are preemptible), it may happen that next < now1.
In this case, where next < now1, tstohz will return 0 and
callout_schedule will schedule the callout immediately, after
zero ticks.
If this happens repeatedly, the kernel will loop through
callout processing without returning callout_softclock, the
callout softint handler at IPL_SOFTCLOCK. This can lead to
heartbeat panics, which check the progress of a different
softint handler at IPL_SOFTCLOCK.
I hypothesize this phenomenon may be one of the two causes of
the following PRs:
PR kern/59339: heartbeat watchdog fires since 10.99.14
https://gnats.NetBSD.org/59339
PR kern/59465: Recurring kernel panic with -current (10.99.14):
"heart stopped beating"
https://gnats.NetBSD.org/59465
PR kern/59679: Multiple "heart stopped beating" / "softints
stuck for 16 seconds" panics
https://gnats.NetBSD.org/59679
>How-To-Repeat:
haven't found a compact reproducer
>Fix:
Several options:
1. Change tvtohz/tstohz to map 0 sec to 1 tick.
Currently these map any timeout <=0 to 0 tick, and any
timeout >0 to >=2 tick -- this is because we don't know how
far away the next hardclock interrupt will be, and since it
could be arbitrarily small, we have to wait for two
hardclock interrupts to guarantee having waited at least
some nonzero time.
But, perhaps we should map any timeout <0 to 0 tick, and a
zero timeout to 1 tick. This will always wait some nonzero
duration, but there is no lower bound to how short that
duration will be before the next hardclock interrupt: For a
kernel running at hardclock frequency f, this will fire at
somewhere in the interval (0, (1 sec)/f] -- that is, it will
fire at the _next_ hardclock interrupt, not immediately.
2. Stop sampling now1 and use tstohz(next - now) instead.
This may drift a little bit, but the itimer transition logic
will account for that the next time it fires -- it may
report an overrun, but it always aims for for times value +
k*interval for integer k.
It may also be a little cheaper, particularly when we adopt
high-resolution timers for which we will really want to use
the preciser but costlier nano(up)time which query the
timecounter directly rather than the coarse and cheap
getnano(up)time which just read the last hardclock
interrupt's cache of the timecounter.
This is what the v6 patch at https://gnats.NetBSD.org/59339
https://mail-index.NetBSD.org/netbsd-bugs/2025/05/12/msg089000.html
does (in addition to fixing PR kern/59691: tstohz(9) fails
to round up on some inputs).
3. Change callout_softclock to grab a batch of callouts and
process that before returning.
This way other softints at IPL_SOFTCLOCK have a chance to
run even if more callouts are scheduled while
callout_softclock is running: if a new callout is scheduled,
it will be handled in the _next_ batch.
We do this for other kinds of queued processing like
workqueues:
174 for (;;) {
175 struct workqhead tmp;
176
177 SIMPLEQ_INIT(&tmp);
178
179 while (SIMPLEQ_EMPTY(&q->q_queue_pending))
180 cv_wait(&q->q_cv, &q->q_mutex);
181 SIMPLEQ_CONCAT(&tmp, &q->q_queue_pending);
182 SIMPLEQ_INIT(&q->q_queue_pending);
183
184 /*
185 * Mark the queue as actively running a batch of work
186 * by setting the generation number odd.
187 */
188 q->q_gen |= 1;
189 mutex_exit(&q->q_mutex);
190
191 workqueue_runlist(wq, &tmp);
192
193 /*
194 * Notify workqueue_wait that we have completed a batch
195 * of work by incrementing the generation number.
196 */
197 mutex_enter(&q->q_mutex);
198 KASSERTMSG(q->q_gen & 1, "q=%p gen=%"PRIu64, q, q->q_gen);
199 q->q_gen++;
200 cv_broadcast(&q->q_cv);
201 }
https://nxr.netbsd.org/xref/src/sys/kern/subr_workqueue.c?r=1.48#174
For callout(9), the operation corresponding to
SIMPLEQ_CONCAT is called CIRCQ_APPEND (with arguments
reversed).
Home |
Main Index |
Thread Index |
Old Index