port-arm: Race condition in generic soft interrupts code?

Subject: Race condition in generic soft interrupts code?
To: None <port-arm@netbsd.org, thorpej@wasabisystems.com>
From: Richard Earnshaw <rearnsha@arm.com>
List: port-arm
Date: 07/14/2003 22:50:08
I'm posting this here because I believe this problem may affect (at least 
in theory) all platforms based on Jason's generic soft-interrupts 
implementation for the ARM.

In trying to port the generic soft-ints code to the Integrator board I've 
been running into a race condition that can lead to particular interrupts 
becoming permanently blocked, even though the SPL level is 0 and the 
processor is sitting in the idle loop.  The main symptom is that 
XXX_ipending has a bit set, and the corresponding bit in intr_enabled is 
zero, so the interrupt is permanently blocked.  Here is how I think this 
can happen:

A high-priority interrupt (high SPL) occurs (lets say it's statclock)

The code enters XXX_intr_dispatch and blocks that interrupt. Eg:

	hwpend = XXX_iintsrc_read();

	/*
	 * Disable all the interrupts that are pending.  We will
	 * reenable them once they are processed and not masked.
	 */
	intr_enabled &= ~hwpend;
	XXX_set_intrmask();

It then enters the loop below that scans over hwpend bits.  While 
processing the statclock interrupt the SPL is effectively raised to 
IPL_STAT_CLOCK with

		current_spl_level |= iq->iq_mask;
		oldirqstate = enable_interrupts(I32_bit);

Interrupts are then enabled and the statclock handler code is called:

		for (ih = TAILQ_FIRST(&iq->iq_list); ih != NULL;
		     ih = TAILQ_NEXT(ih, ih_list)) {
			(void) (*ih->ih_func)(ih->ih_arg ? ih->ih_arg : frame);
		}

Now while this is being processed, another lower-priority interrupt occurs 
(say hardclock):  We again enter XXX_intr_dispatch and block that source 
of interrupt as before.  This time, however, we find that
	(pcpl & ibit)
is true for hardclock (since IPL_CLOCK < IPL_STAT_CLOCK), so we fall into 
the code

		if (pcpl & ibit) {
			/*
			 * IRQ is masked; mark it as pending and check
			 * the next one.  Note: the IRQ is already disabled.
			 */
			ifpga_ipending |= ibit;
			continue;
		}

and then there is nothing more to do (so we return to the previous 
interrupt invocation.

We now complete the handling of the statclock interrupt and restore the 
IPL level, re-enable the statclock interrupt and exit the loop (there was 
only one bit set in hwpend, remember):

		current_spl_level = pcpl;

		/* Re-enable this interrupt now that's it's cleared. */
		intr_enabled |= ibit;
		ifpga_set_intrmask();

At no time do we check for pending interrupts that need to be handled at 
this level.

It seems that the fix for this is trivial, simply insert the following 
line in the code that cleans up after running the handler list:

		restore_interrupts(oldirqstate);
		current_spl_level = pcpl;

+		hwpend |= (XXX_ipending & XXX_INTR_HWMASK) & ~pcpl;

		/* Re-enable this interrupt now that's it's cleared. */
		intr_enabled |= ibit;
		XXX_set_intrmask();

and this will cause the interrupt that was deferred to be handled in the 
current loop before returning.

This patch was certainly sufficient to correct the problem I've been 
encountering (where the main clock interrupt was getting completely lost, 
causing the system to hang during scsi-bus probing -- no clock, so no 
timeout).

Jason, does this look correct to you?  If so, then all ports derived from 
your code will need patching.

R.