Subject: Re: port-shark/22355 [was: Help needed to fix NetBSD/shark]
To: None <jmmv84@gmail.com>
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
List: port-arm
Date: 08/06/2007 23:07:22
jmmv84@gmail.com wrote:

> >> So these patches are just hiding an underlying bug somewhere else by
> >> reseting the spl_mask after interrupt routines are called, however  
> >> the
> >> mask should have been reset back by an splx call.
> >
> > The attached one also seems to fix the problem.
> > (use spl_masks[current_spl_level] rather than spl_mask in  
> > irq_setmask())
> 
> Indeed.  I've been running the machine under load (building some  
> packages) with your patch applied and it has been flawless.
> 
> The only thing that worries me (based on Chris' comments) is that  
> this might be hiding some other problem instead of fixing the real  
> root cause.  But I don't know enough to see if that's the case or not...

I guess the following rough scenario:
---

(1) in cpu_idle() (or other normal processes)
   -> current_spl_level =  0, spl_mask = 0xffffffff (spl_masks[SPL_NONE])
 
(2) clock interrupt occurs and irq_entry() is called
   -> current_spl_level = 10, spl_mask = 0xffffffff (spl_masks[SPL_NONE])

(3) clockintr() is called then splhigh() is called
   -> current_spl_level = 13, spl_mask = 0xffff2c45 (spl_masks[SPL_HIGH]) 
 
(4) splx(s) is called 
   -> current_spl_level = 10, spl_mask = 0xffff2c5d (spl_masks[SPL_CLOCK])
 
(5) returns from clockintr() to irq_entry()
   -> current_spl_level =  0, spl_mask = 0xffff2c5d (spl_masks[SPL_CLOCK])
 
(6) irq_setmasks() is called
   -> disabled_mask is updated per spl_mask (SPL_CLOCK here)

(7) returns from irq_entry() into cpu_idle()
   -> current_spl_level =  0, but clock and lower interrupts are disabled

---

With my first patch, spl_mask is restored properly between (5) and (6).

With my latter patch, spl_mask is no longer used in irq_setmasks()
so interrupts are not blocked. The spl_mask variable might be
restored further splfoo()/splx(s) pairs before switching to cpu_idle().
(spl_mask is no longer referred anyway in that case)

I'm not sure why your first patch doesn't work though.
---
Izumi Tsutsui