Subject: Re: Can BSD be made to do a hard realtime tasks?
To: Warren Postma <warren.postma@sympatico.ca>
From: Richard Earnshaw <rearnsha@buzzard.freeserve.co.uk>
List: port-arm
Date: 09/28/2002 10:12:23
> I am evaluating a lot of RTOS options for a 200 Mhz StrongARM embedded
> single board custom system.
> 
> I'd like to use BSD, as it's elegant, and I could build a multi-threaded
> multi-process system with full MMU Support
> (memory protection).
> 
> I have evaluated eCOS but it comes without MMU support, and they're still
> working on porting the beautiful BSD TCP/IP code to eCOS.
> 
> The hitch is my real-time requirement. Not all threads, or even most
> threads, have realtime consequences, but I have one tricky feature.
> 
> I need a kernel-space interrupt service routine, and the ability to have the
> ISR activate a thread/process that is non-interruptable until it completes
> its work.  Also, the high-priority thread/process must be activated less
> than 8 uSec from when the IRQ comes in.

Hmm, that's a bit tight.  At 200MHz you could execute a theoretical 
maximum of 1600 instructions in that time, but reality, and worse case 
considerations will be significantly lower.  Let's assume that the main 
memory system is running at 66MHz.

1) Must allow time for the exeucting instruction to complete, it could be 
a LDM/STM to NCNB i/o memory.  If I/O space is only running at half memory 
speed, then that could take up to 120ns per register (I'm guessing that 
would be about 2 bus cycles per register), lets say such an access never 
exceeds 4 registers, then we've lost ~0.5us before we can respond (this is 
assuming interrupts are not blocked -- more on this later).

2) StrongARM doesn't have TLB or Cache lock-down, so we can't guarantee 
that the vector page will be either in the TLB or the Cache, so taking the 
interrupt might require both a TLB walk (maybe even two, since it's a 
two-level lookup) and cache-line fills, that means that our 1600 
instructions limit has probably shrunk to much closer to 500 (ie we are 
likely to be executing most instructions in the interrupt handler at 
closer to memory-bus speed) -- the CPI of strongARM averages out at about 
1.4 IIRC, so real instructions are likely to be 350 -- and even that is 
assuming a memory system that can return one word per bus cycle.

Now I'm almost certain that the generic interrupt handler in NetBSD/arm32 
won't be able to dispatch in that sort of time frame, since there's no 
hardware to vector interrupt dispatch, or any hardware assistance for 
priority encoding.

So, the only option would be to make use of a FIQ handler and to route 
your high priority interrupts there; problem is that NetBSD doesn't have 
any support for context switching on the FIQ -- the FIQ is really designed 
for a single very high priority interrupt source that can be serviced 
quickly and then normal operation resumed.

Finally, if you need to activate a user-mode process/thread in response to 
your interrupt, then I think there is no chance of making it work -- 
StrongARM caches are virtually indexed so have to be completely flushed on 
each context switch, that's up to 16K of data to be written out of the D 
cache -- a worse-case scenario of 4K bus cycles (assuming 1 cycle per word 
@66MHz that's 62us!).

> 
> I would like to have both user-space and kernel threads as well as be able
> to run a few "non-realtime" processes as separate Executables, and the Real
> Time component would be kernel-space (an interrupte service routine, and a
> kernel-space thread). So far, I haven't found anything to indicate that
> people have done anything like this before on NetBSD on StrongARM.
> 
> Can anyone tell me if they've managed to build a realtime system in embedded
> NetBSD on a fast StrongARM core?
> 

I think it would be pushing it given the constraints you've mentioned.  
You might like to look at some of the other ARM chips which do have better 
facilities for supporting these types of requirements.  ARM920, for 
example, has both TLB and Cache lock-down capabilites so that you could 
lock the interrupt dispatch code into the cache.  It also has the 
capability to make the caches run in write-through mode, which means that 
a cache flush would be much less expensive in terms of latency (though of 
course, you sacrifice normal run-time performance for that benefit).  Main 
problem is that the arm code in the NetBSD kernel doesn't yet support 
cache/tlb lockdown.

Hope that gives you some clues/pointers.

R.