Subject: Re: SYSCALL_DEBUG
To: Mark Abene <phiber@radicalmedia.com>
From: Wayne Knowles <wdk@frc.niwa.cri.nz>
List: port-arc
Date: 02/13/2001 23:36:20
On Mon, 12 Feb 2001, Mark Abene wrote:

> > Mark Abene wrote:
> > 
> > > OK, so I enabled SYSCALL_DEBUG in netbsd-1.5R (I had to fix arch/mips/mips/

Hi Mark,

This is sometimes called a Hiezenbug (not quite sure on the
spelling) after the well known quantim physics principal :-)

Have seen this several times, but there are many causes (from compiler
bugs to cache coherency bugs to changes in timing characteristics)

> process to get so much further on.  I'm thinking along the lines of a memory
> corruption problem (though it doesn't seem evident from my debug output),
> or a possible dma problem causing the semi-random hangs, or even a TLB
> problem, though I'm not really sure what the TLB map is SUPPOSED to look like
> as compared to when I hang and do a "machine tlb" in DDB.
> I'd be hesitant to think there was a L2 cache problem, as I've gone over that
> code time and again in locore_mips3.S, and it's just too straightforward.
> The only oddity along these lines is if I set L2CachePresent to "0", I get
> a "TLB out of universe" panic right after autoconfig.

You cannot set L2CachePresent = 0 to disable the secondary cache.  All
that does is tell NetBSD that there is no secondary cache which changes 
the cache flush semantics to flush a the size of the smaller L1 cache 
(from the brief glance at the sources)

FYI most of the time you can make the cache size larger than
physically present and it will just result in a performance impact, but
never smaller.

A brief glance at my various R4000 user manuals and programming book
doesn't reveal a "L2 Cache disable" bit in the System co-processor
registers.  I can see how to do it at hardware initialization sequence by
setting NoSCMode=0, but that is a low level hardware thing that only
happens at CPU reset.  There appears to be no way of changing it 
afterwards
Basically by disabling cache flushing between context switches you
have stale data in the L2 cache which gives you the "TLB out of
universe" error(s)

Unfortunately for you, even though the R4000 series has a cache error
register, NetBSD makes no use of it! - if you did have a problem in that
area the problem may not be passed onto you in any form.

To me it sounds more like your serial UART is missing interrupts, or
there is some problem with the interrupt handling.  By sending out the
SYSCALL_DEBUG data you are efectively priming the UART again.

What could be happening in hardware is the UART interrupt is being shared
by some hardware device.  When an interrupt occurs from both devices
simultaneously only one is being serviced leaving the UART without
data.  The UART things it has requested the interrupt, but the OS thinks
it is still waiting for the output.

Console output from the OS poll the UART (haven't had time to check the
ARC sources but they could also do the same) and in the time it writes
SYSCALL_DEBUG strings it kicks the UART into action.
You could also could have the UART not services at interrupt level and
ends up in an endless loop in the interrupt handler.

Perhaps your machine is just having a few problems telling you it is all
up and running.  I'm assuming on what little I know of the ARC machines
they run on a graphics console and not over a serial line, so you may be
attempting something new.

One hint from the dmesg output:

  com0 at jazzio0 addr 0xe0006000 intr 9: ns16550a, working fifo
  com0: txfifo disabled
  com0: console

To me 'txfifo disabled' is not a good thing...

Good luck!

Wayne
-- 
  _____	   	Wayne Knowles,  Systems Manager
 / o   \/   	National Institute of Water & Atmospheric Research Ltd
 \/  v /\   	P.O. Box 14-901 Kilbirnie, Wellington, NEW ZEALAND
  `---'     	Email:   w.knowles@niwa.cri.nz