port-sparc: LX vs SS20?

Subject: LX vs SS20?
To: None <port-sparc@netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: port-sparc
Date: 08/13/2003 01:53:28

What are the differences between an LX and a SS20?

Specifically, on these two machines

mainbus0 (root): SUNW,SPARCstation-LX
cpu0 at mainbus0: TMS390S10 @ 50 MHz, on-chip FPU
cpu0: physical 4K instruction (32 b/l), 2K data (16 b/l): cache enabled

mainbus0 (root): SUNW,SPARCstation-20
cpu0 at mainbus0: TMS390Z50 v0 or TMS390Z55 @ 75 MHz, on-chip FPU
cpu0: physical 20K instruction (64 b/l), 16K data (32 b/l), 1024K external (32 b/l): cache enabled

I have a pseudo-driver in the kernel that, among other things, sends
data between processes, uiomove()ing it from userland in one process
into a kernel buffer, then copying it out of that buffer in another
process.

This all works perfectly on the LX. But on the SS20, the process
receiving the data gets a bufferful of 0x00.

It's not that the writer process is writing the wrong thing. Besides
the userland code being identical, I ktraced it, and it is indeed
writing the correct data. And the pseudo-driver source is identical on
both machines, and both machines are running freshly built kernels.

I can only conclude that there must be something in the nature of an
inter-process cache coherency issue involved. (There is no possibility
of inter-CPU coherency trouble; neither machine has more than one CPU.)
But I hate easter egging to try to "fix" such problems; unless I
actually understand what's wrong and why something fixes it, I don't
consider the `something' a real fix.

Thus my question: what relevant hardware difference could there be, and
how can I fix it right? Either I'm doing something wrong without
realizing it or the cache code in my kernel is buggy, it seems to me.

Details: The driver is a pseudo-disk driver. The "receiving" process
has performed a read on the disk; the code path ends up in a routine
that allocates a 516-byte buffer on the stack (an auto array of char).
The address of this buffer is saved in the softc. The "sending"
process is awoken and writes the data to the character special
interface; the write routine finds the pointer in the softc, uiomove()s
the data into the buffer, and kicks the receiving process. The
receiving process then bcopy()s the data out of the stack buffer and
carries on.

And on the LX (and on my i386 machine, a K6-2), this works fine. But
on the SS20, the receiving process gets a bufferful of zeroes (or, I
once saw, the first (about) 0x80 bytes of the actual data and then the
rest of the buffer full of 0x00).

Are SPARC D-caches virtually tagged? Is the kernel stack stored in
such a way that the cache tag for one process's kernel stack is
different from another's view of the same (kernel) virtual address? Is
there some simple way for MI code to push all dirty D-cache lines
applying to a given range of virtual addresses to main memory?

Where is the code responsible for doing any necessary cache flushing on
context switch? I'm willing to step through to see if any relevant
commits have been done since the code I'm using, but I need to know
what file(s) to look at.

/~\ The ASCII der Mouse
\ / Ribbon Campaign
X Against HTML mouse@rodents.montreal.qc.ca
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B