Subject: Re: Data corruption with dump (mmap related??)
To: Chris G. Demetriou <cgd@sibyte.com>
From: Wayne Knowles <w.knowles@niwa.cri.nz>
List: port-mips
Date: 08/28/2000 09:52:23
On 27 Aug 2000, Chris G. Demetriou wrote:

> 
> in _setup, in the DMA_PULLUP case you write data to the FIFO, and in
> the !DMA_PULLUP case you read it.
> 
> in _intr, in the DMA_PULLUP case you again write data to the FIFO,
> and in the !DMA_PULLUP case you whack the fifo.
> 
> for the DMA_PULLUP case, you're basically filling the FIFO with memory
> data which it will then DMA back into memory?
> 
> I don't think I understand what the !DMA_PULLUP case in _setup is
> doing, then.

Ok - perhaps a 1980's ascii art illustration will help:

The vertical bars seperate the 64 byte blocks that we can DMA.  The DMA
must start on a 64b boundry, and has to be a mulitple of 64b in length.

   0               64             128             192             256
   |===============|===============|===============|===============|
   <----dma1-------><-----dma2-----><-----dma3-----><-----dma4----->
           <-------------Segment to DMA--------------------->
            

At the start of the DMA (in the _setup stage) we need to bump the FIFO up
to the correct location.  Here is how it is done:

+  If it is a SCSI read (DMA_PULLUP) we prime the FIFO with the contents
   of memory that the DMA finally end up writing to.

        Memory:    |====================================|
           
			 || (copy memory to prime fifo)
			 \/
        FIFO:      |===============_____________________|
                                   ^
                                dma_fifo ptr

       After _setup is called the ncr53c9x driver sets up the scsi
       transfer which allows data to go into the fifo


        FIFO:      |===============XXXXXXXXXXXXXXXXXXXXXX| 
                          ^                  ^
                    Copy of memory       Data from SCSI 

   
        Once the FIFO is fill it is flushed to main memory.  The memory
        that caused the alignment problem is not changed - just written
        with the a copy of itself.   Given this all happens at splbio()
        level there isn't much that can change it.

+   For the SCSI Write case (!DMA_PULLUP) we do similar tricks.  We force
    the first block to be loaded by the dma controller and manually
    suck out the bytes until we have the byte that is the first byte of
    our transfer.  

Once the alignment issue is sorted out for the first DMA block of 64 bytes
we can power on until the last block.
We then use similar tricks to fill up the FIFO to reach 64 bytes and force
a flush - this is done at the _intr stage when the transfer is complete.

Pretty smart tactics are required for brain dead DMA :-)

> (BTW, the comment "XXX - disable DMA xfer before flushing FIFO ?" is
> kinda scary -- esp. since you aren't flushing it here, you're filling
> it as far as I can tell.  8-)

That is me thinking out load (we must all do it :-)  The technical
reference manual says one must disable the DMA when stuffing the FIFO
register.  The only problem is the FIFO register is disabled if we disable
DMA!! (I think the spec was written before the first driver was.)

It is safe to do it at that stage since the NCR 53c94 controller has
reached terminal count - that is what I keep telling myself when I see
that comment, but I left it there as a reminder. 

> right, so, this got back to the question of which operation actually
> causes the read to fill the buffer with the bad contents.
> 
> If it was read from a raw device, the device is the userland buffer
> would be filled directly by device DMA.

Ahhhh- that's what I've been looking for!!!!!  If you are ever in
Wellington I owe you beer and pizza.

That was what I was trying to find out in all my unix books and sources
yesterday, but couldn't find that one simple sentence.

I can now see where you are coming from.   Yes, you were pointing me
back in the right direction.  There is a chance the last FIFO flush didn't
happen on the last block - but I think that is less likely.

Looks like the problem is somewhere around bus_dmamap_load where is builds
the DMA segments for dma chaining.   IIRC, the same bus_space.c code is
shared between mipsco, pmax, hpcmips and arc ports.   Hopefully that is
why pmax is also infected.

Will check this out tonight

Wayne
-- 
  _____	   	Wayne Knowles,  Systems Manager
 / o   \/   	National Institute of Water & Atmospheric Research Ltd
 \/  v /\   	P.O. Box 14-901 Kilbirnie, Wellington, NEW ZEALAND
  `---'     	Email:   w.knowles@niwa.cri.nz