Subject: Re: change bus_space_barrier() semantic
To: Garrett D'Amore <garrett_damore@tadpole.com>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-kern
Date: 01/09/2006 22:46:47
On Mon, Jan 09, 2006 at 06:44:13AM -0800, Garrett D'Amore wrote:
> >With a write-thru cache, I think you can achieve read caching without
> >enabling write merging/reordering. Whenever this is desireable depends
> >on the device you're talking to.
> >For example, with ix(4) i think you could write to the buffer using PIO
> >and read it using the memory-mapped ram.
> > 
> >
> Seems like an unusual configuration to me.  But it does explain why we 
> have separate degrees of freedom for cacheable and prefetchable.
> 
> It is interesting to me that we assume read/write will be treated the 
> same in our mappings.  Should we allow for seperation of this?  Again, I 
> can't think of any concrete examples where caching (as an example) loads 
> vs. stores would be useful.  (I guess you could have buffer that has no 
> side effects for read, but which you can write to -- think a xmit buffer 
> that you wind up going and reexamining for some reason.  Seems a bit 
> contrived though.)

This specific example would mean configuring the cache write-though.
But I'm not sure we want this granularity. This interface is for MI drivers,
and these drivers will use a reasonable set of common functionnalities.


> 
> >>Okay.  Will you be fixing any implementation(s)? that suffer from this 
> >>defect?
> >>   
> >>
> >
> >I probably don't have the knowledges to fix this, but I'll look at it 
> >anyway
> > 
> >
> Okay, then I think we need to get more info, buy-in from those that do.  
> I'm willing to fix at least evbmips.  (I'd offer to do other MIPS ports, 
> but I only have evbmips hardware.)

I have sgimips systems.

> >>Accurately figuring out 
> >>barriers beyond that is likely to be somewhat painful for a large 
> >>variety of drivers.
> >>   
> >>
> >
> >Sure. But as most drivers are not using barrier at all right now, it's not
> >a problem.
> > 
> >
> Okay, but I still would like to see the convenience/compatibility macros 
> defined.

OK, no problems.

> >>I think the documentation will need to be updated.
> >>   
> >>
> >
> >Yes, this was the point which started this thread: the documentation and
> >implementation are not in sync :)
> > 
> >
> Apart from the finer grained values, a driver conforming to the 
> documentation is still "correct" (if underperforming) so its not as bad 
> you suggest.
> 
> My main point here is that if we're going to have implicit barriers, we 
> need to be very careful to document when explicit barriers must or may 
> not be used.

Sure.

> >
> >>It would be nice if 
> >>the name of the flag BUS_SPACE_MAP_PREFETCHABLE somehow reflected the 
> >>fact that barriers were explicit vs. implicit.
> >>   
> >>
> >
> >Hum, I can't think of a better one. Do you have any suggestions?
> >
> How about "BUS_SPACE_MAP_STRICT_ORDER", or conversely, 
> "BUS_SPACE_MAP_LAX_ORDER" (or something similar)?
> 
> Alternatively, "BUS_SPACE_MAP_BARRIER_EXPLICIT/IMPLICIT". I like this 
> less, because I think you still want barriers if BUS_SPACE_CACHEABLE is 
> used.

Yes. Hum, I'm not sure we really should change the name :)

> 
> >>Most device on MIPS plaforms are located in uncached memory space, so 
> >>the assumption is *usually* correct.
> >>
> >>Yet the barrier operation is non-null.  One could argue reasonably that 
> >>this is a bug in the mips implementation.  But until this bug is fixed, 
> >>your proposed changes will have a bad effect on performance, at least on 
> >>MIPS hardware.
> >>   
> >>
> >
> >So I guess the barrier should check if the bus_space handle used is
> >mapped in cached or non-cached memory.
> > 
> >
> Agreed, although there are challenges there -- owing to the fact that 
> most of this bus_space code is in MIPS "MI" code, but the details of 
> which addresses are cached or not is really MD.  (The fact that there 
> are over half a dozen different MIPS "ports" doesn't help.)

Well, bus_space should already know if the space was mapped
BUS_SPACE_CACHEABLE or not. This can be cached in the handle.

> >
> >>I wonder if other ports have similar issues.  I have not done a survey 
> >>of them.  Have you?
> >>   
> >>
> >
> >Only i386 and sparc64. On i386 no barrier are needed, on sparc64
> >none should be needed either, but I'm tracking down a bug in a driver
> >which could in fact be caused by write happening out of order.
> > 
> >
> I think it would be a good idea to look at more of them, then -- or at 
> least solicit input from the port masters.
> 
> Also, I'd think a barrier may be needed on some high end sparc64 
> hardware.  (I seem to recall that this was certainly true for Solaris 
> drivers with DMA mapped consistent -- you still had to do a 
> ddi_dma_sync().  I'm not sure whether PIO needed it or not.  The problem 
> involved flushing some buffers in the crossbar on SF15K class systems.  
> Not sure that NetBSD is ever run on that class of machine anyway, 
> though.  Not many developers/users have several hundred K $ to run a 
> NetBSD on it.)
> 
> As far as your particular bug, I think the first best fix is to add the 
> barrier to the driver, without waiting for resolution of the larger 
> issue here.  (Besides which, the change you propose is really too big to 
> be considered as a backport, but I think a small barrier fix to a given 
> driver would be reasonable to backport, if the need existed.)

No. Most current drivers assume implicit barriers at last when mapped with
no flags, and most bus_space implementations do. If barriers are needed
for this driver it's a bug in sparc64's bus_space implementation not
doing implicit barriers properly.

> >>   
> >>
> >
> >Yes, clearly. This kind of detail should not be known to the Mi device
> >drivers. If we strictly followed the bus_space man page, right now,
> >wi(4) and the ide drivers should use barrier in the MI part of the
> >code.
> > 
> >
> Ok.  So can we define "fixing" the MIPS port(s) as a requirement for 
> this change, please.  Otherwise performance on many platforms will be 
> adversely impacted.  (At this point I'm not arguing that the current 
> code is incorrect, only that it functions reasonably well.)

I didn't look closely at the mips port yet, but I guess my change doesn't
require an immediate change in the mips's bus_space. The fact that drivers
works properly on mips shows that the implicit barriers are already there,
or that the space are mapped in a way so that no barriers are needed (I guess
it's the second case here).

> >>But it doesn't, and the end result of your changes will 
> >>seriously impact performance of these devices on (probably) all MIPS 
> >>platforms.  I suspect that other ports may have similar issues, but I 
> >>have not checked.
> >>   
> >>
> >
> >I didn't either. but bus_space read/write is already doing the right thing
> >or most driver wouldn't work properly. I think the mips port is a special
> >case, and I suspect it can be easily fixed.
> > 
> >
> Btw, there is not just "one" MIPS port.  There are a number of them, 
> including (off memory) evbmips, sbmips, pmax, cobalt, sgimips, mipsco, 
> ews4800mips (or whatever it is called), playstation2.  There may be 
> others.  Most of them suffer from this deficiency.  It might be a good 
> idea to include port-mips in the discussion.

But I guess their current bus_space implementation maps the registers
so that no barrier are needed, otherwise drivers wouldn't work.

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--