tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Why NetBSD x86's bus_space_barrier does not use [sml]fence?
On Mon, Dec 02, 2019 at 06:30:21AM +0000, Taylor R Campbell wrote:
> > Date: Fri, 29 Nov 2019 14:49:33 +0000
> > From: Andrew Doran <ad%netbsd.org@localhost>
> >
> > It no, the CALL instruction that calls bus_space_barrier() produces a write
> > to the stack when storing the return address. On x86, stores are totally
> > ordered, and loads are never reordered around stores. No further barrier
> > is needed. This is the idea anyway, sometimes reality does not match..
>
> I don't understand this. The Intel manual (volume 3 of the 3-volume
> manual, System Programming Guide, Sec. 8.2.2 `Memory Ordering in P6
> and More Recent Processor Families') says:
I have not been paying attention, refused to engage my brain and gotten
confused. My fault. On the case of ordering in WB regions I put a comment
in sys/arch/x86/include/lock.h to describe the situation, the last time this
discussion came up and it was in regards to locks. I have strong suspicions
about what Intel writes but probably better not to muddy the discussion with
them!
We're talking UC/WB mappings here though, so:
> It's not entirely clear to me from the Intel manual, but in the AMD
> manual (e.g., Volume 2, chapter on Memory System) it is quite clear
> that write-combining `WC' regions may issue stores out of order -- and
> that's what you get from BUS_SPACE_MAP_PREFETCHABLE.
>
> So it seems to me we really do need
>
> switch (flags) {
> case BUS_SPACE_BARRIER_READ:
> lfence or locked instruction;
> break;
> case BUS_SPACE_BARRIER_WRITE:
> sfence or locked instruction;
> break;
> case BUS_SPACE_BARRIER_READ|BUS_SPACE_BARRIER_WRITE:
> mfence or locked instruction;
> break;
> }
I see a couple of problems with what's in x86_bus_space_barrier() now:
- It will apply to UC mappings too and it's an unneeded pessimisation,
because the majority of bus_space acceses on x86 don't need any kind of
barrier. Maybe the other systems are good with that, I think we should
try to do better.
- It will also apply to I/O space mappings. There are, if I recall
correctly, some drivers that can work with either I/O or memory mapped
chips and bus_space is designed to allow this. In the case of I/O port
access there is no memory access going on (at least from the kernel's POV,
I don't know how the chipset implements it under the covers). There will
also be drivers that are I/O port mapping only and the author has followed
the bus_space manual page dilligently and put in barriers anyway.
I think the way to avoid both of those would be key off the bus_space_tag_t.
There are already two, one for I/O and one for memory. We could do multiple
memory tags. That's probably very easy, I think the harder bit would likely
be passing those down to PCI and having it choose. I don't know that code.
Andrew
Home |
Main Index |
Thread Index |
Old Index