Re: Am I using bus_dma right?

To: tech-kern%NetBSD.org@localhost
Subject: Re: Am I using bus_dma right?
From: Mouse <mouse%Rodents-Montreal.ORG@localhost>
Date: Fri, 24 Apr 2020 13:13:49 -0400 (EDT)

>> I've been treating it as though my inspection of a given sample in
>> the buffer counts as "transfer completed" for purposes of that
>> sample.
> Are you inspecting the buffer only after reciept of an interrupt or
> are you polling?

Polling.  (Polls are provoked by userland doing a read() that ends up
in my driver's read routine.)

> [...] POSTWRITE does tell the kernel it can free up any bounce
> buffers it may have allocated if it allocated bounce buffers, but I
> digress.

Someone else asked me (off-list) if bounce buffers were actually in
use.  I don't know; when next I'm back at that job (it's only three
days a week), one thing I intend to do is peek under the hood of the
bus_dma structures and find out.

>> For my immediate needs, I don't care about anything other than
>> amd64.  But I'd prefer to understand the paradigm properly for the
>> benefit of potential future work.
> I believe if you use COHERENT on amd64 none of this matters since it
> turns off caching on those memory regions.  (But I don't have time to
> grep the souces to verify this.)

I do - or, rather, I will.  I don't recall whether I'm using COHERENT,
but it's easy enough to add if I'm not.

>> And, indeed, I tried making the read routine do POSTREAD|POSTWRITE
>> before and PREREAD|PREWRITE after its read-test-write of the
>> samples, and it didn't help.
> Ah now we're getting to something interesting.

> What failure mode are you seeing?

That off-list person asked me that too.  I wrote up a more detailed
explanation, but I saved it in case someone else wanted it.  I'll
include the relevant text below.  The short summary is that I'm seeing
data get _severely_ delayed before reaching CPU visibility - severely
delayed as in multiple seconds.  And here's why I believe that's what's
going on:

----------------

Perhaps I should explain why I believe what I do about the behaviour.

The commercial product is a turnkey system involving some heavily
custom application-specific hardware.  It generates blobs of data which
historically it sent up to the host over the 7300 to a DOS program (the
version in use when I came into it was running under DOS and I was
brought in to help move it to something more modern).

Shortly into the project, we learned that the 7300 had been EOLed by
Adlink with no replacement device suggested.  We cast about and ended
up putting another small CPU on the generating end which sends data up
over Ethernet.  The only reason we still care about data over the 7300
is a relatively large installed base that doesn't have Ethernet-capable
data-generating hardware, but which we want to upgrade (the DOS
versions have various problems in addition to feature lack).

But my test hardware does have Ethernet.  And, in the presence of that
hardware, it always sends the data both ways, both as Ethernet packets
and as signals on differential pairs (which get converted to the
single-ended signals the 7300 needs very close to it - the differential
pairs are for better noise immunity over a relatively long cable run in
end-user installations).

For my test runs, I not only ran the application, which I told to read
from the 7300, but also a snoop program, which (a) uses BPF to capture
the Ethernet form of the data and (b) uses a snoop facility I added to
the 7300 driver to record a copy of everything that got returned
through a read() call.  I also added code to the userland application
to record everything it gets from read().  (The driver code I put up
for FTP does not have that.  I can make that version available too if
you want.)

What I'm seeing is an Ethernet packet arriving containing, let us say,
11 22 33 44 55 66 77 88 99 aa bb cc dd ee, but I'm also seeing the 7300
driver returning, say, 11 22 33 44 55 66, to userland, then userland
calling read() many times - enough to burn a full second of time -
getting "no data here" each time (see the next paragraph for what this
means).  Multiple seconds later, after userland has timed out and gone
into its "I'm not getting data" recovery mode, the driver sees the 77
88 99 aa bb cc dd ee part getting passed back to userland.

When userland calls read(), the driver reads the next sample out of the
DMA buffer, looking to see whether it's been overwritten.  If it has,
the samples it finds are passed back to userland and their places in
the buffer written over with a value that cannot correspond to a sample
(23 of the 32 data pins are grounded, so those bits cannot be nonzero
in a sample).  The driver uses interrupts only to deal with the case of
data arriving over the 7300 but userland not reading it.  The driver
wants to track where the hardware is DMAing into, so it knows where to
look for new data.  I configure the hardware to interrupt every
half-meg of data (in a 16M buffer); if the writing is getting too close
to the reading, I push the read point forward, clearing the buffer to
the "impossible" value in the process.  But, in the tests I'm doing, I
doubt that's happening (I can add code to check, but in my test cases
the Ethernet data stream indicates I'm not getting enough data for that
to be plausible even if userland weren't reading it).

Thus, userland calling read() and getting "nothing here" means that the
driver is reading the next sample and getting the "this cannot occur"
value it initialized the whole buffer to (and overwrites returned
samples with).

I know enough about the data-sending hardware (I have schematics) that
I consider it essentially impossible that the data is getting sent over
Ethernet but not making it to the 7300's input connector.  There is no
intelligence between the point where the bits are tapped off for the
send-over-Ethernet hardware and the 7300, just things like differential
receivers connected to single-ended drivers.  If the data weren't
arriving at all, I could, maybe, posit a breakdown in that datapath
somewhere, but that's not what I'm seeing.

Since the whole 11 22 ... dd ee blob arrives in a single Ethernet
packet, it must be getting sent at close to full speed (the timeout
before the Ethernet-send code pushes out a packet is fairly short, and
that packet is way less than the maximum size; indeed, it's barely over
the minimum Ethernet packet size).  In particular, it is not plausible
that delays anywhere close to a second occur there.

The observations are consistent with the theory that the PLX9080 is
somehow being really slow to DMA the samples to host memory.  But the
data rate is low enough (about 1MHz) and the PCI bus is non-busy enough
that I consider that theory to have rather low plausibility.  The most
plausible theory I have is that the DMA writes to memory are happening
from the PCI device's point of view, but are somehow not becoming
visible to the CPU until much later.

My impression is that that's what things like bus_dmamem_sync were
created to deal with, hence my suspicion that I was misusing something
in the bus_dma suite of routines.  Which it now appears I was, but,
upon going under the hood of the amd64 implementation, I don't think my
misuse is likely to be the problem.  I also tried using
POSTREAD|POSTWRITE before and PREREAD|PREWRITE after the test in the
core of my read routine (instead of just POSTREAD before and PREWRITE
after) and it didn't help.

The DOS version is rock-solid in this respect, though.  It is designed
around having the 7300 DMA directly into the application's buffer;
that's what Adlink's DOS library (which the DOS version uses) does.  I
am in the process of trying to rewrite my 7300 driver to come closer to
the more traditional paradigm bus_dma is presumably designed for; I
_think_ I can compensate for the API differences in a glue layer in
userland.  But I don't yet know whether that will help with this delay
issue I'm seeing.

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse%rodents-montreal.org@localhost
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

Follow-Ups:
- Re: Am I using bus_dma right?
  - From: Eduardo Horvath

References:
- Am I using bus_dma right?
  - From: Mouse
- Re: Am I using bus_dma right?
  - From: Martin Husemann
- Re: Am I using bus_dma right?
  - From: Mouse
- Re: Am I using bus_dma right?
  - From: Eduardo Horvath
- Re: Am I using bus_dma right?
  - From: Mouse
- Re: Am I using bus_dma right?
  - From: Eduardo Horvath

Prev by Date: Re: Am I using bus_dma right?
Next by Date: Re: Am I using bus_dma right?
Previous by Thread: Re: Am I using bus_dma right?
Next by Thread: Re: Am I using bus_dma right?
Indexes:

Home | Main Index | Thread Index | Old Index