tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: RFC: add MSI/MSI-X support to NetBSD
On Fri, Jun 06, 2014 at 12:40:54PM -0500, David Young wrote:
> On Fri, May 30, 2014 at 05:55:25PM +0900, Kengo NAKAHARA wrote:
> > Hello,
> >
> > I'm going to add MSI/MSI-X support to NetBSD. I list tasks about this.
> > Would you comment following task list?
>
> I think that MSI/MSI-X logically separates into a few pieces, what do
> you think about these pieces?
>
> 1 An MI API for establishing "mailboxes" (or "doorbells" or whatever
> we may call them). A mailbox is a special physical address (PA) or
> PA/data-pair in correspondence with a callback (function, argument).
>
> An MI API for mapping the mailbox into various address spaces,
> but especially the message-signalling devices. In this way, the
> mailbox API is a use or an extension of bus_dma(9).
>
> Somewhere I have a draft proposal for this MI API, I will try to
> dig it up.
Here is the proposal that I came up with many months (a few years?) ago
with input from Matt Thomas. I have tried to account for Matt's
requirements, but I'm not sure that I have done so.
Dave
--
David Young
dyoung%pobox.com@localhost Urbana, IL (217) 721-9981
BUS_MSI(9) NetBSD Kernel Developer's Manual BUS_MSI(9)
bus_msi(9) is a machine-independent interface for establishing in the
machine's physical address space a "doorbell" that when written with
a particular word, sends an interrupt vector to a set of CPUs. Using
bus_msi(9), the interrupt vector can be tied to interrupt handlers.
bus_msi(9) is the basis for a machine-independent implementation
of PCI Message-Signaled Interrupts (MSI) and MSI-X, however, the
bus_msi(9) implementation itself is highly machine-dependent. Any
NetBSD architecture that wants to support PCI MSI should provide a
bus_msi(9) implementation.
bus_msi(9) uses facilities provided by bus_dma(9).
typedef struct _bus_msi_t {
bus_addr_t mi_addr;
uint32_t mi_data;
uint32_t mi_count;
};
int
bus_msi_alloc(bus_dma_tag_t tag, bus_msi_reservation_t *msirp, size_t n,
uint32_t data_min, uint32_t data_max,
uint32_t data_alignment, uint32_t data_boundary, int flags);
Reserve `number' interrupt vectors on up to `ncpumax' CPUs
in the set `cpusetin' and reserve corresponding message
address/message data pairs. Record the message address/data-pair
reservations in up to `nintervals' consecutive bus_msi_interval_ts
beginning with `interval[0]'; overwrite `rintervals' with
the number of intervals used. Overwrite `cpusetout' with
the set of CPUs where interrupt vectors were established.
Each bus_msg_interval_t tells a message address, mi_addr,
and the mi_count different 32-bit message data words,
[mi_data,�mi_data�+�mi_count�-�1], to write to trigger
mi_count different interrupt vectors.
Each message data interval, [mi_data, mi_data + mi_count�-�1]
will satisfy the constraints passed to bus_msg_alloc():
[data_min,�data_max] must enclose each interval, each
interval must start at a multiple of data_alignment, and
no interval may cross a data_boundary boundary. A legal
value of data_alignment (or data_boundary) is either zero
or a power of 2. When zero, data_alignment (or data_boundary)
has no effect.
`tag' is the bus_dma_tag_t passed by the parent driver via
the bus _attach_args.
`flags' may be one of BUS_DMA_WAITOK or BUS_DMA_NOWAIT.
bus_msi_handle_t
bus_msi_establish(bus_dma_tag_t tag, bus_msi_reservation_t msir, int idx,
const kcpuset_t *cpusetin, int ncpumax, kcpuset_t *cpusetout,
int ipl, int (*func)(void *), void *arg);
Establish a callback (func, arg) to run at interrupt priority
level `ipl' whenever the `idx'th message in `intervals' is
delivered. Return an opaque handle for use with
bus_msi_disestablish().
You can establish more than one handler at each `idx'.
The correspondence between `idx's and message-address/data
pairs is like this:
idx 0 -> (intervals[0].mi_addr, intervals[0].mi_data)
idx 1 -> (intervals[0].mi_addr, intervals[0].mi_data + 1)
. . .
idx N - 1 -> (intervals[0].mi_addr, intervals[0].mi_data +
intervals[0].mi_count - 1)
idx N -> (intervals[1].mi_addr, intervals[1].mi_data)
idx N + 1 -> (intervals[1].mi_addr, intervals[1].mi_data + 1)
. . .
idx N + K - 1 -> (intervals[1].mi_addr, intervals[1].mi_data +
intervals[1].mi_count - 1)
void
bus_msi_disestablish(bus_dma_tag_t tag, bus_msi_handle_t);
Disestablish the callback established previously with
bus_msi_handle_t.
void
bus_msi_free(bus_dma_tag_t tag, bus_msi_reservation_t msir, int idx, size_t n);
Release intervals allocated with bus_msi_alloc().
bus_msi_free(9) behavior is undefined if callbacks are still
established on any of the message intervals.
int
bus_msi_extract(bus_dma_tag_t tag, bus_msi_reservation_t msir,
int idx, size_t n, bus_msi_t *msip, int *rmsi);
Extract `n' MSI from `msir', starting with the `idx'th,
and write them to `msip'. Record how many were extracted
at `rmsi'.
int
bus_msi_to_segs(bus_dma_tag_t tag, bus_msi_t *msip, size_t n,
bus_dma_segment_t *segs, int nsegs, int *rsegs);
Create an array of bus_dma_segment_ts from the message
addresses in the `n' bus_msi_ts at `msip'. Record the
length of the bus_dma_segment_t array at `rsegs'.
int
bus_msi_map(bus_dma_tag_t tag, bus_msi_reservation_t, uint32_t **kvap,
size_t n);
Map `n' message addresses into kernel virtual address space,
recording virtual addresses at `kvap[0..n-1]'.
[Implementation note: use bus_dmamem_map(9).]
int
bus_msi_unmap(bus_dma_tag_t tag, uint32_t **kvap, size_t n);
Unmap `n' message addresses, `kvap[0..n-1]'.
[Implementation note: use bus_dmamem_unmap(9).]
int
bus_msi_trigger(bus_dma_tag_t tag, bus_msi_reservation_t, int idx);
Post the `idx'th message in `intervals'. Behavior is
undefined if a callback has not been established on the
`idx'th interval using bus_msi_establish(9).
If a callback was previously established, it may be called
before bus_msi_trigger(9) has returned or after.
[Implementation note: use bus_msi_extract(9), bus_msi_map(9),
*kvap = extracted_interval.mi_data, bus_msi_unmap(9).]
EXAMPLES
/*
* Allocate N vectors for MSI on any 1 CPU, return the message
* address at msiaddrp and the base message data at msidatap.
*/
int
msi_allocate(int n, bus_addr_t *msiaddrp, uint32_t *msidatap)
{
bus_msg_interval_t intervals;
int rc, rintervals;
rc = bus_msg_alloc(tag, n, kcpuset_running, 1, NULL,
&intervals, 1, &rintervals, 0, UINT16_MAX, 4, 0,
BUS_DMA_WAITOK);
if (rc != 0)
return rc;
*msiaddrp = intervals.mi_addr;
*msidatap = intervals.mi_data;
return 0;
}
/*
* Allocate N vectors for MSI-X on different CPUs in round-robin
* fashion, return the message-address/data pairs at msiaddrp
* and msidatap.
*/
int
msix_allocate(int n, bus_addr_t *msiaddrp, uint32_t *msidatap)
{
kcpuset_t *anykcp, *estkcp;
bus_msg_interval_t *intervals;
int i, rc, rintervals;
intervals = calloc(n, sizeof(*intervals));
if (intervals == NULL)
return ENOMEM;
if (!kcpuset_create(&estkcp, false)) {
free(intervals);
return ENOMEM;
}
if (!kcpuset_create(&anykcp, true)) {
free(intervals);
kcpuset_destroy(estkcp);
return ENOMEM;
}
for (i = 0; i < n; i++) {
/* If we've emptied our "any CPUs" set,
* refill.
*/
if (kcpuset_iszero(anykcp))
kcpuset_copy(anykcp, kcpuset_running);
rc = bus_msg_alloc(tag, 1, anykcp, 1, estkcp,
&intervals[i], 1, &rintervals, 0, UINT32_MAX, 0, 0,
BUS_DMA_WAITOK);
if (rc != 0)
goto err;
/* The CPU where we established the interrupt
* is temporarily ineligible.
*/
kcpuset_subtract(anykcp, estkcp);
/* Remember
msiaddrp[i] = intervals[i].mi_addr;
msidatap[i] = intervals[i].mi_data;
}
free(intervals);
kcpuset_destroy(estkcp);
kcpuset_destroy(anykcp);
return 0;
err:
while (--i >= 0)
bus_msg_free(tag, intervals[i], 1);
free(intervals);
kcpuset_destroy(estkcp);
kcpuset_destroy(anykcp);
return rc;
}
/*
* Allocate N vectors for MSI-X on any CPUs.
* Return the message-address/data pairs at msiaddrp
* and msidatap.
*/
int
msix2_allocate(int n, bus_addr_t *msiaddrp, uint32_t *msidatap)
{
bus_msg_interval_t *intervals;
int i, rc, rintervals;
intervals = calloc(n, sizeof(*intervals));
if (intervals == NULL)
return ENOMEM;
rc = bus_msg_alloc(tag, n,
kcpuset_running, kcpuset_countset(kcpuset_running),
NULL,
intervals, n, &rintervals, 0, UINT32_MAX,
0, /* alignment */
1, /* 0 would suffice, but a boundary of 1 prevents
* consecutive mi_data from being reserved.
*
* Perhaps this is too clever.
*/
BUS_DMA_WAITOK);
if (rc != 0)
goto err;
for (i = 0; i < n; i++) {
msiaddrp[i] = intervals[i].mi_addr;
msidatap[i] = intervals[i].mi_data;
}
free(intervals);
kcpuset_destroy(estkcp);
kcpuset_destroy(anykcp);
return 0;
err:
free(intervals);
kcpuset_destroy(estkcp);
kcpuset_destroy(anykcp);
return rc;
}
Home |
Main Index |
Thread Index |
Old Index