tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: DRM/KMS: vmwgfx driver is now available



On 7/23/23 06:06, Taylor R Campbell wrote:
Date: Sun, 23 Jul 2023 00:47:39 +0900
From: PHO <pho%cielonegro.org@localhost>

On 7/22/23 22:23, Taylor R Campbell wrote:
For that matter, do we even need a new allocator here?  Can we just do
a new bus_dmamem_alloc/load for each one?

Sure, but I fear that would degrade the performance.

Any idea how much?  I don't have a good sense of how this is used in
the system.  Is this like a network driver that precreates a pool of
mbufs and dma maps up front, and then loads and unloads them during
packet processing?

It's used for a command queue. Every time the driver wants to enqueue a GPU command (like creating a texture, running shaders, notifying the GPU of updated areas of a framebuffer, and so on) it needs to allocate some memory, write the command and perform a DMA. So I believe applications would call it like at least 100 times every second.

- In:

    static dma_addr_t __vmw_piter_dma_addr(struct vmw_piter *viter)
    {
    #if defined(__NetBSD__)
	/* XXX: Is this... really? Really what we need to do? */
	struct vm_page* const page = &viter->pages[viter->i]->p_vmp;
	paddr_t const paddr = VM_PAGE_TO_PHYS(page);
	bus_addr_t const baddr = PHYS_TO_BUS_MEM(viter->dmat, paddr);
	return baddr;

    Seems wrong -- surely there should be a bus dma map here to use?

Yes there is, but bus_dmamap_t isn't page-oriented, right? I don't know
how to iterate through pages with it.

You can ask for maxsegsz=PAGE_SIZE boundary=PAGE_SIZE in
bus_dmamap_create, and then each segment's .ds_len will be PAGE_SIZE
and each .ds_addr will be page-aligned.

Ah. That should work. Thanks for the tip.

linux_dma_fence.c:

- In:

    -	KASSERT(dma_fence_referenced_p(fence));
    +	dma_fence_assert_alive(fence);

    These are not the same -- why the change?

Yeah this one... I know they aren't the same and you won't like it (_I
don't_) but there's a reason why the change is needed:

https://github.com/NetBSD/src/commit/af88505b56f338c25163dd3f172394b9f2a86492
--
linux/dma-fence.h: Allow fences to be signaled after their refcount
going down to zero as long as their .release callback hasn't called
dma_fence_free() yet. This semantics is not documented anywhere, and is
probably just an implementation detail, but the vmwgfx driver heavily
relies on it.

Sorry riastradh@, this is unfortunate but circumventing this (admittedly
healthy) assertion requires a major code reorganization of the driver.
We really can't fight against the upstream on this matter.
--

You're right that I don't like it...it is completely bonkers
use-after-free!  Even i915 with its abuse of dma fence internals
doesn't violate these assertions.  What is vmwgfx doing that seems to
rely on it, which isn't just a broken missing reference count
acquire/release cycle or something?

This:
https://github.com/depressed-pho/netbsd-src/blob/91daa67f17222da355d3fddd6fa849c786d9c545/sys/external/bsd/drm2/dist/drm/vmwgfx/vmwgfx_fence.c#L45

and this:
https://github.com/depressed-pho/netbsd-src/blob/91daa67f17222da355d3fddd6fa849c786d9c545/sys/external/bsd/drm2/dist/drm/vmwgfx/vmwgfx_fence.c#L527

vma_fence_manager::fence_list holds pointers to all the fence objects in use but the list doesn't retain their refcount. I tried to change the way how it works by simply incrementing the refcount in vmw_fence_obj_init() and decrementing it in __vmw_fences_update(). But it didn't work well because fences could also be released by dma_resv_add_excl_fence() and dma_resv_add_shared_fence(). I cannot recall all the details but sometimes refcounts went down to zero even if they were retained by fence_list, and that's when I gave up fighting against it. I admit I don't fully understand the code.

Home | Main Index | Thread Index | Old Index