tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Implement mmap for PUD



Hello,

Since there is no mmap implementation for PUD devices I began working
on one. I would like to make an implementation that avoids copying
buffers around user-space and kernel memory, since mmap is usually
used for fast applications. I began working on pud_dev.c file, that
contains the kernel implementation of the mmap call, which is then
passed to user space. My mmap function looks like:


static paddr_t
pud_cdev_mmap(dev_t dev, off_t off, int flag)
{
        struct pud_creq_mmap pc_mmap;
        struct vmspace *vm;
        int error;
        int num;
        paddr_t pa;

        pc_mmap.pm_flag  = flag;
        pc_mmap.pm_pageoff = off;

        printf("Inside mmap, off: %jd flag: %d\n", (intmax_t) off, flag);

        error = pud_request(dev, &pc_mmap, sizeof(pc_mmap),
            PUD_REQ_CDEV, PUD_CDEV_MMAP);
        if (error)
                return (paddr_t) -1;

        mutex_enter(proc_lock);
        pc_mmap.pm_proc = proc_find(pc_mmap.pid);
        mutex_exit(proc_lock);
        /* Catch error? */

        error = proc_vmspace_getref(pc_mmap.pm_proc, &vm);
        if (error)
                panic("Unable to get vmspace");

        /* Try to read the value */
        if(copyin_proc(pc_mmap.pm_proc, (void *) pc_mmap.pm_addr, &num,
sizeof(num)) == EFAULT)
                panic("Unable to read value from user-space");
        printf("Read: %d from addr: %" PRIxPADDR "\n", num, pc_mmap.pm_addr);

        if ((vaddr_t)pc_mmap.pm_addr & PAGE_MASK)
                panic("pud_cdev_mmap: memory not page aligned");

        if (pmap_extract(vm->vm_map.pmap,
            (vaddr_t) pc_mmap.pm_addr + (u_int) off, &pa) == FALSE)
                panic("pud_cdev_mmap: memory page not mapped");

        uvmspace_free(vm);
        printf("Inside mmap, returning: %" PRIxPADDR "\n", pa);

        return pa;
}


Basically we use pud_request to pass the request to the user-space
server, and the server returns a memory address, allocated in the
user-space memory of it's process. Then I try to read the value of the
user space memory from the kernel, which works ok, I can fetch the
correct value. After reading the value (that is just used for
debugging), the physical address of the memory region is collected
using pmap_extract and returned.

An example of a very simple user-space server implementing mmap would be:

void * memory;

vaddr_t
test_mmap(dev_t dev, off_t off, int flag, void *auxdata)
{
        int *num;
        if (off > 0)
                return (vaddr_t)-1;
        if (memory == NULL)
        {
                memory = malloc(PAGE_SIZE);
                if (mlock(memory, PAGE_SIZE) < 0)
                        err(EXIT_FAILURE, "Unable to lock pages");
        }
        num = memory;
        memset(num, 0, PAGE_SIZE);
        *num = 10;
        return (vaddr_t) num;
}

This *works* ok, the kernel doesn't panic, but if I perform a mmap on
the device, the pud_cdev_mmap function is called forever with the same
arguments, and the function never returns (I have to kill the
program). Under a GENERIC kernel, I don't see any error messages, but
using a XEN kernel, I see the following messages from the hypervisor:

(XEN) d0:v0: reserved bit in page table (ec=000D)
(XEN) Pagetable walk from 00007f7ff7ff6000:
(XEN)  L4[0x0fe] = 0000000838c8b027 000000000000f374
(XEN)  L3[0x1ff] = 0000000c0a3be027 0000000000015c41
(XEN)  L2[0x1bf] = 000000083a701027 000000000000d8fe
(XEN)  L1[0x1f6] = 80000198ab000125 ffffffffffffffff
(XEN) ----[ Xen-4.2-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e033:[<0000000000400b77>]
(XEN) RFLAGS: 0000000000010206   EM: 0   CONTEXT: pv guest
(XEN) rax: 00007f7ff7ff6000   rbx: 00007f7fffffffe0   rcx: 0000000000001000
(XEN) rdx: 0000000000000000   rsi: 0000000000001000   rdi: 000000000000001c
(XEN) rbp: 00007f7fffffdbb0   rsp: 00007f7fffffdb80   r8:  0000000000000003
(XEN) r9:  0000000000000000   r10: 0000000000000001   r11: 0000000000000246
(XEN) r12: 00007f7fffffdbd8   r13: 0000000000622588   r14: 00007f7fffffcdfd
(XEN) r15: 0000000000000001   cr0: 000000008005003b   cr4: 00000000000026f0
(XEN) cr3: 00000008332ff000   cr2: 00007f7ff7ff6000
(XEN) ds: 0017   es: 0017   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=00007f7fffffdb80:
(XEN)    00007f7fffffdbf0 00000001f7b26c60 00007f7ff7b270c8 00007f7ff7ff6000
(XEN)    00007f7ff7ff6000 00000003004007b9 00000000006012c0 0000000000400975
(XEN)    0000000000000001 00007f7fffffffe0 0000000000000000 00007f7ff7c0699d
(XEN)    00007f7ff7ffa000 0000000000000001 00007f7ffffffdc8 0000000000000000
(XEN)    00007f7ffffffdcf 00007f7ffffffde6 00007f7ffffffe08 00007f7ffffffe1e
(XEN)    00007f7ffffffe33 00007f7ffffffe3e 00007f7ffffffe54 00007f7ffffffe5d
(XEN)    00007f7ffffffef4 00007f7fffffff28 00007f7fffffff3b 00007f7fffffff46
(XEN)    00007f7fffffff61 00007f7fffffff6d 00007f7fffffff78 00007f7fffffff9a
(XEN)    00007f7fffffffa6 00007f7fffffffb5 00007f7fffffffc4 00007f7fffffffd2
(XEN)    0000000000000000 0000000000000003 0000000000400040 0000000000000004
(XEN)    0000000000000038 ffffa00000000005 0000000000000006 ffffa00000000006
(XEN)    0000000000001000 ffffa00000000007 00007f7ff7c00000 0000000000000008
(XEN)    0000000000000000 ffffa00000000009 0000000000400a00 ffffa000000007d0
(XEN)    0000000000000000 00000000000007d1 0000000000000000 ffffa000000007d2
(XEN)    0000000000000000 ffffa000000007d3 0000000000000000 ffffa00000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000

That appears every time a mmap call returns. My guess is that I have
to mark the memory I'm returning from mmap as shared somehow, but I
don't know how (or if it is even possible to pass memory from
user-space programs through the kernel). Any hint on how to solve this
problem would be really appreaciated.

Thanks, Roger.


Home | Main Index | Thread Index | Old Index