Subject: kqmemu: uvm(9) questions
To: None <tech-kern@netbsd.org>
From: Oliver Gould <ogould@olix0r.net>
List: tech-kern
Date: 04/24/2007 19:14:12
Hello-

More questions.. this time about uvm(9)

KQEMU wants to allocate kernel memory, and then be able to wire and
unwire it.

FreeBSD's implementation looks something like:

  struct kqemu_page *CDECL kqemu_alloc_zeroed_page(unsigned long *ppage_index)
  {
      pmap_t pmap;
      vm_offset_t va;
      vm_paddr_t pa;
  
      va = kmem_alloc(kernel_map, PAGE_SIZE);
      if (va == 0) {
          kqemu_log("kqemu_alloc_zeroed_page: NULL\n");
          return NULL;
      }
      pmap = vm_map_pmap(kernel_map);
      pa = pmap_extract(pmap, va);
      /* kqemu_log("kqemu_alloc_zeroed_page: %08x\n", pa); */
      *ppage_index = pa >> PAGE_SHIFT;
      return (struct kqemu_page *)va;
  }

  struct kqemu_user_page *CDECL kqemu_lock_user_page(unsigned long *ppage_index,
                                                   unsigned long user_addr)
  {
      struct vmspace *vm = curproc->p_vmspace;
      vm_offset_t va = user_addr;
      vm_paddr_t pa = 0;
      int ret;
      pmap_t pmap;
  #if __FreeBSD_version >= 500000
      ret = vm_map_wire(&vm->vm_map, va, va+PAGE_SIZE, VM_MAP_WIRE_USER);
  #else
      ret = vm_map_user_pageable(&vm->vm_map, va, va+PAGE_SIZE, FALSE);
  #endif
      if (ret != KERN_SUCCESS) {
          kqemu_log("kqemu_lock_user_page(%08lx) failed, ret=%d\n", user_addr, ret);
          return NULL;
      }
      pmap = vm_map_pmap(&vm->vm_map);
      pa = pmap_extract(pmap, va);
      *ppage_index = pa >> PAGE_SHIFT;
      return (struct kqemu_user_page *)va;
  }

My (thusfar disfunctional) port looks like:

  struct kqemu_page *
  kqemu_alloc_zeroed_page(unsigned long *ppage_index)
  {
	extern  struct vm_map *kernel_map;
       	pmap_t  pmap;
       	vaddr_t va;
       	paddr_t pa;

       	pa = 0;
	va = uvm_km_alloc(kernel_map, PAGE_SIZE, 0, UVM_KMF_WIRED|UVM_KMF_ZERO);
	if (va == 0) {
		kqemu_log(...);
		return NULL;
	}
	pmap = vm_map_pmap(kernel_map);
	if (pmap_extract(pmap, va, &pa) == FALSE) {
		kqemu_log(...);
		return NULL;
	}
	if (kqemu_debug > 0)
		kqemu_log("kqemu_alloc_zeroed_page: va=%08x pa=%08x\n",
			va, pa);

	*ppage_index = pa >> PAGE_SHIFT;
	return (struct kqemu_page *)va;
  }

  struct kqemu_user_page *
  kqemu_lock_user_page(unsigned long *ppage_index, unsigned long user_addr)
  {
        struct  vmspace *vm;
        vaddr_t va;
        paddr_t pa;
        pmap_t  pmap;

        vm = curproc->p_vmspace;
        va = (vaddr_t)user_addr;
        pa = 0;
        /* FIXME - fails */
        if (uvm_map_pageable(&vm->vm_map, va, va+PAGE_SIZE,
                                FALSE, 0) == FALSE) {
                kqemu_log("kqemu_lock_user_page(%08lx) failed\n", va);
                return NULL;
        }
        pmap = vm_map_pmap(&vm->vm_map);
        if (pmap_extract(pmap, va, &pa) == FALSE) {
                kqemu_log(...);
                return NULL;
        }
        *ppage_index = pa >> PAGE_SHIFT;
        return (struct kqemu_user_page *)va;
  }

The issue is that uvm_map_pageable(9) fails in kqemu_lock_user_page():

  Apr 24 18:58:12 isla /netbsd: kqemu: kqemu_alloc_zeroed_page:
	  va=d49e0000 pa=24c95000
  Apr 24 18:58:12 isla /netbsd: kqemu: kqemu_alloc_zeroed_page:
	  va=d49e1000 pa=260d6000
  Apr 24 18:58:12 isla /netbsd: kqemu: kqemu_lock_user_page(b305d000) failed
 
If I put an equivalent statement in kqemu_alloc_zeroed_page() directly
following uvm_km_alloc(), it succeeds.

My current thought is that the vm_map is the culprit here.  From a
recent thread, I gather that 'kernel_map' should not be used (my use of
it is inherited from FreeBSD's code).  Does it seem likely that this is
the issue?  What else should I consider?

Also, is there any documentation that describes, in a high-level way,
how uvm(9) works?  The Internals Guide is woefully incomplete, and I
would imagine that would be the proper place for such documentation.

And, on a procedural note, is it bothersome for me to post this much
code in email?  I'd hate to put anyone off.

Many thanks,
  - Oliver