Subject: copy on write md(4)
To: None <tech-kern@netbsd.org>
From: Darrin B.Jewell <dbj@netbsd.org>
List: tech-kern
Date: 06/10/2005 14:17:48
--=-=-=


I started looking at what it would take to do copy on write
support for an md(4) device.

So to start, I'm trying to use using page loaning for the memcpy
in a mdstrategy() read.  I created a rather hackish patch, included
below, which creates managed page mappings for the memory disk
memory so that I could use a uvm page loan.

Now that I can establish a loan on these pages, what do I do with the
bp->b_data pages that they would replace?  Can I just pmap_remove()
the old pages and uvm_pagefree() them?  This seems rather naive, since
I'm completely ignoring any information on where they originated.
Similarly, I assume using pmap_enter() to establish the newly loaned
pages will have problems later when the caller is done with them.

Can anyone help steer me a little on this?

Thanks,
Darrin


--=-=-=
Content-Type: text/x-patch
Content-Disposition: attachment; filename=md_root.diff
Content-Description: patch to add page mappings for md_root_image

Index: md_root.c
===================================================================
RCS file: /cvsroot/wasabisrc/src/sys/dev/md_root.c,v
retrieving revision 1.1.1.6
diff -u -r1.1.1.6 md_root.c
--- md_root.c	12 May 2003 22:41:49 -0000	1.1.1.6
+++ md_root.c	10 Jun 2005 18:03:40 -0000
@@ -47,6 +47,8 @@
 
 #include <dev/md.h>
 
+#include <uvm/uvm.h>
+
 extern int boothowto;
 
 #ifdef MEMORY_DISK_DYNAMIC
@@ -77,8 +79,9 @@
  * This array will be patched to contain a file-system image.
  * See the program mdsetimage(8) for details.
  */
-u_int32_t md_root_size = ROOTBYTES;
-char md_root_image[ROOTBYTES] = "|This is the root ramdisk!\n";
+u_int32_t md_root_size = round_page(ROOTBYTES);
+char md_root_image[ROOTBYTES] __aligned(PAGE_SIZE)
+     = "|This is the root ramdisk!\n";
 #endif /* MEMORY_DISK_IMAGE */
 #endif /* MEMORY_DISK_DYNAMIC */
 
@@ -108,6 +111,69 @@
 		md->md_addr = (caddr_t)md_root_image;
 		md->md_size = (size_t)md_root_size;
 		md->md_type = MD_KMEM_FIXED;
+
+    {
+      vaddr_t ova, va;
+      paddr_t opa, pa;
+      int err;
+      size_t osz, sz;
+      
+      ova = va = (vaddr_t)md_root_image;
+      sz = md_root_size;
+
+      err = pmap_extract(pmap_kernel(), va, &pa);
+      KASSERT(err);
+
+      /* XXX wouldn't it be nice to put this back where it was using UVM_FLAG_FIXED ?
+       * i ran into errors trying to get it to split the kernel map, so I let
+       * it put it wherever it wants.  maybe use a submap?  maybe use separate aobj?
+       */
+      err = uvm_map(kernel_map, &va, sz, uvm.kernel_object,
+                    va - vm_map_min(kernel_map), 0,
+                    UVM_MAPFLAG(UVM_PROT_ALL, UVM_PROT_ALL, UVM_INH_NONE, UVM_ADV_RANDOM,
+                                UVM_FLAG_QUANTUM /* | UVM_FLAG_FIXED */));
+      if (err) {
+        printf("md%d: unable to remap image\n", unit);
+      } else {
+        printf("md%d: remapped md_root_image at %llx\n", unit, (unsigned long long)va);
+        md->md_addr = (caddr_t)va;
+
+        pmap_kremove(ova, sz);
+        /* XXX assumes md_root_image is contiguous pa */
+        uvm_page_physload(atop(pa), atop(pa+sz), 0, 0, VM_FREELIST_DEFAULT);
+
+        ova = va;
+        opa = pa;
+        osz = sz;
+
+        /* XXX is this even necessary? */
+        simple_lock(&uvm.kernel_object->vmobjlock);
+        uvm_lock_pageq();
+        while (sz) {
+          uvm_pagerealloc(PHYS_TO_VM_PAGE(pa), uvm.kernel_object,
+                          va - vm_map_min(kernel_map));
+          va += PAGE_SIZE;
+          pa += PAGE_SIZE;
+          sz -= PAGE_SIZE;
+        }
+        uvm_unlock_pageq();
+        simple_unlock(&uvm.kernel_object->vmobjlock);
+
+        sz = osz;
+        va = ova;
+        pa = opa;
+
+        /* XXX Could this be done in the above loop, while holding the locks instead ?*/
+        while (sz) {
+          pmap_enter(pmap_kernel(), va, pa, VM_PROT_ALL, PMAP_WIRED);
+          va += PAGE_SIZE;
+          pa += PAGE_SIZE;
+          sz -= PAGE_SIZE;
+        }
+        pmap_update(pmap_kernel());
+      }
+    }
+
 		format_bytes(pbuf, sizeof(pbuf), md->md_size);
 		aprint_normal("md%d: internal %s image area\n", unit, pbuf);
 	}

--=-=-=--