Subject: Partial fix for the NetBSD VM problem
To: None <tech-kern@NetBSD.ORG>
From: Greg Hudson <ghudson@MIT.EDU>
List: tech-kern
Date: 03/02/1996 11:35:09
Included below is a patch to fix what I consider to be the important
parts of the NetBSD VM problem:

	* msync() doesn't sync all pages of a MAP_PRIVATE map.
	  (Thanks to Mike Hibler for identifying the fix.)

	* When you mmap() a file, you can wind up sharing pages with a
	  previous map which is out of date with the actual file.  It
	  should not be necessary for every program to msync() after
	  every mmap() call just to get a current view of the file.

After this patch, the following problems will remain:

	* If you modify a file through the filesystem, the changes
	  will not necessarily affect mmap()'d regions until an
	  msync().

	* If you modify an mmap()'d region with MAP_SHARE, the changes
	  will not necessarily affect the filesystem until an msync()
	  or munmap().

I don't consider either of the above two problems serious, since they
don't affect vi (as far as I can tell, vi should be using MAP_COPY
anyway) and they don't affect running executables.  Fixing the first
problem could be done by adding sufficient hair to the buffer cache to
invalidate VM pages when files change; fixing the second problem
requires a unified VM and buffer cache.  I believe POSIX.1b allows one
or both of the above behaviors, but I haven't checked.

Anyway, the patch below is simple and I've tested it.  If there aren't
any objections, I'll check it in in a few days.

*** vm_mmap.c	1996/03/01 06:11:19	1.4
--- vm_mmap.c	1996/03/02 16:12:29
***************
*** 879,884 ****
--- 879,887 ----
  			printf("vm_mmap(%d): FILE *addr %x size %x pager %p\n",
  			       curproc->p_pid, *addr, size, pager);
  #endif
+ 
+ 		/* Make sure new mapping is in sync with filesystem */
+ 		vm_map_clean(map, *addr, *addr+size, TRUE, TRUE);
  	}
  	/*
  	 * Correct protection (default is VM_PROT_ALL).
*** vm_map.c	1996/03/02 08:51:37	1.1
--- vm_map.c	1996/03/02 08:51:21
***************
*** 1381,1387 ****
  	register vm_map_entry_t current;
  	vm_map_entry_t entry;
  	vm_size_t size;
! 	vm_object_t object;
  	vm_offset_t offset;
  
  	vm_map_lock_read(map);
--- 1381,1387 ----
  	register vm_map_entry_t current;
  	vm_map_entry_t entry;
  	vm_size_t size;
! 	vm_object_t object, shadow;
  	vm_offset_t offset;
  
  	vm_map_lock_read(map);
***************
*** 1433,1452 ****
  			object = current->object.vm_object;
  			vm_object_lock(object);
  		}
! 		/*
! 		 * Flush pages if writing is allowed.
! 		 * XXX should we continue on an error?
! 		 */
! 		if ((current->protection & VM_PROT_WRITE) &&
! 		    !vm_object_page_clean(object, offset, offset+size,
! 					  syncio, FALSE)) {
  			vm_object_unlock(object);
  			vm_map_unlock_read(map);
  			return(KERN_FAILURE);
! 		}
! 		if (invalidate)
  			vm_object_page_remove(object, offset, offset+size);
! 		vm_object_unlock(object);
  		start += size;
  	}
  
--- 1433,1459 ----
  			object = current->object.vm_object;
  			vm_object_lock(object);
  		}
! 		while (object) {
! 		    /*
! 		     * Flush pages if writing is allowed.
! 		     * XXX should we continue on an error?
! 		     */
! 		    if ((current->protection & VM_PROT_WRITE) &&
! 			!vm_object_page_clean(object, offset, offset+size,
! 					      syncio, FALSE)) {
  			vm_object_unlock(object);
  			vm_map_unlock_read(map);
  			return(KERN_FAILURE);
! 		    }
! 		    if (invalidate)
  			vm_object_page_remove(object, offset, offset+size);
! 
! 		    shadow = object->shadow;
! 		    if (shadow)
! 			vm_object_lock(shadow);
! 		    vm_object_unlock(object);
! 		    object = shadow;
! 		}
  		start += size;
  	}