tech-kern: Re: Observations on our VM system problem

Subject: Re: Observations on our VM system problem
To: None <ghudson@mit.edu, tech-kern@NetBSD.ORG>
From: Brian Buhrow <buhrow@cats.ucsc.edu>
List: tech-kern
Date: 03/01/1996 13:17:56
	I thought you might be interested to know that I tried your test program on 
NetBSD 0.9A (February 1994) and it prints "hello\nthere"
which seems to indicate that this system works as it is supposed to.  The
only change I made to your test file was to use the MAP_FILE flag instead
of 0 in the flags argument of the mmap call.  Would it be helpful to
understanding what is going on if I sent you chunks of the source code I'm
running?  I'd guess that the code is somewhat evolved from january 1994,
but diffs might yield some interesting things.
	Let me know if you want to pursue this further.  I'd be interested
in offering assistance and ideas where possible.
-Brian

On Mar 1,  6:12am, ghudson@MIT.EDU wrote:
} Subject: Observations on our VM system problem
} I wrote a test program to illustrate the canonical NetBSD VM system
} problem (vnode-backed VM pages can get out of sync with the filesystem
} and stay that way indefinitely).  I've inclueded the test program at
} the end of this message.  It should print "Hello.\nThere.\n" if it
} works properly; it actually prints "Hello.\n\0\0\0\0\0\0\0".  Notice
} that an msync() after the second mmap() does not avert the problem (so
} the patch I recently sent to the guy on netbsd-help is useless).
} 
} I'm trying to figure out why msync() doesn't help.  It's easy to
} determine that it's supposed to work; it attempts to free all the
} physical pages associated with the vm objects corresponding to the
} memory region you specify.  Some curious things I noticed while
} testing:
} 
} 	* sys_msync() calls vm_map_clean() which calls
} 	  vm_object_page_remove() to remove all physical pages from
} 	  the vm object.  When I watched this happen under DDB, there
} 	  were apparently no physical pages in the vm object's memq
} 	  (there were zero iterations of the loop in
} 	  vm_object_page_remove()).
} 
} 	* I tried breaking on vm_fault() after the invocation of
} 	  sys_msync(), and my breakpoint was not triggered until AFTER
} 	  "Hello.\n\0\0\0\0\0\0\0" was printed.  This means no page
} 	  fault occurred when addr was referenced after the msync()
} 	  call.  Thus, there must have been a physical page set up to
} 	  take the address reference, even though it wasn't found in
} 	  the vm object's memq.
} 
} Can anyone shed some light on this?  It could mean that vm_map_clean()
} is using incorrect logic and winds up cleaning the wrong objects, or
} it could mean that the vm structures are getting corrupt and there's a
} page corresponding to this vm object which simply isn't being cleaned.
} 
} #include <stdio.h>
} #include <fcntl.h>
} #include <unistd.h>
} #include <sys/types.h>
} #include <sys/stat.h>
} #include <sys/mman.h>
} 
} int main()
} {
}     char buf[128] = "/tmp/test-vm.XXXX", c;
}     int fd, fd_hold;
}     caddr_t addr, addr_hold;
} 
}     /* Create a file. */
}     fd = mkstemp(buf);
}     write(fd, "Hello.\n", 7);
}     close(fd);
} 
}     /* mmap() the file and bring the page into the VM cache. */
}     fd_hold = open(buf, O_RDONLY, 0);
}     addr_hold = mmap(NULL, 7, PROT_READ, 0, fd_hold, 0);
}     c = *addr_hold;
} 
}     /* Modify the file through the buffer cache. */
}     fd = open(buf, O_RDWR | O_APPEND, 0);
}     write(fd, "There.\n", 7);
}     close(fd);
} 
}     /* Now mmap() the file again, and display its contents. */
}     fd = open(buf, O_RDONLY, 0);
}     addr = mmap(NULL, 14, PROT_READ, 0, fd, 0);
}     c = *addr;
}     msync(addr, 14);
}     write(1, addr, 14);
}     munmap(addr, 14);
}     close(fd);
} 
}     unlink(buf);
} }
} 
>-- End of excerpt from ghudson@MIT.EDU