Subject: port-m68k/4120: vmapbuf()/vunmapbuf() bug in all m68k ports
To: None <>
From: None <>
List: netbsd-bugs
Date: 09/17/1997 21:25:43
>Number:         4120
>Category:       port-m68k
>Synopsis:       m68k port implementation of vunmapbuf() has a bug
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    gnats-admin (GNATS administrator)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Sep 17 20:35:01 1997
>Originator:     Michael L. Hitch
	Montana State Univerisity
>Release:        970913
System: NetBSD 1.2G NetBSD 1.2G (ZEUS) #970913-11: Wed Sep 17 12:16:03 MDT 1997 mhitch@:/opt/tmp/src/sys/arch/amiga/compile/ZEUS amiga

All the m68k ports are using nearly the same implementation of vmapbuf()
and vmunmapbug().  They all use bp->b_bcount to compute the size of the
buffer to map/unmap.  Normally this works just fine, but there are cases
where bp->b_bcount has been changed by the I/O operation [attempting to
read a large buffer at the end of a disk partition is where I found it].
This results in vunmapbuf() not deallocating all of the memory mapped by
vmapbuf().  The VM system is unaware of the pages left mapped, and will
return those pages back to the free memory list.

This is particularly nasty on the amiga and other ports that allocate pv
entries in the same fashion.  When vmapbuf() enters a mapping for the
user pages in the buffer, a new pv entry needs to be allocated.  If
there are no free pv entries available, kmem_alloc() is called to
allocate a new page.  When allocating a the new page for the pv entries,
vm_fault appears to be used to set up the mapping for the new page.  If
the new page being faulted is one of the pages that vunmapbuf() failed
to deallocate, then there exists an entry in pv_table for that page, and
pmap has to allocate a new pv entry [haven't I been here before?].  At
this point the system gets into a deadlock:  vm_fault() has locked the
kernel_map with a shared lock, and kmem_alloc() wants an exclusive lock
and will never get it.

I was able to replicate this quite easily by running sucessive 'dd' on
disk partitions (dd if=/dev/rsd0a of=/dev/null bs=64k) on an amiga.  If
the disk partition isn't a multiple of the dd buffer size, the system will
eventually hang with the process going into "thrd_sleep" waiting for the
kernel_pmap lock.  I think this will probably occur on most, if not all,
of the other m68k ports.
The simplest and easiest fix is to use the buffer size passed as a parameter
to vmapbuf()/vunmapbuf() instead of bp->b_bcount.

The buffer mapping could also be done in a similar manner to the other
non-m68k ports by copying the pte entries.  The other port code doesn't
work as-is on most 68k ports though.  The amiga pmap.c currently will
panic if DEBUG is defined, because the pages are not recorded in
pv_table, which pmap_remove() won't like when kmem_free_wakeup() removes
the mappings.  There may also be a problem if the pte entries mapping
the buffer cross a page boundary of the kernel page tables.  [I'm not
certain if pmap could deal with a fault when the invalid page is