Subject: Re: More about the recent panic
To: Jukka Marin <jmarin@pyy.jmp.fi>
From: Charles M. Hannum <mycroft@mit.edu>
List: current-users
Date: 09/04/1996 15:16:23
Jukka Marin <jmarin@pyy.jmp.fi> writes:

> 
> panic: locking against myself
> #0  0xf801dbe4 in mi_switch ()
> #1  0xf80d4578 in boot ()
> #2  0xf801fda8 in panic ()
> #3  0xf80aad70 in ufs_lock ()
> #4  0xf80b7380 in vnode_pager_uncache ()
> #5  0xf80b6edc in vm_allocate_with_pager ()
> #6  0xf80b57a0 in vm_pager_get_pages ()
> #7  0xf80b5828 in vm_pager_get ()
> #8  0xf80ada6c in vm_fault ()
> #9  0xf80da630 in mem_access_fault ()
> #10 0xf80052fc in kernel_text ()
> #11 0xf80064a4 in copyout ()
> #12 0xf80a38f4 in ffs_write ()
> #13 0xf803e1ac in vn_write ()
> #14 0xf80215d8 in sys_write ()
> #15 0xf80dab20 in syscall ()
> #16 0xf8005530 in kernel_text ()

Well, this is reasonably straightforward.  Basically, you've mmap(2)ed
the file, and you're trying to write its contents back out to itself.
vn_write() locks the file to prevent simultaneous writes.  When FFS
tries to copy the data in later, it gets a page fault, and it tries to
page in the data.  Unfortunately, since the file is already locked,
you lose.

Although someone will no doubt by now want to suggest that we need to
separate read-write and read-only locks (and this is indeed true), it
wouldn't help in this case, because the first lock is read-write.

It would seem that there is also a related deadlock.  Imagine that
process A maps file A, and then tries to write the contents out to
file B.  Further imagine that process B simultaneously maps file B and
tries to write the contents out to file A.  Both processes will hang
waiting for the second lock.

Probably the simplest thing to do would be to have vn_write() (or
maybe even sys_write()) split up the request and prefault the data
with vmapbuf(), a la physio().  This could also reduce thrashing in
some circumstances.

> I'll try to reproduce the problem sometime later (can't take this system
> down again now).

Um, don't bother.  B-)