Subject: Re: NetBSD Security Advisory 1999-008
To: Matthew Orgass <darkstar@pgh.net>
From: Bill Studenmund <wrstuden@nas.nasa.gov>
List: current-users
Date: 04/15/1999 15:51:14
Note: Simon, the address: Simon Burge <simonb@telstra.com.au> is not
feeling well - all my mail to there was bouncing.

On Thu, 15 Apr 1999, Matthew Orgass wrote:

> On Thu, 15 Apr 1999, Bill Studenmund wrote:
> 
> > Depending on whether some_path starts with a "/" or not, you get different
> > problens. If no slash, you (should) get the panic, and with slash, you get
> > a node left locked which will never get unlocked.
> 
>  Um...
> 
> [1.3.3]
> 
> cd /tmp
> ln -s /tmp/ test
> ln -s /tmp/ test
> [hangs]
> 
> cd /tmp
> ln -s / test
> ln -s / test
> ls
> [hangs]
> 
>   Sure looks to me like the directory matters...

We are in a mood to split hairs, aren't we? :-)

>   It seems to me that the first case gets a "locking against myself" panic
> in 1.4_ALPHA because the symlink is in the directory that is being linked
> (which presumably needs to be linked too), thus it encounters the
> unreleased lock while still linking, detects this, and panics.   In the
> second case, it never hits the unreleased lock while in ln, but when
> something else trys to access it hangs.

Yep!

>   In which case it would seem that the pmax behavior is what *should* be
> happening.  ls should see that it is locked and patiently wait until it
> gets unlocked, as happens with other blocking processes 

The "pmax is not doing right" comment revolves around the fact the upper
case hangs rather than panics. The lock manager is sitting there (i.e. 
sleeping) waiting for a lock to be freed. However, the lock is held by the
sleeping process!

> (like the
> scsipi(?) bug that caused something to be blocked when I tried to open
> /dev/cd0a that was recently fixed).  An intersting note on that bug was
> that when I tried to read the disklabel, it would hang and couldn't be
> killed, but nothing else was affected.  However when I tried to mount it,
> the mount point also remained locked, which eventually caused the entire
> system to hang.

Right. The reason we want a panic in this case is to prevent silent death
like above. At least w/ a panic, the operator knows why a machine died.

>   Presumably, pmax does something to avert a system hang in the second
> case which i386 does not do.  It seems rather strange to me that the
> second case should hang the entire system while the first case just hangs
> the process; both fail to release a lock on part of the file system, yet
> in the first case only that process (and any other that accesses that
> particular file) is affected, while in the second case the entire system
> eventually locks up (I guess in the ln case, the rest of the system runs
> into the lock sooner?).

No, it's the first case where pmax should hang and doesn't.

Could this be another side effect of the missing TIA bug which gave alpha
the mount hang problem? (I thought I was told that only Alpha and pmax
would really be hit by it even though it was MI.)

Take care,

Bill