Subject: Re: link() apparently nonatomic
To: None <netbsd-bugs@NetBSD.ORG>
From: Christoph Badura <>
List: netbsd-bugs
Date: 04/19/1996 00:24:00
In hanse-ml.netbsd.bugs you write:
>	We are using a link() based locking routine for use over NFS on our
>web servers running NetBSD . This routine has been failing with multiple
>processes gaining the lock simultaniously.

This is normal and happens AFAIK on all UNIX versions.

>It is somewhat more
>difficult to reproduce running the perl script while logged into the
>same machine.

That shouldn't happen, if your program is correct.  But since at least
the C routines are full of race conditions and false assumptions...

Anyway, the mail user agent guys have discussed this to death years
ago and haven't been able to find a way to relibably do file based
locking over NFS.

>time_t nfslock(path, namelock, max_age, notify)

>  if (getuid()==0) {	/* we're root, so be careful! */

No, if geteuid()==0 you are root.  That's the important test.

>    if (fstat(tmpfd, &our_tmp) < 0) {	/* stat failed... shouldn't happen */
>      close(tmpfd);
>      unlink(tpath);

If fstat fails on an open fd, it's argueably highly unsafe to unlink
that file.  How could you be sure that you are unlinking the file that
belongs to tmpfd?

>    if (link(tpath, fullpath) == 0) {
>      unlink(tpath); /* got it! */
>      free(tpath);

>      close(tmpfd);
>      if (lstat(fullpath, &our_tmp) < 0) {	/* stat failed... shouldn't happen */

Quite the contrary, this could happen quite easily.

>      return(our_tmp.st_ctime);
>    }

This contains a race condition.  Since fullpath is a link to
tpath, you should just fstat() tmpfd to get the ctime.

>    oldlck = lstat(fullpath, &old_stat);

>    if (write(tmpfd, "\0", 1) != 1 || fstat(tmpfd, &our_tmp) < 0)
>      break; /* something bogus is going on */

Testing for conditions that basically can't happen while at the same
time depending that you can make long sequences of system calls *and*
keep your timeslice -- even under heavy load--, is indeed bogus.

The write is very likely only local and not yet committed on too the
server, so it won't fail.  The fstat could come from the local
attribute cache as well.

>    if (oldlck != -1 && old_stat.st_ctime + max_age < our_tmp.st_ctime) {
>      unlink(fullpath); /* break the stale lock */
>      if (notify) fprintf(stderr,"Breaking stale lock on %s.\n",fullpath);

Reporting after the fact that you are about to do it, isn't helpful

Christoph Badura

You don't need to quote my .signature.  Everyone has seen it by now.
Besides, it doesn't add anything to the current thread.