Subject: kern/6052: race condition in link() call
To: None <gnats-bugs@gnats.netbsd.org>
From: Liz S. Reynolds <ilaine@panix.com>
List: netbsd-bugs
Date: 08/26/1998 17:06:51
>Number:         6052
>Category:       kern
>Synopsis:       link() call can be raced resulting in false success return
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Aug 26 14:20:01 1998
>Last-Modified:
>Originator:     Liz S. Reynolds
>Organization:
Public Access Networks (Panix)
	
>Release:        NetBSD-1.3.2
>Environment:
	
System: NetBSD panix7.panix.com 1.3.2 NetBSD 1.3.2 (PANIX-USER) #0: Thu Aug 13 15:03:31 EDT 1998 marcotte@juggler.panix.com:/devel/netbsd/1.3.2/src/sys/arch/i386/compile/PANIX-USER i386


>Description:

	It is possible for 2 processes to return success linking to the
	same file simultaniously. This does not make a tri-linked file, 
	the race winner actually has the link, and the loser does not. One
	can distinguish by checking the inode number of the newly created
	file and comparing it to the inode of the original file. In the
	race winner they are the same, in the loser they are different.

	This only happens on an nfs-mounted partition, but it works for
	several architectures on the server side. I have tested it with
	NetBSD, SunOS, and Network Appliances, with exactly the same
	results. I cannot cause the failure running the script on a Sun
	machine nfsmounting the same partitions. This locates the problem
	in the NetBSD nfs client code.


>How-To-Repeat:
/****************************************************************************
linktest.c  

  The purpose of this program is to demonstrate the existence of a
   race condition in the link() system call. It works (that is, fails)
   on netbsd when the test directory is mounted via nfs.

   linktest is a single test of link() and should be run from a script
   that forks many instances of linktest and races them against each
   other. A simple example, sufficient to cause several examples of
   anomolous output on my system is provided:

--snip--

#! /usr/local/bin/perl
$| = 1;

while ($procs < 30){
    $pid=fork();
    if ($pid == 0){
	$procs++;
    }
    elsif ($pid > 0) {
	while ($tries < 25){
	    system("./linktest");
	    $tries++;
	}
	exit (0);
    }
}

--snip--

and some sample output:
ok
ok
ok
ok
ok
ok
ok
non-equal inode #s 536594 536592
ok
ok
ok
ok
non-equal inode #s 536595 536594
ok
ok
ok
ok

--snip--

****************************************************************************/

#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <fcntl.h>
#include <time.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>

time_t main(int argc, char *argv[]) 
{
  int tmpfd;
  struct stat tmp_sbuf, win_sbuf;
  char linkfilename [] = "./thewinner";
  char tpath [_POSIX_PATH_MAX];


  /*
    create a temp file with a semi-random name
     verify that the create was successful
  */
  srandom(getpid()+time(0));
  sprintf(tpath,"test_tmp%i.%li", getpid(), random());
  tmpfd = open(tpath, O_CREAT|O_WRONLY|O_EXCL, S_IRUSR|S_IWUSR);
  if (tmpfd < 0) {
      close(tmpfd);
      unlink(tpath);
      exit (-2);
  }

  /* stat the file descriptor, make sure it really worked */
  if (fstat(tmpfd, &tmp_sbuf) < 0) {
      close(tmpfd);
      unlink(tpath);
      exit (-2);
  }

  /* attempt to create linkfilename via link() */

  if (link(tpath, linkfilename) == 0) {
      /* we have the link */
      unlink(tpath); 
      close(tmpfd);

    /* stat the new file */
      if (lstat(linkfilename, &win_sbuf) < 0) {
	  unlink(linkfilename);
	  exit (-2);
      }

      /* 
	 The stat structure from the original temp file and the new
	 file created by link() should be exactly the same 
	 if they are not another process has successfully raced us.
	 We cannot detect the race winner, only the loser.
      */

      if (tmp_sbuf.st_ino!=win_sbuf.st_ino) {
	  printf ("non-equal inode #s %i %i\n", 
		  tmp_sbuf.st_ino,win_sbuf.st_ino);
	  exit (-3);
      }
      else 
	  printf ("ok\n");
      unlink (linkfilename);
      exit (0);
  }

  /* 
     link() returned non-zero, we don't care about this case
     so exit silently.
  */
  errno = EEXIST;
  unlink(tpath);
  close(tmpfd);
  exit (-1);
}

>Fix:
	I am depending on the atomicity of the link() call for the correct
	working of an advisory locking system. Users have reported file
	corruption caused by failure of this routine when two programs
	believe they both have a lock on the database at the same time. 

	I have a workaround suitable for my purposes by detecting the fact
	that the new file doesn't have the same inode as the temp file.
	That process is determined not to have the lock despite the
	successful return from link().

	This is not a general-purpose fix as I have no way of knowing what
	other software on the system depends on the correct working of this
	call.
>Audit-Trail:
>Unformatted: