Subject: kern/2826: union file system botches file locks after copy-on-write operations
To: None <gnats-bugs@gnats.netbsd.org>
From: John Kohl <jtk@kolvir.arlington.ma.us>
List: netbsd-bugs
Date: 10/09/1996 20:28:28
>Number:         2826
>Category:       kern
>Synopsis:       union file system botches file locks after copy-on-write operations
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Oct  9 17:50:01 1996
>Last-Modified:
>Originator:     John Kohl
>Organization:
NetBSD Kernel Hackers `R` Us
>Release:        1.2
>Environment:
	
System: NetBSD pattern 1.2 NetBSD 1.2 (PATTERN) #102: Thu Sep 12 18:46:38 EDT 1996 jtk@pattern:/u4/sandbox/src/sys/arch/i386/compile/PATTERN i386


>Description:
	The union file system botches file locking if a file which
exists only in the lower layer is open for read and read-locked (via
fcntl(2)) while another process opens it for write.  The union vnode is
modified by a copy-on-write operation which creates an upper-layer
shadow of the file.

The lower-layer file remains read-locked by the original locker, yet
when it closes the file descriptor or exits, the lock is not cleaned out
because the union file system code redirects locking operations to the
fresh upper-layer copy of the file.

The lower-layer file then remains read-locked until the system is
rebooted.

>How-To-Repeat:
Here is a sample program which you can compile twice, once with -DWLOCK
and once without.  Run the non-write version in one window, pause at its
prompt, and then run the write version in another window.
exit them both.

Run the write version again--it will indicate that a now-nonexistent
process ID still has the file read-locked.

#include <fcntl.h>
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

main(argc, argv)
    int argc;
    char **argv;
{
    FILE *f;
    int fd;
    struct flock lock;
    char foo[8192];

    if (argc != 2) {
        printf("usage: %s tmpfile\n", argv[0]);
        exit(1);
    }
#ifdef WLOCK
    f = fopen(argv[1], "a+");
#else
    f = fopen(argv[1], "r");
#endif
    if (!f) {
        perror(argv[1]);
        exit(1);
    }
    printf("F_UNLCK == %d\n", F_UNLCK);
    if (fcntl(fileno(f), F_GETLK, (void *)&lock) == 0) {
        printf("lock held by pid %d type %d\n", lock.l_pid,
               lock.l_type);
    }
#ifdef WLOCK
    lock.l_type = F_WRLCK;
#else
    lock.l_type = F_RDLCK;
#endif
    lock.l_whence = SEEK_SET;
    lock.l_start = 0;
    lock.l_len = 0;
    lock.l_pid = 0;
    if (fcntl(fileno(f), F_SETLK, (void *)&lock) == 0)
        printf("pid %d obtained lock\n", getpid());
    else {
        perror("fcntl");
        lock.l_type = F_WRLCK;
        lock.l_whence = SEEK_SET;
        lock.l_start = 0;
        lock.l_len = 0;
        lock.l_pid = 0;
        if (fcntl(fileno(f), F_GETLK, (void *)&lock) == 0)
            printf("BAD: lock blocked by pid %d type %d\n", lock.l_pid,
                   lock.l_type);
        else
            perror("fcntl");
    }
    printf("press return:");
    gets(foo);
    lock.l_type = F_WRLCK;
    lock.l_whence = SEEK_SET;
    lock.l_start = 0;
    lock.l_len = 0;
    lock.l_pid = 0;
    if (fcntl(fileno(f), F_GETLK, (void *)&lock) == 0) {
        printf("lock held by pid %d type %d\n", lock.l_pid,
               lock.l_type);
    } else
        perror("fcntl");
    fd = open(argv[1], O_RDWR);
    if (fd != -1) {
        lock.l_type = F_WRLCK;
        lock.l_whence = SEEK_SET;
        lock.l_start = 0;
        lock.l_len = 0;
        lock.l_pid = 0;
        if (fcntl(fd, F_GETLK, (void *)&lock) == 0) {
            printf("lock held by pid %d type %d\n", lock.l_pid,
                   lock.l_type);
        } else
            printf("no lock on file\n");
    }
    exit(0);
}

>Fix:
	Maybe keep track of file locking operations in the union layer?
>Audit-Trail:
>Unformatted: