Subject: kern/21473: read/write over NFS can fail under certain conditions
To: None <gnats-bugs@gnats.netbsd.org>
From: None <marcotte@panix.com>
List: netbsd-bugs
Date: 05/05/2003 20:39:03
>Number:         21473
>Category:       kern
>Synopsis:       read/write over NFS can fail under certain conditions
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue May 06 00:40:00 UTC 2003
>Closed-Date:
>Last-Modified:
>Originator:     Brian Marcotte
>Release:        NetBSD 1.6.1
>Organization:
	Public Access Networks, Corp (panix.com)
>Environment:
	
	
System: NetBSD trinity.nyc.access.net 1.6.1 NetBSD 1.6.1 (PANIX-STD) #0: Mon Apr 28 23:09:19 EDT 2003     root@trinity.nyc.access.net:/devel/netbsd/1.6.1/src/sys/arch/i386/compile/PANIX-STD i386
Architecture: i386
Machine: i386
>Description:

We're encountering a problem with NetBSD 1.6.1 (and 1.6) where read(2)
and write(2) can fail with EFAULT when the file is on NFS. Generally
accessing files over NFS causes no problems, but under the right
conditions, it happens every time. The file server in our case is a
NetApp Filer though I don't think this is the server's fault at all.

This seems to happen when programs open files as one user, who has
access to the file, and then attempts to read or write as another
user. I think there may be more to it than that, because a small
program I wrote to simulate what I described didn't trigger the
problem.

I've seen this with read() from the Postfix "local" program and with
write() in Apache. I'm including debuging output from Postfix only.

The following is ktrace output from the Postfix "local"
program. I've included the part where it it trying to read the
~/.forward file from an NFS mounted file system.

   644 local    CALL  seteuid(0)
   644 local    RET   seteuid 0
   644 local    CALL  setegid(0x58)
   644 local    RET   setegid 0
   644 local    CALL  setgroups(0x1,0xbfbfd844)
   644 local    RET   setgroups 0
   644 local    CALL  seteuid(0x4d)
   644 local    RET   seteuid 0
   644 local    CALL  geteuid
   644 local    RET   geteuid 77/0x4d
   644 local    CALL  getegid
   644 local    RET   getegid 88/0x58
   644 local    CALL  geteuid
   644 local    RET   geteuid 77/0x4d
   644 local    CALL  seteuid(0)
   644 local    RET   seteuid 0
   644 local    CALL  setegid(0xa)
   644 local    RET   setegid 0
   644 local    CALL  setgroups(0x1,0xbfbfd850)
   644 local    RET   setgroups 0
   644 local    CALL  seteuid(0x473e)
   644 local    RET   seteuid 0
   644 local    CALL  open(0x80d8808,0,0)
   644 local    NAMI  "/staff/br/.forward"
   644 local    RET   open 14/0xe
   644 local    CALL  geteuid
   644 local    RET   geteuid 18238/0x473e
   644 local    CALL  seteuid(0)
   644 local    RET   seteuid 0
   644 local    CALL  setegid(0x58)
   644 local    RET   setegid 0
   644 local    CALL  setgroups(0x1,0xbfbfd83c)
   644 local    RET   setgroups 0
   644 local    CALL  seteuid(0x4d)
   644 local    RET   seteuid 0
   644 local    CALL  fcntl(0xe,0x1,0)
   644 local    RET   fcntl 0
   644 local    CALL  fcntl(0xe,0x2,0x1)
   644 local    RET   fcntl 0
   644 local    CALL  read(0xe,0x8115008,0x1000)
   644 local    RET   read -1 errno 14 Bad address
   644 local    CALL  close(0xe)
   644 local    RET   close 0

As you can see, local becomes 0x473e/0xa (user "br", group
"staff") to open the file and reverts to 0x4d/0x58 (user "postfix",
group "certs") to read the contents. The read call returns EFAULT
which shouldn't be the case.

I then captured the NFS traffic from the network to see what was going
on. This time the user is "marcotte" (UID 13564). First, I used a
working system (NetBSD 1.5-release as of Sep 17, 2002:

***** BEGIN ************************************************************
 Frame 35 (146 bytes on wire, 146 bytes captured)
Ethernet II, Src: 00:a0:c9:32:3d:6a, Dst: 00:c0:95:f9:d1:4f
Internet Protocol, Src Addr: 166.84.1.75 (166.84.1.75), Dst Addr: 166.84.1.90 (166.84.1.90)
User Datagram Protocol, Src Port: 1022 (1022), Dst Port: nfs (2049)
Remote Procedure Call
    XID: 0x1e001070 (503320688)
    Message Type: Call (0)
    RPC Version: 2
    Program: NFS (100003)
    Program Version: 3
    Procedure: ACCESS (4)
    The reply to this request is in frame 36
    Credentials
        Flavor: AUTH_UNIX (1)
        Length: 24
        Stamp: 0x00000000
        Machine Name: <EMPTY>
        UID: 13564
        GID: 10
        Auxiliary GIDs
    Verifier
Network File System
    Program Version: 3
    V3 Procedure: ACCESS (4)
    object
        length: 32
        hash: 0xf4f4f892
        Name: .forward
        type: unknown
        data: C0011800DB747A0020000000001A2348
              E3FD7D0005230001C0011800DB747A00
    access: 0x01

Frame 36 (162 bytes on wire, 162 bytes captured)
Ethernet II, Src: 00:c0:95:f9:d1:4f, Dst: 00:a0:c9:32:3d:6a
Internet Protocol, Src Addr: 166.84.1.90 (166.84.1.90), Dst Addr: 166.84.1.75 (166.84.1.75)
User Datagram Protocol, Src Port: nfs (2049), Dst Port: 1022 (1022)
Remote Procedure Call
    XID: 0x1e001070 (503320688)
    Message Type: Reply (1)
    Program: NFS (100003)
    Program Version: 3
    Procedure: ACCESS (4)
    Reply State: accepted (0)
    This is a reply to a request in frame 35
    Time from request: 0.000099000 seconds
    Verifier
    Accept State: RPC executed successfully (0)
Network File System
    Program Version: 3
    V3 Procedure: ACCESS (4)
    Status: OK (0)
    obj_attributes
        attributes_follow: value follows (1)
        attributes
            Type: Regular File (1)
            mode: 0600
            nlink: 1
            uid: 13564
            gid: 10
            size: 139
            used: 4096
            rdev: 0,0
            fsid: 16786181
            fileid: 1712968
            atime: May  1, 2003 23:37:27.764011000
            mtime: Jul 26, 2001 15:49:49.942006000
            ctime: May  1, 2003 23:13:59.084005000
    access: 0x01

Frame 37 (142 bytes on wire, 142 bytes captured)
Ethernet II, Src: 00:a0:c9:32:3d:6a, Dst: 00:c0:95:f9:d1:4f
Internet Protocol, Src Addr: 166.84.1.75 (166.84.1.75), Dst Addr: 166.84.1.90 (166.84.1.90)
User Datagram Protocol, Src Port: 1022 (1022), Dst Port: nfs (2049)
Remote Procedure Call
    XID: 0x1e001071 (503320689)
    Message Type: Call (0)
    RPC Version: 2
    Program: NFS (100003)
    Program Version: 3
    Procedure: GETATTR (1)
    The reply to this request is in frame 38
    Credentials
        Flavor: AUTH_UNIX (1)
        Length: 24
        Stamp: 0x00000000
        Machine Name: <EMPTY>
        UID: 13564
        GID: 10
        Auxiliary GIDs
    Verifier
Network File System
    Program Version: 3
    V3 Procedure: GETATTR (1)
    object
        length: 32
        hash: 0xf4f4f892
        Name: .forward
        type: unknown
        data: C0011800DB747A0020000000001A2348
              E3FD7D0005230001C0011800DB747A00

Frame 38 (154 bytes on wire, 154 bytes captured)
Ethernet II, Src: 00:c0:95:f9:d1:4f, Dst: 00:a0:c9:32:3d:6a
Internet Protocol, Src Addr: 166.84.1.90 (166.84.1.90), Dst Addr: 166.84.1.75 (166.84.1.75)
User Datagram Protocol, Src Port: nfs (2049), Dst Port: 1022 (1022)
Remote Procedure Call
    XID: 0x1e001071 (503320689)
    Message Type: Reply (1)
    Program: NFS (100003)
    Program Version: 3
    Procedure: GETATTR (1)
    Reply State: accepted (0)
    This is a reply to a request in frame 37
    Time from request: 0.000096000 seconds
    Verifier
    Accept State: RPC executed successfully (0)
Network File System
    Program Version: 3
    V3 Procedure: GETATTR (1)
    Status: OK (0)
    obj_attributes
        Type: Regular File (1)
        mode: 0600
        nlink: 1
        uid: 13564
        gid: 10
        size: 139
        used: 4096
        rdev: 0,0
        fsid: 16786181
        fileid: 1712968
        atime: May  1, 2003 23:37:27.764011000
        mtime: Jul 26, 2001 15:49:49.942006000
        ctime: May  1, 2003 23:13:59.084005000

Frame 39 (154 bytes on wire, 154 bytes captured)
Ethernet II, Src: 00:a0:c9:32:3d:6a, Dst: 00:c0:95:f9:d1:4f
Internet Protocol, Src Addr: 166.84.1.75 (166.84.1.75), Dst Addr: 166.84.1.90 (166.84.1.90)
User Datagram Protocol, Src Port: 1022 (1022), Dst Port: nfs (2049)
Remote Procedure Call
    XID: 0x1e001072 (503320690)
    Message Type: Call (0)
    RPC Version: 2
    Program: NFS (100003)
    Program Version: 3
    Procedure: READ (6)
    The reply to this request is in frame 40
    Credentials
        Flavor: AUTH_UNIX (1)
        Length: 24
        Stamp: 0x00000000
        Machine Name: <EMPTY>
        UID: 13564
        GID: 10
        Auxiliary GIDs
    Verifier
Network File System
    Program Version: 3
    V3 Procedure: READ (6)
    file
        length: 32
        hash: 0xf4f4f892
        Name: .forward
        type: unknown
        data: C0011800DB747A0020000000001A2348
              E3FD7D0005230001C0011800DB747A00
    offset: 0
    count: 32768

Frame 40 (310 bytes on wire, 310 bytes captured)
Ethernet II, Src: 00:c0:95:f9:d1:4f, Dst: 00:a0:c9:32:3d:6a
Internet Protocol, Src Addr: 166.84.1.90 (166.84.1.90), Dst Addr: 166.84.1.75 (166.84.1.75)
User Datagram Protocol, Src Port: nfs (2049), Dst Port: 1022 (1022)
Remote Procedure Call
    XID: 0x1e001072 (503320690)
    Message Type: Reply (1)
    Program: NFS (100003)
    Program Version: 3
    Procedure: READ (6)
    Reply State: accepted (0)
    This is a reply to a request in frame 39
    Time from request: 0.012248000 seconds
    Verifier
    Accept State: RPC executed successfully (0)
Network File System
    Program Version: 3
    V3 Procedure: READ (6)
    Status: OK (0)
    file_attributes
        attributes_follow: value follows (1)
        attributes
            Type: Regular File (1)
            mode: 0600
            nlink: 1
            uid: 13564
            gid: 10
            size: 139
            used: 4096
            rdev: 0,0
            fsid: 16786181
            fileid: 1712968
            atime: May  1, 2003 23:38:37.350002000
            mtime: Jul 26, 2001 15:49:49.942006000
            ctime: May  1, 2003 23:13:59.084005000
    count: 139
    EOF: Yes
    Data: <DATA>
***** END ************************************************************

The above output shows that the UID/GID on both the requests and the
replies is 13564/10.

Next is from a NetBSD 1.6.1 system:

***** BEGIN ************************************************************
Frame 75 (146 bytes on wire, 146 bytes captured)
Ethernet II, Src: 00:a0:c9:32:3d:6a, Dst: 00:c0:95:f9:d1:4f
Internet Protocol, Src Addr: 166.84.1.75 (166.84.1.75), Dst Addr: 166.84.1.90 (166.84.1.90)
User Datagram Protocol, Src Port: 1003 (1003), Dst Port: nfs (2049)
Remote Procedure Call
    XID: 0x1e993039 (513355833)
    Message Type: Call (0)
    RPC Version: 2
    Program: NFS (100003)
    Program Version: 3
    Procedure: ACCESS (4)
    The reply to this request is in frame 76
    Credentials
        Flavor: AUTH_UNIX (1)
        Length: 24
        Stamp: 0x00000000
        Machine Name: <EMPTY>
        UID: 13564
        GID: 10
        Auxiliary GIDs
    Verifier
Network File System
    Program Version: 3
    V3 Procedure: ACCESS (4)
    object
        length: 32
        hash: 0xf4f4f892
        Name: .forward
        type: unknown
        data: C0011800DB747A0020000000001A2348
              E3FD7D0005230001C0011800DB747A00
    access: 0x01

Frame 76 (162 bytes on wire, 162 bytes captured)
Ethernet II, Src: 00:c0:95:f9:d1:4f, Dst: 00:a0:c9:32:3d:6a
Internet Protocol, Src Addr: 166.84.1.90 (166.84.1.90), Dst Addr: 166.84.1.75 (166.84.1.75)
User Datagram Protocol, Src Port: nfs (2049), Dst Port: 1003 (1003)
Remote Procedure Call
    XID: 0x1e993039 (513355833)
    Message Type: Reply (1)
    Program: NFS (100003)
    Program Version: 3
    Procedure: ACCESS (4)
    Reply State: accepted (0)
    This is a reply to a request in frame 75
    Time from request: 0.003607000 seconds
    Verifier
    Accept State: RPC executed successfully (0)
Network File System
    Program Version: 3
    V3 Procedure: ACCESS (4)
    Status: OK (0)
    obj_attributes
        attributes_follow: value follows (1)
        attributes
            Type: Regular File (1)
            mode: 0600
            nlink: 1
            uid: 13564
            gid: 10
            size: 139
            used: 4096
            rdev: 0,0
            fsid: 16786181
            fileid: 1712968
            atime: May  1, 2003 23:43:32.599018000
            mtime: Jul 26, 2001 15:49:49.942006000
            ctime: May  1, 2003 23:13:59.084005000
    access: 0x01

Frame 77 (142 bytes on wire, 142 bytes captured)
Ethernet II, Src: 00:a0:c9:32:3d:6a, Dst: 00:c0:95:f9:d1:4f
Internet Protocol, Src Addr: 166.84.1.75 (166.84.1.75), Dst Addr: 166.84.1.90 (166.84.1.90)
User Datagram Protocol, Src Port: 1003 (1003), Dst Port: nfs (2049)
Remote Procedure Call
    XID: 0x1e99303a (513355834)
    Message Type: Call (0)
    RPC Version: 2
    Program: NFS (100003)
    Program Version: 3
    Procedure: GETATTR (1)
    The reply to this request is in frame 78
    Credentials
        Flavor: AUTH_UNIX (1)
        Length: 24
        Stamp: 0x00000000
        Machine Name: <EMPTY>
        UID: 13564
        GID: 10
        Auxiliary GIDs
    Verifier
Network File System
    Program Version: 3
    V3 Procedure: GETATTR (1)
    object
        length: 32
        hash: 0xf4f4f892
        Name: .forward
        type: unknown
        data: C0011800DB747A0020000000001A2348
              E3FD7D0005230001C0011800DB747A00

Frame 78 (154 bytes on wire, 154 bytes captured)
Ethernet II, Src: 00:c0:95:f9:d1:4f, Dst: 00:a0:c9:32:3d:6a
Internet Protocol, Src Addr: 166.84.1.90 (166.84.1.90), Dst Addr: 166.84.1.75 (166.84.1.75)
User Datagram Protocol, Src Port: nfs (2049), Dst Port: 1003 (1003)
Remote Procedure Call
    XID: 0x1e99303a (513355834)
    Message Type: Reply (1)
    Program: NFS (100003)
    Program Version: 3
    Procedure: GETATTR (1)
    Reply State: accepted (0)
    This is a reply to a request in frame 77
    Time from request: 0.000117000 seconds
    Verifier
    Accept State: RPC executed successfully (0)
Network File System
    Program Version: 3
    V3 Procedure: GETATTR (1)
    Status: OK (0)
    obj_attributes
        Type: Regular File (1)
        mode: 0600
        nlink: 1
        uid: 13564
        gid: 10
        size: 139
        used: 4096
        rdev: 0,0
        fsid: 16786181
        fileid: 1712968
        atime: May  1, 2003 23:43:32.599018000
        mtime: Jul 26, 2001 15:49:49.942006000
        ctime: May  1, 2003 23:13:59.084005000

Frame 79 (154 bytes on wire, 154 bytes captured)
Ethernet II, Src: 00:a0:c9:32:3d:6a, Dst: 00:c0:95:f9:d1:4f
Internet Protocol, Src Addr: 166.84.1.75 (166.84.1.75), Dst Addr: 166.84.1.90 (166.84.1.90)
User Datagram Protocol, Src Port: 1003 (1003), Dst Port: nfs (2049)
Remote Procedure Call
    XID: 0x1e99303b (513355835)
    Message Type: Call (0)
    RPC Version: 2
    Program: NFS (100003)
    Program Version: 3
    Procedure: READ (6)
    The reply to this request is in frame 80
    Credentials
        Flavor: AUTH_UNIX (1)
        Length: 24
        Stamp: 0x00000000
        Machine Name: <EMPTY>
        UID: 77
        GID: 88
        Auxiliary GIDs
    Verifier
Network File System
    Program Version: 3
    V3 Procedure: READ (6)
    file
        length: 32
        hash: 0xf4f4f892
        Name: .forward
        type: unknown
        data: C0011800DB747A0020000000001A2348
              E3FD7D0005230001C0011800DB747A00
    offset: 0
    count: 139

Frame 80 (158 bytes on wire, 158 bytes captured)
Ethernet II, Src: 00:c0:95:f9:d1:4f, Dst: 00:a0:c9:32:3d:6a
Internet Protocol, Src Addr: 166.84.1.90 (166.84.1.90), Dst Addr: 166.84.1.75 (166.84.1.75)
User Datagram Protocol, Src Port: nfs (2049), Dst Port: 1003 (1003)
Remote Procedure Call
    XID: 0x1e99303b (513355835)
    Message Type: Reply (1)
    Program: NFS (100003)
    Program Version: 3
    Procedure: READ (6)
    Reply State: accepted (0)
    This is a reply to a request in frame 79
    Time from request: 0.000122000 seconds
    Verifier
    Accept State: RPC executed successfully (0)
Network File System
    Program Version: 3
    V3 Procedure: READ (6)
    Status: ERR_ACCES (13)
    file_attributes
        attributes_follow: value follows (1)
        attributes
            Type: Regular File (1)
            mode: 0600
            nlink: 1
            uid: 13564
            gid: 10
            size: 139
            used: 4096
            rdev: 0,0
            fsid: 16786181
            fileid: 1712968
            atime: May  1, 2003 23:43:32.599018000
            mtime: Jul 26, 2001 15:49:49.942006000
            ctime: May  1, 2003 23:13:59.084005000
***** END ************************************************************

Almost the same as the working system except for the read call (frames
79 and 80) which has the UID/GID as 77/88. The file server correctly
reports ERR_ACCES.

This makes Postfix unusable on NetBSD 1.6.x when home directories are
mounted over NFS (well, unless all ~/.forward files are readable by
others).

I can provide the full ktrace output and the raw packets from the
network on request.

>How-To-Repeat:

I tried to write a small test program to stimulate the problem, but
was unable to. The best I can recommend is to run postfix and mount
the users home directories over NFS making sure that the ~/.forward
files are readable only by the owner.

>Fix:

Unknown by me...


>Release-Note:
>Audit-Trail:
>Unformatted: