Subject: Re: [gwr@mc.com: Re: kern/1991: device number defined inconsistently on sparc/Ultrix/OSF1 Alpha]
To: None <ivanenko@ctpa03.mit.edu>
From: Gordon W. Ross <gwr@mc.com>
List: netbsd-bugs
Date: 02/06/1996 11:15:31
> Date: Mon, 5 Feb 1996 14:11:08 -0500
> From: Taras Ivanenko <ivanenko@ctpa03.mit.edu>

> Right, I can create /dev/console (0,0) without problem but that is all
> I can do. No matter where I create the device (from the client or from
> the server) the device numbers on a server are correct. On the client
> side the numbers are correct for some time but then change to
> meaningless. 

The server must be violating the NFS protocol spec.

> Experiment:
> 1) Booted the client with generic kernel from the distribution, 
>    single user mode.
> 2) Created some device, say zero (3,12) in /tmp
> 3) ls -al reports <...> 3, 12 /tmp/zero
> 4) On the server ls -al tmp/zero reports 0, 780 zero

So far, that looks normal to me.

> At this point everything should be OK, at least the device numbers are 
> right. 
> 
> 5) Go to lunch, after lunch:
> 
> 6) Oops! ls -al gives 0, 12 /tmp/zero on the client, 3, 12 tmp/zero on
>    the server. This is the problem I was writing.

For mknod, the NFS client sends an NFS_CREATE call to the server
with the type bits set to "chr" or "blk" (device type) and one
longword of bits representing the client's dev_t value.

The NFS serve is obligated to treat that longword as opaque,
and return that same value on later getattr calls, even if
that value has no meaning as a device node to the server.

> My impression is that OSF1 converts the numbers into its own format
> when doing some operations. I have no idea why it is doing that but
> the effect on NetBSD was disastrous. I can not change much in OSF1
> kernel, it is probably DEC's proprietary code but I can change the way
> NetBSD sees the devices. For the reference, I have 
> 
> DEC OSF/1 V2.1 (Rev. 250); Wed Nov  9 20:01:58 EST 1994 
> DEC OSF/1 V2.0 Worksystem Software (Rev. 240)

Presumably, if you make your NFS client use the same device node
format as your (broken) server, then you can avoid the bug.
Of course you realize that's a total hack...

The folks at DEC should be pestered about this.

Note that this would mean they can not serve diskless SunOS 4.1.3
clients, because they will have the same device node problem.
Perhaps you could submit a bug report against OSF/1 serving a
SunOS 4.1.3 diskless client and get some attention...

Gordon