Subject: [gwr@mc.com: Re: kern/1991: device number defined inconsistently on sparc/Ultrix/OSF1 Alpha]
To: None <netbsd-bugs@NetBSD.ORG>
From: Taras Ivanenko <ivanenko@ctpa03.mit.edu>
List: netbsd-bugs
Date: 02/05/1996 14:11:08
> From: "Gordon W. Ross" <gwr@mc.com>
> Date: Mon, 5 Feb 96 12:05:37 EST
> To: ivanenko@ctpa03.mit.edu
> Cc: netbsd-bugs@netbsd.org
> In-Reply-To: <199601291605.LAA20530@pain.lcs.mit.edu> (ivanenko@ctpa03.mit.edu)
> Subject: Re: kern/1991: device number defined inconsistently on sparc/Ultrix/OSF1 Alpha
> Reply-To: "Gordon Ross" <gwr@mc.com>
> 
> > Date: Mon, 29 Jan 1996 10:36:03 -0500
> > From: ivanenko@ctpa03.mit.edu
> 
> > >Number:         1991
> > >Category:       kern
> > >Synopsis:       device number defined inconsistently on sparc/Ultrix/OSF1 Alpha
> > >Confidential:   no
> > >Severity:       serious
> > >Priority:       medium
> > >Responsible:    kern-bug-people (Kernel Bug People)
> > >State:          open
> > >Class:          sw-bug
> > >Submitter-Id:   net
> > >Arrival-Date:   Mon Jan 29 11:05:04 1996
> > >Last-Modified:
> > >Originator:     Taras Ivanenko
> > >Organization:
> > Massachusetts Institute of Technology
> > >Release:        1.1
> > >Environment:
> >     SUN SLC (diskless), DEC Alpha OSF/1 as a boot/file server
> > System: NetBSD ctps01.mit.edu 1.1 NetBSD 1.1 (ctp-sun-slc) #1: Sun Jan 28 17:32:33 PST 1996 ivanenko@ctps01.mit.edu:/usr/src/sys/arch/sparc/compile/ctp-sun-slc sparc
> > 
> > 
> > >Description:
> > 	device number is combined 8-bit major/8-bit minor on NetBSD
> > and everywhere else except DEC Alpha, where it is 12-bit major/20-bit
> > minor. When I used DEC Alpha as a file server for root dir in diskless
> > setup, the devices are created with Alpha numbers (12/20) but are then
> 
> Bzzzzt!  That was the mistake.  The NFS diskless arrangement requires
> that you create your device nodes from the client.  The NFS server
> handles the device numbers provided by the client as opaque values,
> so the client will be happy.  To the server, they look like garbage.
> 
> Actually, you can create /dev/console (0,0) on either machine, and
> probably need to make just that on the server so you can boot the
> client in single-user mode to make the remaining device nodes.
> 

Right, I can create /dev/console (0,0) without problem but that is all
I can do. No matter where I create the device (from the client or from
the server) the device numbers on a server are correct. On the client
side the numbers are correct for some time but then change to
meaningless. 

Experiment:
1) Booted the client with generic kernel from the distribution, 
   single user mode.
2) Created some device, say zero (3,12) in /tmp
3) ls -al reports <...> 3, 12 /tmp/zero
4) On the server ls -al tmp/zero reports 0, 780 zero

At this point everything should be OK, at least the device numbers are 
right. 

5) Go to lunch, after lunch:

6) Oops! ls -al gives 0, 12 /tmp/zero on the client, 3, 12 tmp/zero on
   the server. This is the problem I was writing.



> > read with NetBSd macros (8/8). To make things worse, the devices stay
> > in cache for some time with 8/8 numbers and then switch to 12/20 (as
> > on the server) and make the system unusable.
> > 
> > >How-To-Repeat:
> > 	It show up every time in my setup.
> > >Fix:
> > 	I put this into sys/types.h as I do not have other /dev except
> > on Alpha. I guess it is possible to make the NFS code work
> > consistently across platforms.
> 

I realize fix I proposed would not work.

> Note that NFS does deal correctly with different device node formats.
> 
> Gordon
> 

My impression is that OSF1 converts the numbers into its own format
when doing some operations. I have no idea why it is doing that but
the effect on NetBSD was disastrous. I can not change much in OSF1
kernel, it is probably DEC's proprietary code but I can change the way
NetBSD sees the devices. For the reference, I have 

DEC OSF/1 V2.1 (Rev. 250); Wed Nov  9 20:01:58 EST 1994 
DEC OSF/1 V2.0 Worksystem Software (Rev. 240)


I used the following patch to make the things work. NetBSD devices
have numbers less then 0xFFFF and OSF1 devices have numbers more then
0xFFFFF. So when I call the macros osf2bsd_dev, the BSD dev number
does not change but OSF1 number gets converted to the proper BSD
number. What am I missing here? Sould the file
./usr/src/sys/nfs/nfsm_subs.h 
be also changed somehow? I got that idea from Alpha port.


	Taras.

diff -c output:

*** ./usr/src/sys/nfs/nfs_serv.c	Fri Oct 13 22:54:35 1995
--- /ctpa03/sun/./usr/src/sys/nfs/nfs_serv.c	Sat Feb 03 11:58:03 1996
***************
*** 661,666 ****
--- 661,670 ----
  	nfsm_srvdone;
  }
  
+ #ifdef COMPAT_OSF1DEV
+ #define osf2bsd_dev(dev) 	((dev==0xFFFFFFFF)?dev:makedev((dev >> 20) & 0xfff, dev & 0xfffff))
+ #endif
+ 
  /*
   * nfs create service
   * now does a truncate to 0 length via. setattr if it already exists
***************
*** 714,719 ****
--- 718,726 ----
  			rdev = fxdr_unsigned(long, sp->sa_nfssize);
  		else
  			rdev = fxdr_unsigned(long, sp->sa_nqrdev);
+ #ifdef COMPAT_OSF1DEV
+ 		rdev = osf2bsd_dev(rdev);
+ #endif
  		if (va.va_type == VREG || va.va_type == VSOCK) {
  			vrele(nd.ni_startdir);
  			nqsrv_getl(nd.ni_dvp, NQL_WRITE);
***************
*** 735,742 ****
  				VOP_ABORTOP(nd.ni_dvp, &nd.ni_cnd);
  				vput(nd.ni_dvp);
  				goto out;
! 			} else
! 				va.va_rdev = (dev_t)rdev;
  			nqsrv_getl(nd.ni_dvp, NQL_WRITE);
  			if (error = VOP_MKNOD(nd.ni_dvp, &nd.ni_vp, &nd.ni_cnd, &va)) {
  				vrele(nd.ni_startdir);
--- 742,753 ----
  				VOP_ABORTOP(nd.ni_dvp, &nd.ni_cnd);
  				vput(nd.ni_dvp);
  				goto out;
! 			} else{
! #ifdef COMPAT_OSF1DEV
! 			  rdev = osf2bsd_dev(rdev);
! #endif
! 			  va.va_rdev = (dev_t)rdev;
! 			}
  			nqsrv_getl(nd.ni_dvp, NQL_WRITE);
  			if (error = VOP_MKNOD(nd.ni_dvp, &nd.ni_vp, &nd.ni_cnd, &va)) {
  				vrele(nd.ni_startdir);
*** ./usr/src/sys/nfs/nfs_subs.c	Fri Oct 13 22:54:38 1995
--- /ctpa03/sun/./usr/src/sys/nfs/nfs_subs.c	Sat Feb 03 11:58:01 1996
***************
*** 636,641 ****
--- 636,645 ----
  }
  
  #ifdef NFSCLIENT
+ #ifdef COMPAT_OSF1DEV
+ #define	osf2bsd_dev(dev)	((dev==0xFFFFFFFF)?dev:makedev((dev >> 20) & 0xfff, dev & 0xfffff))
+ #endif
+ 
  /*
   * Attribute cache routines.
   * nfs_loadattrcache() - loads or updates the cache contents from attributes
***************
*** 685,693 ****
--- 689,703 ----
  		vtyp = IFTOVT(vmode);
  	if (isnq) {
  		rdev = fxdr_unsigned(long, fp->fa_nqrdev);
+ #ifdef COMPAT_OSF1DEV
+ 		rdev = osf2bsd_dev(rdev);
+ #endif
  		fxdr_nqtime(&fp->fa_nqmtime, &mtime);
  	} else {
  		rdev = fxdr_unsigned(long, fp->fa_nfsrdev);
+ #ifdef COMPAT_OSF1DEV
+ 		rdev = osf2bsd_dev(rdev);
+ #endif
  		fxdr_nfstime(&fp->fa_nfsmtime, &mtime);
  	}
  	/*
*** ./usr/src/sys/nfs/nfs_vnops.c	Wed Oct 18 07:38:32 1995
--- /ctpa03/sun/./usr/src/sys/nfs/nfs_vnops.c	Sat Feb 03 11:58:00 1996
***************
*** 944,949 ****
--- 944,954 ----
  	return (error);
  }
  
+ #ifdef COMPAT_OSF1DEV
+ #define	osf2bsd_dev(dev)	((dev==0xFFFFFFFF)?dev:makedev((dev >> 20) & 0xfff, dev & 0xfffff))
+ #endif
+ 
+ 
  /*
   * nfs mknod call
   * This is a kludge. Use a create rpc but with the IFMT bits of the mode
***************
*** 974,983 ****
  	struct mbuf *mreq, *mrep, *md, *mb, *mb2;
  	u_long rdev;
  
! 	if (vap->va_type == VCHR || vap->va_type == VBLK)
  		rdev = txdr_unsigned(vap->va_rdev);
  #ifdef FIFO
! 	else if (vap->va_type == VFIFO)
  		rdev = 0xffffffff;
  #endif /* FIFO */
  	else {
--- 979,991 ----
  	struct mbuf *mreq, *mrep, *md, *mb, *mb2;
  	u_long rdev;
  
! 	if (vap->va_type == VCHR || vap->va_type == VBLK){
  		rdev = txdr_unsigned(vap->va_rdev);
+ #ifdef COMPAT_OSF1DEV
+ 		rdev = osf2bsd_dev(rdev);
+ #endif
  #ifdef FIFO
! 	}else if (vap->va_type == VFIFO)
  		rdev = 0xffffffff;
  #endif /* FIFO */
  	else {