Subject: How to run sshd on read-only mounted root file system?
To: None <tech-userlevel@netbsd.org>
From: Ian Zagorskih <ianzag@megasignal.com>
List: tech-userlevel
Date: 02/24/2006 23:07:09
uname -a 
NetBSD MAKS 3.99.15 NetBSD 3.99.15 (GENERIC) #10: Fri Feb 24 21:23:50 NOVT 
2006  ianzag@IANZAG:/home/ianzag/NetBSD/kernel/GENERIC i386

I have an embedded system with root file system mounted as read-only. The 
reason is to get rid of fsck course we need this system to boot as fast as 
possibly and on the other hand it can be switched off at any time without a 
proper shutdown. The fstab looks like this:

/dev/wd0a /   ffs  rw,noatime,nodevmtime 0  1
swap  /tmp  mfs  rw,-s=2m,noatime  0  0
swap  /var  mfs  rw,-s=4m,noatime  0  0

I.e. there's only root filesystem mounted as read only and all dynamicall 
stuff is created at boot time by hands in /var and /tmp.

In order to manage this system remotely I'm using ssh so there's running sshd. 
The problem is that when user is logging in sshd calls openpty(3) to allocate 
a pty and, course /dev is r/o, it fails -> I cannot login. Of course, there's 
"-T" option for ssh client which instructs remote sshd not to allocate a pty, 
but resulting shell is rather limited so I cannot even run vi, less etc -> 
this is unlikely an "administrative shell".

At #netbsd I was advised to put dynamically generated /dev somewhere on one of 
my MFS temporary file systems and mout it then over original /dev with null 
fs. So here we are and at boot time the following script does this:

---cut---
#!/bin/sh
#
# PROVIDE: finishlocal
# REQUIRE: mountlocal

$_rc_subr_loaded . /etc/rc.subr

name="makevar"
start_cmd="makevar_start"
stop_cmd=":"

makevar_start()
{
        echo "Setting up dynamic /var"
        mkdir -m 0755 -p /var/dev /var/run /var/log /var/db /var/chroot/sshd
        mkdir -m 1777 -p /var/tmp
 touch /var/log/authlog /var/log/messages /var/log/xferlog /var/log/aculog
        echo "Setting up dynamic /dev"
        cd /var/dev && /dev/MAKEDEV maks
        ln -s /var/run/log /var/dev/log
        echo "Mount dynamic /dev"
        mount_null /var/dev /dev
}

load_rc_config $name
run_rc_command "$1"
---cut---

Of course option "file-system NULL" is enabled in kernel config file and this 
is the first script I run before any services like sshd are started . The 
system boots just fine, device nodes are created dynamically in /var/dev and 
finally mounted over /dev:

# mount
/dev/wd0a on / type ffs (noatime, local)
mfs:25 on /var type mfs (synchronous, noatime, local)
mfs:30 on /tmp type mfs (synchronous, noatime, local)
/var/dev on /dev type null (local)

Now when I'm trying to login into the system with pty allocation enabled 
(default) ssh client just hangs:

$ ssh -t toor@192.168.100.45 
toor@192.168.100.45's password: 
...here's nothing i.e. no response on any kdb events and so on. Nothing 
appears in logs, console etc - just hang up. At the same time, from another 
shell it looks like this:

$ ssh -T toor@192.168.100.45 
toor@192.168.100.45's password: 
ps aux 
ps: warning: /var/run/dev.db: No such file or directory
USER PID %CPU %MEM VSZ  RSS TTY STAT STARTED    TIME COMMAND
root 479  7.4  4.0 316 2440 ?   Ss   11:07PM 0:00.27 sshd: toor@notty 
root 136  4.0  1.2 148  728 ?   Ss   11:07PM 0:00.04 -sh 
root   0  0.0  6.5   0 3980 ?   DKs  11:03PM 0:00.01 [swapper]
root   1  0.0  1.1  48  632 ?   Is   11:03PM 0:00.05 init 
root   2  0.0  6.5   0 3980 ?   DK   11:03PM 0:00.00 [atabus0]
root   3  0.0  6.5   0 3980 ?   DK   11:03PM 0:00.00 [atabus1]
root   4  0.0  6.5   0 3980 ?   DK   11:03PM 0:00.01 [pagedaemon]
root   5  0.0  6.5   0 3980 ?   DK   11:03PM 0:00.02 [ioflush]
root   6  0.0  6.5   0 3980 ?   DK   11:03PM 0:00.01 [aiodoned]
root  14  0.0  6.5   0 3980 ?   DK   11:03PM 0:00.00 [physiod]
root  25  0.0  1.2 216  740 ?   Ss   11:03PM 0:00.09 mount_mfs -s 4m -o 
noatime
root  30  0.0  1.0 216  580 ?   Is   11:03PM 0:00.04 mount_mfs -s 2m -o 
noatime
root 378  0.0  1.2 172  736 ?   Ss   11:03PM 0:00.22 /usr/sbin/syslogd -s 
root 404  0.0  1.3 612  756 ?   Is   11:03PM 0:00.01 /sbin/dhclient -q rtk0 
root 434  0.0  2.8 284 1668 ?   Ss   11:03PM 0:01.55 /usr/sbin/sshd 
root 463  0.0  4.0 324 2432 ?   Is   11:06PM 0:00.27 sshd: toor@ttyp0 
root 475  0.0  1.1  80  636 ?   R    11:07PM 0:00.00 ps aux 
root 444  0.0  1.2 148  720 ?   Is   11:06PM 0:00.06 -sh 

I.e. at the first time I actually can access remote host, login and so on 
except it hangs at some point.

The problem can be "fixed" quite easily: When I do not mount /var/dev 
over /dev with NULL FS but run root read/write accessible with the same set 
of devices, all works just fine as usually fine.

Well, the question is: am I right in my feeling that special files are not 
accessible over NULL FS?

The second question is: when I do all the same but with UNION FS simple access 
to dev like "ls -l /dev" makes a kernel panic complaining to "locking to 
itself". Though I haven't found why.

// wbr