Subject: Re: Hacking rsh to add exit codes.
To: None <tech-userlevel@netbsd.org>
From: Todd Whitesel <toddpw@best.com>
List: tech-userlevel
Date: 01/13/2000 17:50:09
Previously I wrote:
> I want. Namely, return the exit code of the remote session so if my remote
> compile fails, my build stops instead of charging ahead bindly. (Currently
> I am using ssh with acceptable success, but nagging problems remain.)

Further investigation confirms something I've suspected for a few days now;
it's an m68k specific problem that affects both ssh and rsh. Once in a blue
moon, the client enters the kernel via a trap for connect(), and never comes
out. Happens frequently mac68k and rarely on sun3, but has not yet been
observed on sparc even though many more trials have been run on the sparc.

I figure the best way to help fix this one is to work around it for now and
get more snapshots built so anyone who wants to help test can run what I'm
running. This was originally observed with 1.4P, but also occurs with the
1.4.2_ALPHA kernels I just built.

I'm dusting off some old socket example code of mine to eliminate as many
unknowns as possible and get something working that might able to recover
from the hanging phenomenon. In hindsight I would have saved time if I'd
done that to start with, but them's the brakes.

Unfortunately I can't use either of these suggestions:

> This trick relies on the fact that date will send output to stdout (thus
> providing input to wc), and error messages to stderr (not read by wc). 
> Brian

Yeah, but I need all of stdout too; I can't throw it away just to get the
exit code.

> Isn't this what the "rexec" (remote execute) service was for?
>         -- Jason R. Thorpe <thorpej@nas.nasa.gov>

I wish. Unfortunately rexec only produces two sockets: one for stdin and
stdout, and an optional one for signal numbers (one byte per signal request)
and stderr. UTSL on rshd reveals that it explicitly ignores the exit code
of the remote shell, calling exit(0) instead of wait() or anything like it.
The remote job gets orphaned for init(8) to reap.

Thanks for the quick responses everyone, but I looks like I'm on my own for
this, but that's fine -- I was hoping to get around to writing it eventually
anyway.

Todd Whitesel
toddpw @ best.com