Subject: Re: Introducing myself to the club
To: None <port-vax@NetBSD.ORG>
From: Michael Sokolov <mxs46@po.CWRU.Edu>
List: port-vax
Date: 01/11/1998 17:57:06
   David Brownlee <abs@anim.dreamworks.com> wrote:
> It might help to work out the services you are intending to
> provide in more detail, to better match machines to tasks.
> eg: it might make sense to have a VS3100/38M each for email
> and www (assuming you are interested in having a web server :),
> to keep load off the central servers.
   What do you mean by "email"? It can mean three different things:
receiving incoming SMTP mail and delivering it to mailboxes, serving POP
clients, and relaying SMTP for machines that can't send mail directly to
the recipient, only to a relay. I think that the first task should be
handled by the same machine that actually stores the mailboxes, since that
machine will have to spend CPU cycles on any incoming message anyway,
whether it comes via SMTP or via NFS. All UNIX "clusters" that I have seen,
even those with more than just one central server, direct incoming SMTP
mail to the machine that actually stores the mailboxes. The second task
(POP serving), on the other hand, is usually delegated to a separate
machine which NFS-mounts /var/mail from the disk/mail server. I plan to
dedicate one of the "satellites" for that. Since I don't find POP
particularly important (I think that one should log into a UNIX box to read
E-mail rather than POP it out), so I'll probably use a slow VS2000 for that
(don't want to waste a fast VS3100 M38 on Eudora crybabies). As for the
third task (relaying SMTP, also mostly for Eudora crybabies), I don't plan
to do it on my system at all. Other systems on our campus have pretty fast
machines dedicated just to that, and anyone can relay SMTP through them,
even if the From: address in their relayed messages is on my system and not
on the one that the relay is part of.
   The situation with anonymous FTP, Gopher, and WWW servers is pretty much
the same as with incoming SMTP mail: the load that the corresponding
daemons put on the machine is more disk than CPU, so they should be running
on the machine that stores the files involved. Usually the public-access
services can be easily put on a separate machine together with the public-
access files themselves, since they are usually separate from the home
directories and other data on the system. But with Harhan the situation is
different. The primary purpose of Harhan Project is to provide services to
CWRU faculty, staff, and students, not to host a general public-access site
for some organization or department. Therefore, most public-access files
will be in the home directories of individual users who make them
accessible to the public by setting the permissions appropriately. In this
case, the daemons for anonymous FTP, Gopher, and WWW should run on the
central disk/mail server. Of course, I could set up special areas for
personal public-access files separate from the home directories, but then I
would have to divide the potentially scarce disk space between the two,
while with my current approach I will give each user a choice of how to use
his/her total disk quota.
   In any case these are just my plans for the initial setup. If it proves
inefficient when people start using Harhan, I will definitely change it.
> Assuming you're not using the main server for anything but file
> serving, IO and memory (for a cache) is going to be more of an
> issue than CPU. That extra 32MB used in the buffer cache could
> give a real win...
   That's right. When it comes to begging my sponsors for purchase
requisitions, memory modules will probably be a higher priority than the
KA655.
   BTW, you are saying something about some buffer cache. What do you mean
by that? What is its size controlled by?
> Can you get the UNIX source licence that allows you to use the AT&T
> code in 4.4BSD?)?
   One SCO guy has told me that SCO (the current legal owner of UNIX(R))
doesn't object anymore to people sharing the UNIX(R) code in BSD. He wrote
to me:
> I think SCO has even said "go ahead and distribute those files".
   He wasn't sure himself, so I can't know this for sure. If this is true,
great. If not, well, I'll have to obtain a copy of 4.4BSD illegally. But I
DO NEED that code, whether legal or not.
> If you cant get the licence then you probably have to accept gnu
> tools to fill in the gaps - and NetBSD is probably the best
> source tree to start from, and then shift back whichever userland
> utilities amd libraries to give the interface required.
   The "userland utilities" that I want constitute most of the "encumbered
code".
> Starting with the NetBSD tree gives you support on all your
> machines (plus the promise of more to come) [...]
   Actually I like the kernel structure and interfaces of Berkeley UNIX(R)
better than those of NetBSD (especially the classical versus "new"
config(8)). As far as support for BabyVAXen goes, I have recently
discovered another possible way to proceed. I have always thought that
there is no source whatsoever for Ultrix, but I have recently read an
Ultrix manual that talks about kernel config files and building custom
kernels. That probably means that there are kernel sources for Ultrix after
all. I have never seen Ultrix myself, but I should be able to get it if
necessary. In general it seems to me that Berkeley UNIX(R) and Ultrix
developed in parallel with a lot of code exchange between the two. The
device names and config structure seem to be the same in the two. This
applies to both vax and pmax ports. So Ultrix may very well turn out to be
an excellent resource. BTW, does anyone know where Ultrix/vax support
starts? I don't think that it goes all the way back to VAX-11/780.
   Brian D Chase <bdc@world.std.com> wrote:
> The VAXstations are diskless [...]
   No, I won't use any diskless machines. It's much better to use small
local disks for the static stuff like the OS and the applications than to
NFS-mount them and put an extra load both on the disk server and the
Ethernet trunk.
> Given enough clients you could even stick multiple ethernet cards in your
> server to split up the client net traffic onto different ethernet
> segments (I'm assuming that no one's going to be donating any switched
> hubs to you any time soon :-)
   One should also take into account that Harhan is not going to be an
isolated system, but it will be connected to the world via CWRUnet, our
campus fiberoptic network. It's a quite advanced network, and it definitely
has switched hubs. As for splitting the traffic into segments, that's a
very delicate task, and it must be thought out very well. In principle it
would be nice to separate the NFS traffic between the disk server and the
"satellites" from FTP and other traffic between the "satellites" and the
outside world. Imagine a typical situation: someone is sitting on a
"satellite" and FTPing a file from some site to his home directory. The
data has to go from the outside gateway to the "satellite" and then to the
disk server. If the "satellite" has only one network interface, as all
BabyVAXen unfortunately do, the bandwidth will be halved no matter what. In
the ideal world one would have two network interfaces on each machine and
use one for internal NFS traffic and the other for communication with the
outside world. The internal interfaces would be interconnected together but
not connected to anything else, and the external interfaces would connect
to the outside world, divided into as many segments as necessary. Extending
this idea a little further quickly leads to DEC's cluster architecture with
CI, DSSI, SDI, dual-ported drives...
> Then I'd use the Q-bus MicroVAX systems and any local disk
> storage they had to act as special function servers: e-mail, WWW, FTP,
> DNS, maybe even serving less often accessed or less speed critial NFS
> filesystems -- perhaps use some space on one to hold all your man pages?
   As I have said earlier, having a lot of specialized servers probably
won't be very efficient since they all need to access the same home
directories anyway. And less-often-accessed and less-speed-critical things
like man pages that you are talking about are static, so they are much
better off being on small local disks.
   
   Sincerely,
   Michael Sokolov
   Phone: 440-449-0299
   ARPA Internet SMTP mail: mxs46@po.cwru.edu