netbsd-help: Re: AFS or CodaFS

Subject: Re: AFS or CodaFS
To: PetraHof <petrahof@chicagonet.net>
From: Phil Nelson <phil@cs.wwu.edu>
List: netbsd-help
Date: 01/18/2005 12:51:35
=2D----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tuesday 18 January 2005 09:00, PetraHof wrote:
> I've been closely following this thread and now wonder why choose one
> file system over the other, as apparently Coda is an extension of AFS. =A0
> Would someone explain, thanks. =A0Petra

I can answer this one.  Yes, Coda started with AFS v2 and then went in a=20
different direction.  Some machines at CMU run both AFS and Coda.   The=20
following summarizes the primary differences between AFS and Coda as I=20
understand them:

   1)  First, AFS and Coda break the file storage into "volumes".  Each vol=
ume=20
contains a filesystem tree.   Volumes may be mounted in any directory in th=
e=20
entire tree.  Typically, each user gets a volume, but a single use may have=
=20
many volumes in their file tree.   In AFS, volumes are stored on a read/wri=
te=20
server.  There may be several read only replicas, but writing to the volume=
=20
requires writing to the read/write server.    In Coda, this restriction was=
=20
dropped and so a volume may be stored on several servers as a read/write=20
replica.   So the loss of one server does not stop work with volumes that=20
have replicas of other servers.

  2)  Both AFS and Coda use a client cache to store the complete file on th=
e=20
local disk.   Neither read or write blocks.  They both implement session=20
semantics.    The difference comes when the server(s) is(are) unavailable. =
=20
AFAIK, if there is a read replica that can be contacted, AFS lets you read=
=20
the file.  If the read/write replica is not available (network or server=20
problems) then you can not write the file.   You lose all access to AFS in=
=20
that case.

   For Coda, it assumes if the file is in the local cache and you can't tal=
k=20
to any server, we should allow the user to use the local cached file.  The=
=20
client goes into "disconnected operation" where the user can still read and=
=20
write files, but only ones that are in the cache.  (You can create a new fi=
le=20
while disconnected.)    The client keeps track of all modifications and new=
=20
files during the disconnected period and then when the client can again=20
connect to the server(s), it sends back all the modifications to the=20
server(s).  =20

Both 1 and 2 above allow for conflicts to occur.   First, when two servers=
=20
disagree on the content of a file and second, when the servers and a client=
=20
disagree on the content of a file.   This is the primary reason a user need=
s=20
to monitor the coda file system.  Also, since an access to a file is delaye=
d=20
until the file is in the local cache, it is nice to see that the delay in=20
editing your file is due to fetching the file from the server.

Because of this disconnected operation, databases should not be used on Cod=
a. =20
The fact that many programs use "mini databases" in one's home directory al=
so=20
means that Coda is somewhat unfriendly to home directories.   AFS works qui=
te=20
well in this situation. (although I don't know how good is the current AFS=
=20
support on NetBSD.)   Coda tends to work very nicely in other environments=
=20
where network connection can be a problem or on a laptop.   People have bee=
n=20
known to take their laptop away from a network for up to 2 weeks and do a l=
ot=20
of work in Coda.  When they finally returned, they reconnected and their 2=
=20
weeks of work was sent back to the servers with no problems.

There is also an instance of disconnected work connected with a server=20
problem.  Several years ago at CMU, a person was using Coda and in their us=
e=20
it caused a bug in the server to be detected.   The server bug caused a=20
server crash.   The primary developer noticed the server crash, debugged th=
e=20
server, brought it back with a fixed version ... and the user never knew th=
e=20
server crashed.   His client just went into disconnected operation and the=
=20
user kept working the entire time the server was being fixed.   The reason=
=20
the user knew there was a server problem was because the developer went to=
=20
the office of the user to make sure everything was OK.

There are some interesting results due to the multiple read/write replicas =
and=20
the algorithms for resolving conflicts.   If one server loses a disk, one c=
an=20
recover that server by replacing the disk, creating the complete set of emp=
ty=20
volumes in the server and then letting the other servers repopulate the the=
=20
contents of the volumes by conflict resolution.   No need to load from=20
backups.

There are other differences due to diverging code, but disconnected operati=
on=20
is IMHO the primary reason to use Coda.   The other reason to use Coda is t=
he=20
multiple r/w replicas.=20

=2D --Phil

=2D --=20
Phil Nelson                       NetBSD: http://www.netbsd.org
e-mail: phil@cs.wwu.edu           Coda: http://www.coda.cs.cmu.edu
http://www.cs.wwu.edu/nelson=20
=2D----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFB7XbbzbodwsP3RI0RAprVAJ9DLu4nq7TRrTA1n0xCZyd6PC6HWgCdFgOS
TX5QcF2RQCq7efzVxd12ZJY=3D
=3DnlJp
=2D----END PGP SIGNATURE-----