Subject: Re: Recoverable Network File System?
To: Greg Troxel <gdt@ir.bbn.com>
From: Sean J.Schluntz <schluntz@workofstone.com>
List: netbsd-users
Date: 12/12/2003 09:21:49
Thanks for the feedback, I guess that brings me down to one question.  
Is it possible to turn off the write cache?  I have one instance where 
two mail servers may try to write to the same network mounted mail 
spool at the same time :/  Having one hit cached could really run me in 
to trouble.

-Sean

On Dec 12, 2003, at 4:41 AM, Greg Troxel wrote:

> I have been running coda for around 5 years, and mostly winning,
> although I wouldn't call it stable for production.  But I am a
> particularly abusive user of coda, since I do all of the following at
> once:
>
> 1) operation over a 28.8 line, so that venus (the client cache) is
>    usually in write-disconnected mode (write-behind caching,
>    essentially, so that modifications are logged locally and are
>    trickle reintegrated after a hold time of 30s or so)
>
> 2) All coda traffic uses transport-mode IPsec ESP.  This hasn't caused
>    that much trouble recently; coda's port usage plan has gotten
>    simpler over the years, and my usage has shaken out a few bugs
>    where stuff was sent on the wrong port occasionally.
>
> 3) Use of the 'hoard' feature, which walks the cache to ensure that
>    all files in a defined set of directories are in-cache and up to
>    date, so that when you lose connectivity you can still use the
>    files.
>
> 4) Use of cfs, with ciphertext in coda.  This makes repair hard, since
>    you have to repair conflicts in ciphertext which is hard to follow.
>
> 5) Writing lots of data while in write-disconnected mode.
>
> Despite all this, I have almost never lost any data.
>
>
> My problems have fallen into three classes:
>
> 1) Kernel bugs where vnode refcounting is wrong and leads to panics.
>    I think these are all fixed now, or at least the ones I run into.
>
> 2) Repair/reintegration bugs where venus thinks there is a conflict
>    and there isn't.   But if you are running with a high-speed,
>    reliable network, and do 'cfs strong', you should avoid
>    write-disconncted mode almost all the time.
>
> 3) limits in coda e.g. directory size.  A mirror of the entire
>    internet-drafts directory gets to have a directory size greater
>    than 256k or something, and coda chokes ungracefully on this.
>
> I have essentially zero trouble these days with the machines that are
> on the same Ethernet as the server, and also no trouble with a machine
> at MIT that's 9 hops away but only 3ms.
>
> The coda security approach has two parts: acls, which are afs-like and
> quite sane, and transport security/authentication, which as
> implemented is completley bogus (due to previous export control
> rules).  So I run over IPsec.
>
> So yes, this is another 'mostly works' review, but if you don't push
> it with disconnected operation or huge directories, I think it will
> work.  And these days disconnected operation usually works; I just had
> a venus repair wedge this week, but I don't remember when the last one
> happened.  If all needed bits are on the server already, one just does
> 'venus -init' to start the client over with an empty cache.
>
> -- 
>         Greg Troxel <gdt@ir.bbn.com>
>
---
Sean J. Schluntz                                                
510-785-8949
Sr. Security Consultant                                        Work of 
Stone