Subject: Re: Why not softdep per default?
To: Karsten Kruse <tecneeq@tecneeq.de>
From: Greg A. Woods <woods@weird.com>
List: current-users
Date: 04/10/2005 21:51:30
[ On Tuesday, April 5, 2005 at 16:14:19 (-0700), Bill Studenmund wrote: ]
> Subject: Re: Why not softdep per default?
>
> Through out this whole thread, you have asserted that life is safer with 
> softdeps on than with it off. I do not believe that is true. I believe 
> that life is _faster_ with softdeps on, not safer.

Indeed.

In can now confirm 101% that life with softdep is _dangerous_ in certain
circumstances, especially when there is _any_ risk of a crash (i.e. a
panic or an operator fat-fingering, etc.).  No amount of backup power or
cache integrity or other magic will save the day from softdep in those
cases.

For example with Cyrus IMAP each e-mail message is stored in a separate
file, and many of its "database" files are updated by copying and
renaming too.  If a busy mail server crashes the result can be hundreds,
if not thousands, of corrupt mailboxes, and many messages (perhaps tens
of thousands) that are very nearly (and effectively) lost.

Note that the only file content that's lost is that of directories --
all the message files will, at worst, end up in the lost+found
directory.  However that's more than enough of a mess to ruin your
day/week/month.  (partially delivered messages are still in the MTA's
queue, and MTA queue files that end up in lost+found can usually,
depending on your MTA, be easily identified and safely be put into a
temporary queue and be re-delivered without much worry)

These little message files cannot be put back in place without
effectively re-delivering them.  At best this screws up your users as
they receive what appears to be new mail which is a copy of old mail
they may already have read and deleted (especially if they had
downloaded their mail to work offline just before the crash).  At worst
you give up and delete it all because it's 80% spam anyway.  :-)

Softdep is not the problem though -- just its inappropriate use.  My
build servers, for example, can really benefit from softdep on some of
their filesystems (e.g. what I call /build, and maybe also /usr/pkg),
and if any of their files end up in lost+found after a crash then I can
safely just delete them and re-run make -- nothing is "lost" that cannot
be rebuilt.

However my mail and web servers, and even my home-directory servers,
will no longer use softdep (and I cannot forsee a day when they ever
will again(*)).  It is no fun to find thousands of lost files in the
lost+found directory, especially when there's no direct way to tell
where they really belong.


(*) NCR once upon a time built some unix systems with full main memory
backup, and AT&T was working on similar features -- you could pull the
plug on those things and they'd return to their exact same running state
when power was restored.  You could even swap out a processor borard on
the AT&T variants.  However that still wouldn't make softdep truly safe
since they didn't have any way to recover state after a kernel crash.

-- 
						Greg A. Woods

H:+1 416 218-0098  W:+1 416 489-5852 x122  VE3TCP  RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>          Secrets of the Weird <woods@weird.com>