Subject: Re: Defragmenting a NetBSD drive
To: Guenther Grau , Robert Elz <kre@munnari.OZ.AU>
From: Brian Buhrow <buhrow@cats.ucsc.edu>
List: netbsd-ports
Date: 09/16/1999 07:39:01
	If a program is scanning a directory and collecting the entries in the
directory, and another process is modifying the entries by shufling them
around and re-ordering them, either by moving them toward the beginning of
the directory or deleting them entirely, the process scanning the directory
could either miss entries which actually exist, get double entries as it
scans over a name twice, or otherwise read off the end of the directory
because its sized changed in the middle of its scan.  To avoid this
problem, you'd need to implement some sort of locking or transaction
protocol to guarantee atomicity for a directory scan.  That problem,
combined with the relatively modest improvement in disk storage efficiency
seems not to be a good fit for a substantial amount of work.  Few
directories take up many extra blocks unnecessarily and disk blocks are
getting cheaper every day.
-Brian
On Sep 16, 11:58am, Guenther Grau wrote:
} Subject: Re: Defragmenting a NetBSD drive
} Hi Robert,
} 
} Robert Elz wrote:
} >   | Hmm, why does the compaction happen creation time, not when a
} >   | directory is deleted?
} > 
} > Obviously it does happen when a directory is deleted - the whole thing
} > is freed then ..  but I assume you mean when an entry is removed from
} > a directory.
} 
} Uh, right :-) I was actually thinking that the entry removed from the
} directory was a directory itself :-) But, you parsed my babbling
} correctly.
} 
} [...]
} > So, directory trimming is done by creat(), rather than by unlink().
} 
} Excellent description!
} 
} >   | Why are directory entries not compressed when a directory gets
} >   | deleted? The fs could move the entries around, couldn't it?
} > 
} > It could, but it isn't quite as easy as it seems - the kernel part
} > isn't too difficult, but the effects on applications which are scanning
} > the directory at the time need careful thought - and that tends to
} > end up suggesting that it is best to just leave the directories alone.
} 
} Hmm, I don't quite understand that. If an application is scanning a
} directory while an entry (yes, I this time I got it right :-) is
} deleted from/added to it, the directory structure is changed behind
} the application. Why does it make a difference to application if the
} creat() code does the compaction instead of the unlink() code?
} 
} > If some human decides to shuffle things around, that's OK, they can
} > then understand why some files might not have been properly found when
} > something else was scanning the directory - but having the kernel just
} > arbitrarily doing that because it feels like it might be useful probably
} > isn't such a wonderful idea.
} 
} Well not  but at unlink() time. :-) Right now it happens
} "arbitrarily" as creat() time ;-)
} 
} >   | I guess this is not done for performance reasons, right?
} > 
} > No.   Its the side effects.
} 
} I am not convinced yet :-)
} 
} Thank you very much for your excellent and detailed explanation!!!
} 
}   Guenther
>-- End of excerpt from Guenther Grau