Subject: Re: FFS journal
To: None <tech-kern@netbsd.org>
From: None <joerg@britannica.bec.de>
List: tech-kern
Date: 07/03/2006 19:23:41
On Sun, Jul 02, 2006 at 07:59:50PM +0400, Kirill Kuvaldin wrote:
> * The operations considered to be transactions are listed below:
>  - create a file;
>  - delete a file;
>  - rename a file (including deletion of the existing name);
>  - change the size of a file (growing or shrinking);
>  - ... (anything else?)

s/file/inode/

I think it should be useful to have a separation similiar to softdep, by
distinguishing operations on the inode and on directories. This would
let to the following set of transactions:
- allocate/alter/free inode
- create link and remove link
- move link
- allocate disk space

If full journaling is wanted:
- write data

The separation has the advantage of simplifying the internal reference
counting by dropping the special case of orphaned files. You should pay
special attention to the semantic of move/rename (atomic!) and the
allocation of disk space in sparse file. This is special since the old
content has to be zeroed out first, security issues can start otherwise.
This problem is reduced with full data journaling a bit.

> * The biggest bottleneck of journaling mechanism is that all transactions
>  write to a single journal. A single journal effectively forces the
>  filesystem into a single-threaded model for updates. This is a serious
>  disadvantage when the concurrent semantics is required. Due to the time
>  constraints to a summer of code projects the support for multiple
>  journals will not be implemented under the scope of this project.

Actually, I would concern myself with the journal concurrency for quite
a while. Write operations are typically IO bound anyway and
serialisation in the journal should be relatively cheap, even with a
course-grain per-filesystem lock.

Joerg

> * The set of unit tests (expected to deliver by 13th August):
>   - to ensure that code works correctly;
>   - to reveal the possible bugs;
>   - to investigate code behaviour under some sort of boundary cases;
>   - to analyze the filesystem performance.

Does this include support in fsck?

Joerg