tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[GSoC 2012 idea] Ext3 HTree directory indexing



Hello,

As I understand, implementing Ext3 is quite a complex task.
Currently I'm interested in implementing the HTree directory indexing.

In current ext2fs implementation directories are treated as simple
linked lists [1].
When the number of items in a directory grows
time required to manipulate this directory increases quadratically.

Ext3 introduces directory indexing where a tree-like structure called an Htree
is used to quickly access directory entries using file name hash values
(currently computed using half-md4 algorithm).

With this entry lookup transforms from linear scan to binary search
and the required time grows close to linearly instead of O(n^2) [2,
see Performance section].

This is backwards-compatible with the code not supporting dir_index feature
so if an ext3 partition with indexed directories is mounted in NetBSD
using mount_ext2fs
lookup is performed using linear scan.

A red-black tree in linux implementation is used to hold on-disk htree
in memory.

As I understand routines in ext2fs_lookup.c are main candidates for
being modified.

I wonder what do you think about this project
and whether it's worth writing a more detailed proposal.

References:
[1] http://ext2.sourceforge.net/2005-ols/paper-html/node3.html
[2] https://lkml.org/lkml/2001/2/20/114


Home | Main Index | Thread Index | Old Index