tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [GSoC 2012 idea] Ext3 HTree directory indexing



In article 
<CABpMX-ZG=ZQMBT7aE2DJCiERhu=LL7XX-QYMhpd56N=vFB990Q%mail.gmail.com@localhost>,
Vyacheslav Matyushin  <v.matyushin%gmail.com@localhost> wrote:
>Hello,
>
>As I understand, implementing Ext3 is quite a complex task.
>Currently I'm interested in implementing the HTree directory indexing.
>
>In current ext2fs implementation directories are treated as simple
>linked lists [1].
>When the number of items in a directory grows
>time required to manipulate this directory increases quadratically.
>
>Ext3 introduces directory indexing where a tree-like structure called an Htree
>is used to quickly access directory entries using file name hash values
>(currently computed using half-md4 algorithm).
>
>With this entry lookup transforms from linear scan to binary search
>and the required time grows close to linearly instead of O(n^2) [2,
>see Performance section].
>
>This is backwards-compatible with the code not supporting dir_index feature
>so if an ext3 partition with indexed directories is mounted in NetBSD
>using mount_ext2fs
>lookup is performed using linear scan.
>
>A red-black tree in linux implementation is used to hold on-disk htree
>in memory.
>
>As I understand routines in ext2fs_lookup.c are main candidates for
>being modified.
>
>I wonder what do you think about this project
>and whether it's worth writing a more detailed proposal.
>
>References:
>[1] http://ext2.sourceforge.net/2005-ols/paper-html/node3.html
>[2] https://lkml.org/lkml/2001/2/20/114


That sounds like a nice SoC project...

christos




Home | Main Index | Thread Index | Old Index