tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Max. number of subdirectories dump



On 2013-08-19 15:56, Manuel Wiesinger wrote:
On 08/19/13 09:31, David Laight wrote:
For defrag I'd have though you'd work from the inode table and treat
directories no different from files.
That's what I'm doing.

I have an additional optimisation step, which tries to move files in the
same directory, so that they are stored contiguously. This can improve
performance, when a whole directory is read. That's why I iterate the
directories. See my mailinglist posting and status update for more details.

Hmm. Interesting, and potentially helpful. But I wonder if actually doing this kind of ordering also based on file access times instead of directory locality would be better. I confess I have not studied the topic in enough detail. Are there any research on what strategies are better when it comes to locating files in a file system in proximity of each other? Is a common directory the best indicator on where to place files, or is last access time a better indicator? What about directories them self?

It might be worth rewriting directories in order to remove gaps and
possibly put subdirectories first (but you really want the most
frequently
used entries first).
That is exactly my idea. Good point with the most recently used entries
first. But it's hard to find out which are the most recently used files.
I think of storing directory entries first, then the files. But I'm not
sure if this is the best approach.

Anyhow, this step is no official goal, so I don't focus on that now.
When time allows I will implement it. Maybe even after GSoC.

Not entirely perfect perhaps, but you can look at the atime of all entries in the directory and sort them on that...

        Johnny



Home | Main Index | Thread Index | Old Index