tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]


On Sun, 13 Nov 2011 23:08:30 +0000
David Holland <> wrote:

> I was recently talking to some people who'd been working with some
> (physicists, I think) doing data-intensive simulation of some kind,
> and that reminded me: for various reasons, many people who are doing
> serious data collection or simulation tend to encode vast amounts of
> metadata in the names of their data files. Arguably this is a bad way
> of doing things, but there are reasons for it and not so many clear
> alternatives... anyway, 256 character filenames often aren't enough in
> that context.

It's only my opinion, but they really should be using multiple files or
a database for the metadata with as necessary a "link" to an actual
file for data.
But I also tend to think the same of software relying on extended
attributes, resource forks and the like (with the possible exception of
a specialized facility for extended permissions :)

> (This sort of usage also often involves things like 50,000 files in
> one directory, so the columnizing behavior of ls is far from the top
> of the list of relevant issues.)

This reminds me, does anyone know about the current state of
UFS_DIRHASH?  I remember reading about some issues with it and ending up
disabling it on my kernels, yet huge directories can occur in a number
of scenarios (probably a more pressing issue than extending file names,

>  > The 255 limit was just because that's how many bytes a one byte length
>  > field permitted, not because anyone thought names that long made sense.
>  > But if you're going to increase it, why  stop at 511?  That number
>  > means nothing - the next logical limit would be 65535 wouldn't it?
> Well... yes but there are other considerations. As you noted, going
> past one physical sector is problematic; going past one filesystem
> block very problematic. Plus, as long as MMU pages remain 4K,
> allocating contiguous kernel virtual space for path buffers (since if
> NAME_MAX were raised to 64K, PATH_MAX would have to be at least that
> large) could start to be a problem.

I agree, especially with all the software that allocates path/file name
buffers on the stack (but even on the heap it could be a general memory
waste with 64KB, other than the memory management performance issues).

Home | Main Index | Thread Index | Old Index