tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Encoding non-alphanumeric characters in manpage filenames



    Date:        Tue, 9 Nov 2021 10:00:56 +0200
    From:        Lassi Kortela <lassi%lassi.io@localhost>
    Message-ID:  <2e0142a3-dd7d-51d4-5cfd-9bdcd669bce5%lassi.io@localhost>


  | AFAICT it mandates that the Platonic Form of most (but not all) ASCII 
  | characters be present. Doesn't mandate ASCII codepoints for them.

No, not ASCII codepoints, but Unicode ones it does.   Convenient though
that for the characters specified, the two are the same.

[Aside: it can be irritating that to actually get the meaning of almost
anything in POSIX you need to have read and remembered almost all of the
almost 4000 pages, and certainly all of the first 200, plus the first
dozen or so of whichever of XSH or XCU is of immediate interest.]

  | AFAICT the POSIX term "filename" means one pathname component.

Correct.

  | IMHO the best solution is to try two variants of the filename:

I wouldn't bother.   Simply define an encoding scheme where none of
the existing man page names can possibly be interpreted as encoded,
and then use an encoding method which only encodes when it needs to.

And note: there is no need here to pick a printable char to be the
"encoded value follows" character for this purpose, we could use
something relatively easy to type, but which is a control char,
like ^A or ^B or something to indicate that an encoded char value follows.

I doubt that there are any existing man pages with control chars in
their names, so this would be safe (but so perhaps would use of something
like '#' - except that would need to be quoted in shells if it happens to
appear as first octet of an encoded filename .. one reason why % and = are
so popular for this kind of thing, they tend not to be magic almost
anywhere, and are also rarely used in names.

And in addition to man, provide a small tool which takes an arbitrary
name and makes its encoded form, so if you need to edit a man page
with an arbitrary name, you just do

	vi $( man_filename my-exotic-name )

and you're done - no need to ever type anything encoded.   Should this
become popular enough, an option to ls could decode the encoded forms
upon request.

kre



Home | Main Index | Thread Index | Old Index