NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: misc/46784: "pretty" dashes in manpages make copying from manpages inadvisable



The following reply was made to PR bin/46784; it has been noted by GNATS.

From: Valery Ushakov <uwe%stderr.spb.ru@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: 
Subject: Re: misc/46784: "pretty" dashes in manpages make copying from
 manpages inadvisable
Date: Thu, 2 Jan 2020 18:36:30 +0300

 On Wed, Aug 08, 2012 at 07:20:00 +0000, spz%tucana.1st.de@localhost wrote:
 
 > .Fl needs to produce something that actually works.
 
 Ascii quotes are another vicim of this (see e.g. sh(1) page).
 
 It would be nice to fix this even if man now uses mandoc(1) by
 default.
 
 From external/gpl2/groff/dist/PROBLEMS
 
   * The UTF-8 output of grotty has strange characters for the minus, the
     hyphen, and the right quote.  Why?
 
   The used Unicode characters (U+2212 for the minus sign and U+2010 for
   the hyphen) are the correct ones, but many programs can't search them
   properly.  The same is true for the right quote (U+201D).  To map those
   characters back to the ASCII characters, insert the following code
   snippet into the `troffrc' configuration file:
 
   .if '\*[.T]'utf8' \{\
   .  char \- \N'45'
   .  char  - \N'45'
   .  char  ' \N'39'
   .\}
 
 Remapping the ascii quote like that messes with quoting macros,
 though.
 
 
 heirloom doctools roff is more considerate, see its .utf8conv command:
 
   ... "\-" remains the ASCII hyphen-minus character. This is because
   in manual pages, "\-" represents the ASCII option introduction
   character, and converting it to a UTF-8 minus character would make
   it impossible to copy-and-paste option descriptions.  Similar
   considerations apply to ` ' vs. \` \'.  The former are typographic
   single quotes, while the latter are commonly used for the ASCII
   syntax quotes in manual pages.
 
 I'm not sure about \' as the heirloom doctool nroff I have handy
 interprets \' as acute accent, like groff also does.  But to get ascii
 quote (from both groff and heirloom) you can use \(aq which is a bit
 annoying but at least the possibility is there.
 
 As for \-, groff is really determined to prettify it, so there's no
 choice but to remap it.
 
 To sum it up:
 
 '   \(aq - ascii single quote
 `   \`   - backquote
 -   \-   - ascii hyphen-minus (with the above hack for groff)
 
 There's still \(mi for unicode minus, but in the manual page context
 we probably still want ascii minus for things like \-1, but if need
 bee (e.g. some math formula) the fancy minus is still accessible.
 
 -uwe
 


Home | Main Index | Thread Index | Old Index