Re: Re: Heirloom Troff for NetBSD (was: Removing ARCNET stuffs)

To: tech-userlevel%netbsd.org@localhost
Subject: Re: Re: Heirloom Troff for NetBSD (was: Removing ARCNET stuffs)
From: Valery Ushakov <uwe%stderr.spb.ru@localhost>
Date: Tue, 9 Jun 2015 21:27:28 +0300

On Tue, Jun 09, 2015 at 18:04:52 +0000, David Holland wrote:

> On Tue, Jun 09, 2015 at 04:51:09PM +0200, carsten.kunze%arcor.de@localhost wrote:
>  > "James K. Lowden" <jklowden%schemamania.org@localhost> wrote:
>  > 
>  > > I didn't know Heirloom Troff produces PDFs directly, a feature I think
>  > > is very valuable.  In that case, is there any reason not to substitute
>  > > it for groff, apart from some engineering work and possibly some
>  > > touch-up to a few documents?  
>  > 
>  > It does product PDF directly, if you pipe the output of dpost(1) to
>  > e.g. "ps2pdf - <PDF filename>" :-)
>  > 
>  > But it can put additional information for the PDF generation into
>  > the PS document with the pdfmark operator.  This way
>  > e.g. hyperlinks and also bookmarks can be included (the table of
>  > contents that PDF viewers show on the left side).  So the result
>  > has the features which had been addressed in this thread.  I
>  > usually don't generate a PS document (but use the pipe to ps2pdf)
>  > because of the size of PS files.
>  > 
>  > Drawback is that ghostscript has a GNU license.
> 
> Importing ghostscript into base to get ps2pdf is not an option. (And
> if it were, by far the path of least resistance would be to do that
> and keep the current groff.)

I spent a bit of time last year looking at writing a gropdf backend
using http://libharu.org/ which is ZLib-licensed.

IIRC, it was mostly trivial with one nasty exception - C <name>
command that emits named char at the current point.  The porblem is
that C does NOT change current position.  And in PDF you can't
save/restore it, so to be able to move back to where C started you
must know the width of that character, so suddenly your postprocessor
needs to know about gory details of font metrics.  I checked gropdf
that comes with newer groff and it does indeed read the metrics, which
is rather gross.  For my quick-and-dirty proof of concept I hacked
libharu to accept negative scaling and did:

    $page->ShowText($text);

    $page->SetHorizontalScalling(-100); # backwards
    $page->SetTextRenderingMode(PDF::Haru::HPDF_INVISIBLE);
    $page->ShowText($text);
    $page->SetTextRenderingMode(PDF::Haru::HPDF_FILL);
    $page->SetHorizontalScalling(100);

which gives the right visual result, but messes us things like
text-extraction.

Obviously, groff does know the width and, from a quick look at the
sources, it should be trivial to emit it as part of C command, but
that would be a change in groff_out(5) format.

So I'd say adding a PDF backend based on libharu should be sufficienly
easy, provided it's given a bit more info that the formatter already
does know.  The license seems suitable too.

My perl script that handles only text was under 200 lines, not
counting maps for font names and character names.  I haven't looked at
the graphics part, but unless there are similar warts in the
intermediate format there shouldn't be many problems.

-uwe

Follow-Ups:
- Re: Heirloom Troff for NetBSD (was: Removing ARCNET stuffs)
  - From: Steffen Nurpmeso

References:
- Aw: Re: Heirloom Troff for NetBSD (was: Removing ARCNET stuffs)
  - From: carsten . kunze
- Re: Re: Heirloom Troff for NetBSD (was: Removing ARCNET stuffs)
  - From: David Holland

Prev by Date: Re: Aw: Re: Heirloom Troff for NetBSD
Next by Date: Re: Heirloom Troff for NetBSD
Previous by Thread: Re: Re: Heirloom Troff for NetBSD (was: Removing ARCNET stuffs)
Next by Thread: Re: Heirloom Troff for NetBSD (was: Removing ARCNET stuffs)
Indexes:

Home | Main Index | Thread Index | Old Index