Subject: entities and mixed processing of the guide with SGML/XML tools and sed
To: None <netbsd-docs@NetBSD.org>
From: Klaus Heinz <k.heinz.apr.fuenf@onlinehome.de>
List: netbsd-docs
Date: 04/03/2005 13:28:09
Hi,

I have tracked down a problem with consecutive character entities being
collapsed into only one in the HTML files.

A sequence of

  &someentity;...something in between...&anotherentity;

resulted in &someentity; in the HTML file without "something in between"
or the other entities on that line.

Processing the XML file with such entities through "osx" (in
htdocs/share/mk/doc.docbook.xsl.mk) converted the entities to processing
instructions (PIs) (due to option -xsdata-as-pis)

  <?sdataEntity someentity [someentity ] ?>

and the sed expression in htdocs/share/mk/doc.docbook.xsl.mk used to
convert those PIs back to the usual character entities was a bit too
greedy (ie, swallowd everything until the last "?>").

Now i use a more detailed "sed" expression but I wondered whether it
would be possible to avoid "sed" altogether and use the XSLT processor
to convert the PIs to the entities we want.

I was able to select the PIs in my stylesheet, but the XSLT processor
always creates "&amp;someentity;" instead of "&someentity;" in the HTML
file.
Am I trying to do the impossible?

ciao
     Klaus