Subject: Re: adding an xml validate target
To: None <jschauma@netmeister.org>
From: Hiroki Sato <hrs@NetBSD.org>
List: netbsd-docs
Date: 04/24/2006 01:42:14
----Security_Multipart0(Mon_Apr_24_01_42_14_2006_974)--
Content-Type: Multipart/Mixed;
 boundary="--Next_Part(Mon_Apr_24_01_42_14_2006_292)--"
Content-Transfer-Encoding: 7bit

----Next_Part(Mon_Apr_24_01_42_14_2006_292)--
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Jan Schaumann <jschauma@netmeister.org> wrote
  in <20060423155200.GD7542@netmeister.org>:

js> Hi,
js>
js> I'd like to add a general target 'make valid' or 'make checkxml' or
js> something along those lines that validates the XML of a given file
js> without actually generating any output.
js>
js> The options I've found are using tidy or xmllint.
js>
js> With tidy, I can do
js>
js> tidy -qe -xml <infile.xml>

 Unfortunately, tidy does not support validation based on DTD.

js> However, this will not resolve entities that are included from other
js> files (such as man-refs.ent and developers.ent).
js>
js> With xmllint, I have not yet found the right invocation, but I presume
js> it would be something like
js>
js> xmllint --noout --valid --dtdvalid .../share/xml/website-netbsd.dtd <infile>

 We need to use XML catalog to look for external entities such as DTD.
 More specifically, "--noout --nonet --valid --xinclude --catalogs" and
 catalog URLs used when it is built are needed.  See the attached patch,
 for example.

 ...and sorry, I still do not have enough time to take care of this
 closely now.  I am planning to implement several improvements, including
 this, discussed earlier on this list from beginning of May (I will have
 some time to do).

js> However, there are some problems:  no matter what I do, I always get a
js> warning "failed to load external entity ...", even if I pass "--nonet".
js>
js> While playing around with this, I notice that the
js> ./share/xml/website-netbsd.dtd and all XML files point the DTD to
js> http://www.NetBSD.org/XML/htdocs/lang/share/xml/website-netbsd.dtd
js>
js> This does simply not exist.  The path should be
js> http://www.NetBSD.org/share/xml/website-netbsd.dtd, no?

 No.  This URL does not correspond to any actual files intentionally
 because of i18n/l10n purpose.

 BTW, "make lint" with the attached patch will generate a lot of
 errors at this moment, but please be careful to correctly
 understand what each error means.

--
| Hiroki SATO

----Next_Part(Mon_Apr_24_01_42_14_2006_292)--
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; filename="web.site.mk.diff"

Index: web.site.mk
===================================================================
RCS file: /cvsroot/htdocs/share/mk/web.site.mk,v
retrieving revision 1.45
diff -d -u -I\$FreeBSD:.*\$ -I\$NetBSD:.*\$ -I\$OpenBSD:.*\$ -I\$DragonFly:.*\$ -I\$Id:.*\$ -I\$hrs:.*\$ -r1.45 web.site.mk
--- web.site.mk	22 Oct 2005 13:10:33 -0000	1.45
+++ web.site.mk	23 Apr 2006 16:22:18 -0000
@@ -125,6 +125,11 @@
 XSLTPROCOPTS+= --nonet --catalogs
 .endif
 XSLTPROC=	env ${XSLTPROC_ENV} ${PREFIX}/bin/xsltproc
+
+XMLLINTOPTS+=	--xinclude --valid --noout
+.if defined(XML_CATALOG_FILES) && !empty(XML_CATALOG_FILES)
+XMLLINTOPTS+=	--nonet --catalogs
+.endif
 XMLLINT=	env ${XSLTPROC_ENV} ${PREFIX}/bin/xmllint

 XSLT.DEFAULT?=	docbook-website
@@ -197,9 +202,18 @@
 #. if !defined(NO_TIDY)
 #	-${TIDY} ${TIDYOPTS} ${.TARGET}
 #. endif
+
+VALIDATE_DOCS+=	VALIDATE.${_ID}
+VALIDATE.${_ID}:
+	@${ECHO} "[xmllint] ${XML.${_ID}}"
+	@${ECHO} "--(begin)--"
+	-${XMLLINT} ${XMLLINTOPTS} ${XML.${_ID}}
+	@${ECHO} "--(end)--"
 .  endfor
 .endfor

+lint: ${VALIDATE_DOCS}
+
 XML_CATALOG_FILES=	file://${WEB_PREFIX}/${DOCLANG}/share/xml/catalog.xml \
 			file://${WEB_PREFIX}/share/xml/catalog.xml \
 			file://${WEB_PREFIX}/share/xml/catalog-common.xml
@@ -325,6 +339,7 @@
 	@${HTML2TXT} ${HTML2TXTOPTS} ${.CURDIR}/${_entry} | ${ISPELL} ${ISPELLOPTS}
 .endfor

+
 #
 # Warn about anything in DOCS that has no translation
 #

----Next_Part(Mon_Apr_24_01_42_14_2006_292)----

----Security_Multipart0(Mon_Apr_24_01_42_14_2006_974)--
Content-Type: application/pgp-signature
Content-Transfer-Encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (FreeBSD)

iD8DBQBES65mTyzT2CeTzy0RAh/zAKCqg0TZxrmCjBEobacttKXCwd6QTQCg2BX0
Y1Due/PyYgdC3UDBseB1FJo=
=o9W+
-----END PGP SIGNATURE-----

----Security_Multipart0(Mon_Apr_24_01_42_14_2006_974)----