Subject: pkg/4341: bsd.port.subdir.mk deficiencies.
To: None <gnats-bugs@gnats.netbsd.org>
From: None <cgd@NetBSD.ORG>
List: netbsd-bugs
Date: 10/25/1997 01:02:14
>Number:         4341
>Category:       pkg
>Synopsis:       bsd.port.subdir.mk deficiencies
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    gnats-admin (GNATS administrator)
>State:          open
>Class:          change-request
>Submitter-Id:   net
>Arrival-Date:   Fri Oct 24 18:05:01 1997
>Last-Modified:
>Originator:     Chris G. Demetriou
>Organization:
Kernel Hackers 'r' Us
>Release:        NetBSD-current as of right now.
>Environment:
System: NetBSD brick.demetriou.com 1.3_ALPHA NetBSD 1.3_ALPHA (BRICK) #1: Wed Oct 22 23:26:01 PDT 1997 cgd@brick.demetriou.com:/usr/src/sys/arch/i386/compile/BRICK i386


>Description:
	bsd.port.subdir.mk uses control characters unnecessarily.
	This makes the file more annoying to read, edit, and cut&paste,
	and is bad form for .mk templates in general.

	Also, bsd.port.subdir.mk doesn't necessarily output valid
	html for the README.html (in particular, doesn't turn
	HTML metacharacters into the proper HTML entities).
	
>How-To-Repeat:

	Control characters:

	Take a look at the cat ${README} | sed ... expressions in the
	README.html rule.  At first glance, it might seem that this use
	of control characters is, while perhaps not necessary, somewhat
	useful, because it helps avoid complexity in escaping characters
	which may show up in file names.

	Note that the use of control-B is completely unnecessary.
	There's just no reason to use a control character here, and
	any character not otherwise used in the substitute expression
	(e.g. ,) would work just as well.

	The use of control-A was seemingly intended to avoid characters
	used in path names.  However, there's no guarantee that a
	control-A won't appear in a pathname, and there _is_ one
	character which is guaranteed to not appear, in this case: /.
	The 'internal' sed command (the one currently using control-B's 8-)
	has the purpose of producing the component name name of the
	current directory, which cannot contain any slashes!  Therefore,
	it's 'safe' to use / as the delimiter character in this case.

	Invalid HTML:

	Look at the same set of sed commands.  Think of what
	happens if the output contains HTML metacharacters (e.g. < and >),
	perhaps because the pkg DESCR contains those characters or for
	other reasons.  Note that they'll end up in the output html
	file as-is (rathr than being converted into entities, e.g.
	&lt; and &gt;), and that therefore the resulting file will
	in all likelyhood be invalid html (or, if it's valid html, it
	probably doesn't contain what was intended).

>Fix:
	Control characters:

	Replace the control characters used with different characters.
	I'd suggest replacing the control-As with slashes, and the
	control-Bs with commas.

	HTML:

	Post-process the html as appopriate, to convert metacharacters
	to the proper HTML entities.  This is probably best done by
	processing in the data as it's being put into the file, rather
	than afterward, so that metacharacters which really are supposed
	to be there don't have to be handled specially.
>Audit-Trail:
>Unformatted: