Subject: Re: WWW query engine bug (was Query-PR)
To: None <Mike.Long@analog.com>
From: Mike Long <mike.long@analog.com>
List: current-users
Date: 02/22/1996 15:31:41
>Date: Wed, 21 Feb 96 11:16:20 EST
>From: Mike Long <Mike.Long@spd.analog.com>
>
>>From: Chris_G_Demetriou@NIAGARA.NECTAR.CS.CMU.EDU
>>Date: Tue, 20 Feb 96 12:59:12 EST
>
>>It's nice to allow people to include html, e.g. anchors to their home
>>pages' URLs, in PRs, and have them come out as real html on the Web.
>>In fact, i went to a great deal of work to fix things so that only
>>valid tags were allowed, etc...
>>
>>But the question is: how can you tell 'intentional' html from
>>something that just looks like HTML?  (and, what impact does that have
>>on the software used to spit out PRs?)
>
>You may need to restrict the tags you support; for instance, only
>support links (<a href="URL">myurl</a>).  It's a lot harder for that
>long a sequence to show up randomly than a simple <i>.

After reading der Mouse's messages and thinking about it some more, I
think the best option is to just transform URLs into links, and not
bother trying to interpret HTML at all.  Parsing URLs is much easier
than general HTML parsing, and preserves the behavior you want.
-- 
Mike Long <mike.long@analog.com>     <URL:http://www.shore.net/~mikel>
VLSI Design Engineer         finger mikel@shore.net for PGP public key
Analog Devices, CPD Division          CCBF225E7D3F7ECB2C8F7ABB15D9BE7B
Norwood, MA 02062 USA       (eq (opinion 'ADI) (opinion 'mike)) -> nil