Subject: CVS commit: pkgsrc/www/p5-HTML-Parser
To: None <>
From: Havard Eidnes <>
List: pkgsrc-changes
Date: 12/05/2004 18:38:58
Module Name:	pkgsrc
Committed By:	he
Date:		Sun Dec  5 18:38:58 UTC 2004

Modified Files:
	pkgsrc/www/p5-HTML-Parser: Makefile distinfo
Removed Files:
	pkgsrc/www/p5-HTML-Parser/patches: patch-aa

Log Message:
Update p5-HTML-Parser from version 3.35 to 3.42.
Change HOMEPAGE to author-independent link on

Change log:

2004-12-04   Gisle Aas <>

     Release 3.42

     Avoid sv_catpvn_utf8_upgrade() as that macro was not
     available in perl-5.8.0.
     Patch by Reed Russell <>.

     Add casts to suppress compilation warnings for char/U8

     HTML::HeadParser will always push new header values.
     This make sure we never loose old header values.

2004-11-30   Gisle Aas <>

     Release 3.41

     Fix unresolved symbol error with perl-5.005.

2004-11-29   Gisle Aas <>

     Release 3.40

     Make utf8_mode only available on perl-5.8 or better.  It produced
     garbage with older versions of perl.

     Emit warning if entities are decoded and something in the first
     chunk looks like hibit UTF-8.  Previously this warning was only
     triggered for documents with BOM.

2004-11-23   Gisle Aas <>

     Release 3.39_92

     More documentation of the Unicode issues.  Moved around HTML::Parser
     documentation a bit.

     New boolean option; $p->utf8_mode to allow parsing of raw  UTF-8.

     Documented that HTML::Entities::decode_entities() can take multiple

     Unterminated entities are now decoded in text (compatibility
     with MSIE misfeature).

     Document HTML::Entities::_decode_entities(); this variation of the
     decode_entities() function has been available for a long time, but
     have not been documented until now.

     HTML::Entities::_decode_entities() can now be told to try to
     expand unterminated entities.

     Simplified Makefile.PL

2004-11-23   Gisle Aas <>

     Release 3.39_91

     The HTML::HeadParser will skip Unicode BOM.  Previously it
     would consider the <head> section done when it saw the BOM.

     The parser will look for Unicode BOM and give appropriate
     warnings if the form found indicate trouble.

     If no matching end tag is found for <script>, <style>, <xmp>
     <title>, <textarea> then generate one where the next tag

     For <script> and <style> recognize quoted strings and don't
     consider end element if the corresponding end tag is found
     inside such a string.

2004-11-17   Gisle Aas <>

     Release 3.39_90

     The <title> element is now parsed in literal mode, which
     means that other tags are not recognized until </title> has
     been seen.

     Unicode support for perl-5.8 and better.

        Decoding Unicode entities always enabled; no longer a compile
        time option.

        Propagation of UTF8 state on strings.
        Patch contributed by John Gardiner Myers <>.

        Calculate offsets and lengths in chars for Unicode strings.

     Fixed link typo in the HTML::TokeParser documentation.

2004-11-11   Gisle Aas <>

     Release 3.38

     New boolean option; $p->closing_plaintext
     Contributed by Alex Kapranoff <>

2004-11-10   Gisle Aas <>

     Release 3.37

     Improved handling of HTML encoded surrogate pairs and illegally
     endoded Unicode; <>.
     Patch by John Gardiner Myers <>.

     Avoid generating bad UTF8 strings when decoding entities
     representing chars beyond #255 in 8-bit strings.  Such bad
     UTF8 sometimes made perl-5.8.5 and older segfault.

     Undocument v2 style subclassing in synopsis section.

     Internal cleanup:

        Make 'gcc -Wall' happier.

        Avoid modification of PVs during parsing of attrspec.
        Another patch by John Gardiner Myers.

2004-04-01   Gisle Aas <>

     Release 3.36

     Improved MSIE/Mozilla compatibility.  If the same attribute
     name repeats for a start tag, use the first value instead
     of the last.  Patch by Nick Duffek <>.

To generate a diff of this commit:
cvs rdiff -r1.26 -r1.27 pkgsrc/www/p5-HTML-Parser/Makefile
cvs rdiff -r1.8 -r1.9 pkgsrc/www/p5-HTML-Parser/distinfo
cvs rdiff -r1.2 -r0 pkgsrc/www/p5-HTML-Parser/patches/patch-aa

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.