CVS commit: pkgsrc/print/qpdf

To: pkgsrc-changes%NetBSD.org@localhost
Subject: CVS commit: pkgsrc/print/qpdf
From: "Ryo ONODERA" <ryoon%netbsd.org@localhost>
Date: Tue, 27 Feb 2018 12:37:20 +0000

Module Name:    pkgsrc
Committed By:   ryoon
Date:           Tue Feb 27 12:37:20 UTC 2018

Modified Files:
        pkgsrc/print/qpdf: Makefile PLIST buildlink3.mk distinfo

Log Message:
Update to 8.0.0

Changelog:
2018-02-25  Jay Berkenbilt  <ejb%ql.org@localhost>

        * 8.0.0: release

2018-02-17  Jay Berkenbilt  <ejb%ql.org@localhost>

        * Fix QPDFObjectHandle::getUTF8Val() to properly handle strings
        that are encoded with PDF Doc Encoding. Fixes #179.

        * Add qpdf_check_pdf to the "C" API. This method just attempts to
        read the entire file and produce no output, making possible to
        assess whether the file has any errors that qpdf can detect.

        * Major enhancements to handling of type errors within the qpdf
        library. This fix is intended to eliminate those annoying cases
        where qpdf would exit with a message like "operation for
        dictionary object attemped on object of wrong type" without
        providing any context. Now qpdf keeps enough context to be able to
        issue a proper warning and to handle such conditions in a sensible
        way. This should greatly increase the number of bad files that
        qpdf can recover, and it should make it much easier to figure out
        what's broken when a file contains errors.

        * Error message fix: replace "file position" with "offset" in
        error messages that report lexical or parsing errors. Sometimes
        it's an offset in an object stream or a content stream rather than
        a file position, so this makes the error message less confusing in
        those cases. It still requires some knowledge to find the exact
        position of the error, since when it's not a file offset, it's
        probably an offset into a stream after uncompressing it.

        * Error message fix: correct some cases in which the object that
        contained a lexical error was omitted from the error message.

        * Error message fix: improve file name in the error message when
        there is a parser error inside an object stream.

2018-02-11  Jay Berkenbilt  <ejb%ql.org@localhost>

        * Add QPDFObjectHandle::filterPageContents method to provide a
        different interface for applying token filters to page contents
        without modifying the ultimate output.

2018-02-04  Jay Berkenbilt  <ejb%ql.org@localhost>

        * Changes listed on today's date are numerous and reflect
        significant enhancements to qpdf's lexical layer. While many
        nuances are discussed and a handful of small bugs were fixed, it
        should be emphasized that none of these issues have any impact on
        any output or behavior of qpdf under "normal" operation. There are
        some changes that have an effect on content stream normalization
        as with qdf mode or on code that interacts with PDF files
        lexically using QPDFTokenizer. There are no incompatible changes
        for normal operation. There are a few changes that will affect the
        exact error messages issued on certain bad files, and there is a
        small non-compatible enhancement regarding the behavior of
        manually constructed QPDFTokenizer::Token objects. Users of the
        qpdf command line tool will see no changes other than the addition
        of a new command-line flag and possibly some improved error
        messages.

        * Significant lexer (tokenizer) enhancements. These are changes to
        the QPDFTokenizer class. These changes are of concern only to
        people who are operating with PDF files at the lexical layer using
        qpdf. They have little or no impact on most high-level interfaces
        or the command-line tool.

        New token types tt_space and tt_comment to recognize whitespace
        and comments. this makes it possible to tokenize a PDF file or
        stream and preserve everything about it.

        For backward compatibility, space and comment tokens are not
        returned by the tokenizer unless QPDFTokenizer.includeIgnorable()
        is called.

        Better handling of null bytes. These are now included in space
        tokens rather than being their own "tt_word" tokens. This should
        have no impact on any correct PDF file and has no impact on
        output, but it may change offsets in some error messages when
        trying to parse contents of bad files. Under default operation,
        qpdf does not attempt to parse content streams, so this change is
        mostly invisible.

        Bug fix to handling of bad tokens at ends of streams. Now, when
        allowEOF() has been called, these are treated as bad tokens
        (tt_bad or an exception, depending on invocation), and a
        separate tt_eof token is returned. Before the bad token
        contents were returned as the value of a tt_eof token. tt_eof
        tokens are always empty now.

        Fix a bug that would, on rare occasions, report the offset in an
        error message in the wrong space because of spaces or comments
        adjacent to a bad token.

        Clarify in comments exactly where the input source is positioned
        surrounding calls to readToken and getToken.

        * Add a new token type for inline images. This token type is only
        returned by QPDFTokenizer immediately following a call to
        expectInlineImage(). This change includes internal refactoring of
        a handful of places that all separately handled inline images, The
        logic of detecting inline images in content streams is now handled
        in one place in the code. Also we are more flexible about what
        characters may surround the EI operator that marks the end of an
        inline image.

        * New method QPDFObjectHandle::parsePageContents() to improve upon
        QPDFObjectHandle::parseContentStream(). The parseContentStream
        method used to operate on a single content stream, but was fixed
        to properly handle pages with contents split across multiple
        streams in an earlier release. The new method parsePageContents()
        can be called on the page object rather than the value of the
        page dictionary's /Contents key. This removes a few lines of
        boiler-plate code from any code that uses parseContentStream, and
        it also enables creation of more helpful error messages if
        problems are encountered as the error messages can include
        information about which page the streams come from.

        * Update content stream parsing example
        (examples/pdf-parse-content.cc) to use new
        QPDFObjectHandle::parsePageContents() method in favor of the older
        QPDFObjectHandle::parseContentStream() method.

        * Bug fix: change where the trailing newline is added to a stream
        in QDF mode when content normalization is enabled (the default for
        QDF mode). Before, the content normalizer ensured that the output
        ended with a trailing newline, but this had the undesired side
        effect of including the newline in the stream data for purposes of
        length computation. QPDFWriter already appends a newline without
        counting in length for better readability. Ordinarily this makes
        no difference, but in the rare case of a page's contents being
        split in the middle of a token, the old behavior could cause the
        extra newline to be interprted as part of the token. This bug
        could only be triggered in qdf mode, which is a mode intended for
        manual inspection of PDF files' contents, so it is very unlikely
        to have caused any actual problems for people using qpdf for
        production use. Even if it did, it would be very unusual for a PDF
        file to actually be adversely affected by this issue.

        * Add support for coalescing a page's contents into a single
        stream if they are represented as an array of streams. This can be
        performed from the command line using the --coalesce-contents
        option. Coalescing content streams can simplify things for
        software that wants to operate on a page's content streams without
        having to handle weird edge cases like content streams split in
        the middle of tokens. Note that
        QPDFObjectHandle::parsePageContents and
        QPDFObjectHandle::parseContentStream already handled split content
        streams. This is mainly to set the stage for new methods of
        operating on page contents. The new method
        QPDFObjectHandle::pipeContentStreams will pipe all of a page's
        content streams though a single pipeline. The new method
        QPDFObjectHandle.coalesceContentStreams, when called on a page
        object, will do nothing if the page's contents are a single
        stream, but if they are an array of streams, it will replace the
        page's contents with a single stream whose contents are the
        concatenation of the original streams.

        * A few library routines throw exceptions if called on non-page
        objects. These constraints have been relaxed somewhat to make qpdf
        more tolerant of files whose page dictionaries are not properly
        marked as such. Mostly exceptions about page operations being
        called on non page objects will only be thrown in cases where the
        operation had no chance of succeeding anyway. This change has no
        impact on any default mode operations, but it could allow
        applications that use page-level APIs in QPDFObjectHandle to be
        more tolerant of certain types of damaged files.

        * Add QPDFObjectHandle::TokenFilter class and methods to use it to
        perform lexical filtering on content streams. You can call
        QPDFObjectHandle::addTokenFilter on stream object, or you can call
        the higher level QPDFObjectHandle::addContentTokenFilter on a page
        object to cause the stream's contents to passed through a token
        filter while being retrieved by QPDFWriter or any other consumer.
        For details on using TokenFilter, please see comments in
        QPDFObjectHandle.hh.

        * Enhance the string, type QPDFTokenizer::Token constructor to
        initialize a raw value in addition to a value. Tokens have a
        value, which is a canonical representation, and a raw value. For
        all tokens except strings and names, the raw value and the value
        are the same. For strings, the value excludes the outer delimiters
        and has non-printing characters normalized. For names, the value
        resolves non-printing characters. In order to better facilitate
        token filters that mostly preserve contents and to enable
        developers to be mostly unconcerned about the nuances of token
        values and raw values, creating string and name tokens now
        properly handles this subtlety of values and raw values. When
        constructing string tokens, take care to avoid passing in the
        outer delimiters. This has always been the case, but it is now
        clarified in comments in QPDFObjectHandle.hh::TokenFilter. This
        has no impact on any existing code unless there's some code
        somewhere that was relying on Token::getRawValue() returning an
        empty string for a manually constructed token. The token class's
        operator== method still only looks at type and value, not raw
        value. For example, string tokens for <41> and (A) would still be
        equal because both are representations of the string "A".

        * Add QPDFObjectHandle::isDataModified method. This method just
        returns true if addTokenFilter has been called on the stream. It
        enables a caller to determine whether it is safe to optimize away
        piping of stream data in cases where the input and output are
        expected to be the same. QPDFWriter uses this internally to skip
        the optimization of not re-compressing already compressed streams
        if addTokenFilter has been called. Most developers will not have
        to worry about this as it is used internally in the library in the
        places that need it. If you are manually retrieving stream data
        with QPDFObjectHandle::getStreamData or
        QPDFObjectHandle::pipeStreamData, you don't need to worry about
        this at all.

        * Provide heavily annoated examples/pdf-filter-tokens.cc example
        that illustrates use of some simple token filters.

        * When normalizing content streams, as in qdf mode, issue warning
        about bad tokens. Content streams are only normalized when this is
        explicitly requested, so this has no impact on normal operation.
        However, in qdf mode, if qpdf detects a bad token, it means that
        either there's a bug in qpdf's lexer, that the file is damaged, or
        that the page's contents are split in a weird way. In any of those
        cases, qpdf could potentially damage the stream's contents by
        replacing carrige returns with newlines or otherwise messing with
        spaces. The mostly likely case of this would be an inline image's
        compressed data being divided across two streams and having the
        compressed data in the second stream contain a carriage return as
        part of its binary data. If you are using qdf mode just to look at
        PDF files in text editors, this usually doesn't matter. In cases
        of contents split across multiple streams, coalescing streams
        would eliminate the problem, so the warning mentions this. Prior
        to this enhancement, the chances of qdf mode writing incorrect
        data were already very low. This change should make it nearly
        impossible for qdf mode to unknowingly write invalid data.

2018-02-04  Jay Berkenbilt  <ejb%ql.org@localhost>

        * Add QPDFWriter::setLinearizationPass1Filename method and
        --linearize-pass1 command line option to allow specification of a
        file into which QPDFWriter will write its intermediate
        linearization pass 1 file. This is useful only for debugging qpdf.
        qpdf creates linearized files by computing the output in two
        passes. Ordinarily the first pass is discarded and not written
        anywhere. This option allows it to be inspected.


To generate a diff of this commit:
cvs rdiff -u -r1.18 -r1.19 pkgsrc/print/qpdf/Makefile
cvs rdiff -u -r1.4 -r1.5 pkgsrc/print/qpdf/PLIST
cvs rdiff -u -r1.1 -r1.2 pkgsrc/print/qpdf/buildlink3.mk
cvs rdiff -u -r1.15 -r1.16 pkgsrc/print/qpdf/distinfo

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: pkgsrc/print/qpdf/Makefile
diff -u pkgsrc/print/qpdf/Makefile:1.18 pkgsrc/print/qpdf/Makefile:1.19
--- pkgsrc/print/qpdf/Makefile:1.18     Fri Feb 23 06:25:23 2018
+++ pkgsrc/print/qpdf/Makefile  Tue Feb 27 12:37:20 2018
@@ -1,6 +1,6 @@
-# $NetBSD: Makefile,v 1.18 2018/02/23 06:25:23 adam Exp $
+# $NetBSD: Makefile,v 1.19 2018/02/27 12:37:20 ryoon Exp $
 
-DISTNAME=      qpdf-7.1.1
+DISTNAME=      qpdf-8.0.0
 CATEGORIES=    print
 MASTER_SITES=  ${MASTER_SITE_SOURCEFORGE:=qpdf/}
 

Index: pkgsrc/print/qpdf/PLIST
diff -u pkgsrc/print/qpdf/PLIST:1.4 pkgsrc/print/qpdf/PLIST:1.5
--- pkgsrc/print/qpdf/PLIST:1.4 Fri Sep 29 21:11:40 2017
+++ pkgsrc/print/qpdf/PLIST     Tue Feb 27 12:37:20 2018
@@ -1,4 +1,4 @@
-@comment $NetBSD: PLIST,v 1.4 2017/09/29 21:11:40 wiz Exp $
+@comment $NetBSD: PLIST,v 1.5 2018/02/27 12:37:20 ryoon Exp $
 bin/fix-qdf
 bin/qpdf
 bin/zlib-flate
@@ -15,6 +15,7 @@ include/qpdf/Pl_Count.hh
 include/qpdf/Pl_DCT.hh
 include/qpdf/Pl_Discard.hh
 include/qpdf/Pl_Flate.hh
+include/qpdf/Pl_QPDFTokenizer.hh
 include/qpdf/Pl_RunLength.hh
 include/qpdf/Pl_StdioFile.hh
 include/qpdf/PointerHolder.hh

Index: pkgsrc/print/qpdf/buildlink3.mk
diff -u pkgsrc/print/qpdf/buildlink3.mk:1.1 pkgsrc/print/qpdf/buildlink3.mk:1.2
--- pkgsrc/print/qpdf/buildlink3.mk:1.1 Sat Jun  7 10:44:30 2014
+++ pkgsrc/print/qpdf/buildlink3.mk     Tue Feb 27 12:37:20 2018
@@ -1,11 +1,11 @@
-# $NetBSD: buildlink3.mk,v 1.1 2014/06/07 10:44:30 wiz Exp $
+# $NetBSD: buildlink3.mk,v 1.2 2018/02/27 12:37:20 ryoon Exp $
 
 BUILDLINK_TREE+=       qpdf
 
 .if !defined(QPDF_BUILDLINK3_MK)
 QPDF_BUILDLINK3_MK:=
 
-BUILDLINK_API_DEPENDS.qpdf+=   qpdf>=5.0.1nb2
+BUILDLINK_API_DEPENDS.qpdf+=   qpdf>=8.0.0
 BUILDLINK_PKGSRCDIR.qpdf?=     ../../print/qpdf
 
 .endif # QPDF_BUILDLINK3_MK

Index: pkgsrc/print/qpdf/distinfo
diff -u pkgsrc/print/qpdf/distinfo:1.15 pkgsrc/print/qpdf/distinfo:1.16
--- pkgsrc/print/qpdf/distinfo:1.15     Fri Feb 23 06:25:23 2018
+++ pkgsrc/print/qpdf/distinfo  Tue Feb 27 12:37:20 2018
@@ -1,8 +1,8 @@
-$NetBSD: distinfo,v 1.15 2018/02/23 06:25:23 adam Exp $
+$NetBSD: distinfo,v 1.16 2018/02/27 12:37:20 ryoon Exp $
 
-SHA1 (qpdf-7.1.1.tar.gz) = d2bbc564c0b6abe3c3c939d092870574ab7025c2
-RMD160 (qpdf-7.1.1.tar.gz) = a8bead427d4c819cae4935b9c635c97d220a656b
-SHA512 (qpdf-7.1.1.tar.gz) = a75f988c7dd7ac174bdc981cd3696ca8b539ac6c581e3afecf601dc67277014cb4fe3f0e5cb75a67412cafa4eb645b2fc2d8a0ec203834464baf0c7e80baa0b4
-Size (qpdf-7.1.1.tar.gz) = 7099282 bytes
+SHA1 (qpdf-8.0.0.tar.gz) = 5fc59652c6c2742a4b115530163f342822e07dcd
+RMD160 (qpdf-8.0.0.tar.gz) = 74a4dcbbfaa68210aebdbf986f4aac2f00b989a8
+SHA512 (qpdf-8.0.0.tar.gz) = 194a439cf703e4e9990f61c05b59e1be1972a21e0698647a02a475ca53358f6a8ae3d44d56425bcd9d8b94cdea3cbcd66217d7b4e7a0cdd0ae428464f45e58ae
+Size (qpdf-8.0.0.tar.gz) = 7947253 bytes
 SHA1 (patch-libqpdf.pc.in) = f592899487bb958a01931afbe4ddf3c749ea103e
 SHA1 (patch-make_libtool.mk) = 8622d6a446da284269102dde38bf14271363dfdc

Prev by Date: CVS commit: pkgsrc/security/vault
Next by Date: CVS commit: pkgsrc/print/cups-filters
Previous by Thread: CVS commit: pkgsrc/security/vault
Next by Thread: CVS commit: pkgsrc/print/cups-filters
Indexes:

Home | Main Index | Thread Index | Old Index