pkgsrc-WIP-cvs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

CVS commit: wip/pandoc

Module name:    wip
Committed by:   szptvlfn
Date:           Thu Sep 19 10:48:04 UTC 2013

Modified Files:
        wip/pandoc: Makefile PLIST distinfo

Log Message:
pandoc (1.12)

  [new features]

  * Much more flexible metadata, including arbitrary fields and structured
    values.  Metadata can be specified flexibly in pandoc markdown using
    YAML metadata blocks, which may occur anywhere in the document:

        title: Here is my title.
        abstract: |
          This is the abstract.

          1. It can contain
          2. block content
             and *inline markup*

        tags: [cat, dog, animal]

    Metadata fields automatically populate template variables.

  * Added `opml` (OPML) as input and output format.  The `_note` attribute,
    used in OmniOutliner and supported by multimarkdown, is supported.
    We treat the contents as markdown blocks under a section header.

  * Added `haddock` (Haddock markup) as input format (David Lazar).

  * Added `revealjs` output format, for reveal.js HTML 5 slide shows.
    (Thanks to Jamie F. Olson for the initial patch.)
    Nested vertical stacks are used for hierarchical structure.
    Results for more than one level of nesting may be odd.

  * Custom writers can now be written in lua.

        pandoc -t data/sample.lua

    will load the script sample.lua and use it as a custom writer.
    (For a sample, do `pandoc --print-default-data-file sample.lua`.)
    Note that pandoc embeds a lua interpreter, so lua need not be
    installed separately.

  * New `--filter/-F` option to make it easier to run "filters"
    (Pandoc AST transformations that operate on JSON serializations).
    Filters are always passed the name of the output format, so their
    behavior can be tailored to it.  The repository
    <> contains
    a python module for writing pandoc filters in python, with
    a number of examples.

  * Added `--metadata/-M` option.
    This is like `--variable/-V`, but actually adds to metadata, not
    just variables.

  * Added `--print-default-data-file` option, which allows printing
    of any of pandoc's data files. (For example,
    `pandoc --print-default-data-file reference.odt` will print

  * Added syntax for "pauses" in slide shows:

        This gives

        . . .

        me pause.

  * New markdown extensions:

    + `ignore_line_breaks`:  causes intra-paragraph line breaks to be ignored,
      rather than being treated as hard line breaks or spaces.  This is useful
      for some East Asian languages, where spaces aren't used between words,
      but text is separated into lines for readability.
    + `yaml_metadata_block`:  Parse YAML metadata blocks.  (Default.)
    + `ascii_identifiers`: This will force `auto_identifiers` to use ASCII
       only. (Default for `markdown_github`.) (#807)
    + `lists_without_preceding_blankline`:  Allow lists to start without
      preceding blank space.  (Default for `markdown_github`.) (#972)

  [behavior changes]

  * `--toc-level` no longer implies `--toc`.
    Reason: EPUB users who don't want a visible TOC may still want
    to set the TOC level for in the book navigation.

  * `--help` now prints in and out formats in alphabetical order, and
    says something about PDF output (#720).

  * `--self-contained` now returns less verbose output (telling you
    which URLs it is fetching, but not giving the full header).  In
    addition, there are better error messages when fetching a URL fails.

  * Citation support is no longer baked in to core pandoc. Users who
    need citations will need to install and use a separate filter
    (`--filter pandoc-citeproc`).  This filter will take `bibliography`,
    `csl`, and `citation-abbreviations` from the metadata, though it
    may still be specified on the command line as before.

  * A `Cite` element is now created in parsing markdown whether or not
    there is a matching reference.

  * The `pandoc-citeproc` script will put the bibliography at the
    end of the document, as before.  However, it will be put inside a `Div`
    element with class "references", allowing users some control
    over the styling of references.  A final header, if any, will
    be included in the `Div`.

  * The markdown writer will not print a bibliography if the
    `citations` extension is enabled.  (If the citations are formatted
    as markdown citations, it is redundant to have a bibliography,
    since one will be generated automatically.)

  * Previously we used to store the directory of the first input file,
    even if it was local, and used this as a base directory for finding
    images in ODT, EPUB, Docx, and PDF.  This has been confusing to many
    users.  So we now look for images relative to the current
    working directory, even if the first file argument is in another
    directory.   Note that this change may break some existing workflows.
    If you have been assuming that relative links will be interpreted
    relative to the directory of the first file argument, you'll need
    to make that the current directory before running pandoc. (#942)

  * Better error reporting in some readers, due to changes in `readWith`:
    the line in which the error occured is printed, with a caret pointing
    to the column.

  * All slide formats now support incremental slide view for definition lists.

  * Parse `\(..\)` and `\[..\]` as math in MediaWiki reader.
    Parse `:<math>...</math>` as display math.  These notations are used with
    the MathJax MediaWiki extension.

  * All writers: template variables are set automatically from metadata
    fields.  However, variables specified on the command line with
    `--variable` will completely shadow metadata fields.

  * If `--variable` is used to set many variables with the same name,
    a list is created.

  * Man writer:  The `title`, `section`, `header`, and `footer` can now
    all be set individually in metadata.  The `description` variable has been
    removed.  Quotes have been added so that spaces are allowed in the
    title.  If you have a title that begins

        COMMAND(1) footer here | header here

    pandoc will still parse it into a title, section, header, and
    footer.  But you can also specify these elements explicitly (#885).

  * Markdown reader

    + Added support for YAML metadata blocks, which can come anywhere
      in the document (not just at the beginning).  A document can contain
      multiple YAML metadata blocks.
    + HTML span and div tags are parsed as pandoc Span and Div elements.

  * Markdown writer

    + Allow simple tables to be printed as grid tables,
      if other table options are disabled.  This means you can do
      `pandoc -t markdown-pipe_tables-simple_tables-multiline_tables`
      and all tables will render as grid tables.
    + Support YAML title block (render fields in alphabetical order
      to make output predictable).

  [API changes]

  * `Meta` in `Text.Pandoc.Definition` has been changed to allow
    structured metadata.  (Note:  existing code that pattern-matches
    on `Meta` will have to be revised.)  Metadata can now contain
    indefinitely many fields, with content that can be a string,
    a Boolean, a list of `Inline` elements, a list of `Block`
    elements, or a map or list of these.

  * A new generic block container (`Div`) has been added to `Block`,
    and a generic inline container (`Span`) has been added to `Inline`.
    These can take attributes.  They will render in HTML, Textile,
    MediaWiki, Org, RST and and Markdown (with `markdown_in_html`
    extension) as HTML `<div>` and `<span>` elements; in other formats
    they will simply pass through their contents.  But they can be
    targeted by scripts.

  * `Format` is now a newtype, not an alias for String.
    Equality comparisons are case-insensitive.

  * Added `Text.Pandoc.Walk`, which exports hand-written tree-walking
    functions that are much faster than the SYB functions from
    `Text.Pandoc.Generic`.  These functions are now used where possible
    in pandoc's code.  (`Tests.Walk` verifies that `walk` and `query`
    match the generic traversals `bottomUp` and `queryWith`.)

  * Added `Text.Pandoc.JSON`, which provides `ToJSON` and `FromJSON`
    instances for the basic pandoc types. They use GHC generics and
    should be faster than the old JSON serialization using

  * Added `Text.Pandoc.Process`, exporting `pipeProcess`.
    This is a souped-up version of `readProcessWithErrorcode` that
    uses lazy bytestrings instead of strings and allows setting
    environment variables.  (Used in `Text.Pandoc.PDF`.)

  * New module `Text.Pandoc.Readers.OPML`.

  * New module `Text.Pandoc.Writers.OPML`.

  * New module `Text.Pandoc.Readers.Haddock` (David Lazar).
    This is based on Haddock's own lexer/parser.

  * New module `Text.Pandoc.Writers.Custom`.

  * In `Text.Pandoc.Shared`, `openURL` and `fetchItem` now return an
    Either, for better error handling.

  * Made `stringify` polymorphic in `Text.Pandoc.Shared`.

  * Removed `stripTags` from `Text.Pandoc.XML`.

  * `Text.Pandoc.Templates`:

    + Simplified `Template` type to a newtype.
    + Removed `Empty`.
    + Changed type of `renderTemplate`: it now takes a JSON context
      and a compiled template.
    + Export `compileTemplate`.
    + Export `renderTemplate'` that takes a string instead of a compiled
    + Export `varListToJSON`.

  * `Text.Pandoc.PDF` exports `makePDF` instead of `tex2pdf`.

  * `Text.Pandoc`:

    + Made `toJsonFilter` an alias for `toJSONFilter` from `Text.Pandoc.JSON`.
    + Removed `ToJsonFilter` typeclass.  `ToJSONFilter` from
      `Text.Pandoc.JSON` should be used instead.  (Compiling against
      pandoc-types instead of pandoc will also produce smaller executables.)
    * Removed the deprecated `jsonFilter` function.
    + Added `readJSON`, `writeJSON` to the API (#817).

  * `Text.Pandoc.Options`:

    + Added `Ext_lists_without_preceding_blankline`,
      `Ext_ascii_identifiers`, `Ext_ignore_line_breaks`,
      `Ext_yaml_metadataBlock` to `Extension`.
    + Changed `writerSourceDirectory` to `writerSourceURL` and changed the
      type to a `Maybe`.  `writerSourceURL` is set to 'Just url' when the
      first command-line argument is an absolute URL.  (So, relative links
      will be resolved in relation to the first page.)  Otherwise, 'Nothing'.
    + All bibliography-related fields have been removed from
      `ReaderOptions` and `WriterOptions`: `writerBiblioFiles`,
      `readerReferences`, `readerCitationStyle`.

  * The `Text.Pandoc.Biblio` module has been removed.  Users of the
    pandoc library who want citation support will need to use
    `Text.CSL.Pandoc` from `pandoc-citations`.

  [bug fixes]

  * In markdown, don't autolink a bare URI that is followed by `</a>`

  * `Text.Pandoc.Shared`

    + `openURL` now follows redirects (#701), properly handles `data:`
      URIs, and prints diagnostic output to stderr rather than stdout.
    + `readDefaultDataFile`: normalize the paths.  This fixes bugs in
      `--self-contained` on pandoc compiled with `embed_data_files` (#833).
    + Fixed `readDefaultDataFile` so it works on Windows.
    + Better error messages for `readDefaultDataFile`.  Instead of
      listing the last path tried, which can confuse people who are
      using `--self-contained`, so now we just list the data file name.
    + URL-escape pipe characters.  Even though these are legal, `Network.URI`
      doesn't regard them as legal in URLs.  So we escape them first (#535).

  * Mathjax in HTML slide shows:  include explicit "Typeset" call.
    This seems to be needed for some formats (e.g. slideous) and won't
    hurt in others (#966).

  * `Text.Pandoc.PDF`

    + On Windows, create temdir in working directory, since the system
      temp directory path may contain tildes, which can cause
      problems in LaTeX (#777).
    + Put temporary output directory in `TEXINPUTS` (see #917).
    + `makePDF` tries to download images that are not found locally,
      if the first argument is a URL (#917).
    + If compiling with `pdflatex` yields an encoding error, offer
      the suggestion to use `--latex-engine=xelatex`.

  * Produce automatic header identifiers in parsing textile, RST,
    and LaTeX, unless `auto_identifiers` extension is disabled (#967).

  * `Text.Pandoc.SelfContained`:  Strip off fragment, query of relative URL
     before treating as a filename.  This fixes `--self-contained` when used
     with CSS files that include web fonts using the method described here:
      (#739).  Handle `src` in `embed`, `audio`, `source`, `input` tags.

  * `Text.Pandoc.Parsing`: `uri` parser no longer treats punctuation before
    percent-encoding, or a `+` character, as final punctuation.

  * `Text.Pandoc.ImageSize`:  Handle EPS (#903).  This change will make
    EPS images properly sized on conversion to Word.

  * Slidy:  Use slidy.js rather than slidy.js.gz.
    Reason:  some browsers have trouble with the gzipped js file,
    at least on the local file system (#795).

  * Markdown reader

    + Properly handle blank line at beginning of input (#882).
    + Fixed bug in unmatched reference links.  The input
      `[*infile*] [*outfile*]` was getting improperly parsed:
      "infile" was emphasized, but "*outfile*" was literal (#883).
    + Allow internal `+` in citation identifiers (#856).
    + Allow `.` or `)` after `#` in ATX headers if no `fancy_lists`.
    + Do not generate blank title, author, or date metadata elements.
      Leave these out entirely if they aren't present.
    + Allow backtick code blocks not to be preceded by blank line (#975).

  * Textile reader:

    + Correctly handle entities.
    + Improved handling of `<pre>` blocks (#927). Remove internal HTML tags
      in code blocks, rather than printing them verbatim. Parse attributes
      on `<pre>` tag for code blocks.

  * HTML reader: Handle non-simple tables (#893).  Column widths are read from
    `col` tags if present, otherwise divided equally.

  * LaTeX reader

    + Support alltt environment (#892).
    + Support `\textasciitilde`, `\textasciicircum` (#810).
    + Treat `\textsl` as emphasized text reader (#850).
    + Skip positional options after `\begin{figure}`.
    + Support `\v{}` for hacek (#926).
    + Don't add spurious ", " to citation suffixes.
      This is added when needed in pandoc-citeproc.
    + Allow spaces in alignment spec in tables, e.g. `{ l r c }`.
    + Improved support for accented characters (thanks to Scott Morrison).
    + Parse label after section command and set id (#951).

  * RST reader:

    + Don't insert paragraphs where docutils doesn't.
      `rst2html` doesn't add `<p>` tags to list items (even when they are
      separated by blank lines) unless there are multiple paragraphs in the
      list.  This commit changes the RST reader to conform more closely to
      what docutils does (#880).
    + Improved metadata.  Treat initial field list as metadata when
      standalone specified.  Previously ALL fields "title", "author",
      "date" in field lists were treated as metadata, even if not at
      the beginning.  Use `subtitle` metadata field for subtitle.
    + Fixed 'authors' metadata parsing in reST.  Semicolons separate
      different authors.

  * MediaWiki reader

    + Allow space before table rows.
    + Fixed regression for `<ref>URL</ref>`.
      `<` is no longer allowed in URLs, according to the uri parser
      in `Text.Pandoc.Parsing`.  Added a test case.
    + Correctly handle indented preformatted text without preceding
      or following blank line.
    + Fixed `|` links inside table cells.  Improved attribute parsing.
    + Skip attributes on table rows.  Previously we just crashed if
      rows had attributes, now we ignore them.
    + Ignore attributes on headers.
    + Allow `Image:` for images (#971).
    + Parse an image with caption in a paragraph by itself as a figure.

  * LaTeX writer

    + Don't use ligatures in escaping inline code.
    + Fixed footnote numbers in LaTeX/PDF tables.  This fixes a bug
      wherein notes were numbered incorrectly in tables (#827).
    + Always create labels for sections.  Previously the labels were only
      created when there were links to the section in the document (#871).
    + Stop escaping `|` in LaTeX math.
      This caused problems with array environments (#891).
    + Change `\` to `/` in paths.  `/` works even on Windows in LaTeX.
      `\` will cause major problems if unescaped.
    + Write id for code block to label attribute in LaTeX when listings
      is used (thanks to Florian Eitel).
    + Scale LaTeX tables so they don't exceed columnwidth.
    + Avoid problem with footnotes in unnumbered headers (#940).

  * Beamer writer: when creating beamer slides, add `allowframebreaks` option
      to the slide if it is one of the header classes.  It is recommended
      that your bibliography slide have this attribute:

        # References {.allowframebreaks}

    This causes multiple slides to be created if necessary, depending
    on the length of the bibliography.

  * ConTeXt writer: Properly handle tables without captions.  The old output
    only worked in MkII. This should work in MkIV as well (#837).

  * MediaWiki writer: Use native mediawiki tables instead of HTML (#720).

  * HTML writer:

    + Fixed `--no-highlight` (Alexander Kondratskiy).
    + Don't convert to lowercase in email obfuscation (#839).
    + Ensure proper escaping in `<title>` and `<meta>` fields.

  * AsciiDoc writer:

    + Support `--atx-headers` (Max Rydahl Andersen).
    + Don't print empty identifier blocks `([[]])` on headers (Max
      Rydahl Andersen).

  * ODT writer:

    + Fixing wrong numbered-list indentation in open document format
      (Alexander Kondratskiy) (#369).
    + `reference.odt`: Added pandoc as "generator" in `meta.xml`.
    + Minor changes for ODF 1.2 conformance (#939). We leave the
      nonconforming `contextual-spacing` attribute, which is provided by
      LibreOffice itself and seems well supported.

  * Docx writer:

    + Fixed rendering of display math in lists.
      In 1.11 and 1.11.1, display math in lists rendered as a new list
      item.  Now it always appears centered, just as outside of lists,
      and in proper display math style, no matter how far indented the
      containing list item is (#784).
    + Use `w:br` with `w:type` `textWrapping` for linebreaks.
      Previously we used `w:cr` (#873).
    + Use Compact style for Plain block elements, to
      differentiate between tight and loose lists (#775).
    + Ignore most components of `reference.docx`.
      We take the `word/styles.xml`, `docProps/app.xml`,
      `word/theme/theme1.xml`, and `word/fontTable.xml` from
      `reference.docx`, ignoring everything else.  This should help
      with the corruption problems caused when different versions of
      Word resave the reference.docx and reorganize things.
    +  Made `--no-highlight` work properly.

  * EPUB writer

    + Don't add `dc:creator` tags if present in EPUB metadata.
    + Add `id="toc-title"` to `h1` in `nav.xhtml` (#799).
    + Don't put blank title page in reading sequence.
      Set `linear="no"` if no title block.  Addresses #797.
    + Download webtex images and include as data URLs.
      This allows you to use `--webtex` in creating EPUBs.
      Math with `--webtex` is automatically made self-contained.
    + In `data/epub.css`, removed highlighting styles (which
      are no longer needed, since styles are added by the HTML
      writer according to `--highlighting-style`).  Simplified
      margin fields.
    + If resource not found, skip it, as in Docx writer (#916).

  * RTF writer:

    + Properly handle characters above the 0000-FFFF range.
      Uses surrogate pairs.  Thanks to Hiromi Ishii for the patch.
    + Fixed regression with RTF table of contents.
    + Only autolink absolute URIs.  This fixes a regression, #830.

  * Markdown writer:

    + Only autolink absolute URIs.  This fixes a regression, #830.
    + Don't wrap attributes in fenced code blocks.
    + Write full metadata in MMD style title blocks.
    + Put multiple authors on separate lines in pandoc titleblock.
      Also, don't wrap long author entries, as new lines get treated
      as new authors.

  * `Text.Pandoc.Templates`:

    + Fixed bug retrieving default template for markdown variants.
    + Templates can now contain "record lookups" in variables;
      for example, `author.institution` will retrieve the `institution`
      field of the `author` variable.
    + More consistent behavior of `$for$`.  When `foo` is not a list,
      `$for(foo)$...$endfor$` should behave like $if(foo)$...$endif$.
      So if `foo` resolves to "", no output should be produced.
      See pandoc-templates#39.

  * Citation processing improvements (now part of pandoc-citeproc):

    + Fixed `endWithPunct` The new version correctly sees a sentence
      ending in '.)' as ending with punctuation.  This fixes a bug which
      led such sentences to receive an extra period at the end: '.).'.
      Thanks to Steve Petersen for reporting.
    + Don't interfere with Notes that aren't citation notes.
      This fixes a bug in which notes not generated from citations were
      being altered (e.g. first letter capitalized) (#898).
    + Only capitalize footnote citations when they have a prefix.
    + Changes in suffix parsing.  A suffix beginning with a digit gets 'p'
      inserted before it before passing to citeproc-hs, so that bare numbers
      are treated as page numbers by default.  A suffix not beginning with
      punctuation has a space added at the beginning (rather than a comma and
      space, as was done before for not-author-in-text citations).
      The result is that `\citep[23]{item1}` in LaTeX will be interpreted
      properly, with '23' treated as a locator of type 'page'.
    + Many improvements to citation rendering, due to fixes in citeproc-hs
      (thanks to Andrea Rossato).
    + Warnings are issued for undefined citations, which are rendered
      as `???`.
    + Fixed hanging behavior when locale files cannot be found.

  [template changes]

  * DocBook:  Use DocBook 4.5 doctype.

  * Org: '#+TITLE:' is inserted before the title.
    Previously the writer did this.

  * LaTeX:  Changes to make mathfont work with xelatex.
    We need the mathspec library, not just fontspec, for this.
    We also need to set options for setmathfont (#734).

  * LaTeX: Use `tex-ansi` mapping for `monofont`.
    This ensures that straight quotes appear as straight, rather than
    being treated as curly.  See #889.

  * Made `\includegraphics` more flexible in LaTeX template.
    Now it can be used with options, if needed.  Thanks to Bernhard Weichel.

  * LaTeX/Beamer: Added `classoption` variable.
    This is intended for class options like `oneside`; it may
    be repeated with different options.  (Thanks to Oliver Matthews.)

  * Beamer: Added `fonttheme` variable.  (Thanks to Luis Osa.)

  * LaTeX: Added `biblio-style` variable (#920).

  * DZSlides: title attribute on title section.

  * HTML5: add meta tag to allow scaling by user (Erik Evenson)

  [under-the-hood improvements]

  * Markdown reader:Improved strong/emph parsing, using the strategy of
    <>.  The new parsing algorithm requires
    no backtracking, and no keeping track of nesting levels.  It will give
    different results in some edge cases, but these should not affect normal

  * Added `Text.Pandoc.Compat.Monoid`.
    This allows pandoc to compile with `base` < 4.5, where `Data.Monoid`
    doesn't export `<>`.  Thanks to Dirk Ullirch for the patch.

  * Added `Text.Pandoc.Compat.TagSoupEntity`.
    This allows pandoc to compile with `tagsoup` 0.13.x.
    Thanks to Dirk Ullrich for the patch.

  * Most of `Text.Pandoc.Readers.TeXMath` has been moved to the
    `texmath` module (0.6.4).  (This allows `pandoc-citeproc` to
    handle simple math in bibliography fields.)

  * Added `Text.Pandoc.Writers.Shared` for shared functions used
    only in writers.  `metaToJSON` is used in writers to create a
    JSON object for use in the templates from the pandoc metadata
    and variables.  `getField`, `setField`, and `defField` are
    for working with JSON template contexts.

  * Added `Text.Pandoc.Asciify` utility module.
    This exports functions to create ASCII-only versions of identifiers.

  * `Text.Pandoc.Parsing`

    + Generalized state type on `readWith` (API change).
    + Specialize readWith to `String` input. (API change).
    + In `ParserState`, replace `stateTitle`, `stateAuthors`, `stateDate`
      with `stateMeta` and `stateMeta'`.

  * `Text.Pandoc.UTF8`: use strict bytestrings in reading.  The use of lazy
     bytestrings seemed to cause problems using pandoc on 64-bit Windows
     7/8 (#874).

  * Factored out `registerHeader` from markdown reader, added to

  * Removed `blaze_html_0_5` flag, require `blaze-html` >= 0.5.
    Reason:  < 0.5 does not provide a monoid instance for Attribute,
    which is now needed by the HTML writer (#803).

  * Added `http-conduit` flag, which allows fetching https resources.
    It also brings in a large number of dependencies (`http-conduit`
    and its dependencies), which is why for now it is an optional flag

  * Added

  * Improved INSTALL instructions.

  * `make-windows-installer.bat`: Removed explicit paths for executables.

  * `aeson` is now used instead of `json` for JSON.

  * Set default stack size to 16M.  This is needed for some large
    conversions, esp. if pandoc is compiled with 64-bit ghc.

  * Various small documentation improvements.
    Thanks to achalddave and drothlis for patches.

  * Removed comment that chokes recent versions of CPP (#933).

  * Removed support for GHC version < 7.2, since pandoc-types now
    requires at least GHC 7.2 for GHC generics.

To generate a diff of this commit:
cvs -z3 rdiff -u -r1.2 -r1.3 wip/pandoc/distinfo
cvs -z3 rdiff -u -r1.4 -r1.5 wip/pandoc/PLIST
cvs -z3 rdiff -u -r1.7 -r1.8 wip/pandoc/Makefile

To view a diff of this commit:

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13.
pkgsrc-wip-cvs mailing list

Home | Main Index | Thread Index | Old Index