tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: R packages

Thanks for the various feedback on my original description of a tool
for creating/updating/managing R packages.  Let me outline a modified
version for tracking CRAN packages in pkgsrc for more feedback.

- The program works on a set of R packages discovered anywhere in
  pkgsrc (i.e., packages named R-* and with a Makefile including
  R/Makefile.extension) or listed explicitly on the command line or
  both.  The set may optionally be enlarged by recursion through
  dependencies of the initial set.  This set may also be modified by
  including/excluding packages based upon regular expression matches
  to the names.  Thus, the initial set of R packages in pkgsrc to work
  on can be composed quite flexibly.

- Given a set of pkgsrc R packages, a set of CRAN packages is found
  that contains those corresponding to the pkgsrc R packages (if they
  exist) and all of the dependencies of those corresponding packages.
  These packages are used to determine, for example, whether pkgsrc
  packages are out of date and to provide CRAN-based information when
  creating or updating pkgsrc packages.

- Each pkgsrc R package is classified into one of several categories
  based upon comparisons with the corresponding CRAN packages:

  + a new package, i.e., one without an existing pkgsrc package
  + an out-of-date package, i.e., pkgsrc version < CRAN version
  + an up-to-date package, i.e., pkgsrc version = CRAN version
  + a future package, i.e., pkgsrc version > CRAN version (should never happen)
  + a missing package, i.e., one with no corresponding CRAN package

- In addition to simply listing the packages found and their
  classification, three different modifications of the pkgsrc tree may
  occur, each under independent control:

  + New packages are optionally created in a specified category.
  + Out-of-date packages are optionally updated in place.
  + Up-to-date and future packages are optionally updated in place.

- Changes to existing packages are limited to the following using
  information derived from each CRAN package's DESCRIPTION file:

  + A new file is created containing CRAN's description
    information.  This may be useful in case the DESCR file should be
    reviewed and/or updated.
  + If Makefile.orig does not exist, it is created from Makefile.
  + Makefile is modified in the following ways:
    o R_PKGVER is updated to correspond to the CRAN package version.
    o If the version is changed, PKGREVISION lines are removed.
    o A '# LICENSE' line is added containing CRAN's license
      information.  If possible, this line also includes a suggestion
      for a possibly appropriate translation of CRAN's license
      information into the name of a pkgsrc license file.  This latter
      information is provided as a means of expediting the search for
      license files to compare with the package's actual license.
    o New '# DEPENDS' lines are added reflecting CRAN's idea of dependencies.
  + If distinfo.orig does not exist, it is created from distinfo.
  + distinfo is updated with 'make makesum'

A run of the program (currently named R2pkg, but other suggestions are
welcome; e.g., CRAN2pkg?) against pkgsrc-2011Q2 packages reveals that
only these changes are, in fact, made.  (I can send anyone interested
the output of cvs diff for review.)  The idea is to minimally modify
packages while tracking down all the dependencies, etc. that are
provided by CRAN.  This provides information useful for subsequent
manual review (e.g., possibly new licensing terms, possibly new
dependencies, possibly new descriptions, etc.) prior to commiting yet
does not invasively modify existing packages.  Indeed, when run twice,
packages are unchanged the second time compared with the first.

I would appreciate any comments on this design.  I think this was
largely what the earlier design was actually accomplishing, but
perhaps my description ended up focusing on less salient aspects.
Nevertheless, the current rendition is an improvement, so I appreciate
the input.

Thanks alot.


Home | Main Index | Thread Index | Old Index