[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: R packages
Thanks for the various feedback on my original description of a tool
for creating/updating/managing R packages. Let me outline a modified
version for tracking CRAN packages in pkgsrc for more feedback.
- The program works on a set of R packages discovered anywhere in
pkgsrc (i.e., packages named R-* and with a Makefile including
R/Makefile.extension) or listed explicitly on the command line or
both. The set may optionally be enlarged by recursion through
dependencies of the initial set. This set may also be modified by
including/excluding packages based upon regular expression matches
to the names. Thus, the initial set of R packages in pkgsrc to work
on can be composed quite flexibly.
- Given a set of pkgsrc R packages, a set of CRAN packages is found
that contains those corresponding to the pkgsrc R packages (if they
exist) and all of the dependencies of those corresponding packages.
These packages are used to determine, for example, whether pkgsrc
packages are out of date and to provide CRAN-based information when
creating or updating pkgsrc packages.
- Each pkgsrc R package is classified into one of several categories
based upon comparisons with the corresponding CRAN packages:
+ a new package, i.e., one without an existing pkgsrc package
+ an out-of-date package, i.e., pkgsrc version < CRAN version
+ an up-to-date package, i.e., pkgsrc version = CRAN version
+ a future package, i.e., pkgsrc version > CRAN version (should never happen)
+ a missing package, i.e., one with no corresponding CRAN package
- In addition to simply listing the packages found and their
classification, three different modifications of the pkgsrc tree may
occur, each under independent control:
+ New packages are optionally created in a specified category.
+ Out-of-date packages are optionally updated in place.
+ Up-to-date and future packages are optionally updated in place.
- Changes to existing packages are limited to the following using
information derived from each CRAN package's DESCRIPTION file:
+ A new file DESCR.new is created containing CRAN's description
information. This may be useful in case the DESCR file should be
reviewed and/or updated.
+ If Makefile.orig does not exist, it is created from Makefile.
+ Makefile is modified in the following ways:
o R_PKGVER is updated to correspond to the CRAN package version.
o If the version is changed, PKGREVISION lines are removed.
o A '# LICENSE' line is added containing CRAN's license
information. If possible, this line also includes a suggestion
for a possibly appropriate translation of CRAN's license
information into the name of a pkgsrc license file. This latter
information is provided as a means of expediting the search for
license files to compare with the package's actual license.
o New '# DEPENDS' lines are added reflecting CRAN's idea of dependencies.
+ If distinfo.orig does not exist, it is created from distinfo.
+ distinfo is updated with 'make makesum'
A run of the program (currently named R2pkg, but other suggestions are
welcome; e.g., CRAN2pkg?) against pkgsrc-2011Q2 packages reveals that
only these changes are, in fact, made. (I can send anyone interested
the output of cvs diff for review.) The idea is to minimally modify
packages while tracking down all the dependencies, etc. that are
provided by CRAN. This provides information useful for subsequent
manual review (e.g., possibly new licensing terms, possibly new
dependencies, possibly new descriptions, etc.) prior to commiting yet
does not invasively modify existing packages. Indeed, when run twice,
packages are unchanged the second time compared with the first.
I would appreciate any comments on this design. I think this was
largely what the earlier design was actually accomplishing, but
perhaps my description ended up focusing on less salient aspects.
Nevertheless, the current rendition is an improvement, so I appreciate
Main Index |
Thread Index |