tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: R packages

On Sun, Oct 09, 2011 at 12:41:44PM -0600, Brook Milligan wrote:
> I end up using a lot of R packages and tend to find that they are not
> yet in pkgsrc.  Over the years I have either added them or created
> them privately, but that has become tedious enough that I have now
> written a tool to automatically create them (see the current version
> of the man page below).  I would like to commit this as pkgtools/R2pkg
> at some point in the future.  However, I would like to seek some
> consensus on how we wish to handle R packages into the future, because
> this tool will (I hope) make it easier to import and maintain R
> packages within pkgsrc so there may very well be _many_ new ones.


> First, all R packages are currently under the "math" category.  I
> propose creating a new category "R" and moving them all.  This will
> better separate R packages (which are in fact not all math related)
> and will prevent overwhelming the legitimate math packages with more R
> packages.

How many packages are we talking about, circa?
(What is R if not a statistics tool, in other words, math?)

> Second, I would appreciate help on license handling for R packages.
> Most R packages have an indication of the license that applies.  I
> have built into the tool a map between those descriptions and the
> known licenses in pkgsrc.  In some cases, however, the mapping is
> unclear.  For example, how would "BSD" map given the several variants
> of the BSD license?

You have to find out which of the bsd licenses it is and use the
appropriate one. There are at least
modified-bsd (3 clauses)
original-bsd (4 clauses)

> And, how would "GPL (>= 2)" map?  Please see the
> current set of known licenses below and offer any suggestions.

AFAIK, we haven't really marked them differently yet, they are
currently marked as gnu-gpl-v2.
"More proper" would probably be "gnu-gpl-v2 OR gnu-gpl-v3", but only
until GPL4 comes along.

> Finally, as the command line arguments are currently set up, the
> normal case is to use both --fetch and --recurse.  Should I make these
> the default cases and have corresponding --no-* options to turn them
> off?

I think fetch should definitely be the default.
I have no particular opinion on --recurse. In your usage, have you
found to use it all the time?

>      Every package listed on the command line should correspond to the name of
>      an R package found on the Comprehensive R Archive Network (CRAN; see
>  For each one, R2pkg downloads the corre-
>      sponding DESCRIPTION file, parses it, and recursively does the same for
>      all dependencies.  Subsequently, R2pkg creates or updates the pkgsrc(7)
>      packages for each of the listed packages.  If a package already exists,
>      the corresponding files are renamed with .orig extensions prior to creat-
>      ing new ones.  As a result, the two version may be compared to make cer-
>      tain that no manual edits are lost in the process.  Note that pre-exist-
>      ing files with .orig extensions will be silently overwritten.

So if you run it twice, or two packages on the command line depend on
the same package, the original file will be lost?
(I wouldn't modify it if an ".orig" file already exists; but then we
need a way to remove the .orig files semiautomatically again, perhaps
"R2pkg --remove-orig" to be run when you are finished committing.

>                        BSD : no corresponding pkgsrc license
> GNU General Public License : no corresponding pkgsrc license
>                        GPL : no corresponding pkgsrc license

I'd rather mention the possibilities (see above for BSD, "gnu-gpl-v2
or gnu-gpl-v3" for the other two).

>                  Unlimited : no corresponding pkgsrc license

What's that?


Home | Main Index | Thread Index | Old Index