tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: science category



On 04/18/17 14:25, Brook Milligan wrote:
On Apr 18, 2017, at 7:54 AM, Jason Bacon <bacon4000%gmail.com@localhost> wrote:

I suspect the reason a science category doesn't already exist is simply lack of critical mass.  That critical mass is coming now, as pkgsrc is growing very quickly.

FYI, pkgsrc has a huge potential impact on scientific computing and research in general.  Most HPC clusters run CentOS, which deliberately uses older compilers, kernels and core libraries for the sake of stability and binary compatibility for commercial software.  This makes it problematic for the latest versions of many open source packages.  I see scientists struggling with this all the time.  Pkgsrc is far and away the best existing solution to this problem and we're starting to raise awareness in the scientific community.  The number of scientific packages is likely to grow at an accelerating rate as more scientists warm up to pkgsrc.
I would like to second Jason’s points above and his concern about pkgsrc categories.  The point of better categories is something that I raised (too) many years ago to more or less the same response.  In the meantime pkgsrc keeps growing and scientific computing packages have no good home.  The biology category is largely an outgrowth of a few of us trying to include bioinformatics software in pkgsrc, and for that biology makes sense as a category.  However, one goal that I and others are now actively working on is uptake of pkgsrc by a broader scientific computing community.  By last count, just Jason and I have over 200 packages that can be added to pkgsrc, many of which are scientific and not necessarily biological in nature.

Before we launch on such an endeavor, it would be useful to have appropriate places to put things so that future re-categorization is not necessary.  Perhaps “science” is too broad.  In that case, the discussion should focus on answering the question: given a set of likely scientific computing packages, what would be an appropriate set of categories?  The answer is not to lump everything into “biology”.

It is also useful to make ample and appropriate use of the CATEGORY variable in packages.  My understanding is that the first term should be the physical category directory name, but that other keywords may be used in addition.  Improved use of those other keywords and better indexing tools would to some degree dissociate the location from the searching.  Nevertheless, that does not justify putting lots of packages in a directory that is not informative.

Cheers,
Brook

I would add that the lack of a science category might give scientists the impression that pkgsrc offers less than other package managers for research computing. Most other major package managers have a science category (Debian packages, FreeBSD ports, MacPorts, etc.)

Gentoo portage, on the other hand, has 11 narrower categories from sci-astro through sci-visualization. I worry about this approach, though, because the number of categories required to cover all the sciences would be enormous. This will also cause confusion for non-specific software, e.g. does molecular dynamics software belongs under physics, chemistry, or engineering? I would only argue for narrower categories where there's a large number of packages that clearly belong (like biology). MacPorts, BTW, has only 13 packages remaining under biology, most of which are cross-listed under science, which has 1247.

--
Earth is a beta site.


Home | Main Index | Thread Index | Old Index