pkgsrc-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Binary packages for scientific research on RHEL/CentOS

On 05/16/17 18:44, Kamil Rytarowski wrote:
On 17.05.2017 01:09, Jason Bacon wrote:
Binary packages aimed at research computing and HPC are now available
for RHEL/CentOS 6 and 7:

These are more or less static package sets with a different prefix for
each quarterly build, e.g. /sharedapps/pkg-2017Q1. This layout allows
researchers to use the same package versions for the duration of
long-term studies (this is critical in many research projects) while
newer packages can be deployed alongside them on the same system.

I also posted instructions for our pbulk setup, which uses a pristine OS
installation image in a chroot env.  Many thanks to everyone who offered
advice as I was experimenting with pbulk.

Thank you for your involvement, this looks great!

I'm hoping the availability of these packages will spur a new level of
interest in pkgsrc within the research computing community.  I know for
a fact that pkgsrc has enormous potential here, as I've watched many
researchers struggle with software installations for the past 17 years,
especially on enterprise Linux, which has a virtual monopoly in HPC.

As far as I researched my local city HPC top500 cluster uses mainline
Linux kernel and the userland distribution... is built from scratch with
tuned configure options. This sounds like massive waste of manpower,
better to add options in pkgsrc packages, tune mk.conf and just generate
binary prebuilt packages quarterly.

A blog post - with successful stories - would help to convince them and
others to research the pkgsrc option.

Another market for similar setups is a corporate market with long-term
application support, people keep deploying setups on CentOS|RHEL servers
that are 2-3 major versions behind.. developing and installing software
on decade old Linux distributions is usually done manually.

With this infrastructure in place, the work of creating many new
scientific packages will now become our primary focus.

Comments and contributions are always welcome!

It is worth preparing a post on TNF blog, about pkgsrc on HPC computers
/ CentOS|RHEL setups. There are already few users out there, I'm aware
about NASA and Joyent.



To put the waste of manpower in perspective: There are cases where literally thousands of people in research around the world are spending 10 or 20 hours each trying to install the same software. These are brilliant people who should be spending that time doing research in their field. Most of them would rather not be struggling with IT tasks anyway, but they're just not aware of a better way.

In other cases, people are using a mishmash of different automated installation methods, like yum, pip, virtualenv, or using containers for the sole purpose of isolating software with esoteric build systems or bundled dependencies. It's a real mess. Some of us are concerned about the containerization trend leading to the de-evolution of scientific software. If software is always isolated and doesn't have to play nice with others, the developers have no motivation to clean it up or make it more portable.

If we can get scientists on board with the idea of using pkgsrc to install most of their software, that trend can be reversed. So, I'm hoping to make pkgsrc the fastest and easiest method of deployment on the mainstream platforms in HPC. We're already there for many things, most importantly for the tools that are often hurdles on EL such as newer compilers and interpreters. Thankfully the pkgsrc gcc and clang packages are working well on CentOS now.

The main task from here on is populating pkgsrc with as many scientific packages as possible. If people can install most of what they need with a simple "pkg_add" or "pkgin install", then we can stop flushing man-years of extremely valuable time down the toilet and scientific discovery will be accelerated.



Earth is a beta site.

Home | Main Index | Thread Index | Old Index