tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: mk/pbulk/pbulk.sh



  Hello,

Havard Eidnes <he%NetBSD.org@localhost> writes:

> I was recently made aware of the mk/pbulk/pbulk.sh script, and
> have used it to prepare a pbulk bulk build.
>
> While it's nice, it could be better:
>
>  * If you've run it before and have a populated /usr/pbulk with
>    packages where the pkgdb is in /usr/pbulk/var/db/pkg, the
>    script will fail, leaving the user to clean up manually and
>    he returns to square zero.
>  * If the pbulk.sh script failed (e.g. as above), it'll leave
>    behind /tmp/work-pbulk, but a re-run of pbulk.sh will, instead
>    of cleaning up the mess, refuse to run, leaving the user to do
>    a manual cleanup and he returns to square zero.
>  * If the main package installation in /usr/pkg contains any of
>    the packages the pbulk.sh script wants to install, the script
>    will fail, again leaving the user to clean up manually and he
>    returns once again to square zero.
>
> Add to this that it can take a considerable amount of time
> (especially on some of my slower hosts) for any of these failures
> to present themselves, and using it can be a pretty frustrating
> experience.
>
> Is there a good reason the script can't check a bit more about
> the preconditions which have to be met before it delves into a
> lenghty build which is doomed to fail?  Or that it can't
> automatically mop up after its own failures?

Yes, there's a good reason.

This script is carefully written to be minimal and clear so that it can
be easily read and understood and perhaps reused in more complex setup.
If you start adding more checks to it, you'll make it better for you
personally but complicate things for others. Most likely a user of this
script is going to introduce his personal improvements, thus keeping the
script simple is better than providing features. Sadly, sh is horrible
programming language, thus adding even minor feature makes the code
unproportionally harder to understand. This is especially so for the
intended audience which is mostly sysadmins and/or programmers who
generally lack sh skills. For instance I have a variation of the script
that memoizes pbulk tools, but it is clumsier.

I'd like to keep current goals for that script.

These were general considerations, now on to particularities.

Cleanup isn't easy. If deployment of pbulk tools fails, you actually
want to analyse the problem, hence you want to keep all work directories
after exit. You only need to cleanup when you have worked it out.
You have instructions in the script how to cleanup after each logical step.

Another consideration is that I expect this script to be used within
isolated environment such as chroot environment, jail or similar solution.
Up to a virtual machine. Alright, I admit that some operating systems
make setting up such an environment more complex, yet I think that it
shouldn't be a problem given the current state of technology.
As a consequence I find it more important to keep it easy and clear
rather than provide support for cleanup after various failures.

As for non-idempotence, I don't think that it is actually needed or wanted.
It isn't easy to check whether rerun is going to setup same environment
in operational sense. I find it a lot better to fail earlier than try to
go on and fail in some mysterious way. This is amplified by the nature
of bulk build, which is quite heavy process even for relatively small builds.
Besides, idempotence hack is trivial to implement, if you really want it.
If you wish more safety, does checking preconditions suffice for you?
I think, that's the most we can do without changing initial assumptions
and considerations.

Thank you for your feedback!


-- 
HE CE3OH...



Home | Main Index | Thread Index | Old Index