Subject: Re: pkgsrc progress bar?
To: =?ISO-8859-1?Q?Timo_Sch=F6ler?= <eclipser23@web.de>
From: Dmitri Nikulin <setagllib@optusnet.com.au>
List: tech-pkg
Date: 01/22/2005 21:15:40
Timo Schöler wrote:

>
> you write that you /could/ develop a package management system which 
> is more sophisticated than pkgsrc.
>
Technically yes, there's no magic, just getting a lot of possibilities 
out there... it's something anyone with some free time can do. I hope.

But it's not worth it right now. The system DFly devs are working on 
will probably be better anyway, and portability is a goal too. I talked 
to corecode himself about it and read his paper, and while not all of 
the details are decided yet, it will take the best of the best features 
and hopefully have a very good final product. It's a BSD project: 
they're not exactly known for getting things wrong :)

Only if that flops and I end up having a LOT of time on my hands am I 
likely to try authoring a package manager, and even that might fail 
miserably.

The one thing that's questionable is language: I'm a C purist, and if 
it's to fit comfortably into BSD base systems (i.e. not need 
REs/interpreters/etc), it should be a C work. C++ is questionable, 
usually only for the GNU projects, but also possible; if the design for 
the packaging system needs it I can do that too. But I'd prefer a nice 
clean C implementation.

corecode said "Python would be fine" so they don't seem to care about 
dependencies, or even chicken/egg problems (python needing to be 
installed with the package manager that needs it to run; compare 
fetching the FreeBSD Ports tree with cvsup which is only avaiable in the 
ports tree! Man I love NetBSD). I don't see the advantages of a script 
language for this task; they're usually meant for rapid development and 
maintainance, but in the long run a well-written C/C++ program will win. 
This is especially useful for portability (at least for POSIXy 
platforms). There's not much in here that even requires stepping outside 
the standard C library, except readdir() and its brothers for travelling 
the tree.

The idea should be that the new system does not need so much in the way 
of resources that it doesn't run somewhere that NetBSD itself does. 
pkgsrc fits this bill just by being a conservative design. A system with 
more advanced features needs more resources, but the pure C nature and 
very careful management should keep it within safe bounds. If we can 
tolerate gcc's bloat, we can tolerate anything.

One other often overlooked factor is redundancy. Redundancy in the 
'ports' files leads to an unnecessarily large tree. But avoiding 
redundancy can mean overly clever code which can get hairy, or requiring 
too many quirks if something doesn't fit the usual template.

Of course in terms of functionality, it shouldn't be any less featureful 
than Portage. However, its architecture division is narrow-minded since 
it's based on Linux' portability ("the processor is everything; what do 
you mean, busses?"), but basing the new system on NetBSD portability 
("the 'universal' machine configuration is more important than the 
components") might lead to way too many ports being supported and being 
meaningless outside of NetBSD. In this case, it might make sense to 
stick with Portage's way, but be able to define processor 'wildcards' 
and such things that would allow easy/implicit expansion. This would 
lead to different USE (or whatever we call it) flags and different base 
patches, and different levels of testing (the lattest I consider silly: 
properly written software has the same bugs on any architecture). Two 
levels should be enough (tested and raw), but for root's sake let's test 
the raw stuff before committing it, unlike Portage maintainers do.

One other thing that I just noticed would be handy is an easy, 
non-hackish way to look in more than one tree for the intended software. 
Portage has 'override' trees which come close, but it's still only a 
second level, and for all we know certain vendors or distributions might 
want to maintain their own trees and have our 'central' tree be second 
in priority. Or vice versa. And this should scale up gracefully to any 
number of 'layers'.

Now one last [for now] important aspect is distribution. CVS, rsync, 
what? I'm inclined to go with CVS since it's in EVERY BSD base package, 
but on the other hand it has a few fundamental design and implementation 
flaws which worry me. rsync never appealed to begin with, but someone 
else might be able to explain why it's good, if it is. But then, there's 
no real reason not to make things even more efficient by developing an 
internal distribution system and provide a server option: it wouldn't 
need much power, essentially just comparing a checksum and sending a new 
file if anything is different. Or if any file is missing. And the option 
to force deletion of files that were deleted from the server. It isn't 
necessary to do the whole revision or branch thing like in CVS: that's 
provided already by the version and testing flags. It's more work to 
store every single deletion, but then, we can have implicit deletion if 
something is NOT in the other tree (and rely on a secondary local tree 
for user additions).

And to make things much easier for developers, an internal system for 
generating skeleton 'ports' and then submitting finished works for 
inclusion in a tree (this can slot right in to the client/server 
communication system) is a good idea too: it would also remove the need 
for mailing lists and PR's which I always found overly complicated and 
error-prone especially if you don't run a local mail server and have to 
use either a badly designed mail client (Thunderbird) or a web 
interface; this includes most of the 'next generation' of developers. I 
maintained two ports for FreeBSD (one of a package I wrote myself, 
another which was mysteriously missing) and updated yet another twice, 
and found the PR system unwieldly. That you have to use sharchives to 
submit new ports then unified diffs thereafter is way too much work for 
one-liner changes. If the software handles this it's much easier. Just 
imagine: you are marked as the maintainer of a port, and after that, 
nobody even has to moderate your work on it. Of course a team to audit 
changes to ensure nothing is an obvious backdoor is essential: but look 
how long AfterStep has survived while sending postcards to its developers.

Wow, lot there. I might write this system in spare time just for fun, 
even if it never gets off the ground.