Subject: Re: pkg_summary
To: Jeremy C. Reed <firstname.lastname@example.org>
From: Aleksey Cheusov <email@example.com>
Date: 06/13/2007 00:09:29
> I can write a cron job to check for these. (But I still don't understand
> the FTP layout for the packages as some have an overlay and are in
> available in two places but some are only available in one place.)
mk/bulk/upload uses the following
ls -t | grep '\.t[gb]z$' | while read n; do pkg_info -X "$n"; done
hint: xargs makes this 12.2% faster, shorter, and even easier
12% from nothing! ;-)
0 All>time ls -t | grep '\.t[gb]z$' | while read n; do pkg_info -X "$n"; done >/tmp/summary1
98.02s real 72.08s user 19.75s system
0 All>time ls -t | grep '\.t[gb]z$' | xargs pkg_info -X >/tmp/summary2
86.07s real 69.98s user 15.27s system
> As for updating or rebuilding -- Maybe we can make a script that removes
> non-existent data from pkg_summary and adds new data. That should be way
> faster than creating entire pkg_summary each time.
Rebuilding an entire index is fast enough.
Building an index for .tbz (bzip2 is slower than gzip) 700 packages on
my 5-years old machine (!!!) takes less than 100 secs, see below.
For entire repository (less than 7000 packages) it will take
less than 1000 sec, i.e. less 17 minutes.
17 minutes per day on 800Mhz machine! ;-)
Also note that most repositories are not updated most of the time
(test "`tail -t | head -1`" -nt pkg-summary.gz) may help.
Best regards, Aleksey Cheusov.