Subject: Re: Bzip versus Gzip
To: None <netbsd-users@netbsd.org>
From: Peter I. Hansen <pih@xbase.dk>
List: netbsd-users
Date: 07/07/2005 12:02:47
Martijn van Buul wrote:
> [Hmm, I didn't recall mailing this to the list, but that's ok]
> 
> It occurred to me that Peter I. Hansen wrote in gmane.os.netbsd.general:
> 
>>Martijn van Buul wrote:
>>
>>>On Tue, Jul 05, 2005 at 10:24:37PM +0200, Peter I. Hansen wrote:
>>> 
>>>
>>>
>>>>But, if you want to save space and bandwidth, like Zafer, bzip 
>>>>is the way to go.
>>>
>>>
>>>Actually, it's not. If you *really* don't care about the amount of CPU
>>>cycles spent on compressing/decompressing, you could always try p7zip; 
>>>it's in archivers/p7zip in pkgsrc.
>>>
>>>Compressing the same testfile takes a staggering 5 minutes 41 seconds,
>>>but it compresses it down to 40MB - 24% of the original. *THAT* is what
>>>I call a considerable improvement over gzip. And the best part has yet
>>>to come: Extraction takes only 15 seconds. Still a lot more than gzip, but
>>>considerably less than bzip2!
>>>
>>
>>That's quite good. So, If I read your numbers correct, the p7zip
>>compression saves us 24 Mb at the expense of 8.8 seconds in
>>extra decompression time.
> 
> 
> well, "at a factor 3 more expansion time" would be more correct. And, I
> compressed only binaries, while there's also text- and configuration 
> files which might skew things a bit. And the extra burden during build
> is considerable, too, even more than bzip2.
> 

I tested with pkgsrc.tar . With p7zip versus gzip I save around 
7 Mb. The decompression time for gzip is 18 seconds on my 400 
MHz Pentium  II, and for p7zip 33 seconds. I would gladly 
sacrifice this cpu time in favor of the smaller download.
But thats just me.

> 
>>I have no idea what the average cpu is like, but I think using
>>that kind of compression for distribiting packages makes sense
>>in terms of bandwidth consideration.
> 
> 
> Unfortunately, p7zip uses a horrible commandline interface, totally
> unlike bzip2/gzip. And I'm not convinced it's very portable.
> 

I totally agree with you on the interface, but I think that a 
small shell script could make it more gzip like.
On the portability I simply don't know...

> 
>>If cpu cycles is everything to you, I'm with you all the way on
>>gzip.
> 
> 
> I'm not saying it's *everything*. But it's certainly an argument, as
> far as I'm concerned.
> 

I think cpu time and bandwidth are both arguments. What's more 
important depends if you are the server or the user, and then 
again what kind of equipment you have.
If you serve files that are downloaded a lot, then maybe you 
prefer making these files smaller.
If you sit on a PII 400 MHz with a 100 Mbit connection, then 
gzip is nice.
If you are a home user with 256 Kbit DSL, download time matters.

Maybe disk space is cheap enough to serve both gz and 7z files 
so the user can chose? I have no idea :)