Re: Bloat

To: tech-kern%netbsd.org@localhost
Subject: Re: Bloat
From: Ignatios Souvatzis <is%netbsd.org@localhost>
Date: Fri, 30 Jan 2009 17:10:30 +0100

On Thu, Jan 29, 2009 at 09:03:53PM +0000, David Laight wrote:

> For non-superscaler cpus and cpus without significant (or any)
> instruction cache, inlining and loop unrolling are probably gains
> (if you can afford the code space).
> So on a vax or 68xxx inlining and unrolling are probably wins.

Be careful...  at least 68060 is very much RISCy in this regards.

Besides being mildly superscalar, the 68060 has a branch target
cache and can hide even a conditional branch completely if mostly
taken in one direction.  (I verified this while implementing the
new delay loop).  Given the tiny instruction cache, I'd prefer
calls, as long as the overall code size is smaller.

For really tiny inlines where the compiler can make use of overall
optimizations within the caller, this might be different.

The confusion is even worse:

The tiny primary (and only) physically-addressed instruction cache
of the 68060 makes kernel-trapping FPU instructions, emulated by
the kernel trap emulation library (necessary for FSIN & friends,
like on the 68040 I think), *faster* than using the userland version
of the emulation library in a 68060-specific libm. I have the
numbers somewhere. Well, I had them - carefully collected after
one week of evening work to build the library, with much cursing
after reading the results.

Regards,
        -is

Follow-Ups:
- Re: Bloat
  - From: Thor Lancelot Simon

References:
- Bloat
  - From: Andrew Doran
- Re: Bloat
  - From: Antti Kantee
- Re: Bloat
  - From: der Mouse
- Re: Bloat
  - From: Masao Uebayashi
- Re: Bloat
  - From: Allen Briggs
- Re: Bloat
  - From: David Laight

Prev by Date: Re: Bloat
Next by Date: Re: Bloat
Previous by Thread: Re: Bloat
Next by Thread: Re: Bloat
Indexes:

Home | Main Index | Thread Index | Old Index