Subject: Userland "cc -pg (gprof) profiling: working well on amd64?
To: None <tech-kern@netbsd.org, port-amd64@netbsd.org>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: port-amd64
Date: 03/21/2006 20:19:35
Is profiling of userland code expected to work on amd64?
On either -current or netbsd-3.0?

I ask because I've been trying to profile some of my own old code, in
hopes of better understanding a 2:1 difference in user-CPU time
between recent Linux distros and NetBSD (to NetBSD's disadvantage), on
one of my own private, long-running apps.

I'm consistently seeing the total user time reported by gprof on
NetBSD is only (very coarsely) 30% of the runtime reported by time
(csh or /usr/bin/time); and (after some brute-force adhoc VM tuning),
wall-time and CPU-time for my app are both matching externally-
measured wall-time fairly closely, on an otherwise-idle machine.

I'm wondering if the gprof data is uniformly missing samples, or the
scaling is buggy, or something else mysterious is going on.
(Stack smashing by the app is one possibility).

If it matters, the host I'm using is an dual-core socket-939 amd64,
with an SMP kernel either from -current as of late February, or
NetBSD-3.0 (or RC6, I forget which, but that's pretty close to 3.0).

Thanks in advance to anyone who shares relevant experiences,
(positive or otherwise).

--Jonathan