Subject: Re: kgmon -b causes a reboot (no kernel profiling)
To: Bill Studenmund <>
From: Gary Thorpe <>
List: current-users
Date: 09/07/2006 22:31:38
--- Bill Studenmund <> wrote:

> On Thu, Aug 31, 2006 at 11:09:47PM -0400, Gary Thorpe wrote:
> > Hi,
> > 
> > I was recently attempting to profile a kernel under different
> > conditions. However, each time I try to enable profiling using
> 'kgmon
> > -b' the machine reliably reboots (no ddb or panic).
> > 
> > At first, I though it was because 'kgmon' was from 3.0, but the
> current
> > version produces the same result as well. Is this just something
> wrong
> > with my source tree/actions or is this universally repeatable?
> No idea about repro, but I'm going to guess that the problem is that
> a
> routine is getting profiled that shouldn't. There are a few routines
> which
> aren't profiled even when you're profiling. They are the routines
> involved
> in profiling itself.
> So my guess is that a routine called as part of profiling is getting 
> profiled, which triggers a recursion, which makes the stack explode,
> which 
> can cause the box to just reboot.
> A main problem is that anything to fix this, such as a stack guard
> page, 
> will trigger uvm code which is itself profiled, which will continue
> the 
> recursion.
> So the only suggestions I can come up with are: 1) make sure your
> source 
> tree is clean, 2) look at the call graph for profiling routines and
> see if 
> one of the routines in the graph is not marked as "no profiling", and
> 3) 
> try a date-based checkout to see when the change happened & examine
> the 
> change that killed things.
> Good luck!
> Take care,
> Bill

Thanks for responding.

I built a kernel for another machine (with the same source tree, but a
different configuration) which doesn't reboot when you enable
profiling. My guess was 1) because doing that required a new (clean)
tree. However, it still reboots after rebuilding with a clean tree. [Is
there a short list (i.e. most probable) of times when one should clean
the tree in between kernel builds (always for current)?]

So this problem may be either specific to this kernel configuration or
this particular machine.

About getting a call graph for 2): is their a #define that marks "no
profiling" (or is it done some other way)?

For 3), since it does not reboot with 3.0, should that branch point be
my starting point (that's a bit far back :-( )? Should I also try
varying the configuration file to see if it only happens with certain
options (e.g. stripping it down to the bare minimal and then gradually
testing more options)?

Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around