Port-i386 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: profiling broken on MP systems?



On Mon, Nov 22, 2010 at 09:05:23AM -0500, Thor Lancelot Simon wrote:
> We've been trying to do some profiling on a system which is a
> dual-processor x86 running NetBSD/i386 5.1.
> 
> It has not gone well.  If we enable profiling for more than 0.2 seconds,
> the system locks up.  Even more distressingly, sometimes attempts to
> turn off profiling simply don't: kgmon -h reports that it turned off
> profiling, but actually it didn't.  To quote one of my coworkers:
> 
> > .2 seconds was fine.  .3 was not.  Even though it claimed to have turned
> > profiling off, profiling was actually still on.
> >
> > I broke into the debugger and _gmonparam was 0: GMON_PROF_ON.  Manually
> > patching it to 3 in my debugger curiously did not work.  It kept being set
> > to  0.
> >
> > With DDB write command, I was able to set it to 3, and after a while, the
> > system came back and I was able to obtain the trace.
> 
> Another disturbing effect we've noticed is that the sysmon sme_worker goes
> crazy when profiling is turned on: it appears to loop sending cv_broadcast()
> a million times, and to be scheduled every few ticks instead of once every
> 30 seconds.  I cannot account for why this would be by (quick) code
> examination.  Unfortunately disabling it does not solve the system hangs
> when we turn on profiling on the MP kernel, however.
> 
> The only even remotely unusual thing about our kernel is that it's built
> with HZ=1024.  We have confirmed that hz = profhz = stathz = 1024 at runtime,
> however.
> 
> Does anyone have profiling working on a multiprocessor system right now?
> On i386?  With netbsd-5?

It worked during the 5.0 development cycle, but it has been
a very long time since I tried it.   With the current setup a spinlock
is taken on EVERY function entry, so it does have a severe impact on 
the system.  Although unless the box is being hammered at the time, what
you suggest seems to indicate a more severe problem. 
It wants for per-CPU buffers that get merged when the data is read out..

As a workaround have you considered using tprof?  Not as useful as gprof
though.



Home | Main Index | Thread Index | Old Index