Re: Performance problems.

To: Matt Thomas <matt%3am-software.com@localhost>
Subject: Re: Performance problems.
From: Johnny Billquist <bqt%softjar.se@localhost>
Date: Fri, 07 May 2010 15:22:11 +0200

Hi. Thanks for the input...

Matt Thomas wrote:

On May 5, 2010, at 9:09 AM, Johnny Billquist wrote:

Tobias Nygren wrote:

On Wed, 05 May 2010 12:11:59 +0200
Johnny Billquist <bqt%softjar.se@localhost> wrote:

Anyone else seen this, knows what could be affecting it, or care?
Or are all others just using modern fast machines, where all this don't really 
make a blip on the radar anyway, or else is not using their machines enough to 
make this visible?

My gut feeling is that even on comparatively fast sparc64 hardware more
cycles are spent these days doing kernel work. But it may simply be
that the accounting in top(1) is broken / calculated differently from
how it used to be done. Benchmark results of current vs. 2.0 on low
powered uniprocessor systems would be very interesting to see.

I'm trying to look at this now, but what I'm looking at right now don't seem to 
have any bearing on sparcs, sorry.

That said, maybe someone can help me with a few details here. I'm looking at 
the softint code. Currently, there exists a __HAVE_FAST_SOFTINT option for the 
kernel. The only two architectures that seem to have defined this is x86 and 
VAX.
So far, so good.

But looking further into kern/kern_softint.c, I see a number of calls to 
softint_init_isr, where the second parameter is said to be the interrupt 
priority. The comments and descriptions about soft interrupt priorities say:

*       The four priority levels map directly to scheduler priority
*       levels, and where the architecture implements 'fast' software
*       interrupts, they also map onto interrupt priorities.  The
*       interrupt priorities are intended to be hidden from machine
*       independent code, which should use thread-safe mechanisms to
*       synchronize with software interrupts (for example: mutexes).


IPL_SOFTCLOCK is what that is referring to and those are defined correctly.

So the sentence "they also map onto interrupt priorities" should not betaken that they are the same as interrupt priorities?

Am I reading and understanding things wrong here, or is this fundamentally 
wrong in design? If the priority levels should match the interrupt levels in 
the case of fast soft interrupts, I can't understand why the definitions are in 
the MI files. It should be pretty machine dependent...


The former.

:-)

Since only the VAX and x86 are using fast soft interrupts, I can see that the 
code might have been written so that it works on the x86. Since noone else uses 
it, noone else gets hit by this but the VAX.


Actually, MIPS is using them now too.

Ah. I was searching through the arch/_/include directories for thedefinition, and didn't find it for MIPS. Oh well, not that importantanyway, since I think I was barking up the wrong tree.

By the way, I did try changing the PRI_SOFTCLOCK to IPL_SOFTCLOCK inkern_softint.c. The system worked just as fine anyway, but it didn'tmake my observed problems go away either, so that don't seem to havebeen the problem. :-(

If anyone can shed some more light on this, please do. In the meanwhile, I'll 
just do some blind testing and see if changing this will help...

I still have the problem, and am still interested in finding out what itis. This is a serious issue, and one that is currently hurtingperformance seriously on the 8650, and I would suspect it also hurtsother machines, even if it perhaps is less visible. But there isdefinitely a problem in there somewhere.


My observation points so far:

1) The clock of the system goes hayward after a while. So much so thatntp is not able to ever sync to a time source again.2) After running a couple of cvs update in parallel, along with abuild.sh, doing a shutdown -r takes me close to 10 minutes to get themachine to reboot. I don't know what it is doing, but it's definitelyslower than molass.


These two are very visible, tangible and easily repeatable problems.

My third one is general performance. The system seems to be spendingexcessive times in system mode. Yes, doing something CPU intensive willcause the system to run mostly in user mode, but doing anything thatrequires for instance, much disk access, directly pushes system time upover 50%, often hitting around 80% system time, which seems like way toomuch.

This is a bit harder to define. After all, how much time is to beexpected? And how long should various commands take?I plan to install NetBSD 2, to make relative comparisions between, whichwill help. But for now, it's most just a feeling.


As usual. Any hints, ideas, or help would be appreciated.

        Johnny

References:
- Performance problems.
  - From: Johnny Billquist
- Re: Performance problems.
  - From: Tobias Nygren
- Re: Performance problems.
  - From: Johnny Billquist
- Re: Performance problems.
  - From: Matt Thomas

Prev by Date: Re: Broken GENERIC kernel
Next by Date: Re: newfs seems stuck in loop
Previous by Thread: Re: Performance problems.
Next by Thread: Re: Performance problems.
Indexes:

Home | Main Index | Thread Index | Old Index