Subject: Re: X server as a Unix system process
To: Jonathan Stone <jonathan@DSG.Stanford.EDU>
From: Jukka Marin <jmarin@pyy.jmp.fi>
List: tech-kern
Date: 07/15/1997 12:08:17
> >If I quit and restart Applixware, things return to normal.  So this means
> >that the linked list is deallocated when the Applixware process end?
> 
> Now *that* sounds more as if there is some per-client state inside the
> Xserver (or possibly the application) that is slowing things down.
> Can you run "systat vmstat" (or top) during hte really slow redraws
> and see if the CPU is busy in kernel mode or user mode?  

60% system, 40% user.  With systat -w 1, system percentage varies between
50% and 70%.

> On a P166 and with a moderately well-written X application, the system
> shouldn't be spending more than a very few percent of its time in the
> kernel (and you can always compare to the `fast' case just after startup.)

Well, it's mostly system time, it seems.  (Isn't that natural, after
seeing how long the munmap() calls take?)

> If the time is going in user mode, is it going in the X server or in
> your Applixware X client?  If you use top with an interval of 1sec,
> and the Xserver is busy in userspace, that should show very clearly
> during a 9-second redraw.

According to top, the X server uses about 30-40% of CPU time, system uses
60-70% and Applix uses 0.15%.

> If this is a kernel problem, then kernel profiling would help a
> *lot*. I'd suggest building a kernel with profiling and turn it on
> when performance gets really bad, do redraws &c for a while, and then
> extracting the kernel profile data and examining it with gprof.  Curt
> Sampson <cjs@portal.ca> has created a Web page with instructinos on
> how to do that, based on a message I sent a month or two ago.

I guess I could try this.  (I'm pretty busy with my job, but the Applix
problem is too annoying to be ignored.. or, as it turns out, the NetBSD
problem.)

> IIRC, you've said earlier that the client sees a steadily increasing
> response time between select() calls.  That suggests either the
> Xserver is busy or the kernel is busy, but not which ktrace isn't the
> best tool to tell which, at least not without a fair bit of
> postprocessing, and preferably correlation with the ``slow'' client
> requests.

Well, I can clearly see that the munmap() calls are slowing down.  During
the first ktrace, they took 2 ms or less.  During the last ktrace, they
were taking more than 9 ms each.  This explains why select() on Applix
side is returning so slowly (all that time is really being used by the
X server (or the munmap() calls, to be exact)).

> I don't know enough about Xserver internals anymore to have a clue why
> the sever is busy mmap()ing and unmmaping() smallish regions.  Knowing
> that might help especially if this *is* a problem with malloc() in the
> Xserver.

Can't help there.. but I'm willing to investigate this further, even if
it means rebooting this machine (yikes!).  I'd love to see the fix to
this in 1.3.......... :-))

  -jm


-- 

                       1503 kHz @ 22:30 EET DST Mon-Fri

                     ---> http://www.jmp.fi/~jmarin/ <---