Subject: Re: X server as a Unix system process
To: Jukka Marin <jmarin@pyy.jmp.fi>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: tech-kern
Date: 07/15/1997 01:56:42
>If I quit and restart Applixware, things return to normal.  So this means
>that the linked list is deallocated when the Applixware process end?

Now *that* sounds more as if there is some per-client state inside the
Xserver (or possibly the application) that is slowing things down.
Can you run "systat vmstat" (or top) during hte really slow redraws
and see if the CPU is busy in kernel mode or user mode?  

On a P166 and with a moderately well-written X application, the system
shouldn't be spending more than a very few percent of its time in the
kernel (and you can always compare to the `fast' case just after startup.)

If the time is going in user mode, is it going in the X server or in
your Applixware X client?  If you use top with an interval of 1sec,
and the Xserver is busy in userspace, that should show very clearly
during a 9-second redraw.


If this is a kernel problem, then kernel profiling would help a
*lot*. I'd suggest building a kernel with profiling and turn it on
when performance gets really bad, do redraws &c for a while, and then
extracting the kernel profile data and examining it with gprof.  Curt
Sampson <cjs@portal.ca> has created a Web page with instructinos on
how to do that, based on a message I sent a month or two ago.

 >>It certainly feels like that, looking the word processor redrawing the
 >window ssssslllllooowwwwlllyyyyy.  It takes 9 seconds to do one redraw
 >at the moment (on a 166 MHz Pentium with 64 MB RAM, no paging during
 >the redraw).
 >
 >> It looks to me like you're gettin a different chunk of address space
 >> each time, which is also consistent with a slowly-growing linked list.
 >> (Where else would the state be kept?)

 >I can send the ktrace output to someone if it helps (I guess it doesn't).

Mmm, not yet, not till whe know where the  time is really going.

IIRC, you've said earlier that the client sees a steadily increasing
response time between select() calls.  That suggests either the
Xserver is busy or the kernel is busy, but not which ktrace isn't the
best tool to tell which, at least not without a fair bit of
postprocessing, and preferably correlation with the ``slow'' client
requests.

I don't know enough about Xserver internals anymore to have a clue why
the sever is busy mmap()ing and unmmaping() smallish regions.  Knowing
that might help especially if this *is* a problem with malloc() in the
Xserver.