Subject: Re: tty/thread machine starvation/lockups with 4.99.40 (sparc64)
To: Erik Fair <firstname.lastname@example.org>
From: Rafal Boni <email@example.com>
Date: 12/12/2007 21:57:04
Erik Fair wrote:
> I/O to ttys has been the proximate cause of UNIX process unhappiness
> since time-immemorial. You can't kill (or swap) a process involved in
> DMA I/O (the subsequently completed I/O would then end up in some other
> process' memory, which would be ... bad), so the kernel doesn't permit
Well, you're right in many ways, though of course that's why we put data
in the kernel tty buffers, so all I/O goes there rather than directly to
> If you want to watch real badness, hit ^S on a UNIX console, and wait a
> while. Depending on how many processes want to spew on the console, and
> how often, the process table will eventually fill up with unkillable
> processes, or ... zombies. This is one reason why syslogd(8) exists.
Sure, I've seen the pile-up of processes all blocked on a single
resource (be it something stuck in disk-wait that probably *is* pinned
for the reasons you state above), or the flow-control induced console
backup. But this smells different...
> Just out of curiosity, if you put asterisk on the back side of a pty
> (e.g. with script(1), screen(1), ssh(1), etc), does the hang still
> happen? Or is asterisk directly opening /dev/console itself?
Yes, in fact it *is* on the back side of a SSH pty, and the other
processes that it blocks are *not* sharing that same tty/pty (e.g.,
login on the real console device). Interrupt delivery also doesn't look
to be the culprit since the interrupt to kick me into DDB arrives fine.
I have not tried just quitting DDB to see if that unwedges anything,
but I suspect not.
I guess I should have been more specific, but when I wrote 'tty lockups'
I really mean 'tty subsystem lockups' because that's what it looks like
is the culprit (possibly when there's threading involved as well).