tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Problem with pty(4) handling under NetBSD-5, was: Re: Problem with pckbc(4) under NetBSD-5

        hello.  In looking into this problem further, I've determined that it
is not a problem with the pckbc(4) driver, but appears to be a problem with
the pty(4) driver instead.  What I'm seeing is that the master of the pty
is getting blocked in write(2) and the slave for the same pty is getting
blocked in read(2).  I believe there is a race condition in this code
somewhere, but I'm not really sure.  I can reproduce the problem at will, b
ut if I use my application gently, I can avoid the problem, for a while,
        I believe this may be a side effect of the changes introduced to fix
kern/37915 back in January.  I have confirmed that it affects all pseudo
terminals and is not specific to the wscons keyboard.  In fact, it has
nothing to do with the keyboard now.  I'm not sure if it's related, but
it seems to happen more quickly when there are a lot of actived ptys
sending and receiving input at once.  Specifically, when one application is
manipulating a number of simultaneously active ptys.
        This problem definitely does not exist under NetBSD-4, but I can't say
for certain when it was introduced.  I'll start trying various revisions of
the src/sys/kern/tty.c and/or src/sys/kern/tty_pty.c
and see where I come out.
On the face of my initial investigation, though, this seems to be a pretty
serious bug.

On Jun 4,  4:35pm, Brian Buhrow wrote:
} Subject: Re: Problem with pckbc(4) under NetBSD-5
}       Hello.  I've switched this thread to tech-kern, because it looks like
} the problem is squarely in the kernel.
}       Following up on my own post, I've figured out more precisely what is
} going on.  Or what appears to be going on.  At this point, I could use some
} assistance in getting a bit further.
}       What's happening is that I have a process running on /dev/console of
} an I386 system using the pc's keyboard.  This process is getting stuck in
} the write(2) system call.  In tracing things down, it looks like we're
} getting stuck in ttwrite() on line  1935.
} The comment above this line says that we're sleeping for carrier to wake
} up.  How can carrier ever be down on a wsdisplay(4) output device?
} This is with:
} /*    $NetBSD: tty.c,v 2009/02/06 02:05:18 snj Exp $        */
} In looking at the tty.c file, I don't see where a wakeup is ever called
} against this particular address.  And, indeed, until the process that gets
} stuck here is killed, the process never moves again.
} this problem doesn't exist under NetBSD-4, but then the locking code is
} completely different under NetBSD-4.
}       I can reproduce this problem at will, so if there are any ddb commands
} which I can run to provide useful data to track this problem down, I'm
} happy to do it.  I'd like to fix this issue, since I cannot upgrade my
} workstations until this issue is resolved.
}       At the very least, it looks like this sleep ought to timeout
} eventually, or it ought to never happen at all in the case of a
} wsdisplay(4) output device.
} Thoughts?
} -Brian
} On Jun 4, 12:51am, Brian Buhrow wrote:
} } Subject: Problem with pckbc(4) under NetBSD-5
} }     Hello.  Under NetBSD-5 I'm seeing a problem where processes running on
} } the console tty, i.e. on the pc keyboard on X86 systems, tested with the
} } I386 platform,  get into a state where  ps shows them in "ttyraw" state.
} } When that happens, the only way to get the keyboard back is to kill the
} } login shell for that terminal off, and let init reset the terminal.  I'm
} } still trying to see exactly what the trace is for the processes that get
} } into this state.  I'm beginning to think that the problem exists under
} } NetBSD-4 as well, but that the polling interval is so much faster on  that
} } version, that the keyboard recovers on its own before things get really
} } stuck.  Has anyone else seen this problem?  I can reproduce it at will,
} } almost, but I'm not sure what interaction causes the problem.  It seems to
} } happen when I'm using the keyboard heavily, and when there's a lot of
} } scrolling going on at the same time.
} }     Any thoughts, ideas or what ever would be greatly appreciated.  I'll
} } post more as I learn more.  Right now, though, NetBSD-5 doesn't look like
} } it will work for me on my workstations.
} } -thanks
} } -Brian
} >-- End of excerpt from Brian Buhrow
>-- End of excerpt from Brian Buhrow

Home | Main Index | Thread Index | Old Index