Subject: Unkillable process, stalled socket write()
To: None <firstname.lastname@example.org>
From: Jorgen Lundman <email@example.com>
Date: 05/27/2004 10:55:30
NetBSD mirror 1.6ZF NetBSD 1.6ZF (mirror) #6: Fri Apr 2 04:06:49 CEST 2004
Most likely is something I have done in my software, but it is behaving unusual.
FTPD process I have is now hung. Kill -9 does nothing to it, and naturally I can
not release it.
Usually when I see this, it is usually due to disk or tape going bad and the
kernel will block forever. But what is unusual is that this time the blocked fd
is a socket, that the FTPd is sending to.
However, gdb tells me:
#0 0x481d36d7 in write () from /usr/lib/libc.so.12
#1 0x808bb2b in sockets_write (fd=275, [cut]
Inspecting my structures I can confirm that fd 275 is a socket, we already have
read 4072 bytes from the file on disk, and are now trying to send them.
0x481d36d7 in write () from /usr/lib/libc.so.12
Dump of assembler code for function write:
0x481d36d0 <write>: mov $0x4,%eax
0x481d36d5 <write+5>: int $0x80
0x481d36d7 <write+7>: jb 0x481d36b8 <getpid+8>
0x481d36d9 <write+9>: ret
int 0x80 at a guess is just a syscall, and 0x4 would be sys_write().
fd 275 is also in nonblocking mode, so even if it was that it is out of mbufs or
memory, should it not always return, even with a failure?
Memory: 278M Act, 39M Inact, 608K Wired, 10M Exec, 291M File, 476M Free
Swap: 10G Total, 10G Free
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
root 15529 0.0 0.0 8216 4 ?? DXs 11:03AM 67:08.89 ./lundftpd
/0 /5 /10 /15 /20 /25 /30 /35 /40 /45 /50 /55 /60
data XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 44023
Alas, netstat, vmstat don't run since userland is 1.6.2 and kernel is -current
(to support the nic) sigh.
Jorgen Lundman | <firstname.lastname@example.org>
Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell)
Japan | +81 (0)3 -3375-1767 (home)