Subject: troubleshoot hung syslogd?
To: None <netbsd-help@netbsd.org>
From: Jeremy C. Reed <reed@reedmedia.net>
List: port-xen
Date: 04/20/2007 10:19:31
Can anyone share some ideas on troubleshooting syslogd that is hung?

It has happened to me yesterday and sometime this morning.

Yesterday, it couldn't log any messages. I couldn't kill it until I used 
-SIGKILL.

Today, noticed it again. Can't log via it.

top says it is in "ttyout" state. ps says "Is" (so normal).

I can't get anything by ktrace with the -p process number. (Teach me!)

gdb of running process:

0xbdb32e57 in writev () from /usr/lib/libc.so.12
(gdb) bt
#0  0xbdb32e57 in writev () from /usr/lib/libc.so.12
#1  0x0804b6c6 in fprintlog ()
#2  0x0804b171 in logmsg ()
#3  0x0804acb5 in printline ()
#4  0x0804a8b8 in dispatch_read_funix ()
#5  0x0804a362 in main ()
#6  0x08049b06 in ___start ()

I killed it with -9. And restarted it and can log to it now.

One of my daemons (spamd) couldn't log to it though, until I restart it 
also. But other daemons, like smtpd and family, started logging 
immediately.

dmesg doesn't indicate anything to me.

This is a Xen instance: NetBSD 3.1 (XEN3_DOMU) i386.

Also yesterday, I had a corrupted Berkeley DB hash file which never 
corrupted before (even on other systems). I am guessing this is related.

Any suggestions on how to troubleshoot this more?

If all I know this could be caused by bad memory. How can I check on a 
remote Xen instance?

Or maybe bad disk. How can I remotely test xbd0 (and my swap xbd1) on a 
Xen Virtual Block Device Interface?

By the way, my dmesg says:

	root on xbd0a dumps on xbd0b

But I don't have any xbd0b disklabled. My swap is on /dev/xbd1a. So if my 
system crashed, what would happen to a kernel dump?

  Jeremy C. Reed