Subject: Re: Strange problem with syslogd on NetBSD-release-1-5
To: None <netbsd-users@netbsd.org>
From: Jim Breton <jamesb-netbsd@alongtheway.com>
List: netbsd-users
Date: 08/04/2001 04:11:08
OK, if anyone is still interested in this. :P My syslogd has once again
stopped logging, and I have a ktrace. syslogd is taking up almost all
my CPU. I wasn't thinking to check whether that was the case last time,
but the other symptoms appear to be exactly the same as the case I
mention below. I am now running the released NetBSD 1.5.1 on i386.
System is a Pentium 100 with 32 MB of RAM. Custom-built kernel (config
file can be supplied). The system has been up for about 19 hours, but
that doesn't seem to be related... this can happen minutes after
booting; conversely, prior to my last reboot the machine had been up for
19 days without this ever occurring at all.
Anyways, here are some data:
load averages: 1.11, 1.43, 1.78
From top:
112 root 64 0 120K 488K run 327:57 78.52% 78.52% syslogd
ktrace:
112 syslogd EMUL "netbsd"
112 syslogd RET poll 1
112 syslogd CALL poll(0x804f180,0x4,0xffffffff)
112 syslogd RET poll 1
112 syslogd CALL poll(0x804f180,0x4,0xffffffff)
112 syslogd RET poll 1
112 syslogd CALL poll(0x804f180,0x4,0xffffffff)
112 syslogd RET poll 1
<snip>
112 syslogd CALL poll(0x804f180,0x4,0xffffffff)
112 syslogd RET poll 1
112 syslogd CALL poll(0x804f180,0x4,0xffffffff)
112 syslogd RET poll 1
112 syslogd CALL poll(0x804f180,0x4,0xffffffff)
<continuous lines like those shown above>
My original message is included below.
Anyone have any ideas?
On Tue, Jul 10, 2001 at 05:35:36AM +0000, Jim Breton wrote:
> I am running on NetBSD-release-1-5 which is up to date as of about a
> week ago. Platform is i386, computer is a Pentium 100 with 32 MB of
> RAM. It runs a Squid and Junkbuster proxy (from pkgsrc), as well as a
> dnscache process used locally (bound to 127.0.0.1). The only time it is
> under any real load is when it runs its own cron jobs (updatedb, etc.).
>
> This problem has happened to me 4 or 5 times now, and I can't see any
> consistency to its interval or what could be causing it.
>
> Here is what happens: the machine normally receives syslog messages over
> udp from several other machines on the LAN. It is configured to write
> these to disk (like usual), as well as write them to a line printer on
> /dev/lpt0:
>
> auth,authpriv.info /dev/lpt0
>
> It also sends them to another machine on the LAN (I plan on turning this
> off at some point); however that machine is not always up:
>
> auth,authpriv.info @192.168.0.150
>
> Anyway, at some seemingly-random point, the NetBSD machine will stop
> logging these remote messages at all, as well as stop the logging of
> its local messages to the printer (as far as I can tell it still logs
> local ones to disk, but I'm not sure if this is the case every time).
>
> I have verified (using tcpdump) that the computer is indeed receiving
> these log messages.
>
> Restarting syslogd makes everything work fine again, until the next time
> it happens.
>
> Any ideas on what might cause this? I can't imagine it could be the
> fact that that other log server is down, for two reasons: 1) AFAIK, log
> messages are one-way traffic, so this box doesn't even know the other
> one is down; 2) this has happened at least twice, within minutes of each
> other, while that other machine _was_ running.
>
> Any insight appreciated. Thanks!