Subject: Re: Bug found: help to isolate it
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Lista de NetBSD Users <list10@sepc.edu.mx>
List: netbsd-users
Date: 05/20/2002 20:19:46
On Sun, 19 May 2002, Manuel Bouyer wrote:
> BTW, I suspect you have some very special setup. I have > 20
> NetBSD servers (i386, alpha, sparc), some of them with very large
> uptime (> 500 days) and the only time I've seen that was when someone
> hit the 'scroll lock' on the console. Unlocking it unwedged syslogd.
When I start a new server for production I do
the following list:
0) Write a list of services and packages this server will need.
Check for each package the kernel options needed.
Check for each package if it can have his own logfile
(sendmail->maillog, named->local1, pgsql->local0,
imap-uw->maillog, dhcpd->local2, etc)
1) Configure setup and turn off APM and these things
2) Install NetBSD, no X, no games.
3) Compile a new kernel with the devices found in dmesg
and the options needed.
4) If this server will move a lot of mail, then
mkdir /var/log/mail and touch /var/log/mail/maillog
and adjust /etc/newsyslog.conf with the number of
backup files and the size.
5) In our case, our server will run pgsql and horde/imp
(apache, php4, etc are needed) then add to newsyslog.conf
/var/log/horde/horde 640 7 750 * Z
/var/log/pgsql/pgsql 640 7 750 * Z
Also edit /etc/syslog.conf and add:
local0.* /var/log/pgsql/pgsql
local4.* /var/log/horde/horde
6) Edit conf files of pgsql and horde for:
a) to use syslogd and b) to use localX (0 or 4)
This is not enough because we got a lot of messages
of local0/4 in /var/log/messages, then we add
local0.none and local4.none to the following line
of syslog.conf
*.info;auth,authpriv,cron,ftp,kern,local0,local4,lpr,mail.none
/var/log/messages
--------------------------
This is my newsyslog.conf
# logfilename [user:group] mode ngen size time [ZBN-]
[/pidfile] [sigtype]
#
/var/cron/log root:wheel 600 5 50 * Z
#/var/log/aculog uucp:dialer 640 7 * 24 Z
/var/log/authlog 600 9 75 * Z
/var/log/kerberos.log 640 7 * 24 Z
/var/log/lpd-errs 640 7 10 * Z
/var/log/mail/maillog 600 9 750 * Z
/var/log/messages 644 9 750 * Z
/var/log/wtmp 644 7 75 * ZBN
/var/log/xferlog 640 7 250 * Z
/var/log/horde/horde 640 7 750 * Z
/var/log/pgsql/pgsql 640 7 750 * Z
----------------------------------
# $NetBSD: syslog.conf,v 1.6 1997/02/21 09:04:26 mikel Exp $
*.err;kern.*;auth.notice;authpriv.none;mail.crit /dev/console
*.info;auth,authpriv,cron,ftp,kern,local0,local4,lpr,mail.none
/var/log/messages
kern.debug /var/log/messages
# The authpriv log file should be restricted access; these
# messages shouldn't go to terminals or publically-readable
# files.
auth,authpriv.info /var/log/authlog
cron.info /var/cron/log
ftp.info /var/log/xferlog
lpr.info /var/log/lpd-errs
mail.info /var/log/mail/maillog
#uucp.info
/var/spool/uucp/ERRORS
*.emerg *
*.notice;auth.debug root
local0.* /var/log/pgsql/pgsql
local4.* /var/log/horde/horde
-----------------------------------------------------
At this point, I believe this is the right way to
use syslogd with packages. If I am wrong, please
tell me.
> Can you give more details about your setup (hardware, console type, software
> using syslogd, syslogd.conf, etc ...)
server# fstat | grep c05cee80
nobody httpd 234 8* unix dgram c064e700 <-> c05cee80
nobody httpd 231 5* unix dgram c0645c80 <-> c05cee80
root named 126 3* unix dgram c05cef40 <-> c05cee80
root syslogd 117 3* unix dgram c05cee80
sendmail is not in this list because is not running like a daemon.
I do not see pgsql in the list... horde/imp are php4 programs,
I supose horde/imp are one of the httpd processes of the list.
The hardware is the following (from dmesg)
Standard hardware were ommited and number of disks
and LAN cards may vary, all the memory is ECC
---- we have 8 like this ------------------------
cpu0: Intel Pentium II/Celeron (Deschutes) (686-class), 350.82 MHz
total memory = 127 MB
avail memory = 116 MB
using 1658 buffers containing 6632 KB of memory
BIOS32 rev. 0 found at 0xf0210
pchb0: Intel 82443BX Host Bridge/Controller (rev. 0x03)
pcib0: Intel 82371AB PCI-to-ISA Bridge (PIIX4) (rev. 0x02)
pciide0 at pci0 dev 7 function 1: Intel 82371AB IDE controller (PIIX4)
(rev. 0x01)
uhci0 at pci0 dev 7 function 2: Intel 82371AB USB Host Controller (PIIX4)
(rev. 0x01)
Intel 82371AB Power Management Controller (PIIX4) (miscellaneous bridge,
revision 0x02) at pci0 d
ev 7 function 3 not configured
vga0 at pci0 dev 10 function 0: ATI Technologies Mach64 B (rev. 0x5c)
ahc0: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs
ex0 at pci0 dev 14 function 0: 3Com 3c905B-TX 10/100 Ethernet (rev. 0x30)
ex1 at pci0 dev 16 function 0: 3Com 3c905B-TX 10/100 Ethernet (rev. 0x30)
sd0 at scsibus0 target 0 lun 0: <IBM, DDRS-39130D, DC1B> SCSI2 0/direct
fixed
---- we have 3 like this --------------------------
NetBSD 1.5.2 (MINERVA) #2: Wed Nov 14 17:45:50 CST 2001
gallegos@minerva:/usr/src/sys/arch/i386/compile/MINERVA
cpu0: Intel Pentium Pro (686-class), 198.96 MHz
total memory = 127 MB
avail memory = 116 MB
using 1659 buffers containing 6636 KB of memory
BIOS32 rev. 0 found at 0xf0210
pchb0: Intel 82441FX PCI and Memory Controller (PMC) (rev. 0x02)
pcib0: Intel 82371SB PCI-to-ISA Bridge (PIIX3) (rev. 0x01)
pciide0 at pci0 dev 7 function 1: Intel 82371SB IDE Interface (PIIX3)
(rev. 0x00)
vga0 at pci0 dev 12 function 0: S3 Trio32/64 (rev. 0x54)
ex0 at pci0 dev 14 function 0: 3Com 3c905B-TX 10/100 Ethernet (rev. 0x30)
ahc0: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs
sd0 at scsibus0 target 0 lun 0: <IBM, DDRS-39130D, DC1B> SCSI2 0/direct
fixed
sd1 at scsibus0 target 1 lun 0: <IBM, DDRS-39130D, DC1B> SCSI2 0/direct
fixed
sd2 at scsibus0 target 2 lun 0: <IBM, DDRS-39130D, DC1B> SCSI2 0/direct
fixed
sd3 at scsibus0 target 3 lun 0: <IBM, DDRS-39130D, DC1B> SCSI2 0/direct
fixed
-----------------------------------------
Well... I think it is all we can do now... and
wait for the next event.
The case of the second server with syslogd stopped is
most interesting... but it is for another mail
message (tonight or tomorrow).
Thanks a lot
Heron Gallegos