Subject: Re: Diagnosing reboots
To: Christopher W. Richardson <cwr@nexthop.com>
From: Skylar Thompson <skylar@cs.earlham.edu>
List: netbsd-help
Date: 09/15/2005 12:22:30
This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig8E91587F275378E1FCE89C2C
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Christopher W. Richardson wrote:

> Hey folks,
>
> Sorry for such a potentially basic user question, but it happens
> so infrequently that I'm at a loss for basic admin skills.  How
> do I go about diagnosing the reason for a machine rebooting?
>
> I came in to my office this morning to find this on my
> workstation:
>
> cwr@achilles#uptime
>  7:55PM  up 15:54, 2 users, load averages: 0.20, 0.15, 0.10
>
> OK, this morning it was less than 15 hours uptime, but, you get
> the idea.  I have no idea what caused the workstation to reboot.
> The end of the reboot shows:
>
> Sep 12 04:03:31 achilles /netbsd: root file system type: ffs
> Sep 12 04:03:31 achilles savecore: no core dump
>
> and the beginning shows:
>
> Sep 11 21:00:10 achilles syslogd: restart
> Sep 12 04:03:31 achilles syslogd: restart
> Sep 12 04:03:31 achilles /netbsd: NetBSD 2.0.2_STABLE (ACHILLES)
> #10: Sun Sep  4
>  13:06:05 EDT 2005
> Sep 12 04:03:31 achilles /netbsd: 
> cwr@achilles:/usr/localhome2/obj/sys/arch/i386/compile/ACHILLES
>  Sep 12 04:03:31 achilles /netbsd: total memory = 254 MB
>  Sep 12 04:03:31 achilles /netbsd: avail memory = 245 MB
>  
> So it appears that it neither dumped core nor logged a reason for
> the reboot.  The authlog shows:
>
> Sep 11 20:27:47 achilles sshd[11619]: Accepted password for cwr from 
> 192.168.10.29 port 3058 ssh2
> Sep 12 04:03:29 achilles sshd[433]: Server listening on :: port 22.
> Sep 12 04:03:29 achilles sshd[433]: Server listening on 0.0.0.0 port 22.
> Sep 12 12:01:45 achilles sshd[785]: Accepted password for cwr from 
> 65.241.132.123 port 1174 ssh2
>
> Which appears to indicate that no one became root anywhere near
> the time of the reboot (and a more thorough search of the log
> confirms that no one had done so within days, other than me).
>
> What's the proper way to go about diagnosing this (oh, and please
> cc me, as I'm not on this list)?
>

Is this machine on a UPS? If not, could it have been a power flicker?

-- 
-- Skylar Thompson (skylar@cs.earlham.edu)
-- http://www.cs.earlham.edu/~skylar/


--------------enig8E91587F275378E1FCE89C2C
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDKa3Ysc4yyULgN4YRAosEAKCQmSJMJHcodwjX7Z9nivnG1tjvCwCfd1EY
5D5bjtK6HCw1N/QfVAOvm2c=
=SKnD
-----END PGP SIGNATURE-----

--------------enig8E91587F275378E1FCE89C2C--