tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: NetBSD-10.0/i386 spurious SIGSEGV



> After upgrading i386 XEN3PAE_DOMU to NetBSD 10.0, various daemons on
> multuple machines get SIGSEGV at places I could not figure any reason
> why it happens.  [...]

> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0xbb610579 in __gettimeofday50 () from /lib/libc.so.12
> (gdb) bt
> #0  0xbb610579 in __gettimeofday50 () from /lib/libc.so.12
> #1  0xbb60ca82 in __time50 (t=t@entry=0xbf7fde88)
>     at /usr/src/lib/libc/gen/time.c:52
> #2  0x0808afdd in update_check_stats (check_type=3, check_time=1717878817)
>     at utils.c:3015

First thing I'd look at is the userland instruction(s) around the crash
point, maybe look at instructions starting at 0xbb610480 or something
and then disassemble forwards looking for 0xbb610579.  In particular,
I'd be interested in whether it's a store instruction that failed or
whether this happened during a syscall trap.

Are all the failures in __gettimeofday50?  All in trap-to-the-kernel
calls?

You say "multiple machines"; are those multiple domUs on a single dom0,
or are they spread across multiple underlying hardware machines?  If
the latter, how similar are those underlying machines?  I'm wondering
if perhaps something is broken in a subtle way such that it manifests
on only certain hardware (I'm talking about something along the lines
of "this tickles erratum #2188 in stepping 478 of Intel CPUs from the
Forest Lawn family").

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse%rodents-montreal.org@localhost
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Home | Main Index | Thread Index | Old Index