Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Strange system behavior



On 2010-09-21 05:02, Eduardo Horvath wrote:
On Mon, 20 Sep 2010, Paul Goyette wrote:

About half of the time, the build fails due to some host utility receiving a
"segmentation fault", and almost always it fails on the exact same command and
at the exact same place in the build!  But re-running the failed command
interactively succeeds without any problem.

A "segmenation fault" is a SIG_SEGV.  That's generated by the kernel when
a process attempts to access memory outside it's address space.  But it
can also be sent by a number of other conditions including another process
calling alarm().  You need to deermine the source of the signal, something
that's quite difficult to do.  You can try putting breakpoints in
uvm_fault() where it sends a signal to the process or turning on UVM
debugging to determine if it really is a memory error.  You can try
putting a breakpoint in the signal delivery code and look at the
kernel backtrace, but getting it to trigger on just the signal you want is
extremely difficult.  You can try ktrace-ing the enitire build to see
signal delivery.

No. alarm() can only deliver a SIGALARM, and only to your own process. However, kill() can deliver signals to other processes (maybe that was what you were thinking of?). The chances of someone sending a SIGSEGV seems unlikely though.

Apart from accessing memory outside your address space, and kill(), hardware problems can generate SIGSEGV.

And for diagnosing, how about checking the point where the signal is delivered by looking in the core dump? Then you can see if it is the same place in the code each time, and what operation is being done at that time, and what addresses are involved, and so on...?

Diagnosing this will be painful.  It's not clearly a hardware problem.  It
could be a memory problem, cache coherency problem, a problem with the
thread context.

Memory problems, as well as cache coherency and so on, are hardware problems. :-)

        Johnny

--
Johnny Billquist                  || "I'm on a bus
                                  ||  on a psychedelic trip
email: bqt%softjar.se@localhost             ||  Reading murder books
pdp is alive!                     ||  tryin' to stay hip" - B. Idol


Home | Main Index | Thread Index | Old Index