Port-sparc64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Severe deadlock issues with 5.0/MP



On Feb 5, 2009, at 3:48 AM, Anders Lindgren wrote:
  I've tried a couple of things recently. Here's a wrap-up:

- Serial console break doesn't work even when the machine is responding, despite working in OFW, so that's not related to the hang; apparently the kernel is ignoring it. I'll have to see why, I was always under the impression that it was, to the contrary, darn close to impossible to *get
it* to ignore it.

- Variants of build.sh [...] tools kernel=GENERIC.MP distribution:

  a) build.sh -j 8 and output to console: hang within minutes.
  b) build.sh      and output to console: hangs after a few hours
  c) build.sh -j 8 > mk.log 2>&1 without tail: same as a)
  d) build.sh      > mk.log 2>&1 without tail: see below

The first run of (d) stopped after a few hours with a zombie process named "(sparc64--netbsd)" (truncated name, but the logfile suggests the command was a sparc64--netbsd-install of some html documentation). I was actually able to ^C the build and restart it with build.sh -u, which seemed to crawl along -- but subjectively very slow; how long is a "distribution" supposed to take on a 400MHz USII, on the ballpark scale? It was going on for several hours despite starting part-way through. This morning I checked on the serial console (which was running "top") and it's output had stopped again. I was again given one keystroke of input before all life signs cease. No ping response.

Is there any chance that this is raidframe && MP-related? If it could be relevant, I'll install a fresh 5.0_RC1 on a single disk and try again. But I suppose I should try a LOCKDEBUG+DIAGNOSTICS kernel first...

I wonder if this is in any way related to the lockups that people on port-vax are talking about. Does this happen on uniprocessor sparc64 machines at all?

          -Dave

--
Dave McGuire
Port Charlotte, FL




Home | Main Index | Thread Index | Old Index