Subject: Re: hardware test
To: Christopher Schultz <christopher.d.schultz@comcast.net>
From: Alex Pelts <alexp@broadcom.com>
List: port-cobalt
Date: 05/26/2005 19:52:43
If it is a random segfault it could be a memory problem or something 
else with hw. If segfault is at the same place most likely it is a 
shared library problem (if there are such thing on netbsd 1.6.1). This 
could be caused by incompatibility of package with shared library on 
particular version of OS. This is easy to check by recompiling apache 
from scratch. It does not take too long given that you have at least 128 MB.
If you are in US, memory is not that expensive. I don't want to post 
spam on this mailing list but I was always happy with this 
(http://www.satech.com/) company. I got my qube memory here as well 
around $45 for 128 MB, and they have free shipping now. I know it is a 
bit expensive but not exceptionally so.

There are several different benchmarks in pkgsrc so you can pick one 
that does memory. Compiles also were good at exposing memory problems, 
just run a large compile and see if it segfaults.

Thanks,
Alex

Christopher Schultz wrote:
> Philip and Jebus,
> 
> 
>>>so if the post test was failing how would I know ?
>>
>>You'd see it on the console at startup.  And I expect your Qube would
>>probably refuse to boot.
>>
>>
>>>and how complete of memory test does the qube do ?
>>
>>I have no idea. But I would be extremely surprised if your
>>segmentation faults had anything to do with the hardware.
> 
> 
> Most POST hardware tests are no more in-depth than you could accomplish
> by yourself looking at the chips with a magnifying glass. The only time
> I have ever seen a POST fail the memory was when a chip was only half
> inserted.
> 
> I hardly have any experience with Cobalt hardware, but I know that Intel
> and particularly AMD hardware have problems like this. (I love AMD but
> their chips require so much of the crappy motherboards and RAM that
> companies are building these days...).
> 
> Anyhow, random segfaults CERTAINLY CAN be a hardware problem. If you
> have the ability to do so, try removing one of your RAM chips if you
> have two, or trying another chip if you've got one lying around (yeah,
> right... Qube memory is far too expensive to have a drawer full of it).
> You're likely not to find any problems, though.
> 
> With one exception, every time I've had random segfaults in software
> that I'm convinced is okay, it ends up being a bad motherboard. For the
> Qube, that's pretty much bad news for the whole box, unfortunately. I've
> used a tool on x86 hardware called memtest86 (www.memtest86.com) which
> is extremely useful in finding hardware problems. Perhaps someone has
> something like this for the MIPS architecture? Though I'm not sure how
> you would run it, since these kinds of things generally want to be run
> without an OS or other programs interfering.
> 
> You mentioned that an older version of Apache was working fine.... have
> you tried downgrading to that version?
> 
> Also, there's a very good posibility that you made a mistake upgrading
> your pkgsrc and you hosed an important library. I'm pretty ignorant when
> it comes to the whole NetBSD packaging system (I used the restore CD way
> back in the day), but I'm guessing that you can really break your system
> if you make a wrong move.
> 
> -chris
>