NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

port-alpha/40604: AlphaServer DS20E loses HDs and other Drives when adding 1 GB more RAM



>Number:         40604
>Category:       port-alpha
>Synopsis:       AlphaServer DS20E loses HDs and other Drives when adding 1 GB 
>more RAM
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    port-alpha-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Feb 10 22:55:00 +0000 2009
>Originator:     Marco Poli
>Release:        5.0 RC1
>Organization:
Design Lab - dlab.poli.usp.br
>Environment:
>Description:
Ok, that's one of the weirdest bugs I have ever faced.

I recently got a AlphaServer DS20E with 2 833 MHz CPUs, 5 Hard Drives and 3 
Power Sources. Nice server.

The machine came with 1 GB RAM, 4x256 MB DIMM boards placed in Bank 0.

I installed NetBSD 4.0.1 without any issues and immediately got the 5 Hard 
Drives in a RAID configuration, with root and swap under RAID1 and /usr under a 
RAID5. All working very nicely.

One day I received 8 more of that 256 MB memories, and hurried to upgrade the 
server. I installed the boards in Banks 1 and 2.

What wasn't my surprise in the next boot, when I was faced with a mysterious

-----
probe(esiop0:0:0:0): request sense for a request sense ?
probe(esiop0:0:0:0): request sense failed with error 22
probe(esiop0:0:0:0): generic HBA error
-----
and that 3 messages repeat for each of my other 4 Hard Drives.

and then everything closes with the misterious:
-----
WARNING: can't figure out what device matches "SCSI 1 7 0 0 0 0 0"
-----
That should be my boot and root device, dka0.

The next line asks me to set the root device, but when I hit any key, the 
following line immediatly appears 3 times:

----
root device:
stray isa irq 1
stray isa irq 1
stray isa irq 1
use one of: fxp0 fd0[a-h] cd0[a-h] ddb halt reboot
stray isa ira 1; stopped logging
----

As you can see, none of my Hard Drives are listed... The first time that 
happened I imagined I had put some static and physically damaged my SCSI bus, 
but after a boot into Linux, everything seemed just fine hardware-wise.

Ok, so, lets try to boot using the CD-ROM (dqa0 in my case): now the same thing 
happens, but:

----
WARNING: can't figure out what device matches "IDE 0 105 0 0 0 0 0"
----

And all the same. The CD doesn't show in the list of available root devices, 
then.


When I remove the extra memory and leave only Bank 0 full, that is only 1 GB, 
everything gets back to normal.

Linux 2.6 works just fine with 2 GB or 3 GB of total memory, no issues noticed 
in the 2 or 3 days of uptime with this configuration.

This bug *might* be related to #38941, with the difference that in my case, it 
never really hangs, it just gets to a stale "no root disk". The install CD even 
gets the installation script running, but tells me there is nowhere to install 
to. I am able to use ddb at any point and quit and restart the installation 
script.

I can't say for sure. But I don't think it is related to #37915.

Machine is a:

---
COMPAQ AlphaServer DS20E 833 MHz, s/n ...
---

The SCSI device:
---
esiop0 at pci1 dev 7 function 0: Symbios Logic 53c895 (ultra2-wide scsi)
esiop0: using on-board RAM
esiop0: interrupting at dec 6600 irq 47
---

I am sorry for any typos, I was unable to copy-paste the actual screen, this is 
a newly typed-in reproduction.

Please tell me if I can provide any other useful information.

Thanks!
>How-To-Repeat:


Try to boot a DS20E with more than 1 GB of memory, or with memory in other 
Banks than Bank 0, I can't really tell.
>Fix:



Home | Main Index | Thread Index | Old Index