Port-vax archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
RE: KA655 Kernel Compile and Corrupt Object Files
Thanks all for your thoughts. The CPU (and ECC memory) really does seems fine, even under load -- including when pages are swapped back in over the network. A DMA and/or load related problem in the DEQNA or its driver seems likely to me for a few reasons:
- DEQNAs are seemingly notorious for poor behavior...
- ...indeed, I recently noticed that mine is occasionally producing this kernel message: "qe0: xmit logic died, resetting..."
- ... and, related, my reseller* told me that a minimum revision of the DEQNA (which I *did* met/exceeded for both of my DEQNAs) was required for proper interop with the KA655, suggesting that some issues were seen.
I will be double checking my original assertions and looking closely at the object files to see if their contents are recognizable at all as compared to the known good simh build. I agree with mouse's suspicion that a successful link of the hardware compiled object files would likely produce an inoperable kernel and I may test to confirm. I MIGHT also try testing with my hardware KA630, although math tells me a few weeks would be required for the compile.
As for reproducible builds, while the date-times may differ, I wouldn't expect object files to change between builds (aside from maybe vers.o), given the same toolchain options and source inputs**. Am I mistaken? I will try to learn more about GCC and how it writes object files. I'll also be looking at if_qe.c. I've never worked on a compiler, nor a kernel-mode (and network) driver before, so this should be fun! Reading list suggestions appreciated.
I WOULD like to test with another NIC, but all I have is another DEQNA. I cannot acquire a DELQA(/T) at this time. Is anyone willing to loan a DELQA out? (I'm in the northwest of the US -- Seattle.) I believe the DEQNA and DELQA use the same (BA123) cabinet kit, I thought I heard once?
All of that will take time***, and I HAD hoped to prioritize some improvements to the npf parser as my debut NetBSD and current spare-time project... so I guess we'll see which task wins.
Thanks again!
* Mitch at Keyways -- he seems GREAT, frankly!
** Also, not using build.sh. Using config $kern_config && cd ../compile/$kern_config && make depend && make
*** Perhaps not surprising on a less cycle-rich ISAs, such as VAX.
--
Kind regards,
Josh
-----Original Message-----
From: port-vax-owner%NetBSD.org@localhost <port-vax-owner%NetBSD.org@localhost> On Behalf Of Rhialto
Sent: Saturday, May 24, 2025 7:44 AM
To: Johnny Billquist <bqt%softjar.se@localhost>
Cc: Mouse <mouse%Rodents-Montreal.ORG@localhost>; port-vax%netbsd.org@localhost
Subject: Re: KA655 Kernel Compile and Corrupt Object Files
On Sat 24 May 2025 at 14:52:18 +0200, Johnny Billquist wrote:
> Yeah, it do seem it is non-deterministic. But even more weird is that
> most object files come out differently (even when valid) on simh
> compared to real hardware. But both "valid".
>
> Are the two systems really running the same toolchain? This just
> sounds more strange by the second...
>
> Is gcc prone to now produce non-deterministic output?
I don't think build.sh has reproducible builds enabled as default, does it? So at least lots of time stamps would differ between builds.
> Johnny
-Olaf.
--
___ Olaf 'Rhialto' Seibert <rhialto/at/falu.nl>
\X/ There is no AI. There is just someone else's work. --I. Rose
Home |
Main Index |
Thread Index |
Old Index