NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

port-vax/60293: ./configure "checking for working mmap..." from clisp-2.49 freezes user space on NetBSD/vax 10.1



>Number:         60293
>Category:       port-vax
>Synopsis:       ./configure "checking for working mmap..." from clisp-2.49 freezes user space on NetBSD/vax 10.1
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-vax-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon May 25 23:50:00 +0000 2026
>Originator:     Alexander Schreiber
>Release:        10.1
>Organization:
not much
>Environment:
NetBSD isengart.angband.thangorodrim.de 10.1 NetBSD 10.1 (ISENGART) #1: Sun May 24 00:10:12 UTC 2026  root@isengart:/usr/src/sys/arch/vax/compile/ISENGART vax

>Description:
Running the ./configure script from clisp-2.49 natively (it's packaged as lang/clisp, so grab the .tar.gz from distfiles, unpack, run ./configure) appears to freeze the user space on NetBSD/vax 10.1. The kernel I'm running,
ISENGART, is GENERIC with DIAGNOSTIC enabled and the patches from https://releng.netbsd.org/cgi-bin/req-10.cgi?show=1080 pulled in (so it actually boots with DIAGNOSTIC). However, it this also works the same with plain GENERIC.

Observed behaviour:
 - ./configure starts
 - runs a lot of the usual tests
 - then prints "checking for working mmap..." and hangs ..
 - turns out, the entire machine (well, user space) hangs:
   - top running in a different ssh session stops updating, no longer responds to inputs
   - new incoming ssh connections accepted by kernel, no response from actual sshd
   - kernel still responds to ping
   - login via system console no longer possible due to no response after entering the username

I've reduced the offending code to the following short snippet (took the conftest.c generated by
configure and did some hacksawing):

----------------------- cut here for new monitor ---------------------------
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <signal.h>


/* A POSIX signal handler.  */
static void
exception_handler (int sig)
{
  exit (1);
}
static void
nocrash_init (void)
{
  signal (SIGSEGV, exception_handler);
  signal (SIGBUS, exception_handler);
}

int main () {

  fprintf(stderr, "starting up ..\n");
  int flags = MAP_ANON | MAP_PRIVATE;
  int fd = -1;
  nocrash_init();
#define bits_to_avoid 0
#define my_shift 24
#define my_low   1
#define my_high  64
#define my_size  8192 /* hope that 8192 is a multiple of the page size */
/* i*8 KB for i=1..64 gives a total of 16.25 MB, which is close to what we need */
#define base_address 0
 {long i;
#define i_ok(i)  ((i) & (bits_to_avoid >> my_shift) == 0)
  for (i=my_low; i<=my_high; i++)
    if (i_ok(i))
      { caddr_t addr = (caddr_t)(base_address + (i << my_shift));
/* Check for 8 MB, not 16 MB. This is more likely to work on Solaris 2. */
#if bits_to_avoid
        long size = i*my_size;
#else
        long size = ((i+1)/2)*my_size;
#endif
        fprintf(stderr, "mapping %ld size memory ...\n", size);
        if (mmap(addr,size,PROT_READ|PROT_WRITE,flags|MAP_FIXED,fd,0) == (void*)-1) exit(1);
    }
  fprintf(stderr, "sleeping for 30s\n");
  sleep(30);
#define x(i)  *(unsigned char *) (base_address + (i<<my_shift) + (i*i))
#define y(i)  (unsigned char)((3*i-4)*(7*i+3))
  fprintf(stderr, "stepping over memory, up\n");
  for (i=my_low; i<=my_high; i++) if (i_ok(i)) { x(i) = y(i); }
  fprintf(stderr, "stepping over memory, down\n");
  for (i=my_high; i>=my_low; i--) if (i_ok(i)) { if (x(i) != y(i)) exit(1); }
  fprintf(stderr, "done\n");
  exit(0);
}}

----------------------- cut here for new monitor ---------------------------

When the above snippet is compiled and run, the output on vax is:

----------------------- cut here for new monitor ---------------------------
starting up ..
mapping 8192 size memory ...
mapping 16384 size memory ...
mapping 24576 size memory ...
mapping 32768 size memory ...
mapping 40960 size memory ...
mapping 49152 size memory ...
mapping 57344 size memory ...
mapping 65536 size memory ...
mapping 73728 size memory ...
mapping 81920 size memory ...
mapping 90112 size memory ...
mapping 98304 size memory ...
mapping 106496 size memory ...
mapping 114688 size memory ...
mapping 122880 size memory ...
mapping 131072 size memory ...
mapping 139264 size memory ...
mapping 147456 size memory ...
mapping 155648 size memory ...
mapping 163840 size memory ...
mapping 172032 size memory ...
mapping 180224 size memory ...
mapping 188416 size memory ...
mapping 196608 size memory ...
mapping 204800 size memory ...
mapping 212992 size memory ...
mapping 221184 size memory ...
mapping 229376 size memory ...
mapping 237568 size memory ...
mapping 245760 size memory ...
mapping 253952 size memory ...
mapping 262144 size memory ...
sleeping for 30s
stepping over memory, up
----------------------- cut here for new monitor ---------------------------

At which point user space no longer responds. The same snippet runs fine (in <1s + the 30s sleep) on
NetBSD 10.1 on the following architectures: i386, amd64, sparc64. The behaviour on vax is the same
both running as uid=0 and as a non-privileged user.

One remaining wrinkle: my "VAX" is a SIMH instance using  SIMH v4.0 - 19-01 due to lack of a
physical VAX machine. SIMH VAX config as follows:

----------------------- cut here for new monitor ---------------------------
set cpu simhalt
; real hardware limit for MV3900: 64m
; set cpu 512m
set cpu 64m
set cpu idle=netbsd
attach nvr nvram.bin
; metric bytes ...
set rq0 rauser=17179
attach rq0 isengart.disk
set rq1 cdrom
attach rq1 NetBSD-10.0-vax.iso
set xq mac=AA-00-04-00-23-42
attach xq0 eno1
----------------------- cut here for new monitor ---------------------------

The SIMH VAX simulator process sits usually at < 40% CPU when the simulated machine running
NetBSD is idle. When the user space freezes up, it goes to 100% and stays there, so _something_
is still happening ... 

dmesg from the machine booting up:

----------------------- cut here for new monitor ---------------------------
> boot netbsd
4435992+186336 [254016+238904]=0x4e1180
[   1.0000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003,
[   1.0000000]     2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013,
[   1.0000000]     2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023,
[   1.0000000]     2024
[   1.0000000]     The NetBSD Foundation, Inc.  All rights reserved.
[   1.0000000] Copyright (c) 1982, 1986, 1989, 1991, 1993
[   1.0000000]     The Regents of the University of California.  All rights reserved.

[   1.0000000] NetBSD 10.1 (ISENGART) #1: Sun May 24 00:10:12 UTC 2026
[   1.0000000]  root%isengart.angband.thangorodrim.de@localhost:/usr/src/sys/arch/vax/compile/ISENGART
[   1.0000000] MicroVAX 3800/3900
[   1.0000000] total memory = 65468 KB
[   1.0000000] avail memory = 57860 KB
[   1.0000000] mainbus0 (root)
[   1.0000000] cpu0 at mainbus0: KA655, CVAX microcode rev 6 Firmware rev 83
[   1.0000000] lance at mainbus0 not configured
[   1.0000000] uba0 at mainbus0: Q22
[   1.0000000] dz1 at uba0 csr 160100 vec 304 ipl 17
[   1.0000000] mtc0 at uba0 csr 174500 vec 774 ipl 17
[   1.0000000] mscpbus0 at mtc0: version 5 model 3
[   1.0000000] mscpbus0: DMA burst size set to 4
[   1.0000000] uda0 at uba0 csr 172150 vec 770 ipl 17
[   1.0000000] mscpbus1 at uda0: version 3 model 3
[   1.0000000] mscpbus1: DMA burst size set to 4
[   1.0000000] qt0 at uba0 csr 174440 vec 764 ipl 17
[   1.0000000] qt0: delqa-plus in Turbo mode, hardware address aa:00:04:00:23:42
[   1.0000000] rlc0 at uba0 csr 174400 vec 160 ipl 17
[   1.0000000] rl0 at rlc0 drive 0: RL01, drive not loaded
[   1.0000000] rl1 at rlc0 drive 1: RL01, drive not loaded
[   1.0000000] rl2 at rlc0 drive 2: RL01, drive not loaded
[   1.0000000] rl3 at rlc0 drive 3: RL01, drive not loaded
[   1.0000000] ts0 at uba0 csr 172520 vec 224 ipl 17: TS11
[   1.0000000] ts0: rev 0, extended features enabled, transport offline
[   1.0000000] WARNING: system needs entropy for security; see entropy(7)
[   1.6200030] mt0 at mscpbus0 drive 0: TK50
[   1.6200030] mt1 at mscpbus0 drive 1: TK50
[   1.6200030] mt2 at mscpbus0 drive 2: TK50
[   1.6200030] mt3 at mscpbus0 drive 3: TK50
[   1.6200030] ra0 at mscpbus1 drive 0: RA82
[   1.6200030] racd0 at mscpbus1 drive 1: RRD40
[   1.6200030] ra1 at mscpbus1 drive 2: RD54
[   1.6200030] rx0 at mscpbus1 drive 3: RX50
[   1.7400030] swwdog0: software watchdog initialized
[   1.7600030] ra0: size 33567766 sectors
[   1.7600030] racd0: size 1331200 sectors
[   1.7600030] ra1: attempt to bring on line failed:  unit offline (not mounted) (code 3, subcode 1)
[   1.7600030] rx0: attempt to bring on line failed:  unit offline (not mounted) (code 3, subcode 1)
[   1.7600030] WARNING: 2 errors while detecting hardware; check system log.
[   1.7600030] boot device: ra0
[   1.7600030] root on ra0a dumps on ra0b
[   1.7600030] root file system type: ffs
[   1.7600030] kern.module.path=/stand/vax/10.1/modules
----------------------- cut here for new monitor ---------------------------

I've built a new kernel with the following all turned on:

options         DIAGNOSTIC
options         DEBUG
options         PMAPDEBUG
options         TRAPDEBUG
options         LOCKDEBUG

however, same results and no complaints on console (which is what I hoped for).


>How-To-Repeat:
Boot up NetBSD/vax 10.1 using GENERIC on either a real VAX (to be verified) or inside a SIMH simulated MicroVAX 3800/3900 (verified), grab clisp-2.49.tar.gz from pkgsrc distfiles, unpack, run the ./configure, see userspace eventually freeze up at the "checking for working mmap..." stage.
>Fix:




Home | Main Index | Thread Index | Old Index