Subject: kern/13353: can not build libc when running a -current kernel
To: None <gnats-bugs@gnats.netbsd.org>
From: None <martin@duskware.de>
List: netbsd-bugs
Date: 07/01/2001 18:51:45
>Number:         13353
>Category:       kern
>Synopsis:       can not build libc when running a -current kernel
>Confidential:   yes
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Jul 01 09:50:01 PDT 2001
>Closed-Date:
>Last-Modified:
>Originator:     Martin Husemann
>Release:        1.5W, cvs updated July 1
>Organization:
	
>Environment:
	
System: NetBSD setting-sun.duskware.de 1.5W NetBSD 1.5W (GENERIC) #0: Sun Jul 1 15:12:50 MEST 2001 martin@setting-sun.duskware.de:/usr/src/sys/arch/sparc64/compile/GENERIC sparc64
Architecture: sparc64
Machine: sparc64
>Description:

The problem is there for more than a week now, maybe two. I first suspected
hardware failure, but can rule out that now. Furthermore I suspected pmap
changes in the sparc64 pmap, but Ken Wellsch sees exactly the same symptoms
on his alpha.

So, what happens is this: if I boot a current kernel, goto usr/src/lib/libc,
do "make cleandir" there and then "make", after that "make install" as root
(to fasten the test process, I set NOMAN, NOLINT and NOPROFILE) I get:

install -r  -c  -o root  -g wheel -m 600 libc.a /usr/lib/libc.a
ranlib -t /usr/lib/libc.a
chmod 444 /usr/lib/libc.a
install -r  -c  -o root  -g wheel -m 600 libc_pic.a /usr/lib/libc_pic.a
ranlib -t /usr/lib/libc_pic.a
chmod 444 /usr/lib/libc_pic.a
install -r  -c  -o root  -g wheel -m 444 libc.so.12.76 /usr/lib/libc.so.12.76
ln -sf libc.so.12.76 /usr/lib/libc.so.12.tmp
mv -f /usr/lib/libc.so.12.tmp /usr/lib/libc.so.12
ln -sf libc.so.12.76 /usr/lib/libc.so.tmp
mv -f /usr/lib/libc.so.tmp /usr/lib/libc.so
Segmentation fault - core dumped
install -r  -c   -o root -g wheel -m 444 /usr/src/lib/libc/db/man/btree.3 /usr/share/man/man3/btree.3
unknown group wheel*** Error code 1

The core dumped is from cmp.

I then restore /usr/lib/libc.so.12.76, libc_pic.a and libc.a from a backup,
reboot and start an older kernel (I happened to have one from June 8 or 9 
sources lying around), repeat the above procedure (with the same libc sources
and inlcude files installed) and it works just fine.

This is repeatable (more or less, not sure about the coredump, so the exact
corruption in the new libc may vary, but it always is corrupted).

>How-To-Repeat:

Since it works for others, there must be something special about my config
system configuration, but I have no clue what it is. Nothing special running
on this machine (not even ntpd, as the old binary I happen to have does not
like to work with -current sparc64 kernels due to alignment issues).

Note that the libc source I have *is* working with both kernels. When I do the
"make" running the old kernel (so I don't get any corruption) and the do the
"make install" running a -current kernel, it works just fine.

Just for reference, here is a process list from the machine in the state where
it did not work (just to show there is nothing special and/or give clues what
may be causing this):

  PID TT STAT    TIME COMMAND
    0 ?? DKs  0:00.00 (swapper)
    1 ?? Is   0:00.03 init 
    2 ?? DK   0:00.00 (pagedaemon)
    3 ?? DK   1:31.46 (reaper)
    4 ?? DK   0:03.99 (ioflush)
    5 ?? DK   0:00.81 (aiodoned)
   85 ?? Ss   0:00.28 /usr/sbin/syslogd -s 
   95 ?? Is   0:00.05 /usr/sbin/rpcbind -l 
  115 ?? Is   0:00.04 /usr/sbin/mountd 
  126 ?? IL   0:00.06 nfsd: server 
  127 ?? IL   0:00.06 nfsd: server 
  128 ?? IL   0:00.07 nfsd: server 
  129 ?? IL   0:00.08 nfsd: server 
  173 ?? Ss   0:00.07 /usr/pkg/sbin/upsmon night-porter 
  176 ?? Is   0:02.35 /usr/sbin/sshd 
  182 ?? Is   0:00.04 /usr/sbin/inetd -l 
  185 ?? Is   0:00.02 /usr/sbin/cron 
  197 ?? I    0:01.48 sshd: martin@notty 
  198 ?? Is   0:00.10 tcsh -c /usr/X11R6/bin/rxvt -display night-porter.duskware.de:0 -ls -bg black -fg grey -
  199 ?? R    0:02.79 /usr/X11R6/bin/rxvt -display night-porter.duskware.de:0 -ls -bg black -fg grey -sl 1000 
14682 ?? I    0:01.52 sshd: martin@notty 
14683 ?? Is   0:00.10 tcsh -c /usr/X11R6/bin/rxvt -display night-porter.duskware.de:0 -ls -bg black -fg grey -
14684 ?? I    0:01.01 /usr/X11R6/bin/rxvt -display night-porter.duskware.de:0 -ls -bg black -fg grey -sl 1000 
  200 p0 Is   0:00.16 -tcsh 
14562 p0 S    0:00.19 -tcsh 
14734 p0 R+   0:00.00 ps ax 
14685 p1 Is   0:00.12 -tcsh 
14689 p1 I+   0:00.15 /bin/sh /usr/bin/send-pr 
14727 p1 I+   0:00.57 /usr/local/bin/me /tmp/p14689 
  187 ?? Is+  0:00.57 -tcsh 

>Fix:
n/a
>Release-Note:
>Audit-Trail:
>Unformatted: