Subject: Ksyms problem located.
To: None <port-hp700@NetBSD.org>
From: Jochen Kunz <jkunz@unixag-kl.fh-kl.de>
List: port-hp700
Date: 11/05/2003 21:08:46
Hi.

Seams I found the ksyms problem. Symptom: kernel prints out at boot
biomask 00000034 netmask 0000003c ttymask 0000007e
symb[rbit].bitno == cix!!!
symb[rbit].bitno == cix!!!
symb[rbit].bitno == cix!!!
symb[rbit].bitno == cix!!!
etc. etc. etc.

So I added some debug printfs to kern_ksysms.c:
biomask 00000034 netmask 0000003c ttymask 0000007e
kernel_symtab.sd_symsize=571408 sizeof(Elf_Sym)=16 $1/$2=35713
symb[rbit].bitno == cix!!! key=nuidhash_max val=32867
symb[rbit].bitno == cix!!! key=nfsm_srvwcc val=32872
symb[rbit].bitno == cix!!! key=nfsm_srvwcc val=32872

This means that there is a total of 35713 symbols ("nm netbsd| wc -l"
says 35712) in the kernel symbol table. The problem occures when the
symbols with index 32867 and 32872 are added to the Patricia-tree.

32867? 32872? Nachtigall ick hör dir trapsen!

This numbers are slighly bigger then 2^15. A look at kern_ksysms.c:
struct ptree {
	int16_t bitno;
	int16_t lr[2];
} *symb;
[...]
symb[nix].lr[bit] = -val;
confirms my suspicion: int overfolw trouble. It also explains why a 
striped down kernel config doesn't suffer from this problem. There 
are less then 2^15 - 1 kernel symbols in a small kernel.

I run "s/int16_t/int/g" on kern_ksysms.c, recompiled and the problem 
was gone:
biomask 00000034 netmask 0000003c ttymask 0000007e
kernel_symtab.sd_symsize=571408 sizeof(Elf_Sym)=16 $1/$2=35713
scsibus0: waiting 2 seconds for devices to settle...
boot device: ie0
etc.

I can run nm(1) on /dev/ksyms of this kernel, works as expected.

Now the question is: Is kern_ksysms.c broken because it uses int16_t,
thus limiting the size of the kernel symbol table to 2^15 - 1 indices?
Or is there yet an other toolchain issue because there are too many
symbols in the ELF image?

When I run nm(1) on the ELF image I see lots and lots of symbols like 
00641374 t .LC0

I did a 
$ nm netbsd| grep -v ' t .LC' | wc -l
    9810

This number looks more sane to me. (The 1.6 kernel of my PReP machine
has 9122 symbols. I have no other NetBSD kenel to compare at hand.) 
What do these ".LC" symbols mean? 
-- 


tschüß,
       Jochen

Homepage: http://www.unixag-kl.fh-kl.de/~jkunz/