Subject: kvm_mkdb testdb() insufficient
To: None <port-sparc@netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: port-sparc
Date: 11/12/1999 12:55:10
I have a machine (an IPX) I was trying to bring up-to-date with respect
to the source tree I'm using on all my other machines.  After
struggling with weird brokenness ("netstat -n" printing "netstat:
kvm_read: Bad address") and attacking it with chroot and ktrace, I
found that kvm_mkdb wasn't rebuilding kvm.db even though it needed to.
It appears that testdb() is incorrectly thinking the db is up to date.

The version string in the (broken) kvm.db is
"NetBSD 1.4J (SPARKLE) #0: Sat Sep 25 20:06:43 EDT 1999".
/kern/version, on the other hand, says
"NetBSD 1.4J (SPARKLE) #0: Fri Nov 12 00:56:10 EST 1999".

Yet ktracing kvm_mkdb reveals that it's opening /dev/kmem and finding,
in there, a string matching what it finds in kvm.db.  strings on the
kernel (yes, I rebooted!) does not turn up any instances of the old
string.  Neither does dmesg | egrep NetBSD.  The broken version string
seems to be at offset f0194240

   546 kvm_mkdb CALL  lseek(0x3,0,0,0xf0194240,0,0)
   546 kvm_mkdb RET   lseek 0
   546 kvm_mkdb CALL  read(0x3,0xefffe3d8,0x800)
   546 kvm_mkdb GIO   fd 3 read 2048 bytes

which seems to fall into a hole in /netbsd, and kvm_mkdb finds it
because it uses the address from the old kvm.db to look for the version
string in /dev/kmem.

My best guess is that I previously booted a larger kernel, which had
its version string at f0194240, and that my current kernel is sized
such that the memory at that address falls in a hole (probably between
text and data) and hence has never been overwritten since.  (I suspect
that power-cycling the machine, or otherwise destroying memory
contents, would cure it.  But I can't do that now because I'm not at
the machine, and I can't write /dev/kmem 'cause it's up multi-user.)

Thus, some other things that kvm_mkdb arguably perhaps check:

- Is the kernel newer than the db?

- If kernfs is mounted, maybe use its version string instead - or also?
   Or perhaps try mounting it somewhere temporary?

- Compare the "version" symbol in the kernel against what's in the db?

					der Mouse

			       mouse@rodents.montreal.qc.ca
		     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B