Subject: Re: kern/1774: Problems with identd/kvm_read
To: None <tech-kern@NetBSD.ORG>
From: Joao Carlos Mendes Luis <jonny@gaia.coppe.ufrj.br>
List: tech-kern
Date: 11/20/1995 20:37:49
> > >Number:         1774
> > >Category:       kern
> > >Synopsis:       identd locking in kvm_read/lseek
> 
> > 	Some times I note high CPU utilization, just to discover some
> >       copies of identd running.  System load (loadavg) equals 1 times
> >       n copies, so I think it's in an infinite CPU loop.
> > 	When I call GDB and attach to one of these it's locked running
> >       the getbuf function, running kvm_read in the shared lib area.
> >       To be specific, it's locked inside the lseek() function.
> > 	Since none of identd.gebuf or lib/libkvm/kvm.c/kvm_read have
> >       changed since on current, maybe this problem persists.
> > 	Seens to me a problem on /dev/kmem, in lseek, or even a race
> >       condition with identd and a dying socket.
> 
> hmm. interesting -- i have heard this from a few people on irc as well.
> 
> if you are intestested in debugging this, could you compile identd
> with -g, and then next time it happens and you attach with gdb try to
> get a call trace. that will let me see where kvm_read is being called
> from (there are many ways to get to that function).
> 
> thanks; hopefully we can squish this bug and i can commit it to the
> openbsd tree.

GOTCHA ???

identd is heavily looping inside the getlist function.  I don't know
the internal structure of PCBs, but surely something is wrong.  All
my currently connection's PCB's start with 0xf8..., but pcbp->inp_next
holds a value of 0x400002a0 !

Probably a race conditions as I stated before, and a lost pointer
somewhere.

Oh, another problem.  I was stepping around the loop with gdb and
suddenly my network connection stopped.  I went to the console
just to see the KDB prompt saying: "stopped in syscall: push $0" or
something like that.  All I have done was: "watch *pcbp" and lots of
steps.

I have ordered kdb to continue, and the system came back.  The copy of
identd I was debugging simply got a SIGSEGV.

HTH,

					Jonny


--
Joao Carlos Mendes Luis			jonny@coe.ufrj.br
+55 21 290-4698 ( Job )			jonny@adc.coppe.ufrj.br
Network Manager				UFRJ/COPPE/CISI
Universidade Federal do Rio de Janeiro