Subject: Re: panic: dequeued wrong buf in -current
To: Andreas Wrede <andreas@planix.com>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-kern
Date: 09/07/2004 00:12:42
[ follow up to tech-kern ]

On Mon, Sep 06, 2004 at 12:05:12PM -0400, Andreas Wrede wrote:
> After upgrading from Aug 6 -current sources to today's I get a "panic:  
> sdstart(): dequeued wrong buf" very early in the boot up sequence.   
> There is another "panic: biodone already  done" in the syncing disks...  
> step and while the kernel produces a core dump, savecore does not  
> recognize it.

Maybe reboot(0x104) would work ?

> 
> Note that the root fs is on a RAID-1 set.
> 
> Below, you'll find the traceback and the boot messages:
> Traceback:
> 
> panic: sdstart(): dequeued wrong buf
> Begin traceback...
> sdstart(c1aa9f00,c1a73080,0,4,0) at netbsd:sdstart+0x2ea
> sdstrategy(c1a73080,0,80,0,0) at netbsd:sdstrategy+0x1db
> spec_strategy(cc927874,cc8771f8,100000,404,c05229a0) at  
> netbsd:spec_strategy+0x155
> VOP_STRATEGY(cc8771f8,c1a73080,cc92791c,293,72) at  
> netbsd:VOP_STRATEGY+0x28
> rf_DispatchKernelIO(c1a20000,c1ae5074,1,0,3ddcbf) at  
> netbsd:rf_DispatchKernelIO+0x28b

Juergen Hannken-Illjes has reported in private mail a similar problem,
on sparc64 without raidframe or ccd involved.
He started looking at this, and it appears that sdstart() is called twice,
once of the calls being interrupted. I followed the call graph and I don't
know where it could happen.

Both you and Juergen use the esiop driver, and this driver can call
scsipi_done() from esiop_scsipi_request(). This can likely cause sdstart() to
call itself. Other HBA drivers may do this as well.

A workaround would be to add a lock in sdstart() to avoid such recursion,
but this will have an impact on performances, as we loose opportunities to
keep the disk busy.
A better way would be to allow sdstart() to be reentrant. 
Basically we need to deqeue the buf before calling the HBA's adapter request.
1) add a struct scsipi_xfer * argument to scsipi_command(): if this pointer is
   not null it would use this xfer, otherwise it would try to allocate one
   as it does now.
2) make scsipi_command() dequeue the buf itself. We can't do this for every
   command with a buf, so this needs a new flag, or something
3) always dequeue the buf, and use a local FIFO queue when we're out of
   ressources.

I prefer 1) myself, as it can allow a more flexible error recovery procedure
on resources shortage in other cases too.  However, it's quite intrusive as
all scsipi_command() calls needs to be touched (which means almost all
files in sys/dev/scsipi) As we want to get this pulled up to 2.0, 2) may be
better. 

Comments ?

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--