Subject: Re: uvm_fault kernel: page fault trap while un-tar-ing a large file
To: Edgar =?iso-8859-1?B?RnXf?= <ef@math.uni-bonn.de>
From: Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
List: tech-kern
Date: 06/22/2007 23:05:28
On Fri, Jun 22, 2007 at 10:39:11PM +0200, Edgar Fuß wrote:
> >Just built GENERIC.MP from my NetBSD4 tree and it looks like Line 729
> >is right and the error comes from TAILQ_REMOVE when it assigns
> >*(elm)->field.tqe_prev = (elm)->field.tqe_next.
> OK, looks that you're right and I inspected the wrong netbsd.gdb.
> 
> I added a test just before the TAILQ_REMOVE that would panic
> if dq->dq_freelist.tqe_prev was NULL and give me the value of dq.
> After a few tries, I managed to hit the panic. I dumped and gdb'ing
> the dump revealed that the code was trying to remove an entry from the
> free list that wasn't on it in the first place.
> 
> Everything else with that entry looks reasonable:
> $20 = {dq_hash = {le_next = 0x0, le_prev = 0xffff800008fb9bf0},  
> dq_freelist = {
>     tqe_next = 0x0, tqe_prev = 0x0}, dq_flags = 12, dq_cnt = 0,  
> dq_spare = 0,
>   dq_type = 0, dq_id = 10060, dq_ump = 0xffff800009584a00, dq_dqb = {
>     dqb_bhardlimit = 0, dqb_bsoftlimit = 0, dqb_curblocks = 11101240,
>     dqb_ihardlimit = 0, dqb_isoftlimit = 0, dqb_curinodes = 65536,
>     dqb_btime = 1183141531, dqb_itime = 1183141531}}
> 
> The id is correct, the mount pointer is correct.
> The reference count is zero, so it should be on the free list.
> I think the value of curinodes looks suspicious.
> 
> I can give more information from the dump if needed.

Just guessing -- is it possible to overflow dq_cnt?

Could you add this:

	dqref(struct dquot *dq)
	{

	dq->dq_cnt++;
+	if (dq->dq_cnt == 0)
+		panic("dq_cnt overflow");
	}

-- 
Juergen Hannken-Illjes - hannken@eis.cs.tu-bs.de - TU Braunschweig (Germany)