Weird LOCKDEBUG error in revivesa

To: tech-kern%netbsd.org@localhost
Subject: Weird LOCKDEBUG error in revivesa
From: Bill Stouder-Studenmund <wrstuden%netbsd.org@localhost>
Date: Fri, 27 Jun 2008 14:18:24 -0700

I've got the wrstuden-revivesa branch to where kernels boot and almost 
work. A test program is running into an error that is "letting" me debug 
error handling.

I'm running into a LOCKDEBUG error that makes no sense:

        printf("About to destroy sa_mutex %p\n", &sa->sa_mutex);
        mutex_destroy(&sa->sa_mutex);
        printf("Did destroy sa_mutex %p\n", &sa->sa_mutex);
        pool_put(&sadata_pool, sa); 

is triggering a lockdebug error that "sa" contains a still-active lock. 
I'm attaching a screen shot of the problem seen running in VM Fusion. The 
deal is that the lock that's causing the problem, 0xcbbe6f78, is the very 
lock that was passed to mutex_destroy(). I have printfs before and after 
the destroy showing that it sure looks like it was zapped.

I further tweaked lockdebug_free() as follows:

        lockdebug_lock_cpus();   
        rb_tree_remove_node(&ld_rb_tree, __UNVOLATILE(&ld->ld_rb_node));
        lockdebug_unlock_cpus();
if (lockdebug_moredebug) printf("freeing lock info for lock %p, node %p\n", 
lock, __UNVOLATILE(&ld->ld_rb_node));
        ld->ld_lock = NULL;

and I had sa_release set lockdebug_moredebug to 1. You can see that the 
printf triggers in the screenshot when we were cleaning up the mutex in 
the savp structure a few lines above.

Here's the green-text output hand-transcribed for the ASCII readers amoung
us (including myself):

freeing lock info for lock 0xcbbe7f80, node 0xcbc05ba4
About to destroy sa_mutex 0xcbbe6f78
Did destroy sa_mutex 0xcbbe6f78
Mutex error: pool_did_put: allocation contains active lock

lock address : 0x00000000cbbe6f78 type     :             spin
initialized  : 0x00000000c045327b
shared holds :                  0 exclusive:                0
shared wanted:                  0 exclusive:                0
current cpu  :                  1 last held:                0
current lwp  : 0x00000000cbb57360 last held: 0000000000000000
last locked  : 000000000000000000 unlocked : 0000000000000000
owner field  : 0x00000000fffffff0 wait/spin:              0/0

Thoughts? From looking at the lockdebug output, it sure looks like we 
destroyed the lock. I don't see how we destroyed it w/o calling 
lockdebug_free(), and I don't see how it didn't complain about a problem.

My test machine is VMFusion with two CPUs configured running on a Core Duo 
Macbook Pro.

Take care,

Bill

Attachment: revSAPanic.png
Description: PNG image

Attachment: pgpYJdh9UIAuj.pgp
Description: PGP signature

Follow-Ups:
- Re: Weird LOCKDEBUG error in revivesa
  - From: Bill Stouder-Studenmund

Prev by Date: Re: kern/39052: assertion "!ISSET(bp->b_cflags, BC_BUSY)" failed
Next by Date: Re: Nvidia MCP67 SATA controller
Previous by Thread: Disabling histories in uvmhist.
Next by Thread: Re: Weird LOCKDEBUG error in revivesa
Indexes:

Home | Main Index | Thread Index | Old Index