tech-kern: Re: More stuff

Subject: Re: More stuff
To: Bill Paul <wpaul@FreeBSD.ORG>
From: Alan Ritter <rittera@cc.wwu.edu>
List: tech-kern
Date: 08/24/2005 17:32:28
>
> I checked out a copy of your code from the CVS repo and looked at it
> a bit. I also read up a little bit on the NetBSD kernel's locking
> primitives. There are a few things I'd like to suggest.
>
> First, I noticed that in several places you're using SCHED_LOCK() to
> block scheduling. I'm not positive, but I think you're using the API
> incorrectly. It looks like SCHED_LOCK() has the same semantics as
> splxxx(). That is, you should be using it like this:
>
>         int             s;
>
>         s = SCHED_LOCK();
>         /* do critical stuff */
>         SCHED_UNLOCK(s);
>
> But you appear to be using it like this:
>
>         int             s;
>
>         SCHED_LOCK(s);
>         /* do critical stuff */
>         SCHED_UNLOCK(s);
>
> It's possible the man page for SCHED_LOCK() that I read is out of
> date, but you should check this.

Thanks, I didn't actuially put the SCHED_LOCK's in there myself, I sort of inherited
this project from someone else to begin with (http://imil.net/wp/?page_id=21),
(http://imil.net/wp/?page_id=21).  He got all the files in compat/ndis to compile,
but
there were still a bunch of linker errors when trying to load the LKM.  Anyway, this
is good to know, I'll look into it.

> I noticed you're using SCHED_LOCK() in place of mtx_lock_spin().
> This isn't quite right: it will fail to provide protection on an
> SMP machine. The NDISulator uses a spinlock to guard the job queues
> used by the taskqueue and swi threads. It's ok to use a spinlock
> for this because the critical code is small and does not do
> anything that might cause the thread to sleep. It doesn't have
> to be a spinlock though.
>
> At first glance, I thought the right thing to use here would be
> a simplelock, but I'm not positive about that. A lockmgr lock
> would also work just fine. But SCHED_LOCK() is not the right
> solution.

Yes, I think your right, SCHED_LOCK() doesn't make much sense in place of
mtx_lock_spin().  I think mtx_lock_spin() is roughly equivelant to splXXX() and a
simplelock, as mtx_lock_spin() is supposed to block interrupts as well as spin as I
understand it
(http://www.freebsd.org/cgi/man.cgi?query=mtx_lock&apropos=0&sektion=9&manpath=FreeBSD+5.4-RELEASE+and+Ports&format=html)
I'm pretty sure that in NetBSD kernel threads always run to completion unless they
voluntarily give up the processor, which seems very differnt from what's going on in
FreeBSD.  Also NetBSD interrupts have no user context, and thus cannot sleep.

Since an interrupt is all that can preempt a kernel thread, on a single processor
system I think the only thing I need to do to protect a resource is splnet()/splx().
 I will need to add locks for multiprocessor support eventuially, however.

I originally wanted to just re-define the FreeBSD mtx_lock(), mtx_lock_spin(),
NDIS_LOCK(sc) and all their unlocking counterparts, as this would keep things real
simple.  If you look in the nbcompat.[ch] files you can see most of what I'm trying
to do here.  I've tried multiple approaches, most of which are commented out for
trying different combinations (sorry about my sloppyness).  Now, however I'm
starting to think that this isn't really going to work, as it appears that doing
this causes problems where the IPL gets stuck at IPL_NET from inconsistent use of
splnet()/splx() (at least I think this is what's going on).

Anyway, I'm also thinking that NetBSD and FreeBSD synchronization mechanisms are too
differnt, and I'll have to go through and re-do much of it for NetBSD anyway, I've
already converted most of the stuff in kern_ndis.c, and if_ndis.c.  I'm just
replacing mtx_lock_spin() and NDIS_LOCK() with splnet().  I know this won't work for
multiprocessors, but I'm just focusing on single-processor for right now.

I'm hoping that there isn't anything in NDIS that depends on preemption of kernel
threads, as then NetBSD having this could be a big problem.

> Your implementation of KfRaiseIrql() and KfLowerIrql() in subr_hal.c
> also seems to reference SCHED_LOCK(), though it doesn't actually block
> pre-emption with it. Actually, the current code doesn't do anything
> at all. And neither do KeAcquireSpinLockAtDpcLevel() and
> KeReleaseSpinLockFromDpcLevel(). This means the Windows spinlock
> API in the NDISulator does nothing at all, which will almost
> certainly cause you serious problems. The e100bex driver is a
> de-serialized miniport and depends on spinlocks to work correctly
> (it calls NdisAcquireSpinLock() and NdisReleaseSpinLock(), but
> those routines just call the underlying functions in subr_ntoskrnl.c
> and subr_hal.c).

Again I think this should only be an issue on multiprocessors.  Since NetBSD kernel
threads always run to completion (unless interrupted).  But yes, I should put in a
spinlock here, it wouldn't be that hard.

By the way, I tried raising the IPL with splnet() here, and that just caused problems.

> The fact that NetBSD doesn't have the same atomic ops as FreeBSD
> is a little disappointing. For now, I would just copy them from
> atomic.h in FreeBSD (src/sys/i386/include/atomic.h). Since you're
> mainly interested in x86 arch support for now, the x86 version
> should be enough. That should let you get the spinlock test-and-set
> routines working.

Yep, these are in nbcompat.h.

> KfRaiseIrql() and KfLowerIrql() are a bit tougher. I don't think
> using splsched() is the right solution. I would try the approach
> I suggested earlier, of creating a 'dispatch level' sleep lock for
> each CPU. I experimented with this approach on my laptop and it
> seemed to work as expected. At the very least, it's better than
> the no-op code you have now. :)

Maybe I'm misunderstanding your idea, but the point of this is so that at
DISPATCH_LEVEL, we are only interrupted by interrupts, and not preempted by the
scheduler?

If this is the case, then I don't think anything needs to be done on NetBSD at
DISPATCH_LEVEL, as I believe NetBSD kernel threads won't be preempted by the
scheduler.  I know I should probably have it raise the IRQL with splnet() when it's
at
DEVICE_LEVEL, but I think I was having some problems with this (I'll get back to it).

It could be the case that I'm totally off base here, if so someone please let me know!

> Also, I noticed the following in kern_ndis.c:
>
> #ifdef __FreeBSD__
>                 error = kthread_create(ndis_runq, &ndis_tproc,
>                     &ndis_tproc.np_p, RFHIGHPID,
>                     NDIS_KSTACK_PAGES, "ndis taskqueue");
> #else
>                 error = kthread_create1(ndis_runq, &ndis_tproc,
>                     &ndis_tproc.np_p, "ndis taskqueue");
> #endif
>
> I should explain the use of NDIS_KSTACK_PAGES here, as it's important.
> In Windows, each kernel thread as a stack size of 3 pages (12K). In
> BSD (well, FreeBSD at least), the per-thread stack size is 2 pages
> (8K). Most Windows drivers do a reasonable job of using stack space,
> but there are some exceptions. One in particular is the Intel 2200BG
> 802.11g Centrino driver. I discovered that when this driver associates
> with a wireless net, it does an explicit alloca() of a little over
> 5000 bytes. This occurs in the interrupt handler, which is running
> in the swi thread. This usually works ok in Windows, but in my testing
> on FreeBSD, there wasn't enough space available and the result is
> that the driver overflowed the stack. In FreeBSD, this triggers
> a double fault panic (one fault when the code steps on the unmapped
> stack guard page, and another in the trap handler because it too
> tries to touch the unmapped stack guard page). To combat this, I
> created the taskqueue and swi threads with larger than normal stacks,
> hence the NDIS_KSTACK_PAGES to override the default. (Along the
> way, I also found out that support for creating threads with
> non-default stack sizes was broken, and nobody had noticed until
> I tripped over the problem.)
>
> I don't know if NetBSD lets you create threads with larger than
> normal stack allocations. If it does, you need to use what ever
> API allows this. Beware of the problems blowing the stack can cause:
> it was largely dumb luck that I figured out what the problem was as
> quickly as I did.

Thanks, this is good to know.

> I'm not really sure what's causing your 'bogus program counter'
> crashes. I haven't seen the panic messages or debug output so I
> can't really speculate. Bear in mind though that the taskqueue
> and swi threads do explicit jumps to various routines when they
> invoke the jobs on their job queues:
>
>                         /* Do the work. */
>
>                         if (r->nr_func != NULL)
>                                 (*r->nr_func)(r->nr_arg);
>
> If the job queue structures are hosed, you'll end up branching
> into hyperspace. Another possibility is there's still a calling
> convention bug somewhere that's causing a stack botch an a return
> to an invalid address.

Yes, I was thinking these structures might be getting messed up somewhere.  I'll do
some tracing.  The only way I've found to reproduce this problem is with a browser,
so I need X running, and the whole OS crashes as soon as the problem occours, so I
can't see the DbgPrint output from the Windows driver.  I'm still trying to figure
out how to get a peek at it.

> When debugging this, you should try to find out what thread the
> crash occured in. (Was it in the taskqueue or swi thread?) You should
> also do your damndest to get a stack trace. If you can't try to
> examine the contents of the stack manually, looking for values that
> could be return addresses. Compare these values to the kernel
> namelist trying to find out what function it was that jumped off into
>
> the void.

Ok, I'll start digging through the disassembly.  The stack appears to be hosed when
it crashes (The gdb backtrace command shows only one frame).  Your right, I should
check out what's on the stack.

> Don't try to figure out what web pages cause a crash and which ones
> don't. Different apps won't use different features of the driver. They
> may however affect the timing of code execution. My money says the main
> problem here is the broken spinlock implementation.

This makes sense, it seems probable that the problem is a synchronization issue
having to do with the task queue, causing a bad jump.

Also just in case you are thinking about compiling and running it let me know, I
could give you a copy of my kernel configuration files (actuially now that I think
about it, I'll commmit them to the repository).  In addition I don't think it will
work on NetBSD 2.0.2, I've been using 3.99.7 (NetBSD-current, but I haven't update
in a while, as I'm afraid it might not compile).  It still might take a while
however, I found it to be kind of a pain to update to NetBSD-current.  I've been
meaning to get the LKM version up and working, as this would be much easier to test
out (right now I think it will crash while it's loading).

Anyway, thanks again for all your help :-)