current-users: Re: panic: TLB IPI rendezvous failed (mask 4)

Subject: Re: panic: TLB IPI rendezvous failed (mask 4)
To: None <dokas@cs.umn.edu>
From: Erik E. Fair <fair@netbsd.org>
List: current-users
Date: 06/02/2004 10:52:43
TLB = Translation Lookaside Buffer

This is an MMU mapping cache that saves you from having to walk page 
tables in main RAM when you're translating a virtual address to a 
physical address (which is done all the time).

IPI = Inter-Processor Interrupt (I believe)

This is what some MP systems use to communicate between processors.

You're apparently experiencing a cache coherency management problem. 
One processor had something to say to another (or all the others) 
about TLB coherency, and someone didn't acknowledge the message.

Modern CPUs all have caches - a fast access copy of what's in main 
RAM. This is because CPUs can almost always run faster than RAM can 
deliver data+instructions, but it's too expensive to build main RAM 
out of the same stuff you make caches from. So you add a cache of 
some size, and hope that the running program + the data it is 
operating on will fit inside the cache so the processor can operate 
at maximum speed.

Cache coherency comes about when there is more than one actor on main 
RAM. This is an issue in uniprocessor systems because of DMA 
peripherals; when the OS intiates DMA in or out of main RAM to a 
peripheral (e.g. a disk, a network interface), it must make sure that 
any cached data is flushed out to main RAM (in the output case), or 
invalidated (in the read case), so that you don't end up with an 
inconsistent data state between the caches and main RAM. When you do 
this right, you're maintaining "cache coherency."

Now, add multiple processors, and a TLB per CPU. The OS must now 
manage the L1/L2/L3 caches on all the processors so that when they're 
operating on main RAM, everyone maintains the same "view" of it, plus 
make sure that the TLBs in each CPU have the same view of the mapping 
between virtual and physical addresses. Oh, plus the DMA peripherals 
previously mentioned.

Add to this the tension between wanting to keep things in caches as 
long as possible (well, as long as they're being used anyway) both to 
keep the processor running at full speed and to keep traffic off the 
(usually sole) bus to main RAM, versus the need to keep coherency, 
and things get very tricky indeed.

This is why getting an OS to work right on an MP system is such fun.

	Erik <fair@netbsd.org>