Subject: Re: port-i386/36206: Segmentation faults with SMP on i386 multiprocessor kernel
To: None <gnats-bugs@NetBSD.org>
From: Andrew Doran <ad@netbsd.org>
List: netbsd-bugs
Date: 06/14/2007 14:44:03
On Thu, Jun 14, 2007 at 01:35:04PM +0000, Greg Oster wrote:
> The following reply was made to PR port-i386/36206; it has been noted by GNATS.
> 
> From: Greg Oster <oster@cs.usask.ca>
> To: gnats-bugs@NetBSD.org
> Cc: 
> Subject: Re: port-i386/36206: Segmentation faults with SMP on i386 multiprocessor kernel 
> Date: Thu, 14 Jun 2007 07:33:06 -0600
> 
>  shannonr@NetBSD.org writes:
>  > >Number:         36206
>  > >Category:       port-i386
>  > >Synopsis:       Apparently random segmentation faults occur frequently with 
>  > SMP kernel on dual Intel Core 2 system.
>  > >Confidential:   no
>  > >Severity:       critical
>  > >Priority:       high
>  > >Responsible:    port-i386-maintainer
>  > >State:          open
>  > >Class:          sw-bug
>  > >Submitter-Id:   net
>  > >Arrival-Date:   Tue Apr 24 16:15:00 +0000 2007
>  > >Originator:     John R. Shannon
>  > >Release:        NetBSD 4.99.18 (also occurs with 4.0 BETA)
>  > >Organization:
>  > 	johnrshannon.com
>  > >Environment:
>  > System: NetBSD michael.internal.johnrshannon.com 4.99.18 NetBSD 4.99.18 (KERN
>  > EL.MICHAEL) #3: Tue Apr 24 08:20:29 MDT 2007 build@michael.internal.johnrshan
>  > non.com:/usr/obj/import/CURRENT/src/sys/arch/i386/compile/KERNEL.MICHAEL i386
>  > 	dual Intel Core 2 (Merom)
>  > 	dmesg output appended to PR
>  > Architecture: i386
>  > Machine: i386
>  > >Description:
>  > 	Segmentation faults, in different processes, occur every minute or so.
>  > 	The same kernel, without options MULTIPROCESSOR, works fine. Also, a 
>  > 	64-bit kernel does not display this behavior.
>  
>  Intel released a microcode patch that apparently fixes some TLB 
>  issues in certain Core 2 Duo CPUs.  (Look for 
>  "intel microcode update intel core 2 duo" on google).  Maybe a BIOS 
>  update (that contains the microcode update) would be sufficient to 
>  solve this problem?  (My Core 2 Duo box (with the update) runs 
>  flawlessly in MULTIPROCESSOR mode, albeit in 32-bit mode...)

Good news.

Adding a tlbflushg() in pmap_destroy() is a workaround. Matthias Drochner
tested this and it's confirmed. The problem seems to be limited to Intel
Core 2 cpus. Sorry for not noting this in the PR earlier.

Andrew