Re: evbarm hang

To: tech-kern%netbsd.org@localhost
Subject: Re: evbarm hang
From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
Date: Fri, 19 Apr 2019 12:24:44 +0200

On Fri, Apr 19, 2019 at 11:10:07AM +0200, Manuel Bouyer wrote:
> [...]
> So cpu 1 is indeed running the LWP hodling the spin lock, and it looks
> like it's itself waiting for a mutex.
> Now I have to find why "mach cpu 1" hangs, and how to avoid it ...

I found the reason of the hang (will send a separate mail to port-arm@
about it).

So now I have stack traces for both CPUs:
[ 1553.4798668] cpu0: Begin traceback...                                   
[ 1553.4798668] 0x9cf6b674: netbsd:db_panic+0x14                              
[ 1553.4798668] 0x9cf6b68c: netbsd:vpanic+0x194                      
[ 1553.4798668] 0x9cf6b6a4: netbsd:snprintf                            
[ 1553.4798668] 0x9cf6b6e4: netbsd:lockdebug_more                    
[ 1553.4798668] 0x9cf6b71c: netbsd:lockdebug_abort+0xc0              
[ 1553.4798668] 0x9cf6b73c: netbsd:mutex_abort+0x34                  
[ 1553.4798668] 0x9cf6b7ac: netbsd:mutex_enter+0x580                 
[ 1553.4798668] 0x9cf6b804: netbsd:pool_get+0x70                     
[ 1553.4798668] 0x9cf6b854: netbsd:pool_cache_get_slow+0x1f4         
[ 1553.4798668] 0x9cf6b8a4: netbsd:pool_cache_get_paddr+0x288        
[ 1553.4798668] 0x9cf6b8c4: netbsd:m_clget+0x34                 
[ 1553.4798668] 0x9cf6b924: netbsd:dwc_gmac_intr+0x194                 
[ 1553.4798668] 0x9cf6b93c: netbsd:gic_fdt_intr+0x2c            
[ 1553.4798668] 0x9cf6b964: netbsd:pic_dispatch+0x110           
[ 1553.4798668] 0x9cf6b9c4: netbsd:armgic_irq_handler+0xf4           
[ 1553.4798668] 0x9cf6ba44: netbsd:irq_entry+0x68               
[ 1553.4798668] 0x9cf6bacc: netbsd:rw_enter+0x44c                    
[ 1553.4798668] 0x9cf6bc1c: netbsd:uvm_fault_internal+0x124          
[ 1553.4798668] 0x9cf6bca4: netbsd:data_abort_handler+0x1b0
[ 1553.4798668] cpu0: End traceback...

db{0}> mach cpu 1                                                    
kdb_trap: switching to cpu1                                            
Stopped in pid 27357.1 (gcc) at netbsd:nullop:  mov     r0, #0          ; #0x0
db{1}> tr                                                            
0x9e365ba4: netbsd:_kernel_lock+0xc                                  
0x9e365be4: netbsd:pmap_extract_coherency+0x200                      
0x9e365c4c: netbsd:uvm_km_kmem_alloc+0x110                           
0x9e365c64: netbsd:pool_page_alloc+0x3c                              
0x9e365cbc: netbsd:pool_grow+0x80                                    
0x9e365cd4: netbsd:pool_catchup+0x30                            
0x9e365d2c: netbsd:pool_get+0x620                                      
0x9e365d7c: netbsd:pool_cache_get_slow+0x1f4                    
0x9e365dcc: netbsd:pool_cache_get_paddr+0x288                   
0x9e365dec: netbsd:m_clget+0x34                                      
0x9e365e84: netbsd:sosend+0x38c                                 
0x9e365eac: netbsd:soo_write+0x3c                                    
0x9e365f04: netbsd:dofilewrite+0x7c                                  
0x9e365f34: netbsd:sys_write+0x5c                                    
0x9e365fac: netbsd:syscall+0x12c                                     

So here's our deadlock: cpu 0 holds the kernel lock and wants the pool spin
mutex; cpu 1 holds the spin mutex and wants the kenrel lock.

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--

Follow-Ups:
- re: evbarm hang
  - From: matthew green

References:
- evbarm hang
  - From: Manuel Bouyer
- Re: evbarm hang
  - From: Manuel Bouyer
- Re: evbarm hang
  - From: Manuel Bouyer

Prev by Date: Re: evbarm hang
Next by Date: Re: evbarm hang
Previous by Thread: Re: evbarm hang
Next by Thread: re: evbarm hang
Indexes:

Home | Main Index | Thread Index | Old Index