Subject: Re: port-sparc64/29473: nfs + bus_dmamap_load_mbuf often results in a hang
To: Andrey Petrov <petrov@netbsd.org>
From: john heasley <heas@shrubbery.net>
List: netbsd-bugs
Date: 02/23/2005 11:56:30
oops, in such a rush today.  I meant to mention that I'll try poking it
some more next week.  if you can think of anything sepecific that I should
collect, let me know.

It is rock solid with the gem as long as I do not use NFS.  If I move it
to an hme, it works fine.

I guess that in the specific case that I reported in the PR it didnt hang
hard enough to not accept a break, but that has happened before.  here is
one such trace/case:

hang #1
ok .trap-registers
%TL:1 %TT:3 %TPC:11a0e94 %TnPC:11a0e98
%TSTATE:4482000600  %CWP:0
   %PSTATE:6 AG:0 IE:1 PRIV:1 AM:0 PEF:0 RED:0 MM:0 TLE:0 CLE:0 MG:0 IG:0
   %ASI:82  %CCR:44  XCC:nZvc   ICC:nZvc
  
%TL:2 %TT:d8 %TPC:10090a8 %TnPC:10090ac
%TSTATE:82000501  %CWP:1
   %PSTATE:5 AG:1 IE:0 PRIV:1 AM:0 PEF:0 RED:0 MM:0 TLE:0 CLE:0 MG:0 IG:0
   %ASI:82  %CCR:0  XCC:nzvc   ICC:nzvc
  
%TL:3 %TT:68 %TPC:1005804 %TnPC:1005808
%TSTATE:11001507  %CWP:7
   %PSTATE:15 AG:1 IE:0 PRIV:1 AM:0 PEF:1 RED:0 MM:0 TLE:0 CLE:0 MG:0 IG:0
   %ASI:11  %CCR:0  XCC:nzvc   ICC:nzvc
  
%TL:4 %TT:1ff %TPC:fffffffffffffffc %TnPC:fffffffffffffffc
%TSTATE:4080407  %CWP:7
   %PSTATE:804 AG:0 IE:0 PRIV:1 AM:0 PEF:0 RED:0 MM:0 TLE:0 CLE:0 MG:0 IG:1
   %ASI:4  %CCR:0  XCC:nzvc   ICC:nzvc
  
%TL:5 %TT:1ff %TPC:fffffffffffffffc %TnPC:fffffffffffffffc
%TSTATE:5e840b2b01  %CWP:1
   %PSTATE:b2b AG:1 IE:1 PRIV:0 AM:1 PEF:0 RED:1 MM:0 TLE:1 CLE:1 MG:0 IG:1
   %ASI:84  %CCR:5e  XCC:nZvC   ICC:NZVc
  
ok .registers
        Normal          Alternate       MMU               Vector
0:                 0                0                0                0
1:               6d0                1               37          1805408
2:                 1                0          185ee10              7eb
3:                 0          e8274a0          ddc2000 ffffffffffffffff
4:                 0                2 800000007e6c9636                1
5:          6b09a000         6b09a000             40d8          3f07d80
6:           11a0f18          11a0f18         7eda7708             1000
7:  fffffffff00071cc fffffffff00071cc 800000007e6c9636                1
%PC  11a0e94 %nPC 11a0e98
%TBA 1000000 %CCR 44 XCC:nZvc   ICC:nZvc
  
hang #2 (the one in the PR, i think)

login: kdb breakpoint at 11ab560
Stopped in pid 379.1 (nfsd) at  netbsd:cpu_Debugger+0x4:        nop
db> bt
intr_list_handler(3f07cc0, 6, e0017ed0, 8, 119fa54, 0) at netbsd:intr_list_handl
er+0x10
sparc_interrupt(7, e802000, f6ff128, 299db17da57a38c7, 4a79ae022228b7bc, f6ff450
) at netbsd:sparc_interrupt+0x1d4
_bus_dmamap_load_mbuf(3f15c00, 444b800, 3ec1500, 401, ffffffffffffffef, f6ff5f0)
 at netbsd:_bus_dmamap_load_mbuf+0xa4
gem_start(4444060, 16fc, 16f8, d6, f6ff5f0, 44447c0) at netbsd:gem_start+0x84
ether_output(0, 3ec1d00, 3ee8488, 800, 3ef3428, 40) at netbsd:ether_output+0x358
  
ip_output(3ef2720, 4444060, 3ee8480, 3ee8488, 0, 3ec1de0) at netbsd:ip_output+0x
5c8
udp_output(3ee8480, 3ee8420, c6, 10, 6, 3a) at netbsd:udp_output+0x254
udp_usrreq(3ee6d80, 9, 3ef4130, 3ef2c20, 0, ddddad0) at netbsd:udp_usrreq+0x1f0
sosend(0, 0, 0, 3ef4130, 0, 0) at netbsd:sosend+0x3c4
nfs_send(3ee6d80, 3ef2c20, 3ef4130, 0, ddddad0, 6000) at netbsd:nfs_send+0x9c
nfssvc_nfsd(0, ddddad0, ddd5800, f6ffbd0, 2, 1839ae0) at netbsd:nfssvc_nfsd+0x64
c 
sys_nfssvc(0, f6ffdd0, f6ffdc0, 0, f6ffdd0, 0) at netbsd:sys_nfssvc+0x310
syscall(f6ffed0, 9b, 405369e0, f6ffdd0, 405369e0, 405369e4) at netbsd:syscall+0x
d4
?(4, 202d78, 18, ffffffffffffcc50, 0, 0) at 0x1008cb8

(gdb) list *(gem_start+0x84)
0x1063fb0 is in gem_start (../../../../dev/ic/gem.c:1039).
1034                     * Load the DMA map.  If this fails, the packet either  
1035                     * didn't fit in the alloted number of segments, or we w
ere
1036                     * short on resources.  In this case, we'll copy and try
1037                     * again.
1038                     */
1039                    if (bus_dmamap_load_mbuf(sc->sc_dmatag, dmamap, m0,
1040                          BUS_DMA_WRITE|BUS_DMA_NOWAIT) != 0) {
1041                            if (m0->m_pkthdr.len > MCLBYTES) {
1042                                    printf("%s: unable to allocate jumbo Tx 
"
1043                                        "cluster\n", sc->sc_dev.dv_xname);  

(gdb) list *(_bus_dmamap_load_mbuf+0xa4)
0x11a0f18 is in _bus_dmamap_load_mbuf (../../../../arch/sparc64/sparc64/machdep.
c:1127).
1122                            long incr;
1123
1124                            incr = PAGE_SIZE - (vaddr & PGOFSET);
1125                            incr = min(buflen, incr);
1126
1127                            if (pmap_extract(pmap_kernel(), vaddr, &pa) == F
ALSE) {
1128    #ifdef DIAGNOSTIC
1129                                    printf("_bus_dmamap_load_mbuf: pmap_extr
act failed %lx\n",
1130                                           vaddr);
1131    #endif