Port-macppc archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: dual CPU system hangs



On 07/19/13 19:27, Christos Zoulas wrote:
In article <51E9498A.9070305%groessler.org@localhost>,
Christian Groessler  <chris%groessler.org@localhost> wrote:
On 07/17/13 17:08, Christos Zoulas wrote:
In article <51E66514.5080600%groessler.org@localhost>,
Christian Groessler  <chris%groessler.org@localhost> wrote:
-=-=-=-=-=-

Hi,

I have a dual CPU PowerMac, and when building the tree with -j 3 the
system hangs after some time (1 or 2 hours).

It looks like it cannot create new processes. When I try to run a
program on an already open shell, it simply hangs,
^C doesn't get me back to the prompt.

Is there a way to enter DDB with an USB keyboard? In the FAQ I only
found a way with an ADB keyboard (Command-Option-Power)?
Or how can I try to debug this?

I'm running -current from three days ago.
Leave a loop running vmstat and netstat -m every second on a window
and see what happens when it hangs.

Ok, I started two shells, one with "vmstat -w 1" and one with "while
true; do netstat -m; sleep 1; done"

The build ran through successfully twice.

Then I tried again, without vmstat and netstat, and now it hangs again.

I could still start a few processes, and I have
Looks like you are running out of memory. Can you print stack traces
from ddb of the two processes that are stuck in xchicv (cross-call).


Ok. It took me some time to buy a PCI serial card and hack the kernel
to use it as serial console...

But now I'm back on topic.


I've got a similar hang (-current from today), but now the processes are stuck in
biowait and biolock.


Here's a 'ps' output while ps was still working:

-------
0 600 1 0 85 0 4220 1516 nanoslp Ss ? 0:00.03 /usr/sbin/cron 12 606 571 0 85 0 11520 3892 kqueue I ? 0:00.04 qmgr -l -t unix -u 0 612 515 0 85 0 5280 1968 select I ? 0:00.04 rshd -L 12 27796 571 0 85 0 11520 3892 kqueue I ? 0:00.03 pickup -l -t fifo -u 1000 38 627 0 85 0 4832 1868 pause S+ ttyp0 0:00.05 screen (screen-4.0.3) 1000 627 524 51 85 0 3664 2788 wait Is ttyp0 0:00.04 /usr/pkg/bin/bash -login 1000 337 377 0 85 0 3680 2884 wait Is ttyp1 0:00.03 /usr/pkg/bin/bash 1000 506 337 5738 84 0 4448 1812 wait I+ ttyp1 0:00.09 sh ./build.sh -D /local/netbsd-src/dest -O /local/netbsd-src/obj -j 4 -R /local/
1000  1459 29549 32151  78  0  4444  1524 wait    I+   ttyp1 0:00.01 sh
1000 1760 2300 0 85 0 4020 2156 select S+ ttyp1 0:07.28 /usr/local/tools/bin/nbmake _THISDIR_ distribution
1000  2300  2984  6649  84  0  4444  1512 wait    I+   ttyp1 0:00.00 sh
1000  2514 22619 26106  79  0  4444  1464 wait    I+   ttyp1 0:00.01 sh
1000  2811 22313 29452  78  0  4444   796 wait    I+   ttyp1 0:00.00 sh
1000 2984 506 0 85 0 4020 2140 select S+ ttyp1 0:11.97 /usr/local/tools/bin/nbmake -j 4 release 1000 3049 3541 0 85 0 4020 2160 select S+ ttyp1 0:07.62 /usr/local/tools/bin/nbmake _THISDIR_ NOPOSTINSTALL build
1000  3111  3049 29047  78  0  4444  1516 wait    I+   ttyp1 0:00.01 sh
1000  3541  1760  7105  84  0  4444  1512 wait    I+   ttyp1 0:00.00 sh
1000 3736 20169 32950 77 0 4020 1476 wait I+ ttyp1 0:00.01 /usr/local/tools/bin/powerpc--netbsd-gcc -O2 -ffreestanding -fno-strict-aliasing
1000  5465 13084 32861  77  0  4444  1516 wait    I+   ttyp1 0:00.01 sh
1000 7788 27873 32278 78 0 4020 1480 wait I+ ttyp1 0:00.01 /usr/local/tools/bin/powerpc--netbsd-gcc -O2 -ffreestanding -fno-strict-aliasing 1000 7875 22608 32950 77 0 4020 1448 biowait D+ ttyp1 0:00.01 /usr/local/tools/bin/powerpc--netbsd-gcc -O2 -ffreestanding -fno-strict-aliasing 1000 9625 3736 32936 77 0 32528 26860 biolock D+ ttyp1 0:04.21 /local/netbsd-src/tools/bin/../libexec/gcc/powerpc--netbsd/4.5.4/cc1 -quiet -nos 1000 10253 1459 32861 77 0 6068 3284 wait I+ ttyp1 0:00.11 /usr/local/tools/bin/nbmake _THISDIR_ dependall-../sys/rump/dev/lib dependall-.. 1000 11506 2811 30367 78 0 4412 1536 wait I+ ttyp1 0:00.01 /bin/sh - /usr/local/tools/bin/nblorder cd9660_bmap.po cd9660_node.po cd9660_uti
1000 12528 13084 32982  77  0  4444  1516 wait    I+   ttyp1 0:00.01 sh
1000 12989 13084 32861  77  0  4444  1516 wait    I+   ttyp1 0:00.01 sh
1000 13084 27452 0 85 0 4020 2156 select S+ ttyp1 0:00.07 /usr/local/tools/bin/nbmake _THISDIR_ dependall 1000 13327 2514 0 85 0 5044 3100 select S+ ttyp1 0:00.14 /usr/local/tools/bin/nbmake realall 1000 13893 29646 0 85 0 5044 2872 select S+ ttyp1 0:00.12 /usr/local/tools/bin/nbmake realall 1000 14250 7788 32936 77 0 21980 16760 biolock D+ ttyp1 0:00.95 /local/netbsd-src/tools/bin/../libexec/gcc/powerpc--netbsd/4.5.4/cc1 -quiet -nos
1000 15357 29634 29293  78  0  4444  1464 wait    I+   ttyp1 0:00.00 sh
1000 17094 13084 32861  77  0  4444  1516 wait    I+   ttyp1 0:00.01 sh
1000 17296 12528 0 85 0 4020 2160 select S+ ttyp1 0:00.11 /usr/local/tools/bin/nbmake _THISDIR_ dependall 1000 18673 29099 0 85 0 5044 3308 select S+ ttyp1 0:00.15 /usr/local/tools/bin/nbmake realall
1000 20169 18673 32950  77  0  4444  1472 wait    I+   ttyp1 0:00.01 sh
1000 20197 11506 30367 78 0 4196 1268 biowait D+ ttyp1 0:00.00 /usr/bin/sort -k2 /tmp/_reference_.18054a -o /tmp/_reference_.18054a
1000 22313 13893 29452  78  0  4444  1500 pipe_rd I+   ttyp1 0:00.02 sh
1000 22608 13327 32950  77  0  4444  1472 wait    I+   ttyp1 0:00.01 sh
1000 22619 12989 0 85 0 4020 2076 select S+ ttyp1 0:00.08 /usr/local/tools/bin/nbmake _THISDIR_ dependall
1000 23158 29622 30859  78  0  4444  1512 wait    I+   ttyp1 0:00.00 sh
1000 27452 10253 32861 77 0 4408 1476 wait I+ ttyp1 0:00.01 /bin/sh -c _makedirtarget() { dir="$1"; shift; target="$1"; shift; case "${di 1000 27797 17094 0 85 0 4020 2040 select S+ ttyp1 0:00.06 /usr/local/tools/bin/nbmake _THISDIR_ dependall
1000 27873 29962 32278  78  0  4444  1472 wait    I+   ttyp1 0:00.01 sh
1000 28054 2811 30367 78 0 4228 1248 pipe_rd I+ ttyp1 0:00.00 /usr/local/tools/bin/nbtsort -q
1000 29099 17296 30367  78  0  4444  1464 wait    I+   ttyp1 0:00.01 sh
1000 29549 23158 0 85 0 6068 3324 select S+ ttyp1 0:06.10 /usr/local/tools/bin/nbmake _THISDIR_ build_install 1000 29622 3111 0 85 0 4020 2152 select S+ ttyp1 0:05.23 /usr/local/tools/bin/nbmake _THISDIR_ do-lib 1000 29634 5465 0 85 0 4020 2048 select S+ ttyp1 0:00.06 /usr/local/tools/bin/nbmake _THISDIR_ dependall
1000 29646 27797 30206  78  0  4444  1464 wait    I+   ttyp1 0:00.00 sh
1000 29962 15357 0 85 0 5044 2764 select S+ ttyp1 0:00.12 /usr/local/tools/bin/nbmake realall 1000 458 377 0 85 0 3680 2884 wait Ss ttyp2 0:00.19 /usr/pkg/bin/bash
1000 27390   458     0  43  0  4408  1328 -       O+   ttyp2 0:00.01 ps -axl
1000 2903 377 806 85 0 3672 2836 ttyraw Is+ ttyp3 0:00.02 /usr/pkg/bin/bash 0 562 1 3210 85 0 4320 1384 ttyraw Is+ ttyE0 0:00.01 /usr/libexec/getty std.9600 ttyE0 0 619 1 3210 85 0 4320 1384 ttyraw Is+ ttyE1 0:00.01 /usr/libexec/getty std.9600 ttyE1 0 604 1 3460 85 0 4320 1384 ttyraw Is+ ttyE2 0:00.01 /usr/libexec/getty std.9600 ttyE2 0 558 1 3210 85 0 4320 1384 ttyraw Is+ ttyE3 0:00.01 /usr/libexec/getty std.9600 ttyE3
-------



ddb's 'ps/n' output (note that I've started a 'ls -l /usr/' command between the above output and this one):


-------
db{0}> ps/n
PID      PPID     PGRP        UID S   FLAGS LWPS          COMMAND WAIT
24779   24779      571          0 2   0x100    1           master tstile
9732     9732     9732       1000 2       0    1             bash tstile
9507     9507     9507          0 2   0x100    1             cron tstile
8500     8500      600          0 2       0    1             cron wait
17947   17947    17947       1000 2  0x4000    1               ls biowait
20197   20197      506       1000 2  0x4000    1             sort biowait
7875     7875      506       1000 2  0x4000    1 powerpc--netbsd- biowait
28054   28054      506       1000 2  0x4000    1          nbtsort pipe_rd
11506   11506      506       1000 2  0x4000    1               sh wait
2811     2811      506       1000 2       0    1               sh wait
22608   22608      506       1000 2  0x4000    1               sh wait
14250   14250      506       1000 2  0x4000    1              cc1 biolock
7788     7788      506       1000 2  0x4000    1 powerpc--netbsd- wait
27873   27873      506       1000 2  0x4000    1               sh wait
22313   22313      506       1000 2  0x4000    1               sh pipe_rd
9625     9625      506       1000 2  0x4000    1              cc1 biolock
3736     3736      506       1000 2  0x4000    1 powerpc--netbsd- wait
20169   20169      506       1000 2  0x4000    1               sh wait
18673   18673      506       1000 2  0x4000    1           nbmake select
29099   29099      506       1000 2  0x4000    1               sh wait
17296   17296      506       1000 2  0x4000    1           nbmake select
12528   12528      506       1000 2  0x4000    1               sh wait
13327   13327      506       1000 2  0x4000    1           nbmake select
2514     2514      506       1000 2  0x4000    1               sh wait
13893   13893      506       1000 2  0x4000    1           nbmake select
29962   29962      506       1000 2  0x4000    1           nbmake select
29646   29646      506       1000 2  0x4000    1               sh wait
15357   15357      506       1000 2  0x4000    1               sh wait
22619   22619      506       1000 2  0x4000    1           nbmake select
12989   12989      506       1000 2  0x4000    1               sh wait
29634   29634      506       1000 2  0x4000    1           nbmake select
5465     5465      506       1000 2  0x4000    1               sh wait
27797   27797      506       1000 2  0x4000    1           nbmake select
17094   17094      506       1000 2  0x4000    1               sh wait
13084   13084      506       1000 2  0x4000    1           nbmake select
27452   27452      506       1000 2  0x4000    1               sh wait
10253   10253      506       1000 2  0x4000    1           nbmake wait
1459     1459      506       1000 2  0x4000    1               sh wait
29549   29549      506       1000 2  0x4000    1           nbmake select
23158   23158      506       1000 2  0x4000    1               sh wait
29622   29622      506       1000 2  0x4000    1           nbmake select
2903     2903     2903       1000 2  0x4000    1             bash wait
3111     3111      506       1000 2  0x4000    1               sh wait
3049     3049      506       1000 2  0x4000    1           nbmake select
3541     3541      506       1000 2  0x4000    1               sh wait
1760     1760      506       1000 2  0x4000    1           nbmake select
2300     2300      506       1000 2  0x4000    1               sh wait
2984     2984      506       1000 2  0x4000    1           nbmake select
506       506      506       1000 2  0x4000    1               sh wait
458       458      458       1000 2  0x4000    1             bash wait
337       337      337       1000 2  0x4000    1             bash wait
377       377      377       1000 2   0x101    1     screen-4.0.3 select
38         38       38       1000 2  0x4100    1     screen-4.0.3 pause
627       627      627       1000 2  0x4000    1             bash wait
524       524      524       1000 2  0x4000    1            xterm select
612       612      515          0 2  0x4100    1             rshd select
596       596        1          0 2  0x4000    1            getty ttyraw
558       558      558          0 2  0x4000    1            getty ttyraw
604       604      604          0 2  0x4000    1            getty ttyraw
619       619      619          0 2  0x4000    1            getty ttyraw
562       562      562          0 2  0x4000    1            getty ttyraw
600       600      600          0 2       0    1             cron tstile
606       606      571         12 2  0x4100    1             qmgr kqueue
515       515      515          0 2       0    1            inetd kqueue
571       571      571          0 2  0x4100    1           master kqueue
351       351      351          0 2       0    1             sshd select
342       342      342          0 2       0    1            timed select
335       335      335          0 2       0    1           powerd kqueue
311       311      311          0 2       0    1              lpd select
176       176      176          0 2       0    1          rpcbind select
165       165      165          0 2   0x100    1          syslogd kqueue
1           1        1          0 2  0x4001    1             init wait
0          -1        0          0 2 0x20002   48 system       *
-------


db{0}> trace/t 0t7875
trace: pid 7875 lid 1 at 0x1033bb00
0x1033bb60: at cpu_switchto+0x30
0x1033bb70: at mi_switch+0x17c
0x1033bbb0: at sleepq_block+0xb4
0x1033bbd0: at cv_wait+0xa8
0x1033bbf0: at biowait+0x38
0x1033bc00: at bwrite+0x168
0x1033bc20: at ffs_update+0x34c
0x1033bc60: at ufs_makeinode+0x200
0x1033bc90: at ufs_create+0x68
0x1033bcb0: at VOP_CREATE+0x50
0x1033bcf0: at vn_open+0x27c
0x1033bdc0: at do_open+0x100
0x1033be40: at do_sys_openat+0x8c
0x1033be80: at sys_open+0x28
0x1033beb0: at syscall+0x1fc
0x1033bf20: user SC trap #5 by 0xfde37358: srr1=0xd032
            r1=0xffffbe70 cr=0x28002048 xer=0x20000000 ctr=0xfde37350
db{0}>


db{0}> trace/t 0t9625
trace: pid 9625 lid 1 at 0x102e7b40
0x102e7ba0: at cpu_switchto+0x30
0x102e7bb0: at mi_switch+0x17c
0x102e7bf0: at sleepq_block+0xb4
0x102e7c10: at cv_timedwait+0xb0
0x102e7c30: at bbusy+0x148
0x102e7c60: at getblk+0x88
0x102e7c90: at bio_doread.clone.1+0x30
0x102e7cb0: at bread+0x2c
0x102e7cd0: at ffs_update+0x1e4
0x102e7d10: at ufs_inactive+0x2c0
0x102e7d60: at VOP_INACTIVE+0x48
0x102e7d90: at vrelel+0x278
0x102e7de0: at vn_close+0x70
0x102e7e00: at closef+0x78
0x102e7e40: at fd_close+0x114
0x102e7ea0: at sys_close+0x2c
0x102e7eb0: at syscall+0x1fc
0x102e7f20: user SC trap #6 by 0xfddd7468: srr1=0xd032
            r1=0xffffc7a0 cr=0x84000044 xer=0x20000000 ctr=0xfddd7460
db{0}>



db{0}> trace/t 0t14250
trace: pid 14250 lid 1 at 0x10337b40
0x10337ba0: at cpu_switchto+0x30
0x10337bb0: at mi_switch+0x17c
0x10337bf0: at sleepq_block+0xb4
0x10337c10: at cv_timedwait+0xb0
0x10337c30: at bbusy+0x148
0x10337c60: at getblk+0x88
0x10337c90: at bio_doread.clone.1+0x30
0x10337cb0: at bread+0x2c
0x10337cd0: at ffs_update+0x1e4
0x10337d10: at ufs_inactive+0x2c0
0x10337d60: at VOP_INACTIVE+0x48
0x10337d90: at vrelel+0x278
0x10337de0: at vn_close+0x70
0x10337e00: at closef+0x78
0x10337e40: at fd_close+0x114
0x10337ea0: at sys_close+0x2c
0x10337eb0: at syscall+0x1fc
0x10337f20: user SC trap #6 by 0xfddd7468: srr1=0xd032
            r1=0xffffc870 cr=0x84000044 xer=0x20000000 ctr=0xfddd7460
db{0}>




db{0}> trace/t 0t20197
trace: pid 20197 lid 1 at 0x10263a70
0x10263ad0: at cpu_switchto+0x30
0x10263ae0: at mi_switch+0x17c
0x10263b20: at sleepq_block+0xb4
0x10263b40: at cv_wait+0xa8
0x10263b60: at biowait+0x38
0x10263b70: at bwrite+0x168
0x10263b90: at ffs_update+0x34c
0x10263bd0: at ffs_truncate+0xf78
0x10263d20: at ufs_inactive+0x30c
0x10263d70: at VOP_INACTIVE+0x48
0x10263da0: at vrelel+0x278
0x10263df0: at ufs_remove+0xc8
0x10263e10: at VOP_REMOVE+0x4c
0x10263e40: at do_sys_unlinkat.clone.1+0xe8
0x10263eb0: at syscall+0x1fc
0x10263f20: user SC trap #10 by 0xfdefcbc4: srr1=0xd032
            r1=0xffffd190 cr=0x28022022 xer=0x20000000 ctr=0xfdefcbbc
db{0}>




db{0}> trace/t 0t17947
trace: pid 17947 lid 1 at 0x103a3940
0x103a39a0: at cpu_switchto+0x30
0x103a39b0: at mi_switch+0x17c
0x103a39f0: at sleepq_block+0xb4
0x103a3a10: at cv_wait+0xa8
0x103a3a30: at biowait+0x38
0x103a3a40: at breadn+0xb4
0x103a3a70: at ufs_blkatoff+0x1f4
0x103a3ae0: at ufs_lookup+0x2f4
0x103a3b90: at VOP_LOOKUP+0x4c
0x103a3bc0: at lookup_once+0x18c
0x103a3c10: at namei_tryemulroot+0x438
0x103a3cc0: at namei+0x48
0x103a3cf0: at vn_open+0xac
0x103a3dc0: at do_open+0x100
0x103a3e40: at do_sys_openat+0x8c
0x103a3e80: at sys_open+0x28
0x103a3eb0: at syscall+0x1fc
0x103a3f20: user SC trap #5 by 0xfde07358: srr1=0xd032
            r1=0xffffd650 cr=0x28004048 xer=0x20000000 ctr=0xfde07350
db{0}>


The computer is now sitting at the ddb prompt. If you want me to check anything else, please tell me.

regards,
chris



Home | Main Index | Thread Index | Old Index