NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/57621: Updated 10.0_BETA macppc MP kernel prone to hangs



The following reply was made to PR kern/57621; it has been noted by GNATS.

From: Havard Eidnes <he%uninett.no@localhost>
To: rokuyama.rk%gmail.com@localhost
Cc: gnats-bugs%netbsd.org@localhost, kern-bug-people%netbsd.org@localhost,
 gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
Subject: Re: kern/57621: Updated 10.0_BETA macppc MP kernel prone to hangs
Date: Thu, 21 Sep 2023 08:26:52 +0200 (CEST)

 >> Recently, I frequently observe similar stalls on Mac mini G4 (UP).
 >>
 >> I can work around the problem by this patch:
 > 
 > Excellent!
 > I will apply this to my local tree and check the fix from this
 > coming evening my time.
 
 Unfortunately, that did not result in a complete success, and the
 build got stuck overnight, at
 
       compile  libcrypto/x25519_ref10.o
       build  libcrypto/librumpkern_crypto.so.0.0
       build  libcrypto/librumpkern_crypto.a
 
 I now have an 'ld' stuck in 'tstile' (probably building the
 .so.0.0), together with 'vdrain'.  Output from 'crash':
 
 crash> ps
 PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
 12110 12110 3   1       180           1a41ea00             pickup kqueue
 4483  4483 3   0         0           102ee400                 ld tstile
 26251 26251 3   1       180           15d106c0           collect2 wait
 23329 23329 3   1       180           13cff100   powerpc--netbsd- wait
 24864 24864 3   0       180           181c3400                 sh wait
 21433 21433 3   0       180           1a41e100             nbmake poll
 14503 14503 3   0       180           19661a00                 sh wait
 4702  4702 3   0       180           181c3a00             nbmake poll
 5602  5602 3   0       180           19661700                 sh wait
 25075 25075 3   1       180           19661100             nbmake poll
 9198  9198 3   0       180           22089340             nbmake wait
 8760  8760 3   1       180           13cffd00                 sh wait
 22443 22443 3   0       180           21f5e3c0             nbmake poll
 27361 27361 3   0       180           1ae133c0                 sh wait
 6730  6730 3   0       180           1e36a640             nbmake poll
 1460  1460 3   0       180           1073f340             systat ttyraw
 897  > 897 7   0       100           1a41ed00              crash
 10856 10856 3   1       180           19996640               tcsh pause
 9905  9905 3   1       180           181c3d00               tcsh pause
 10297>10297 7   1       100           19661d00               sshd
 9285  9285 3   1       180           19996040               sshd poll
 6102  6102 3   1       180           1a95f0c0                top select
 1826  1826 3   1       180           1e36ac40                 sh wait
 2345  2345 3   0       180           1df8dc80             nbmake poll
 2611  2611 3   1       180           1df8d080                 sh wait
 2335  2335 3   0       180           1df8d980             nbmake poll
 2399  2399 3   1       180           1e36a940                 sh wait
 2683  2683 3   1       180           1df8d680             nbmake poll
 1074  1074 3   0       180           12175d00                 sh wait
 1072  1072 3   1       180           181c3100               tail kqueue
 361    361 3   1       180           10d63cc0               tcsh pause
 1108  1108 3   1       180           12175700               tcsh pause
 1092  1092 3   1       180           1073f940               tcsh pause
 1089  1089 3   0       180           10d87680               tcsh pause
 1151  1151 3   0       180           15d10cc0               tcsh pause
 850    850 3   0       180           12175400              xterm select
 959    959 3   1       180           1073f040               tcsh pause
 962    962 3   0       180           15d109c0               sshd poll
 952    952 3   0       180           12175100              xterm select
 942    942 3   0       180           102eea00              xterm select
 949    949 3   0       180           10d639c0               sshd poll
 1251  1251 3   0       180           102ee700               tcsh pause
 943    943 3   1       180           10d87c80               sshd poll
 704    704 3   1       180           1073fc40               sshd poll
 928    928 3   0       180           10d87980               tcsh pause
 1122  1122 3   0       180           102eed00               sshd poll
 983    983 3   1       180           15d103c0              getty ttyraw
 981    981 3   1       180           15d100c0              getty nanoslp
 813    813 3   0       180           13cd7c80              getty nanoslp
 986    986 3   1       180           13cd7980              getty nanoslp
 985    985 3   0       180           13cd7680              getty ttyraw
 998    998 3   1       180           12b77640               cron nanoslp
 974    974 3   1       180           12b77c40               sshd poll
 971    971 3   0       180           10d630c0              inetd kqueue
 957    957 3   1       180           10d87380               qmgr kqueue
 828    828 3   0       180           12175a00             master kqueue
 723    723 3   1       180           10d636c0               sshd poll
 604    604 3   0       180           12b77940             powerd kqueue
 567    567 3   0       180           12b77340               ntpd pause
 318    318 3   0       180           10d87080            syslogd kqueue
 1        1 3   0       180           102af0c0               init wait
 0      198 3   1       200           1022f100            physiod physiod
 0      166 3   1       200           102ee100          pooldrain pooldrain
 0      165 3   1       200           102afcc0            ioflush syncer
 0      164 3   0       200           102af9c0           pgdaemon pgdaemon
 0      126 3   0       200           7f87cc80          swwreboot swwreboot
 0      124 3   1       200           102af6c0          atapibus0 sccomp
 0      122 3   0       200           102196c0               usb0 usbevt
 0      121 3   0       200           102af3c0               usb1 usbevt
 0      120 3   0       200           1022f400             npfgc0 npfgcw
 0      119 3   1       200           1028cc80            rt_free rt_free
 0      118 3   1       200           1028c980              unpgc unpgc
 0      117 3   0       200           1028c680    key_timehandler key_timehandler
 
 0      116 3   1       200           1028c380    icmp6_wqinput/1 icmp6_wqinput
 0      115 3   0       200           1028c080    icmp6_wqinput/0 icmp6_wqinput
 0      114 3   1       200           1027ec40          nd6_timer nd6_timer
 0      113 3   1       200           1027e940    carp6_wqinput/1 carp6_wqinput
 0      112 3   0       200           1027e640    carp6_wqinput/0 carp6_wqinput
 0      111 3   1       200           1027e340     carp_wqinput/1 carp_wqinput
 0      110 3   0       200           1027e040     carp_wqinput/0 carp_wqinput
 0      109 3   1       200           1022fd00     icmp_wqinput/1 icmp_wqinput
 0      108 3   0       200           102199c0     icmp_wqinput/0 icmp_wqinput
 0      107 3   0       200           1022fa00           rt_timer rt_timer
 0      106 3   0       200           10219cc0        vmem_rehash vmem_rehash
 0       97 3   0       200           102193c0            lmtemp0 lmtemp0
 0       96 3   1       200           102190c0            dbcool0 dbcool0
 0       30 3   0       200           7f87c980          entbutler entropy
 0       29 3   1       380           7f87c680           fw0probe ieee1394
 0       28 3   1       240           7f87c380            atabus2 atath
 0       27 3   0       200           7f87c080         usbtask-dr usbtsk
 0       26 3   0       200           7f901c40         usbtask-hc usbtsk
 0       25 3   1       240           7f901940            atabus1 atath
 0       24 3   1       200           7f901640            atabus0 atath
 0       23 3   0       200           7f901340                pmu wait
 0       22 3   1       200           7f901040            xcall/1 xcall
 0       21 1   1       200           7f90ed00          softser/1
 0       20 1   1     40200           7f90ea00          softclk/1
 0       19 1   1       200           7f90e700          softbio/1
 0       18 1   1       200           7f90e400          softnet/1
 0       17 1   1       201           7f90e100             idle/1
 0       16 3   0       200           7f91acc0             sysmon smtaskq
 0       15 3   0       200           7f91a9c0         pmfsuspend pmfsuspend
 0       14 3   0       200           7f91a6c0           pmfevent pmfevent
 0       13 3   0       200           7f91a3c0         sopendfree sopendfr
 0       12 3   0       200           7f91a0c0             ifwdog ifwdog
 0       11 3   0       200           7fb26c80            iflnkst iflnkst
 0       10 3   0       200           7fb26980           nfssilly nfssilly
 0        9 3   1       240           7fb26680             vdrain tstile
 0        8 3   1       200           7fb26380          modunload mod_unld
 0        7 3   0       200           7fb26080            xcall/0 xcall
 0        6 1   0       200           7fb30c40          softser/0
 0        5 1   0     40200           7fb30940          softclk/0
 0        4 1   0       200           7fb30640          softbio/0
 0        3 1   0     40200           7fb30340          softnet/0
 0        2 1   0       201           7fb30040             idle/0
 0        0 3   0       200             c32dc0            swapper uvm
 crash> bt/a 102ee400
 trace: pid 4483 lid 4483 at 0x1b623930
 0x1b623990: at cpu_switchto+0x28
 0x1b6239a0: at mi_switch+0x140
 0x1b6239e0: at sleepq_block+0xe0
 0x1b623a00: at turnstile_block+0x284
 0x1b623a60: at rw_enter+0x124
 0x1b623aa0: at cache_lookup+0xe8
 0x1b623ad0: at ufs_lookup+0xd4
 0x1b623b80: at VOP_LOOKUP+0x4c
 0x1b623ba0: at lookup_once+0x1fc
 0x1b623bf0: at namei_tryemulroot.constprop.0+0x4a8
 0x1b623cc0: at namei+0x58
 0x1b623cf0: at vn_open+0xfc
 0x1b623e10: at do_open+0xf0
 0x1b623e60: at do_sys_openat+0x9c
 0x1b623ea0: at sys_open+0x2c
 0x1b623ec0: at syscall+0x294
 0x1b623f20: user SC trap #5 by 0xfdc70e18: srr1=0xd032
             r1=0xffffda20 cr=0x84000444 xer=0x20000000 ctr=0xfdc70e10
 crash> 
 crash> bt/a 7fb26680
 trace: pid 0 lid 9 at 0x1001fc60
 0x1001fcc0: at cpu_switchto+0x28
 0x1001fcd0: at mi_switch+0x140
 0x1001fd10: at sleepq_block+0xe0
 0x1001fd30: at turnstile_block+0x284
 0x1001fd90: at rw_enter+0x124
 0x1001fdd0: at cache_purge1+0x198
 0x1001fe00: at vcache_reclaim+0xc8
 0x1001fe80: at vrecycle.part.0+0xc0
 0x1001feb0: at vdrain_thread+0x384
 0x1001ff20: at cpu_lwp_bootstrap+0xc
 saved LR(0x55aa55a6) is invalid.
 crash> 
 
 Some of the threads which are not waiting:
 
 crash> bt/a 7f90ed00
 trace: pid 0 lid 21 at 0x1004fe40
 0x1004fea0: at cpu_switchto+0x28
 0x1004feb0: at mi_switch+0x140
 0x1004fef0: at softint_thread+0x190
 0x1004ff20: at cpu_lwp_bootstrap+0xc
 saved LR(0xaa55aa51) is invalid.
 crash> bt/a 7f90ea00
 trace: pid 0 lid 20 at 0x1004be40
 0x1004bea0: at cpu_switchto+0x28
 0x1004beb0: at mi_switch+0x140
 0x1004bef0: at softint_thread+0x190
 0x1004bf20: at cpu_lwp_bootstrap+0xc
 saved LR(0xaa55aa51) is invalid.
 crash> bt/a 7f90e700
 trace: pid 0 lid 19 at 0x10047e40
 0x10047ea0: at cpu_switchto+0x28
 0x10047eb0: at mi_switch+0x140
 0x10047ef0: at softint_thread+0x190
 0x10047f20: at cpu_lwp_bootstrap+0xc
 saved LR(0xaa55aa51) is invalid.
 crash> bt/a 7f90e400
 trace: pid 0 lid 18 at 0x10043e40
 0x10043ea0: at cpu_switchto+0x28
 0x10043eb0: at mi_switch+0x140
 0x10043ef0: at softint_thread+0x190
 0x10043f20: at cpu_lwp_bootstrap+0xc
 saved LR(0xaa55aa51) is invalid.
 crash> bt/a 7f90e100
 trace: pid 0 lid 17 at 0x1003fdc0
 0x1003fe20: at cpu_switchto+0x28
 0x1003fe30: at mi_switch+0x140
 0x1003fe70: at idle_loop+0x10c
 0x1003feb0: at cpu_spinup_trampoline+0x3c
 saved LR(0x902e) is invalid.
 crash> bt/a 7fb30c40
 trace: pid 0 lid 6 at 0x10013e40
 0x10013ea0: at cpu_switchto+0x28
 0x10013eb0: at mi_switch+0x140
 0x10013ef0: at softint_thread+0x190
 0x10013f20: at cpu_lwp_bootstrap+0xc
 saved LR(0x55aa55a6) is invalid.
 crash> bt/a 7fb30940
 trace: pid 0 lid 5 at 0x1000fe40
 0x1000fea0: at cpu_switchto+0x28
 0x1000feb0: at mi_switch+0x140
 0x1000fef0: at softint_thread+0x190
 0x1000ff20: at cpu_lwp_bootstrap+0xc
 saved LR(0x55aa55a6) is invalid.
 crash> bt/a 7fb30640
 trace: pid 0 lid 4 at 0x1000be40
 0x1000bea0: at cpu_switchto+0x28
 0x1000beb0: at mi_switch+0x140
 0x1000bef0: at softint_thread+0x190
 0x1000bf20: at cpu_lwp_bootstrap+0xc
 saved LR(0x55aa55a6) is invalid.
 crash> bt/a 7fb30340
 trace: pid 0 lid 3 at 0x10007e40
 0x10007ea0: at cpu_switchto+0x28
 0x10007eb0: at mi_switch+0x140
 0x10007ef0: at softint_thread+0x190
 0x10007f20: at cpu_lwp_bootstrap+0xc
 saved LR(0x55aa55a6) is invalid.
 crash> bt/a 7fb30040
 trace: pid 0 lid 2 at 0x10003e30
 0x10003e90: at cpu_switchto+0x28
 0x10003ea0: at mi_switch+0x140
 0x10003ee0: at idle_loop+0x10c
 0x10003f20: at cpu_lwp_bootstrap+0xc
 saved LR(0x55aa55a6) is invalid.
 crash>
 
 but as near as I can tell these do not provide a 'smoking gun'.
 
 'top' says:
 
 load averages:  0.05,  0.01,  0.00;               up 0+09:04:53        08:23:16
 62 processes: 61 sleeping, 1 on CPU
 CPU states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
 Memory: 1124M Act, 560M Inact, 14M Wired, 23M Exec, 1600M File, 24M Free
 Swap: 2000M Total, 2000M Free / Pools: 241M Used
 
 and is still running.
 
 Regards,
 
 Havard
 


Home | Main Index | Thread Index | Old Index