NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/55272: userland watchdog process may be outstalled



The following reply was made to PR kern/55272; it has been noted by GNATS.

From: Martin Husemann <martin%duskware.de@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: 
Subject: Re: kern/55272: userland watchdog process may be outstalled
Date: Sat, 20 Jun 2020 13:39:09 +0200

 I have seen this now also on other machines, even SMP ones.
 
 I have a macppc, dual 800 MHz G4 machine, 1.5 GB RAM, it uses a software
 watchdog too.
 
 With -current I can reproducably kill it by doing a full ATF test run. It
 dies here:
 
 fs/msdosfs/t_snapshot (687/846): 2 test cases
     snapshot: [1.077320s] Passed.
     snapshotstress: [4.312312s] Passed.
 [5.391946s]
 
 fs/nfs/t_mountd (688/846): 1 test cases
     mountdhup: [1.079431s] Expected failure: PR kern/5844: op failed with EACCES
 [1.085206s]
 
 fs/nfs/t_rquotad (689/846): 6 test cases
     get_nfs_be_1_both: 
 
 [ 9641.0182210] swwdog: 60 second timer expired
 [ 9641.0182210] panic: watchdog timer expired
 [ 9641.0182210] cpu0: Begin traceback...
 [ 9641.0182210] 0x1000fdf0: at vpanic+0x12c
 [ 9641.0482453] 0x1000fe20: at panic+0x50
 [ 9641.0582418] 0x1000fe60: at swwdog_panic+0x90
 [ 9641.0682466] 0x1000fe70: at callout_softclock+0x418
 [ 9641.0782523] 0x1000feb0: at softint_dispatch+0x140
 [ 9641.0882560] 0x1000ff20: at softint_fast_dispatch+0xdc
 [ 9641.0882560] saved LR(0xfb3ffb79) is invalid.cpu0: End traceback...
 [ 9641.0882560] halting CPU 1
 [ 9641.2083150] dumpsys: TBD
 [ 9641.2083150] rebooting
 
 
 However, the test itself suceeds when run in isolation:
 
 # cd /usr/tests/fs/nfs && atf-run t_rquotad|atf-report
 Tests root: /usr/tests/fs/nfs
 
 t_rquotad (1/1): 6 test cases
     get_nfs_be_1_both: [3.183507s] Passed.
     get_nfs_be_1_group: [2.943784s] Passed.
     get_nfs_be_1_user: [2.711919s] Passed.
     get_nfs_le_1_both: [3.143645s] Passed.
     get_nfs_le_1_group: [2.876600s] Passed.
     get_nfs_le_1_user: [2.667976s] Passed.
 [17.542653s]
 
 Summary for 1 test programs:
     6 passed test cases.
     0 failed test cases.
     0 expected failed test cases.
     0 skipped test cases.
 
 
 The lockup (and starvation of the userland watchdog tickle) only happen
 with left over/locked up rump_server processes from previous tests.
 
 This is a showstopper for the netbsd-10 branch.
 
 Martin
 


Home | Main Index | Thread Index | Old Index