NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/53884: fs/vfs/t_rmdirrace:lfs_race test case fails randomly on real hardware



The following reply was made to PR kern/53884; it has been noted by GNATS.

From: Andreas Gustafsson <gson%gson.org@localhost>
To: David Holland <dholland-bugs%netbsd.org@localhost>
Cc: gnats-bugs%NetBSD.org@localhost
Subject: Re: kern/53884: fs/vfs/t_rmdirrace:lfs_race test case fails randomly
 on real hardware
Date: Fri, 18 Jan 2019 12:58:51 +0200

 David,
 
 You wrote:
 >   >   panic: kernel diagnostic assertion "fs->lfs_cleaner_thread == curlwp" failed
 >  
 >  The only way this should be possible is to have two cleaners running
 >  at once, which... I wouldn't put past the rump test framework, I
 >  guess, but it doesn't seem that likely and also shouldn't vary under
 >  the number of cpus on the system. Or so one would think.
 >  
 >  Is it possible to check for extra lfs_cleanerd processes the next time
 >  this happens?
 
 There are no lfs_cleanerd processes running after atf-run has exited,
 if that's what you mean.  The "anita test" command runs "ps -glaxw"
 after the tests, prefixing each line with "ps-post-test", and there
 were no unexpected processes shown in the console log from the last
 run where the lfs_race test case failed:
 
   ps-post-test UID   PID PPID  CPU PRI NI	  VSZ	 RSS WCHAN   STAT TTY	   TIME COMMAND
   ps-post-test   0     0	  0    0   0  0	    0 126868 -	     OKl  ?	0:33.26 [system]
   ps-post-test   0     1	  0   76  85  0 12104	1316 wait    Ss	  ?	0:00.09 init
   ps-post-test   0   167	  1    0  85  0 24188	2136 kqueue  Ss	  ?	0:00.11 /usr/sbin/syslogd -s
   ps-post-test   0   306	  1    0  85  0 12124	1136 kqueue  Is	  ?	0:00.00 /usr/sbin/powerd
   ps-post-test   0   586	  1    0  85  0 51068	2492 kqueue  Is	  ?	0:00.02 /usr/libexec/postfix/master -w
   ps-post-test  12   588	586    0  85  0 51584	3780 kqueue  I	  ?	0:00.02 pickup -l -t unix -u
   ps-post-test  12   595	586    0  85  0 49828	3808 kqueue  I	  ?	0:00.01 qmgr -l -t unix -u
   ps-post-test   0   596	  1    0  85  0 14232	1200 kqueue  Is	  ?	0:00.00 /usr/sbin/inetd -l
   ps-post-test   0   607	  1    0  85  0 12132	1376 nanoslp Is	  ?	0:00.01 /usr/sbin/cron
   ps-post-test   0    77	536 2728  85  0	 7876	1180 pipe_rd S+	  tty00 0:00.00 sed s/^/ps-post-test /
   ps-post-test   0   536	633 2728  85  0 12380	1732 wait    S	  tty00 0:00.03 /bin/sh
   ps-post-test   0   633	  1   60  85  0 66104	4164 wait    Is	  tty00 0:00.02 login
   ps-post-test   0 23032	536 2728  43  0 12188	1244 -	     O+	  tty00 0:00.00 ps -glaxw
   ps-post-test   0   465	  1   60  85  0 12128	1296 ttyraw  Is+  ttyE1 0:00.00 /usr/libexec/getty Pc ttyE1
   ps-post-test   0   634	  1   60  85  0 12260	1296 ttyraw  Is+  ttyE2 0:00.00 /usr/libexec/getty Pc ttyE2
   ps-post-test   0   527	  1   60  85  0 12128	1292 ttyraw  Is+  ttyE3 0:00.00 /usr/libexec/getty Pc ttyE3
 
 If you would like to modify the kernel or test to provide more
 debugging information and don't want to commit the changes, you can
 send me a patch.
 
 This may be unrelated, but I see that on the qemu-based TNF testbed
 (where the test passes), lfs_cleanerd sometimes logs the error message
 "couldn't open device rrumpfs for reading":
 
   babylon5.netbsd.org$ cd /bracket/amd64/results/
   babylon5.netbsd.org$ zgrep 'lfs_race:' 2019/*/test.log.gz
   2019/2019.01.01.10.09.26/test.log.gz:    lfs_race: [25.983353s] Passed.
   2019/2019.01.01.14.01.46/test.log.gz:    lfs_race: [25.433668s] Passed.
   2019/2019.01.02.03.04.26/test.log.gz:    lfs_race: [25.344210s] Passed.
   2019/2019.01.02.09.04.09/test.log.gz:    lfs_race: [25.431315s] Passed.
   2019/2019.01.02.22.58.44/test.log.gz:    lfs_race: Jan  3 22:27:14  lfs_cleanerd[27084]: couldn't open device rrumpfs for reading
   2019/2019.01.03.15.33.06/test.log.gz:    lfs_race: [25.992201s] Passed.
   2019/2019.01.04.10.25.39/test.log.gz:    lfs_race: [24.285902s] Passed.
   2019/2019.01.05.05.40.00/test.log.gz:    lfs_race: [25.598234s] Passed.
   2019/2019.01.05.10.51.06/test.log.gz:    lfs_race: Jan  6 05:42:53  lfs_cleanerd[27176]: couldn't open device rrumpfs for reading
   2019/2019.01.06.11.20.53/test.log.gz:    lfs_race: [26.268811s] Passed.
   2019/2019.01.06.18.56.52/test.log.gz:    lfs_race: Jan  9 14:56:42  lfs_cleanerd[26388]: couldn't open device rrumpfs for reading
   2019/2019.01.07.05.01.10/test.log.gz:    lfs_race: [25.582419s] Passed.
   2019/2019.01.08.06.29.35/test.log.gz:    lfs_race: [23.349863s] Passed.
   2019/2019.01.09.04.02.26/test.log.gz:    lfs_race: [23.915846s] Passed.
   2019/2019.01.10.19.00.17/test.log.gz:    lfs_race: [24.829028s] Passed.
   2019/2019.01.11.02.44.49/test.log.gz:    lfs_race: [25.661166s] Passed.
   2019/2019.01.11.08.30.19/test.log.gz:    lfs_race: [23.367517s] Passed.
   2019/2019.01.11.15.43.51/test.log.gz:    lfs_race: [25.536107s] Passed.
   2019/2019.01.11.23.10.41/test.log.gz:    lfs_race: [25.789131s] Passed.
   2019/2019.01.12.17.25.09/test.log.gz:    lfs_race: [25.472970s] Passed.
   2019/2019.01.13.06.59.15/test.log.gz:    lfs_race: [26.537625s] Passed.
   2019/2019.01.13.10.01.07/test.log.gz:    lfs_race: [25.320557s] Passed.
   2019/2019.01.13.10.43.22/test.log.gz:    lfs_race: Jan 15 05:31:26  lfs_cleanerd[5727]: couldn't open device rrumpfs for reading
   2019/2019.01.13.16.48.51/test.log.gz:    lfs_race: Jan 16 04:26:48  lfs_cleanerd[20726]: couldn't open device rrumpfs for reading
   2019/2019.01.14.03.30.25/test.log.gz:    lfs_race: [24.650734s] Passed.
   2019/2019.01.14.21.29.56/test.log.gz:    lfs_race: [23.333059s] Passed.
   2019/2019.01.15.14.23.56/test.log.gz:    lfs_race: [25.056744s] Passed.
   2019/2019.01.16.08.32.24/test.log.gz:    lfs_race: [26.300848s] Passed.
   2019/2019.01.16.13.54.17/test.log.gz:    lfs_race: [25.688629s] Passed.
 
 -- 
 Andreas Gustafsson, gson%gson.org@localhost
 


Home | Main Index | Thread Index | Old Index