NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/54009: "l->l_pcu_cpu[id] == NULL" panic on aarch64
The following reply was made to PR kern/54009; it has been noted by GNATS.
From: Ryo Shimizu <ryo%nerv.org@localhost>
To: Alexander Nasonov <alnsn%yandex.ru@localhost>
Cc: Ryo Shimizu <ryo%nerv.org@localhost>, gnats-bugs%NetBSD.org@localhost,
kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
netbsd-bugs%netbsd.org@localhost
Subject: Re: kern/54009: "l->l_pcu_cpu[id] == NULL" panic on aarch64
Date: Thu, 05 Sep 2019 18:08:40 +0900
>> I guess the cause is the lack of memory barrier.
>> Will the following patches fix it?
>
>The bug annoyed me so much that I turned that server off.
>But I recently turn it back on to test 9.0_BETA.
>
>Two tor relays running on the server are still in a ramp up phase
>and it will take about a month to get them running at full speed.
>Once they run at a full speed, a chance of hitting the panic will
>be much higher.
With only this verification patch applied, it was confirmed to be false positive.
cvs -q diff -aup .
Index: subr_pcu.c
===================================================================
RCS file: /src/cvs/cvsroot-netbsd/src/sys/kern/subr_pcu.c,v
retrieving revision 1.21
diff -a -u -p -r1.21 subr_pcu.c
--- subr_pcu.c 16 Oct 2017 15:03:57 -0000 1.21
+++ subr_pcu.c 29 Aug 2019 05:53:35 -0000
@@ -336,6 +336,13 @@ pcu_load(const pcu_ops_t *pcu)
s = splpcu();
curci = curcpu();
}
+#if 1
+ if (l->l_pcu_cpu[id] != NULL) {
+ printf("false positive?: l->l_pcu_cpu[id] == NULL? id=%u, l=%p, l->l_pcu_cpu[id]=%p\n", id, l, l->l_pcu_cpu[id]);
+ __asm __volatile ("dsb sy");
+ printf("check again: l->l_pcu_cpu[id] == NULL? id=%u, l=%p, l->l_pcu_cpu[id]=%p\n", id, l, l->l_pcu_cpu[id]);
+ }
+#endif
KASSERT(l->l_pcu_cpu[id] == NULL);
/* Save the PCU state on the current CPU, if there is any. */
[ 46.812281] false positive?: l->l_pcu_cpu[id] == NULL? id=0, l=0xffffffc004ba2300, l->l_pcu_cpu[id]=0xffffffc000a29580
[ 46.812281] check again: l->l_pcu_cpu[id] == NULL? id=0, l=0xffffffc004ba2300, l->l_pcu_cpu[id]=0x0
It's almost certainly a memory barrier problem.
I'll commit the fix. If you still reproduce it, please let me know.
Thanks,
--
ryo shimizu
Home |
Main Index |
Thread Index |
Old Index