NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
port-amd64/48142: i8254 timer stop working during boot - system lockup during boot
>Number: 48142
>Category: port-amd64
>Synopsis: i8254 timer stop working during boot - system lockup during
>boot
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: port-amd64-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Aug 21 13:00:00 +0000 2013
>Originator: Dr. Wolfgang Stukenbrock
>Release: NetBSD 6.1
>Organization:
Dr. Nagler & Company GmbH
>Environment:
System: NetBSD test-s0 5.1.2 NetBSD 5.1.2 (NSW-WS) #3: Fri Dec 21 15:15:43 CET
2012 wgstuken@test-s0:/usr/src/sys/arch/amd64/compile/NSW-WS amd64
Architecture: x86_64
Machine: amd64
>Description:
While initializing the hardware on x86-systems in
sys/arch/x86/x86/cpu.c i8254_delay() is called directly
bypassing the DELAY() macro that may call this function of somthing
else - e.g. lapic_delay() from
sys/arch/x86/x86/lapic.c. I'm not shure if this is by design or by
error ...
If lapic is setup, the i8254 timer counter frequency is set to '0' -
full cycle as far as I understand.
OK - the time-source is getting slower as before, but it is still
running. All calls to i8254_delay() will wait
longer as before, but that does not realy hurt here and nobody has
recognized this as a problem before.
On our Supermicro X8DAH (with one CPU only) we have the problem, that
with some kernel configurations
the system will hang during startup while starting the "other" CPU's.
I've debugged into it and found, that the i8254 timer has stopped
counting - for unknown reasons.
This happens after the the isa subsystems is initialized.
If i8254_delay() is used after this - e.g. at the end of isaattach()
for debugging purpose, it will never return.
At start of this routine the timer is still working fine.
No indication of the problem is reported - the user sits there and is
wondering ...
The problem is triggered by the finsio driver on port 0x4e, but I'm not
shure if it is the fault of this driver or
if the timer registers are visible on other ports than 0x40 and 0x43 on
this board too.
Also still not tested on other motherboards - perhaps others are
affected too.
>How-To-Repeat:
Setup "finsio0 at isa? port 0x4e" in a kernel configuration on a
Supermicro X8DAH board,
The kernel will freeze during startup.
>Fix:
Not 100% shure, because my knowloedge about the constrains during the
startup is to small.
Perhaps replacing the i8254_delay() in arch/x86/x86/cpu.c with DELAY()
would be a good idea.
It solves the problem for me, but I'm not shure if there are other side
effects.
An other way to introduce a workaround is to catch the case that the
timer stops working in i8254_delay() in
sys/arch/x86/isa/clock.c.
If we assume that each loop takes longer than one timer tick, we can
decrement the remaining counter by one
each time we read the same tick-value again to avoid an endless loop
here.
This aproach introduces some slowdown while bringing up the "other"
CPU's, but works fine too without
accessing "other resources" as the first fix would do ...
remark: if complied as XEN, then xen_delay() would be used by the
DELAY() macro. Not shure if this is OK or not,
or if sys/arch/x86/x86/cpu.c goes to a XEN kernel or not.
remark: i8254_delay() will not be used later directly again. So the
second aproach will only slow down the
boot process. (At least I've found no other references to it in
the souces.)
Here is a patch that uses the second aproach for
sys/arch/x86/isa/clock.c.
Feel free to use it or to replace i8254_delay() with DELAY() in
sys/arch/x86/x86/cpu.c
Perhaps it would make sence to add some addition code to report the
problem to the user if it happes
the first time, but on very very fast systems in the future this
message may be misleading ...
--- clock.c 2013/08/21 12:43:37 1.1
+++ clock.c 2013/08/21 12:48:26
@@ -482,6 +482,9 @@
cur_tick = gettick();
if (cur_tick > initial_tick)
delta = rtclock_tval - (cur_tick - initial_tick);
+// avoid looping forever if timer stops counting for any reason
+ else if (cur_tick == initial_tick)
+ delta = 1;
else
delta = initial_tick - cur_tick;
if (delta < 0 || delta >= rtclock_tval / 2) {
@@ -500,6 +503,9 @@
cur_tick = gettick();
if (cur_tick > initial_tick)
remaining -= rtclock_tval - (cur_tick - initial_tick);
+// avoid looping forever if timer stops counting for any reason
+ else if (cur_tick == initial_tick)
+ remaining -= 1;
else
remaining -= initial_tick - cur_tick;
#endif
>Unformatted:
Home |
Main Index |
Thread Index |
Old Index