Port-alpha archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Stream of kernel messages: pid <n> is killed: exceeded RLIMIT_CPU



christos%zoulas.com@localhost (Christos Zoulas) writes:
> On Nov 13,  6:01pm, jarle%uninett.no@localhost (Jarle Greipsland) wrote:
> -- Subject: Re: Stream of kernel messages: pid <n> is killed: exceeded RLIMIT
> 
> | christos%zoulas.com@localhost (Christos Zoulas) writes:
> | > Ok. The root of the problem is:
> | > 
> | > | WARNING: negative runtime; monotonic clock has gone backwards
> | > 
> | > I have seen that on the windows virtual pc but nowhere else. What clock
> | > does your alpha use, and can that go backwards?
> | Good question.  Now that you mention it, I seem to recall that I
> | have had problems with the clock on this system before.  It's a
> | dual CPU CS20.  And I had to use sysctl.conf early in the boot
> | process to set
> | 
> |   kern.timecounter.hardware=clockinterrupt
> | 
> | The choices available are:
> | kern.timecounter.choice = PCC(q=1000, f=1249650000 Hz) clockinterrupt(q=0, 
> f=1024 Hz) dummy(q=-1000000, f=1000000 Hz)
> | 
> | If I choose the PCC, I get the following effect:
> | # while true; do date; sleep 1; done
> | Sun Nov 13 17:57:41 CET 2011
> | Sun Nov 13 17:57:41 CET 2011
> | Sun Nov 13 17:57:42 CET 2011
> | Sun Nov 13 17:57:43 CET 2011
> | Sun Nov 13 17:57:43 CET 2011
> | Sun Nov 13 17:57:44 CET 2011
> | Sun Nov 13 17:57:45 CET 2011
> | Sun Nov 13 17:57:46 CET 2011
> | Sun Nov 13 17:57:46 CET 2011
> | Sun Nov 13 17:57:47 CET 2011
> | Sun Nov 13 17:57:48 CET 2011
> | Sun Nov 13 17:57:48 CET 2011
> | Sun Nov 13 17:57:49 CET 2011
> | Sun Nov 13 17:57:50 CET 2011
> | Sun Nov 13 17:57:50 CET 2011
> | Sun Nov 13 17:57:51 CET 2011
> | Sun Nov 13 17:57:52 CET 2011
> | Sun Nov 13 17:57:52 CET 2011
> | Sun Nov 13 17:57:53 CET 2011
> | Sun Nov 13 17:57:54 CET 2011
> | Sun Nov 13 17:57:54 CET 2011
> | Sun Nov 13 17:57:55 CET 2011
> | Sun Nov 13 17:57:56 CET 2011
> | Sun Nov 13 17:57:56 CET 2011
> | Sun Nov 13 17:57:57 CET 2011
> | Sun Nov 13 17:57:58 CET 2011
> | Sun Nov 13 17:57:58 CET 2011
> | Sun Nov 13 17:57:59 CET 2011
> | 
> | So, something is very fishy when it comes to time-keeping on the
> | alpha.
> 
> Yes, this is horrible. Perhaps we are missing clockinterrupts? How can
> the clock go backwards though?
Note that the systems sets its clock with ntpdate during the boot
process.  Might that be it?

> I also don't understand why we clear the clock in each loop. I
> would think that the code should look like:


> 
> clear
> for (i = 0; i < 4; i++)
>       wait for set, clear;
>       start = getcounter;
>       wait for set, flear;
>       end = getcounter;
>       diff[i] = end - start;
> freq = ((diff[2] + diff[3]) * 16) / 2;
> 
> Not what we have now:
> 
> for (i = 0; i < 4; i++)
>       clear
>       wait for set, clear;
>       start = getcounter;
>       wait for set, flear;
>       end = getcounter;
>       diff[i] = end - start;
> freq = ((diff[2] + diff[3]) / 2) * 16;

> 
> Does this make a difference in the computed frequency if you change it?

I applied your patch, and still gets "exceeded
RLIMIT_CPU"-messages (kernel messages attached).  However, they
seem to (mostly) disappear once the clock source is switched from
PCC (the default during boot, I think) to "clockinterrupt" (as set
in /etc/sysctl.conf), and ntpdate has been run.  Strange.

What is really the difference between the PCC and
"clockinterrupt" as clock sources?  Just different modes from the
same hardware, or different beasts altogether?

                                        -jarle
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
   2006, 2007, 2008, 2009, 2010, 2011
   The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
   The Regents of the University of California.  All rights reserved.

NetBSD 5.99.56 (CS20 based on GENERIC-$Revision: 1.330 $) #3: Sun Nov 13 
18:52:03 CET 2011
        
jarle%sweetheart.urc.uninett.no@localhost:/usr/obj/sys/arch/alpha/compile/CS20.MP
API CS20D 833 MHz, s/n 
8192 byte page size, 2 processors.
total memory = 1024 MB
(2776 KB reserved for PROM, 1021 MB used by NetBSD)
avail memory = 1002 MB
timecounter: Timecounters tick every 0.976 msec
mainbus0 (root)
cpu0 at mainbus0: ID 0 (primary), 21264B-3
cpu0: Architecture extensions: 0x1307<PAT,MVI,CIX,FIX,BWX>
cpu1 at mainbus0: ID 1, 21264B-3
cpu1: Architecture extensions: 0x1307<PAT,MVI,CIX,FIX,BWX>
tsc0 at mainbus0: 21272 Core Logic Chipset, Cchip rev 0
tsc0: 4 Dchips, 1 memory bus of 32 bytes
tsc0: arrays present: 1024MB, 0MB, 0MB, 0MB, Dchip 0 rev 1
tsp0 at tsc0
pci0 at tsp0 bus 0
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
esiop0 at pci0 dev 3 function 0: Symbios Logic 53c1010-66 (ultra3-wide scsi)
esiop0: using on-board RAM
esiop0: interrupting at dec 6600 irq 16
scsibus0 at esiop0: 16 targets, 8 luns per target
fxp0 at pci0 dev 4 function 0: i82559 Ethernet, rev 8
fxp0: interrupting at dec 6600 irq 20
fxp0: Ethernet address 00:02:56:00:06:f9
inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
sio0 at pci0 dev 7 function 0: Acer Labs M1533 PCI-ISA Bridge (rev. 0xc3)
aceride0 at pci0 dev 16 function 0: Acer Labs M5229 UDMA IDE Controller (rev. 
0xc2)
aceride0: bus-master DMA support present
aceride0: using PIO transfers above 137GB as workaround for 48bit DMA access 
bug, expect reduced performance
aceride0: primary channel configured to compatibility mode
aceride0: primary channel interrupting at isa irq 14
atabus0 at aceride0 channel 0
aceride0: secondary channel configured to compatibility mode
aceride0: secondary channel interrupting at isa irq 15
atabus1 at aceride0 channel 1
Acer Labs M7101 Power Management Controller (miscellaneous prehistoric) at pci0 
dev 17 function 0 not configured
isa0 at sio0
lpt0 at isa0 port 0x3bc-0x3bf irq 7
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
attimer0 at isa0 port 0x40-0x43
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
isabeep0 at pcppi0
mcclock0 at isa0 port 0x70-0x71: mc146818 compatible time-of-day clock
attimer0: attached to pcppi0
tsp1 at tsc0
pci1 at tsp1 bus 0
pci1: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
fxp1 at pci1 dev 3 function 0: i82559 Ethernet, rev 8
fxp1: interrupting at dec 6600 irq 32
fxp1: Ethernet address 00:02:56:00:06:fa
inphy1 at fxp1 phy 1: i82555 10/100 media interface, rev. 4
inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
timecounter: Timecounter "clockinterrupt" frequency 1024 Hz quality 0
timecounter: Timecounter "PCC" frequency 1249650400 Hz quality 1000
scsibus0: waiting 2 seconds for devices to settle...
stray isa irq 15
IPsec: Initialized Security Association Processing.
updatertime: now (1,3150429391398570567) < &l->l_stime->sec 
(1,11210236177680282759)
updatertime: now (1,4231397250631051767) < &l->l_stime->sec 
(1,12291208716315795879)
updatertime: now (1,5336397726713191047) < &l->l_stime->sec 
(1,13396211288534309079)
updatertime: now (1,6429373465540434327) < &l->l_stime->sec 
(1,14489200652247982839)
updatertime: now (1,7534406918866653447) < &l->l_stime->sec 
(1,15594216583645498839)
updatertime: now (1,8639402169369381687) < &l->l_stime->sec 
(1,16699222388637715479)
updatertime: now (1,9744421732101742647) < &l->l_stime->sec 
(1,17804239220488180839)
updatertime: now (1,10837403892191821527) < &l->l_stime->sec 
(2,450477734952897143)
updatertime: now (1,11918396890299266007) < &l->l_stime->sec 
(2,1531472460158621543)
sd0 at scsibus0 target 0 lun 0: <IBM-PSG, ST318437LC    !#, 59LK> disk fixed
sd0: 17357 MB, 29851 cyl, 2 head, 595 sec, 512 bytes/sect x 35548320 sectors
sd0: sync (12.50ns offset 31), 16-bit (160.000MB/s) transfers, tagged queueing
updatertime: now (4,2263692188329079047) < &l->l_stime->sec 
(4,2943829616993935447)
updatertime: now (4,3332670117979690327) < &l->l_stime->sec 
(4,11392616297330268199)
updatertime: now (4,4437670712154019687) < &l->l_stime->sec 
(4,12497633335842066199)
updatertime: now (4,5530677007335446167) < &l->l_stime->sec 
(4,13590636029211695239)
updatertime: now (4,6635699419041892807) < &l->l_stime->sec 
(4,14695648535935698919)
updatertime: now (4,7728690288430990087) < &l->l_stime->sec 
(4,15788650476467616199)
updatertime: now (4,8833708876902782887) < &l->l_stime->sec 
(4,16893665743596562999)
updatertime: now (4,9950737323013521607) < &l->l_stime->sec 
(4,18010687443690943399)
updatertime: now (4,11031747474011575207) < &l->l_stime->sec 
(5,644946332117374263)
updatertime: now (4,12124733117821261447) < &l->l_stime->sec 
(5,1737949822609286343)
updatertime: now (4,13217731486064428807) < &l->l_stime->sec 
(5,2830951748379679863)
updatertime: now (4,14322749041229558407) < &l->l_stime->sec 
(5,3935963428458352983)
updatertime: now (5,8415497163014375751) < &l->l_stime->sec 
(5,14883410008259630343)
stray isa irq 15
atapibus0 at atabus1: 2 targets
stray isa irq 15
cd0 at atapibus0 drive 0: <SAMSUNG CD-ROM SN-124, , q008> cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2 (Ultra/33)
cd0(aceride0:1:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA)
root on sd0a dumps on sd0b
root file system type: ffs
updatertime: now (6,10736431281324386567) < &l->l_stime->sec 
(7,349654864982483623)
pid 0 is killed: exceeded RLIMIT_CPU, runtm=18446744073709551615 
rlim_cur=9223372036854775807 rlim_max=9223372036854775807
WARNING: negative runtime; monotonic clock has gone backwards
updatertime: now (7,9885792818574738871) < &l->l_stime->sec 
(7,17945751309036132583)
updatertime: now (7,11423180320257680551) < &l->l_stime->sec 
(8,1036412096561464407)
pid 0 is killed: exceeded RLIMIT_CPU, runtm=18446744073709551614 
rlim_cur=9223372036854775807 rlim_max=9223372036854775807
updatertime: now (7,15158746919232562231) < &l->l_stime->sec 
(8,4771827626102186247)
stray isa irq 4
updatertime: now (7,17356677314372193751) < &l->l_stime->sec 
(8,6969840582444207447)
updatertime: now (8,1876583123360420055) < &l->l_stime->sec 
(8,9936553319710155447)
updatertime: now (8,5912399927497508535) < &l->l_stime->sec 
(8,13972313823395623287)
updatertime: now (8,7053387655207865655) < &l->l_stime->sec 
(8,15113354751637611447)
pid 0 is killed: exceeded RLIMIT_CPU, runtm=18446744073709551612 
rlim_cur=9223372036854775807 rlim_max=9223372036854775807
pid 0 is killed: exceeded RLIMIT_CPU, runtm=18446744073709551614 
rlim_cur=9223372036854775807 rlim_max=9223372036854775807
Nov 13 19:31:57 sweetheart ntpdate[174]: step time server 158.38.39.1 offset 
102.130283 sec
pid 0 is killed: exceeded RLIMIT_CPU, runtm=18446744073709551614 
rlim_cur=9223372036854775807 rlim_max=9223372036854775807
Nov 13 19:31:59 sweetheart savecore: no core dump
pid 0 is killed: exceeded RLIMIT_CPU, runtm=18446744073709551614 
rlim_cur=9223372036854775807 rlim_max=9223372036854775807


Home | Main Index | Thread Index | Old Index