Subject: Re: unified buffers and responsibility
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Milos Urbanek <urbanek@openbsd.cz>
List: tech-kern
Date: 06/13/2002 13:05:29
On Wed, Jun 12, 2002 at 10:55:56PM +0200, Manuel Bouyer wrote:
> > >
> > > Has anyone idea of what is causing this ?
> > > I suspect it could be disksort, which cause a single request to one end of
> > > the disk be delayed to group dozen of requests on the other end of the disk.
> > 
> > if so - apps that doesn't use disk at all for sure (like text editor)
> > should work fine. but it isn't - delays are small but noticable (up to 1
> > second).
> 
> it looks like file activity is still pushing out some data from RAM,
> even if it should not.


I have similar problems at NetBSD 1.5ZC, performing a backup of /home
directory to the tgz file at the same disk leads to a hard lockup, 
similarly 'cp big_file somewhere'. I cant
tell you if there is something at ddb prompt because the whole machine
is completely inresponsive and whenever it happened to me it was running
X.

The response time of other apps is slow aswell during the
copy/tar/whatsever 'larger' disk IO - including Window Manager, X itself, xterms
and other apps, not only those like netscape.
I suspect some pages of those apps are going to swap during the intensive
buffer cache operations, but i do not have a quantitative measures to collect
info about how many times the page daemon had woken up.

Mhm. Now it happened again - during untaring memory image that was 
generated after the crash a week ago. So I can append usefull parts of
vmstat output.

Another symptoms:
top shows about 
Memory: 68M Act, 34M Inact, 3744K Wired, 5084K Exec, 52M File, 268K Free
Swap: 513M Total, 23M Used, 491M Free

during the command
gzip -d netbsd.0.core in /var/crash

imediately after the command completes the statistics are as follows:
Memory: 62M Act, 23M Inact, 3904K Wired, 5084K Exec, 35M File, 18M Free
Swap: 513M Total, 23M Used, 491M Free

when I compare output from vmstat after the first 'gzip' command with the
output done after the another two 'gzip' cmds were run, I get the
following:

vmstat -ms after the first command gzip netbsd.0.core.gz:

In use 2889K, total allocated 3784K; utilization 76.3%

     4096 bytes per page
        8 page colors
    31335 pages managed
     3554 pages free   
    15590 pages active
     6909 pages inactive
        0 pages paging
      979 pages wired   
        0 zero pages    
        1 reserve pagedaemon pages  
        5 reserve kernel pages      
    12478 anonymous pages    
     9605 cached file pages  
     1296 cached executable pages
       64 minimum free pages 
       85 target free pages  
     8671 target inactive pages   
    10445 maximum wired pages
        1 swap devices  
   131417 swap pages    
     5427 swap pages in use  
     1205 swap allocations   
   160711 anons
   143148 free anons
  2958228 total faults taken
105231261 traps
 32436282 device interrupts
149273095 cpu context switches
102736567 software interrupts
631199647 system calls
     1138 pagein requests
      463 pageout requests
      195 swap ins
      211 swap outs 
        0 pages swapped in
     6849 pages swapped out
     5179 forks total  
      932 forks blocked parent
      945 forks shared address space with parent
        0 pagealloc zero wanted and avail
  2519223 pagealloc zero wanted and not avail
        0 aborts of idle page zeroing
  2689138 pagealloc desired color avail
     4333 pagealloc desired color not avail
        7 faults with no memory
        0 faults with no anons
        0 faults had to wait on pages
        0 faults found released page
     6653 faults relock (6652 ok)
   268366 anon page faults
     1137 anon retry faults  
    63038 amap copy faults
    83360 neighbour anon page faults
   611514 neighbour object page faults
   200225 locked pager get faults
     5515 unlocked pager get faults
   223744 anon faults
    44054 anon copy on write faults
   176321 object faults
    23904 promote copy faults
  2468048 promote zero fill faults
      313 times daemon wokeup
      225 revolutions of the clock hand
      225 times daemon attempted swapout
        9 pages freed by daemon
   211785 pages scanned by daemon
     6750 anonymous pages scanned by daemon
    55320 object pages scanned by daemon
    89969 pages reactivated
        0 pages found busy by daemon
      463 total pending pageouts
   238759 pages deactivated
 20870664 total name lookups
          cache hits (88% pos + 5% neg) system 1% per-process
          deletions 0%, falsehits 0%, toolong 0%

vmstat -ms after another two commands gzip [-d] netbsd.0.core.gz:

In use 2901K, total allocated 3700K; utilization 78.4%

     4096 bytes per page
        8 page colors
    31335 pages managed
     4339 pages free
    15938 pages active
     5803 pages inactive
        0 pages paging
      979 pages wired
        0 zero pages
        1 reserve pagedaemon pages
        5 reserve kernel pages
    12341 anonymous pages
     9009 cached file pages
     1271 cached executable pages
       64 minimum free pages
       85 target free pages
     8686 target inactive pages
    10445 maximum wired pages
        1 swap devices
   131417 swap pages
     5804 swap pages in use
     1253 swap allocations
   160711 anons
   142875 free anons
  2986284 total faults taken
105333020 traps
 32526681 device interrupts
149362353 cpu context switches
102815088 software interrupts
631704542 system calls
     1138 pagein requests
      497 pageout requests
      264 swap ins
      280 swap outs
        0 pages swapped in
     7307 pages swapped out
     5190 forks total
      933 forks blocked parent
      946 forks shared address space with parent
        0 pagealloc zero wanted and avail
  2520483 pagealloc zero wanted and not avail
        0 aborts of idle page zeroing
  2774459 pagealloc desired color avail
     5958 pagealloc desired color not avail
        7 faults with no memory
        0 faults with no anons
        0 faults had to wait on pages
        0 faults found released page
     7072 faults relock (7071 ok)
   272969 anon page faults
     1137 anon retry faults
    63191 amap copy faults
    84322 neighbour anon page faults
   614747 neighbour object page faults
   201536 locked pager get faults
     5934 unlocked pager get faults
   228230 anon faults
    44171 anon copy on write faults
   177582 object faults
    23954 promote copy faults
  2469138 promote zero fill faults
      563 times daemon wokeup
      475 revolutions of the clock hand
      475 times daemon attempted swapout
        9 pages freed by daemon
   440512 pages scanned by daemon
     7160 anonymous pages scanned by daemon
   125233 object pages scanned by daemon
   139412 pages reactivated
        0 pages found busy by daemon
      497 total pending pageouts
   474257 pages deactivated
 20875826 total name lookups
          cache hits (88% pos + 5% neg) system 1% per-process
          deletions 0%, falsehits 0%, toolong 0%


so there are about another 200 wake ups of the page daemon and about
another 70 swapouts etc etc..

I think the sysctl values for buff cache should be tuned a bit or is there another
solution? Btw. the lock could be related to frequent swap ins/outs.
It happens to me only when my machine is swapping long enough time.

Milos

PS:

I have something like

vm.nkmempages = 8163
vm.anonmin = 10
vm.execmin = 5
vm.filemin = 10
vm.maxslp = 20
vm.uspace = 8192
vm.anonmax = 80
vm.execmax = 30
vm.filemax = 50

and the kernel derived directly from GENERIC after commenting out a few
drivers.

NetBSD 1.5ZC (OAKLAND) #12: Fri Apr 19 09:24:26 UTC 2002
    root@oakland:/usr/src/sys/arch/i386/compile/OAKLAND
cpu0: AMD Duron (686-class), 800.10 MHz
cpu0: I-cache 64 KB 64b/line 2-way, D-cache 64 KB 64b/line 2-way
cpu0: L2 cache 64 KB 64b/line 16-way
cpu0: features 183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR>
cpu0: features 183f9ff<PGE,MCA,CMOV,FGPAT,PSE36,MMX>
cpu0: features 183f9ff<FXSR>
total memory = 127 MB
avail memory = 114 MB
using 1658 buffers containing 6632 KB of memory
BIOS32 rev. 0 found at 0xfb310
mainbus0 (root)
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0
pchb0: VIA Technologies VT8363 KT133 System Controller (rev. 0x03)
agp0 at pchb0: aperture at 0xd0000000, size 0x10000000
ppb0 at pci0 dev 1 function 0: VIA Technologies VT8363 KT133 PCI to AGP
Bridge (
rev. 0x00)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
vga1 at pci1 dev 0 function 0: ATI Technologies Rage XL (AGP) (rev. 0x65)
wsdisplay0 at vga1 kbdmux 1: console (80x25, vt100 emulation)
wsmux1: connecting to wsdisplay0
wsdisplay0: screen 1-7 added (80x25, vt100 emulation)
pcib0 at pci0 dev 7 function 0
pcib0: VIA Technologies VT82C686A (Apollo KX133) PCI-ISA Bridge (rev.
0x40)
pciide0 at pci0 dev 7 function 1: VIA Technologies VT82C686A (Apollo
KX133) ATA1
00 controller
pciide0: bus-master DMA support present
pciide0: primary channel configured to compatibility mode
wd0 at pciide0 channel 0 drive 0: <IBM-DJNA-371350>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 12949 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 26520480
sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4 (Ultra/66)
pciide0: primary channel interrupting at irq 14
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA
data
transfers)
pciide0: secondary channel configured to compatibility mode
atapibus0 at pciide0 channel 1: 2 targets
cd0 at atapibus0 drive 0: <CRD-8482B, , 1.05> type 5 cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2 (Ultra/33)
pciide0: secondary channel interrupting at irq 15
cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA
data
transfers)
uhci0 at pci0 dev 7 function 2: VIA Technologies VT83C572 USB Controller
(rev. 0
x16)
uhci0: interrupting at irq 9
usb0 at uhci0: USB revision 1.0 
uhub0 at usb0
uhub0: VIA Technologie UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1 at pci0 dev 7 function 3: VIA Technologies VT83C572 USB Controller
(rev. 0
x16) 
uhci1: interrupting at irq 9 
usb1 at uhci1: USB revision 1.0
uhub1 at usb1
uhub1: VIA Technologie UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
pchb1 at pci0 dev 7 function 4
pchb1: VIA Technologies VT82C686A SMBus Controller (rev. 0x40)
auvia0 at pci0 dev 7 function 5: VIA VT82C686A AC'97 Audio (rev 0x50)
auvia0: interrupting at irq 5
auvia0: ICE17 codec; headphone, 18 bit DAC, 18 bit ADC, Unknown 3D
audio0 at auvia0: full duplex, mmap, independent
ex0 at pci0 dev 18 function 0: 3Com 3c905C-TX 10/100 Ethernet with mngmt
(rev. 0
x74)
ex0: interrupting at irq 11
ex0: MAC address 00:01:02:db:4f:a5
ukphy0 at ex0 phy 24: Generic IEEE 802.3u media interface
ukphy0: Broadcom 3c905C internal PHY (OUI 0x000818, model 0x0017), rev. 6
ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
isa0 at pcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot) 
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0 
pms0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pms0 mux 0
lpt0 at isa0 port 0x378-0x37b irq 7
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
isapnp0: no ISA Plug 'n Play devices found
biomask e745 netmask ef45 ttymask ffc7
Kernelized RAIDframe activated
IPsec: Initialized Security Association Processing.
boot device: wd0
root on wd0a dumps on wd0b
root file system type: ffs
IP Filter: v3.4.25 initialized.  Default = pass all, Logging = enabled


> 
> -- 
> Manuel Bouyer <bouyer@antioche.eu.org>
> --
> 

--