Subject: kern/11983: under certain conditions, mkdir(2) takes too long.
To: None <gnats-bugs@gnats.netbsd.org>
From: Herb Peyerl <hpeyerl@beer.org>
List: netbsd-bugs
Date: 01/17/2001 13:12:18
>Number: 11983
>Category: kern
>Synopsis: under certain conditions, mkdir(2) takes too long.
>Confidential: yes
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jan 17 13:12:00 PST 2001
>Closed-Date:
>Last-Modified:
>Originator: Herb Peyerl
>Release: NetBSD-1.5
>Organization:
>Environment:
System: NetBSD nlager 1.5 NetBSD 1.5 (LAGER) #4: Tue Jan 16 05:42:44 MST 2001 hpeyerl@nlager:/usr/src/sys/arch/i386/compile/LAGER i386
>Description:
On my Abit KA7, Athlon 800, 128MB, 2 45GB IBM disks RAID1'd together into
one big filesystem, mkdir(2) takes 17 seconds to complete and completes
3500+ I/O's. As discussed on current-users the week of 01/14/2001.
Here are some particulars:
wd0 and wd1 disklabels:
# /dev/rwd0d:
type: ESDI
disk: IBM-DTLA-307045
label: fictitious
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 16383
total sectors: 90069840
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0 # microseconds
track-to-track seek: 0 # microseconds
drivedata: 0
8 partitions:
# size offset fstype [fsize bsize cpg]
a: 40320 0 4.2BSD 1024 8192 16 # (Cyl. 0 - 39)
d: 90069840 0 unused 0 0 # (Cyl. 0 - 89354)
e: 89529552 40320 RAID # (Cyl. 40 - 88858)
h: 499968 89569872 swap # (Cyl. 88859 - 89354)
raid1 disklabel:
# /dev/rraid1d:
type: RAID
disk: raid
label: default label
flags:
bytes/sector: 512
sectors/track: 32
tracks/cylinder: 1
sectors/cylinder: 32
cylinders: 2797796
total sectors: 89529472
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0 # microseconds
track-to-track seek: 0 # microseconds
drivedata: 0
4 partitions:
# size offset fstype [fsize bsize cpg]
a: 89529472 0 4.2BSD 1024 8192 256 # (Cyl. 0 - 2797795)
b: 49968 89029504 swap # (Cyl. 2782172 - 2783733*)
d: 89529472 0 unused 0 0 # (Cyl. 0 - 2797795)
/etc/raid1.conf:
START array
1 1 0
START disks
/dev/wd0e
#/dev/wd1e
START layout
32 1 1 1
START queue
fifo 100
dmesg:
NetBSD 1.5 (LAGER) #4: Tue Jan 16 05:42:44 MST 2001
hpeyerl@nlager:/usr/src/sys/arch/i386/compile/LAGER
cpu0: AMD K7 (Athlon) (686-class)
total memory = 127 MB
avail memory = 112 MB
using 1658 buffers containing 6632 KB of memory
BIOS32 rev. 0 found at 0xfb470
mainbus0 (root)
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled
pchb0 at pci0 dev 0 function 0
pchb0: VIA Technologies VT8371 (Apollo KX133) Host Bridge (rev. 0x02)
ppb0 at pci0 dev 1 function 0: VIA Technologies VT8371 (Apollo KX133) PCI-PCI Bridge (rev. 0x00)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
pcib0 at pci0 dev 7 function 0
pcib0: VIA Technologies VT82C686A (Apollo KX133) PCI-ISA Bridge (rev. 0x22)
pciide0 at pci0 dev 7 function 1: VIA Tech VT82C586A IDE Controller (rev. 0x10)
pciide0: bus-master DMA support present
pciide0: primary channel configured to compatibility mode
wd0 at pciide0 channel 0 drive 0: <IBM-DTLA-307045>
wd0: drive supports 16-sector pio transfers, lba addressing
wd0: 43979 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 90069840 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5
wd1 at pciide0 channel 0 drive 1: <QUANTUM FIREBALLP LM10.2>
wd1: drive supports 16-sector pio transfers, lba addressing
wd1: 9797 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 20066251 sectors
wd1: 32-bit data port
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4
pciide0: primary channel interrupting at irq 14
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2 (using DMA data transfers)
wd1(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 2 (using DMA data transfers)
pciide0: secondary channel configured to compatibility mode
wd2 at pciide0 channel 1 drive 0: <IBM-DTLA-307045>
wd2: drive supports 16-sector pio transfers, lba addressing
wd2: 43979 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 90069840 sectors
wd2: 32-bit data port
wd2: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5
pciide0: secondary channel interrupting at irq 15
wd2(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2 (using DMA data transfers)
uhci0 at pci0 dev 7 function 2: VIA Technologies VT83C572 USB Controller (rev. 0x10)
uhci0: interrupting at irq 11
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: VIA Technologie UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1 at pci0 dev 7 function 3: VIA Technologies VT83C572 USB Controller (rev. 0x10)
uhci1: interrupting at irq 11
usb1 at uhci1: USB revision 1.0
uhub1 at usb1
uhub1: VIA Technologie UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
pchb1 at pci0 dev 7 function 4
pchb1: VIA Technologies VT82C686A SMBus Controller (rev. 0x30)
fxp0 at pci0 dev 9 function 0: Intel i82557 Ethernet, rev 8
fxp0: interrupting at irq 10
fxp0: Ethernet address 00:d0:b7:26:ab:f8, 10/100 Mb/s
inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
inphy0: 10baseT, 10baT-FDX, 100baseTX, 100baseTX-FDX, auto
ahc1 at pci0 dev 11 function 0
ahc1: interrupting at irq 11
ahc1: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs
scsibus0 at ahc1 channel 0: 16 targets, 8 luns per target
isa0 at pcib0
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard
lpt0 at isa0 port 0x378-0x37b irq 7
pcdisplay0 at isa0 port 0x3b0-0x3bf iomem 0xb0000-0xb7fff
wsdisplay0 at pcdisplay0: console (80x25, vt100 emulation), using wskbd0
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
isapnp0: no ISA Plug 'n Play devices found
biomask fb7d netmask ff7d ttymask ffff
scsibus0: waiting 2 seconds for devices to settle...
ahc1: target 3 using 8bit transfers
ahc1: target 3 using asynchronous transfers
cd0 at scsibus0 target 3 lun 0: <YAMAHA, CDR100, 1.11> SCSI2 4/worm rovable
ahc1: target 5 using 8bit transfers
ahc1: target 5 synchronous at 5.0MHz, offset = 0xf
st0 at scsibus0 target 5 lun 0: <SGI, DLT2000, 8519> SCSI2 1/sequential removable
st0: density code 25, variable blocks, write-enabled
Kernelized RAIDframe activated
RAID autoconfigure
Configuring raid1:
RAIDFRAME: protectedSectors is 64
RAIDFRAME: Configure (RAID Level 1): total number of sectors is 89529472 (43715 MB)
RAIDFRAME(RAID Level 1): Using 6 floating recon bufs with no head sep limit
boot device: raid1
root on raid1a dumps on raid1b
root file system type: ffs
>How-To-Repeat:
Unfortunately, while I can repeat it easily, no one else indicated that
they had been able to.
>Fix:
Bill Sommerfeld suggested the following to see if it improved the
situation and it did:
From: Bill Sommerfeld <sommerfeld@orchard.arlington.ma.us>
Message-Id: <20010116054247.E2A862A4B@orchard.arlington.ma.us>
So, there's one thing which mkdir() does in ffs which is different
from other sorts of file creation... it tries to put the directory in
a different cylinder group from the one it's parent lives in..
I think what's going on here is that ffs_dirpref() may be screwing up
and always picking an initial cylinder group with few directories,
lots of free inodes.. but no free blocks.. so it winds up hunting all
over the disk for free blocks before it finds one for the directory.
I'm willing to bet that the extra level of indirection required for
mirroring is causing the "hunt" for free blocks to no longer fit into
the buffer cache.
So, the core of ffs_dirpref() in sys/ufs/ffs/ffs_alloc.c is:
for (cg = 0; cg < fs->fs_ncg; cg++)
if (fs->fs_cs(fs, cg).cs_ndir < minndir &&
fs->fs_cs(fs, cg).cs_nifree >= avgifree) {
mincg = cg;
minndir = fs->fs_cs(fs, cg).cs_ndir;
}
maybe it should be something more like:
for (cg = 0; cg < fs->fs_ncg; cg++)
if (fs->fs_cs(fs, cg).cs_ndir < minndir &&
fs->fs_cs(fs, cg).cs_nbfree > 0 &&
fs->fs_cs(fs, cg).cs_nifree >= avgifree) {
mincg = cg;
minndir = fs->fs_cs(fs, cg).cs_ndir;
}
.. but I must admit I'm not an expert on ffs guts..
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted: