NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

port-sparc64/51647: deadlock in NFS client with UDP (IPv4) on sparc64



>Number:         51647
>Category:       port-sparc64
>Synopsis:       deadlock in NFS client with UDP (IPv4) on sparc64
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    port-sparc64-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Nov 23 15:15:00 +0000 2016
>Originator:     Rin Okuyama
>Release:        7.99.42
>Organization:
Faculty of Science and Technology, Keio University
>Environment:
NetBSD sb100 7.99.42 NetBSD 7.99.42 (BOOTP.UP) #0: Wed Nov 23 23:29:28 JST 2016  rin@xxx:xxx sparc64
>Description:
Deadlock occurs in NFS client with UDP (IPv4) on Sun Blade 100 under
heavy load. For example, boot the system with NFS root, and execute the
following commands:

  # cd /usr/pkgsrc/lang/perl5 && make clean && make
  ===> Cleaning for perl-5.24.0
  ...
  ===> Extracting for perl-5.24.0

Then tar freezes with the message below appearing in dmesg:

  nfs server cubietruck:/exports/sb100: not responding

After that any operation to rootfs is blocked.

This problem is completely reproducible. It occurs both with on-board
(gem) and PCI (re) ethernet adapters. If I specify LOCKDEBUG kernel
option, or mount rootfs with TCP (IPv4), it seems to work fine (at
least, pkgsrc/lang/perl5 builds successfully).

The NFS server is hosted by NetBSD/evbearmv7hf-eb on Cubietruck. Other
clients, NetBSD/alpha, evbppc, and landisk work with no problem.

A similar problem was reported for Sun Blade 150:

  http://mail-index.netbsd.org/port-sparc64/2016/08/24/msg002569.html

I attached the kernel configuration file and dmesg below.

====
include         "arch/sparc64/conf/GENERIC.UP"
makeoptions     DEBUG="-g"
options         DEBUG
#options        LOCKDEBUG
options         NFS_BOOT_BOOTP
no options      NFS_BOOT_BOOTPARAM
no options      NFS_BOOT_DHCP
config          nfsnetbsd root on re0 type nfs dumps on wd0b
#config         nfsnetbsd root on gem0 type nfs dumps on wd0b
====
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 7.99.42 (BOOTP.UP) #0: Wed Nov 23 23:29:28 JST 2016
        rin@xxx:xxx
total memory = 1536 MB
avail memory = 1491 MB
mainbus0 (root): SUNW,Sun-Blade-100 (Sun Blade 100): hostid xxxxxxxx
cpu0 at mainbus0: SUNW,UltraSPARC-IIe @ 502 MHz, CPU id 0
cpu0: manuf 17, impl 13, mask 14
cpu0: system tick frequency 5.555 MHz
cpu0: 16K instruction (32 b/l), 16K data (32 b/l), 256K external (64 b/l)
psycho0 at mainbus0
psycho0: pci108e,a001: impl 0, version 0: ign 7c0 bus range 0 to 1; PCI bus 0
extent `psycho mem' (0x0 - 0xffffffff), flags = 0x0
     0x0 - 0x427fff
     0x440000 - 0x45ffff
     0x800000 - 0x13fffff
     0x2000000 - 0x30fffff
     0x4000000 - 0x4ffffff
     0x60000000 - 0x7fffffff
     0xc0000000 - 0xdfffffff
     0xf0000000 - 0xffffffff
extent `psycho io' (0x0 - 0xffffff), flags = 0x0
     0x0 - 0xa2f
     0xb00 - 0xbff
     0x1000 - 0x1fff
     0x10000 - 0xffffff
memory range: 000001ff 00000000
pci0 at psycho0
ebus0 at pci0 dev 12 function 0
ebus0: Sun Microsystems PCIO Ebus2 (US III), revision 0x01
flashprom at ebus0 addr 0-fffff not configured
clock0 at ebus0 addr 0-1fff: mk48t59
ebus_attach: idprom: incomplete
ebus1 at pci0 dev 7 function 0
ebus1: Acer Labs M1533 PCI-ISA Bridge, revision 0x00
dma at ebus1 addr 0-ffff ipl 2a not configured
power at ebus1 addr 800-82f ipl 20 not configured
com0 at ebus1 addr 3f8-3ff ipl 2b: ns16550a, working fifo
com0: console
com1 at ebus1 addr 2e8-2ef ipl 2b: ns16550a, working fifo
gem0 at pci0 dev 12 function 1: Sun Microsystems ERI Ethernet (rev. 0x01)
gem0: interrupting at ivec 3006
ukphy0 at gem0 phy 1: OUI 0x0008bb, model 0x0002, rev. 1
ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
gem0: Ethernet address xx:xx:xx:xx:xx:xx, 2KB RX fifo, 2KB TX fifo
fwohci0 at pci0 dev 12 function 2: Sun Microsystems FireWire Controller (rev. 0x01)
fwohci0: interrupting at ivec 1c
fwohci0: OHCI version 1.0 (ROM=0)
fwohci0: No. of Isochronous channels is 4.
fwohci0: EUI64 00:03:ba:ff:fe:0e:63:45
fwohci0: Phy 1394a available S400, 2 ports.
fwohci0: Link S400, max_rec 2048 bytes.
ieee1394if0 at fwohci0: IEEE1394 bus
fwip0 at ieee1394if0: IP over IEEE1394
fwohci0: Initiate bus reset
ohci0 at pci0 dev 12 function 3: Sun Microsystems USB Controller (rev. 0x01)
csr: 02800046
ohci0: interrupting at ivec 24
ohci0: OHCI version 1.0, legacy support
usb0 at ohci0: USB revision 1.0
alipm0 at pci0 dev 3 function 0: 223KHz clock
iic0 at alipm0: I2C bus
admtemp0 at iic0 addr 0x18: MAX1617A temperature sensor: id. 0x41, rev. 0x3c
card-reader at iic0 addr 0x20 not configured
spdmem0 at iic0 addr 0x50
spdmem0: SDRAM, data ECC, 512MB, 133MHz (PC-133)
spdmem1 at iic0 addr 0x51
spdmem1: SDRAM, data ECC, 512MB, 133MHz (PC-133)
spdmem2 at iic0 addr 0x52
spdmem2: SDRAM, data ECC, 512MB, 133MHz (PC-133)
autri0 at pci0 dev 8 function 0: Acer Labs M5451 AC-Link Controller Audio Device (rev. 0x01)
autri0: interrupting at ivec 23
autri0: ac97: Analog Devices AD1881A codec; headphone, Analog Devices Phat Stereo
audio0 at autri0: full duplex, playback, capture, mmap, independent
midi0 at autri0: 4DWAVE MIDI UART
aceride0 at pci0 dev 13 function 0: Acer Labs M5229 UDMA IDE Controller (rev. 0xc3)
aceride0: using ivec 180c for native-PCI interrupt
atabus0 at aceride0 channel 0
atabus1 at aceride0 channel 1
ppb0 at pci0 dev 5 function 0: Digital Equipment DC21152 PCI-PCI Bridge (rev. 0x03)
pci1 at ppb0 bus 1
re0 at pci1 dev 1 function 0: RealTek 8169/8110 Gigabit Ethernet (rev. 0x10)
re0: interrupting at ivec a
re0: Ethernet address xx:xx:xx:xx:xx:xx
rgephy0 at re0 phy 7: RTL8169S/8110S/8211 1000BASE-T media interface, rev. 3
rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
satalink0 at pci1 dev 2 function 0: Silicon Image SATALink 3512 (rev. 0x01)
satalink0: using ivec 15 for native-PCI interrupt
atabus2 at satalink0 channel 0
atabus3 at satalink0 channel 1
machfb0 at pci0 dev 19 function 0membase: 1ff00000000 fbpa: 0
: ATI Technologies Rage XL (rev. 0x27)
machfb0: using MMIO aperture
machfb0: 16 MB aperture at 0x04000000, 4 KB registers at 0x00426000
machfb0: 128 KB ROM at 0x00440000
machfb0: ref_freq=29.498MHz
machfb0: 8192 KB SDRAM 114.765 MHz, maximum RAMDAC clock 230 MHz
machfb0: initializing the DSP
machfb0: initial resolution 1152x864 at 8 bpp
machfb0: initializing the DSP
wsdisplay1 at machfb0 kbdmux 1
direct rendering for machfb0 unsupported
pcons at mainbus0 not configured
fwohci0: BUS reset
fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode
ieee1394if0: 1 nodes, maxhop <= 0 cable IRM irm(0) (me)
ieee1394if0: bus manager 0
No counter-timer -- using STICK at 5.555MHz as system clock.
satalink0: port 1: device present, speed: 1.5Gb/s
wd0 at atabus3 drive 0
uhub0 at usb0: Sun Microsystems OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
wd0: <INTEL SSDSC2KW120H6>
wd0: 111 GB, 232581 cyl, 16 head, 63 sec, 512 bytes/sect x 234441648 sectors
WARNING: 1 error while detecting hardware; check system log.
root on re0 dumps on wd0b
nfs_boot: trying BOOTP
re0: link state DOWN (was UNKNOWN)
re0: link state UP (was DOWN)
nfs_boot: BOOTP next-server: 192.168.10.128
nfs_boot: my_name=sb100
nfs_boot: my_addr=192.168.10.132
nfs_boot: my_mask=255.255.255.0
nfs_boot: gateway=192.168.10.1
root on cubietruck:/exports/sb100
root time: 0x5835a827
kern.module.path=/stand/sparc64/7.99.42/modules
init: copying out path `/sbin/init' 11
>How-To-Repeat:
Described above.
>Fix:
N/A. No additional information is provided by DIAGNOSTIC, DEBUG and
LOCKDEBUG options.



Home | Main Index | Thread Index | Old Index