NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/41974: panic in cpu_in_cksum / likely NFS issue
The following reply was made to PR kern/41974; it has been noted by GNATS.
From: "Greg A. Woods" <woods%planix.ca@localhost>
To: NetBSD GNATS <gnats-bugs%NetBSD.org@localhost>,
NetBSD GNATS Administrator <gnats-admin%NetBSD.org@localhost>
Cc:
Subject: Re: kern/41974: panic in cpu_in_cksum / likely NFS issue
Date: Mon, 23 Feb 2015 18:53:18 -0800
Some more info, possibly useful....
I recently, and finally, switched one of my servers from i386 to amd64
and suddenly I get these same cpu_in_cksum uvm_fault panics almost any
time I try to write (i.e. copy a large file) to an NFS mount point. Not
with every write, but it doesn't seem to take very many tries to
reproduce.
I never ever saw this problem before with the i386 kernel.
Both the before (i386) and after (amd64) systems were built from the
same source tree, which is on the very tip of the netbsd-5 branch.
These are running bare-metal on a Dell PE2950 (2x8-core, 32GB RAM).
It doesn't make any difference whether hardware assisted check-summing
capabilities are enabled in the ethernet interface or not. Initial
panics were observed with caps_enabled=3D0, but panics have continued with
the following config:
$ /sbin/ifconfig bnx1
bnx1: flags=3D8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST>=
mtu 1500
capabilities=3D3f00<IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,U=
DP4CSUM_Rx,UDP4CSUM_Tx>
caps_enabled=3D3f00<IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,U=
DP4CSUM_Rx,UDP4CSUM_Tx>
address: 00:1d:09:35:3c:09
media: Ethernet autoselect (1000baseT full-duplex)
status: active
inet 10.0.1.129 netmask 0xffffff00 broadcast 10.0.1.255
There's no trouble reading from remote NFS servers -- only writing to
them as an NFS client, and perhaps only with larger files/writes. I've
done several full builds, and a bunch of pkgsrc builds, with sources on
the same NFS server which fails when written to, and I've never had any
problem with the read-only access to src and pkgsrc. Manual tests with
'dd' reading large files with large reads work A-OK as well
(i.e. reading with the amd64 kernel as a client, or reading from the
other machine with the adm64 kernel as a server).
I.e.: note that the amd64 kernel happily serves NFS without
encountering this error.
Assuming the new PE2950 that arrived today is in working order then soon
I should be able to test if this happens in a Xen domU, and with
NetBSD-current.
One other possibly interesting point: The server in this case has been
an older PE2650 running NetBSD 4.0_STABLE, and it has a weird "tick" in
its RAID controller and/or driver (see PR# kern/35769), which means it
sometimes doesn't always respond to NFS requests in the most timely
manner. I.e. perhaps this bug is more easily tickled when the NFS
server is slow, and/or the network connection is poor, or similar.
Perhaps I will try using an NFS mount of my iMac; and soon I should also
be able to cross-mount the PE2950s for testing as well (especially if
the bug is reproducible in a Xen kernel).
--=20
Greg A. Woods
Planix, Inc.
<woods%planix.com@localhost> +1 250 762-7675 http://www.planix.com/
Home |
Main Index |
Thread Index |
Old Index