Subject: kern/31839: Hifn driver hangs and resets with 7955 in Soekris net4501
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <tls@netbsd.org>
List: netbsd-bugs
Date: 10/16/2005 22:22:00
>Number:         31839
>Category:       kern
>Synopsis:       Hifn driver hangs and resets with 7955 in Soekris net4501
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Oct 16 22:22:00 +0000 2005
>Originator:     tls@netbsd.org
>Release:        NetBSD 3.99.9 as of 2005-10-16
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD hotpoint.hvg.tjls.com 3.99.9 NetBSD 3.99.9 (HOTPOINT) #11: Sun Oct 16 17:52:17 EDT 2005  tls@enola-gay:/shares/home/tls/current/src/sys/arch/i386/compile/obj.i386/HOTPOINT i386
Architecture: i386
Machine: i386
>Description:

Though it appears to now work correctly most of the time on most
hardware, the 'hifn' driver still has serious problems in some
configurations.  In particular, though the driver works correctly with
a Soekris VPN1401 card (Hifn 7955 on PCI card) in my desktop system,
with a Soekris VPN1411 card (Hifn 7955 on MiniPCI card) in my Soekris
net4501 router, it fails spectacularly.  Any significant use of the card's
encryption functionality causes the card to hang and fail to reset, leaving
any OpenSSL-using process dead in the water (since our OpenSSL uses
/dev/crypto by default).

This problem was first observed in combination with FAST_IPSEC, but the
error mode in that configuration is more complicated and it's simpler to
debug using a kernel with the hifn driver and "pseudo-device crypto" but
not FAST_IPSEC.

>How-To-Repeat:

On a Soekris net4501 with Soekris VPN1411 card, boot a kernel with the
hifn driver and "pseudo-device crypto".  Be sure /dev/crypto exists, and
do this:

dd if=/dev/zero bs=1m count=100 | openssl des-ede3-cbc -out /dev/null

or this:

dd if=/dev/zero bs=1m count=100 | openssl aes-128-cbc -out /dev/null

The openssl process will quickly hang.  You will probably also see, in
the dmesg:

	hifn0: overrun ffffffff
	hifn0: abort, resetting.
	hifn0: proc unit did not reset

These messages, however, do not always appear -- in particular, the
"overrun" message is frequently absent.

>Fix:
No idea.  This problem is also described on the OpenBSD mailing lists at
http://archives.neohapsis.com/archives/openbsd/2004-08/2054.html but if
OpenBSD has any fix, it would seem to be the one in revision 1.149 of
their driver, and we have that change too.