NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/41140: 5-RC3 msk driver possibly broken



>Number:         41140
>Category:       kern
>Synopsis:       ssh/bacula diconnect with errors when msk iface is used
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Apr 04 16:00:01 +0000 2009
>Originator:     Frank Kardel
>Release:        NetBSD 5.0_RC3-090330
>Organization:
>Environment:
NetBSD gaia.acrys.com 5.0_RC3 NetBSD 5.0_RC3 (GAIA) #2: Mon Mar 30 10:40:31 
CEST 2009  kardel%gaia.acrys.com@localhost:/usr/obj/sys/arch/i386/compile/GAIA 
i386
Architecture: i386
Machine: i386
>Description:
        High data volume ssh session break with (e. g. using rsync):
                - MAC corruption
                - bad packet length (with a ridiculous length value in the 
millions)

        Bacula backup fail with:
                03-Apr 02:05 Orcus-sd JobId 15806: Fatal error: bsock.c:415 
Packet size too big from "client:x.y.z.u:36643. Terminating connection.     

        Symptoms seem similar to PR #31178.
        Also it seems to be more likely the less mbufs are available.
        Happens at 100Mb link rate

        Bacula seem fine when using the elinkxl (ex*) driver. With msk* no full 
backup finished. With ex* the full backup went through.

        dmesg sniplets:
        mainbus0 (root)
        cpu0 at mainbus0 apid 0: Intel 686-class, 2831MHz, id 0x10677
        cpu0: Enhanced SpeedStep (1244 mV) 800 MHz
        cpu0: Enhanced SpeedStep frequencies available (MHz): 7200 6400 5600 
4800 4000 3100 2300 1500 700
        cpu1 at mainbus0 apid 3: Intel 686-class, 2831MHz, id 0x10677
        cpu2 at mainbus0 apid 1: Intel 686-class, 2831MHz, id 0x10677
        cpu3 at mainbus0 apid 2: Intel 686-class, 2831MHz, id 0x10677
        ioapic0 at mainbus0 apid 4: pa 0xfec00000, version 20, 24 pins
        acpi0 at mainbus0: Intel ACPICA 20080321
        acpi0: X/RSDT: OemId <IntelR,AWRDACPI,42302e31>, AslId <AWRD,00000000>
        acpi0: SCI interrupting at int 9
        acpi0: fixed-feature power button present
        ...
        mskc0 at pci3 dev 0 function 0mskc0: interrupt moderation is 0 us
        , Yukon-2 EC rev. A3 (0x2): ioapic0 pin 19
        msk0 at mskc0 port A: Ethernet address 00:xx:xx:xx:xx:xx
        makphy0 at msk0 phy 0: Marvell 88E1111 Gigabit PHY, rev. 2
        makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
        mskc1 at pci5 dev 0 function 0mskc1: interrupt moderation is 0 us
        , Yukon-2 EC rev. A3 (0x2): ioapic0 pin 17
        msk1 at mskc1 port A: Ethernet address 00:xx:xx:xx:xx:xx
        makphy1 at msk1 phy 0: Marvell 88E1111 Gigabit PHY, rev. 2
        makphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
        ...


>How-To-Repeat:
        Run rsync via ssh or bacula for high volume data transfer with msk* 
driver on a 5.0_RC3 4-CPU (Q9550). Connections will break due to
        protocol sanity checks (MAC, length issues).
>Fix:
        ignore the two builtin msk interfaces - downgrade to e. g. ex*.



Home | Main Index | Thread Index | Old Index