PR 55714

To: NetBSD User Maillist <netbsd-users%NetBSD.org@localhost>
Subject: PR 55714
From: BERTRAND Joël <joel.bertrand%systella.fr@localhost>
Date: Tue, 20 Oct 2020 09:59:37 +0200

	Hello,

	I have filled a PR concerning re(4) driver and jumbo frames on a
RTL8169S (PCI, not PCIe) adapter.

	In a first time, kernel tried to continually load and unload usbverbose
module (!) until panic. I have upgraded my source tree to build last
kernel (NetBSD-9) and kernel now panics with another error :


[   481.957605] re0: watchdog timeout
[   494.632228] S2C1: *** Connection Error, status=24, logout=-1, state=3
[   503.975605] re0: watchdog timeout
[   519.481139] S2C1: *** Connection Error, status=18, logout=2, state=6
[   525.663327] S2C1: ccb_timeout: num=1 total=1 disp=5
[   525.663327] S2C1: *** Connection Error, status=24, logout=2, state=6
[   525.683334] S2C1: ccb_timeout: num=1 total=1 disp=5
...
[   531.995569] re0: watchdog timeout
[   532.495745] S2C1: Write failed sock 0xffffa6bf802ca940 (ret: 32,
req: 65584, resid: 16234)
[   532.495745] S2C1: *** Connection Error, status=18, logout=-1, state=6
[   534.506456] S2C1: Resend ccb 0xffffd88020ad1d00 (37) - updating
CmdSN old 92, new 95
[   534.506456] S2C1: Resend ccb 0xffffd88020ad1b88 (37) - updating
CmdSN old 93, new 96
[   534.506456] S2C1: Resend ccb 0xffffd88020ad1a10 (37) - updating
CmdSN old 94, new 97
[   534.506456] S2C1: Connection ReCreated successfully - status 0
[   542.009111] re0: watchdog timeout
[   553.023009] re0: watchdog timeout
[   554.533544] S2C1: *** Connection Error, status=24, logout=-1, state=3
[   561.055852] S2C1: *** Connection Error, status=18, logout=2, state=6
...
[  8909.342671] uvm_fault(0xffffffff8151f760, 0xffffba0020ad0000, 1) -> e
[  8909.342671] fatal page fault in supervisor mode
[  8909.342671] trap type 6 code 0 rip 0xffffffff8026aee8 cs 0x8 rflags
0x10286 cr2 0xffffba0020ad0070 ilevel 0 rsp 0xffffba013cc18b90
[  8909.342671] curlwp 0xffff9974adb6e900 pid 0.175 lowest kstack
0xffffba013cc162c0
[  8909.342671] panic: trap
[  8909.342671] cpu5: Begin traceback...
[  8909.342671] vpanic() at netbsd:vpanic+0x160
[  8909.342671] snprintf() at netbsd:snprintf
[  8909.342671] startlwp() at netbsd:startlwp
[  8909.342671] alltraps() at netbsd:alltraps+0xbb
[  8909.342671] dk_start() at netbsd:dk_start+0x102
[  8909.342671] spec_strategy() at netbsd:spec_strategy+0xa7
[  8909.342671] VOP_STRATEGY() at netbsd:VOP_STRATEGY+0x4c
[  8909.352675] dkstart() at netbsd:dkstart+0x184
[  8909.352675] spec_strategy() at netbsd:spec_strategy+0xa7
[  8909.352675] VOP_STRATEGY() at netbsd:VOP_STRATEGY+0x4c
[  8909.352675] wapbl_buffered_write_async() at
netbsd:wapbl_buffered_write_async+0x7d
[  8909.352675] wapbl_buffered_write() at netbsd:wapbl_buffered_write+0xdf
[  8909.352675] wapbl_circ_write() at netbsd:wapbl_circ_write+0x103
[  8909.352675] wapbl_flush() at netbsd:wapbl_flush+0x26f
[  8909.352675] ffs_sync() at netbsd:ffs_sync+0x20a
[  8909.362679] VFS_SYNC() at netbsd:VFS_SYNC+0x35
[  8909.362679] sched_sync() at netbsd:sched_sync+0x98
[  8909.362679] cpu5: End traceback...

	With a standard MTU (1500 bytes), system doesn't panic but iSCSI is
very slow and unusable (more than three days to archive 2 TB). Indeed,
if I do a simple ping with my nas, I obtain following stats :
legendre:[~] > ping 192.168.12.2
PING 192.168.12.2 (192.168.12.2): 56 data bytes
64 bytes from 192.168.12.2: icmp_seq=0 ttl=64 time=0.174106 ms
64 bytes from 192.168.12.2: icmp_seq=1 ttl=64 time=0.214660 ms
64 bytes from 192.168.12.2: icmp_seq=2 ttl=64 time=0.156877 ms
64 bytes from 192.168.12.2: icmp_seq=3 ttl=64 time=0.151308 ms
64 bytes from 192.168.12.2: icmp_seq=4 ttl=64 time=0.148682 ms
64 bytes from 192.168.12.2: icmp_seq=5 ttl=64 time=0.143053 ms
64 bytes from 192.168.12.2: icmp_seq=6 ttl=64 time=0.180400 ms
64 bytes from 192.168.12.2: icmp_seq=7 ttl=64 time=0.170369 ms
64 bytes from 192.168.12.2: icmp_seq=8 ttl=64 time=0.155256 ms
64 bytes from 192.168.12.2: icmp_seq=9 ttl=64 time=0.165401 ms
^C
----192.168.12.2 PING Statistics----
10 packets transmitted, 10 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.143053/0.166011/0.214660/0.020820 ms
legendre:[~] > rpl -is
+++RPL/2 (R) version 4.1.31 (Lundi 04/02/2019, 12:08:26 CET)
+++Copyright (C) 1989 à 2018, 2019 BERTRAND Joël

+++Ce logiciel est un logiciel libre sans aucune garantie de fonctionnement.
+++Pour plus de détails, utilisez la commande 'warranty'.

RPL/2> 0.166011 inv 1448 *

1: 8722.31358163013
RPL/2>

	~9 MBps between initiator and iSCSI target. Only one solution to
acheive a higher rate, increase the MTU of this re(4) interface.

	I suppose there is somewhere in sys/dev/ic/rtl8169.c a memory
corruption when MTU is greater than 1500 bytes. Thus, I started by
analyzing this source file, but I don't see any simple memory
corruption. I'm willing to take tile to help, but I do not know how
continue.

	Help will be welcome,

	JB

Follow-Ups:
- Re: PR 55714
  - From: Manuel Bouyer

Prev by Date: Re: Hexchat fails connection
Next by Date: Re: PR 55714
Previous by Thread: Hexchat fails connection
Next by Thread: Re: PR 55714
Indexes:

Home | Main Index | Thread Index | Old Index