NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: evbarm: very slow IOs on NanoPi NEO3 (Rockchip RK3328)



Quick update:
After some investigation with some ARM port members, I flashed the
latest dtb file from Debian
(linux-u-boot-current-nanopineo3_23.02.2_arm64) and replaced the stock
one. The kernel now detects the entire memory and the CPU is 10°C
cooler, barely warmer than what I have with Linux. I consider those
two issues solved.

Regarding the USB drive, it turned out to be an issue with the drive
firmware, which for some reason was defaulting to USB2 with NetBSD but
not with Linux and macOS. Flashing the latest firmware brought back
the full USB3 speed.

The last standing issue is the slow network speed, which may be caused
by a broken awge driver. I was suggested to try the benchmark again
with "iperf -w 200k" but this didn't change the throughput. I will be
discussing this issue with the ARM team.

Thanks.

Damien B.

On Wed, Nov 29, 2023 at 3:38 PM Damien Boureille
<damien.boureille%gmail.com@localhost> wrote:
>
> Hi. I have installed NetBSD 9.3 and 10-RC1 on this armboard with Gigabit Ethernet and USB3 to use it as a NAS. I am encountering several issues:
> - NetBSD will only detect and use 512MB of memory instead of the full 1024MB
> - Its IOs both for ethernet and block devices are abnormally slow
> - Its idling temperature as per envstat stands at around 53°C, about 15°C more than Debian
>
> I am not sure how to diagnose the loss of half the memory, but I can live with only 512MB on the NAS.
>
> Concerning the temperature, I assume it may be needing some proprietary blob for ACPI? But I could add a fan and live with it.
>
> The IO performance is a major issue. I am using three identical benchmarks performed on Debian 11 Buster and NetBSD 10-RC1 / 9.3 on the same machine, using iperf3 and iozone.
>
> ----------------------- Network -----------------------
>
> - Debian 11 Buster:
>
> iperf3 -c 192.168.0.21
> Connecting to host 192.168.0.21, port 5201
> [  5] local 192.168.0.17 port 57374 connected to 192.168.0.21 port 5201
> [ ID] Interval           Transfer     Bitrate
> [  5]   0.00-1.00   sec  67.9 MBytes   569 Mbits/sec
> [  5]   1.00-2.00   sec  71.4 MBytes   599 Mbits/sec
> [  5]   2.00-3.00   sec  69.3 MBytes   581 Mbits/sec
> [  5]   3.00-4.00   sec  70.1 MBytes   589 Mbits/sec
> [  5]   4.00-5.00   sec  69.3 MBytes   580 Mbits/sec
> [  5]   5.00-6.00   sec  70.6 MBytes   593 Mbits/sec
> [  5]   6.00-7.00   sec  65.9 MBytes   552 Mbits/sec
> [  5]   7.00-8.00   sec  65.0 MBytes   546 Mbits/sec
> [  5]   8.00-9.00   sec  67.8 MBytes   569 Mbits/sec
> [  5]   9.00-10.00  sec  71.5 MBytes   599 Mbits/sec
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bitrate
> [  5]   0.00-10.00  sec   689 MBytes   578 Mbits/sec                  sender
> [  5]   0.00-10.02  sec   686 MBytes   575 Mbits/sec                  receiver
>
> - NetBSD 10-RC1:
>
> iperf3 -c 192.168.0.1
> Connecting to host 192.168.0.1, port 5201
> [  5] local 192.168.0.17 port 57424 connected to 192.168.0.1 port 5201
> [ ID] Interval           Transfer     Bitrate
> [  5]   0.00-1.00   sec  2.01 MBytes  16.8 Mbits/sec
> [  5]   1.00-2.00   sec  2.03 MBytes  17.1 Mbits/sec
> [  5]   2.00-3.00   sec  2.02 MBytes  17.0 Mbits/sec
> [  5]   3.00-4.00   sec  1.92 MBytes  16.1 Mbits/sec
> [  5]   4.00-5.00   sec  1.93 MBytes  16.1 Mbits/sec
> [  5]   5.00-6.00   sec  2.08 MBytes  17.5 Mbits/sec
> [  5]   6.00-7.00   sec  1.98 MBytes  16.6 Mbits/sec
> [  5]   7.00-8.00   sec  1.95 MBytes  16.4 Mbits/sec
> [  5]   8.00-9.00   sec  2.25 MBytes  18.9 Mbits/sec
> [  5]   9.00-10.00  sec  2.01 MBytes  16.8 Mbits/sec
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bitrate
> [  5]   0.00-10.00  sec  20.2 MBytes  16.9 Mbits/sec                  sender
> [  5]   0.00-10.04  sec  20.1 MBytes  16.8 Mbits/sec                  receiver
>
> - NetBSD 9.3:
>
> iperf3 -c 192.168.0.20
> Connecting to host 192.168.0.20, port 5201
> [  5] local 192.168.0.17 port 57592 connected to 192.168.0.20 port 5201
> [ ID] Interval           Transfer     Bitrate
> [  5]   0.00-1.00   sec  2.09 MBytes  17.5 Mbits/sec
> [  5]   1.00-2.00   sec  2.09 MBytes  17.5 Mbits/sec
> [  5]   2.00-3.00   sec  1.97 MBytes  16.6 Mbits/sec
> [  5]   3.00-4.00   sec  1.93 MBytes  16.2 Mbits/sec
> [  5]   4.00-5.00   sec  2.12 MBytes  17.7 Mbits/sec
> [  5]   5.00-6.00   sec  2.06 MBytes  17.3 Mbits/sec
> [  5]   6.00-7.00   sec  2.04 MBytes  17.1 Mbits/sec
> [  5]   7.00-8.00   sec  1.99 MBytes  16.7 Mbits/sec
> [  5]   8.00-9.00   sec  2.00 MBytes  16.8 Mbits/sec
> [  5]   9.00-10.00  sec  1.96 MBytes  16.4 Mbits/sec
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bitrate
> [  5]   0.00-10.00  sec  20.2 MBytes  17.0 Mbits/sec                  sender
> [  5]   0.00-10.00  sec  20.1 MBytes  16.9 Mbits/sec                  receiver
>
> ifconfig on NetBSD 10-RC1 and 9.3 does show gigabit:
> # ifconfig
> awge0: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>         ec_capabilities=0x1<VLAN_MTU>
>         ec_enabled=0
>         address: ca:3e:c7:26:ab:3b
>         media: Ethernet autoselect (1000baseT full-duplex)
>         status: active
>         inet6 fe80::c83e:c7ff:fe26:ab3b%awge0/64 flags 0 scopeid 0x1
>         inet 192.168.0.1/24 broadcast 192.168.0.255 flags 0
> lo0: flags=0x8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33624
>         status: active
>         inet6 ::1/128 flags 0x20<NODAD>
>         inet6 fe80::1%lo0/64 flags 0 scopeid 0x2
>         inet 127.0.0.1/8 flags 0
>
> ----------------------- FS (USB3)  -----------------------
>
> - NetBSD 10-RC (ffs, async, noatime, cgd)
> NB: dmesg shows support for AES acceleration, but the IO doesn't seem CPU-bound anyway as per vmstat
>
>         Command line used: iozone -e -I -a -s 100M -r 1024k -r 16384k -i 0 -i 1
>         Output is in kBytes/sec
>         Time Resolution = 0.000001 seconds.
>         Processor cache size set to 1024 kBytes.
>         Processor cache line size set to 32 bytes.
>         File stride size set to 17 * record size.
>                                                               random    random     bkwd    record    stride
>               kB  reclen    write  rewrite    read    reread    read     write     read   rewrite      read
>           102400    1024       32    26994   379479   382686
>           102400   16384     2432    26968   380234   384018
>
> - NetBSD 9.3
>
>         Command line used: iozone -e -I -a -s 100M -r 1024k -r 16384k -i 0 -i 1
>         Output is in kBytes/sec
>         Time Resolution = 0.000001 seconds.
>         Processor cache size set to 1024 kBytes.
>         Processor cache line size set to 32 bytes.
>         File stride size set to 17 * record size.
>                                                               random    random     bkwd    record    stride
>               kB  reclen    write  rewrite    read    reread    read     write     read   rewrite      read
>           102400    1024     1237     1340   323971   317574
>           102400   16384     1538     7810   323321   321183
>
> - Debian 11 Buster (ext4, noatime, ecryptfs)
>
>         Command line used: iozone -e -I -a -s 100M -r 1024k -r 16384k -i 0 -i 1
>         Output is in kBytes/sec
>         Time Resolution = 0.000001 seconds.
>         Processor cache size set to 1024 kBytes.
>         Processor cache line size set to 32 bytes.
>         File stride size set to 17 * record size.
>         random    random     bkwd    record    stride
>         kB  reclen    write  rewrite    read    reread    read     write     read   rewrite      read   fwrite frewrite    fread  freread
>           102400    1024    171903    172546    362795    363137
>           102400   16384    182412    172815    367108    367482
>
> The write speed on NetBSD seems way too slow. To rule out a potential problem with the benchmark itself on NetBSD, or the FS/CGD layers, I also tried dd on the block device:
>
> # dd if=/dev/rsd0 of=/dev/zero bs=1024k count=50
> 50+0 records in
> 50+0 records out
> 52428800 bytes transferred in 2.786 secs (18818664 bytes/sec)
>
> # dd if=/dev/sd0 of=/dev/zero bs=1024k count=50
> 50+0 records in
> 50+0 records out
> 52428800 bytes transferred in 9.637 secs (5440365 bytes/sec)
>
> Again this is way too slow.
>
> I tried disabling estd and setting the CPU to its maximum frequency, which made no difference with the metrics, as expected since the CPU doesn't seem to be the bottleneck here. Playing with the RX/TX buffers and MTU didn't do much either. I'm suspecting the problem to be on the kernel side.
>
> Any help to give this machine its expected performance on NetBSD is welcome.
>
> Attached: dmesg
>
> D.B.


Home | Main Index | Thread Index | Old Index