Subject: port-arm/36596: iperf incompatible for netbsd Beta 4.0 (NB2) and Beta 4.0 (NB3)
To: None <port-arm-maintainer@netbsd.org, gnats-admin@netbsd.org,>
From: None <frankf@marvell.com>
List: netbsd-bugs
Date: 07/02/2007 19:00:01
>Number:         36596
>Category:       port-arm
>Synopsis:       iperf incompatible for netbsd Beta 4.0 (NB2) and Beta 4.0 (NB3)
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    port-arm-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Jul 02 19:00:01 +0000 2007
>Originator:     Frank Fang
>Release:        NetBSD 4.0 Beta 2 - ARM platform
>Organization:
Marvell
>Environment:
NetBSD orion 4.0_BETA2 NetBSD 4.0_BETA2 (ORION5X81BE) # 31: Mon Jul 2 11:25:57 PDT 2007 ffang@localhhost.localdomain:/home/ffang
>Description:
iperf is incompatible for netbsd Beta 4.0 (NB2) and Beta 4.0 (NB3) 

Beta 4.0 (NB2) ===> gcc 4.1.2 20060628 prerelease (NetBSD nb2 20060711), working well 
Beta 4.0 (NB3) ===> gcc 4.1.2 20061021 prerelease (NetBSD nb3 20061125), performance degration

The iperf version 2.0.2 has shown degration in performance while using 
the Beta 4.0 (NB3) to build th iperf in arm platform. We have observed there is hudge performance degration from 680 mbps dropped to 25 mpbs.


Observation:
The Application profiling has found out that most of code stays in a loop after initialization for performing system call for “socket write” to server. This is a blocking system call and return when the socket write operation completed. (in function Cleint:Run of Client.cpp)

currLen = write( mSettings->mSock, mBuf, mSettings->mBufLen ); 

 

As I found out both versions are always returned with the same transfer length with 128KB no other value.

However, in NB3, the blocking time is much longer than NB2 for the same amount of data transfer. It is why we are getting the low performance number. Also, it is consistent with the kernel profiling where we were seeing a lot of “d-cache” invalidation. The main function of codes for initialization and data path has compared. Besides, I have checked the “setsocketopt”, “gettimeofday”, “_write” and “writev” several critical systems calls in assembly code. So far, it seems they are the same. However, there is some section of _init and init_fallthu code (done by complier or linked) different. It may cause some difference.  



 
>How-To-Repeat:

(1) get iper:2.0.2
http://dast.nlanr.net/Projects/Iperf/

(2) building iperf 2.0.2 in ARM platform in NB2 and NB3
 
(3) setup ip address for arm platform
and using the following iperf command
./iperf -s -w 128k -l 128k -i 1 -t 100
./iperf -c 192.168.x.x -w 128k -l 128k -i 1 -t 100
You can reproduce the perfromance in two systems.


Note: There is no problem using iper 1.7.0 for NB2 and NB3
==================================================================

===================
gcc -dumpspecs for NB3
*asm:
%{mbig-endian:-EB} %{mlittle-endian:-EL} %{mcpu=*:-mcpu=%*} %{march=*:-march=%*} %{mapcs-*:-mapcs-%*} %(subtarget_asm_float_spec) %{mthumb-interwork:-mthumb-interwork} %{msoft-float:-mfloat-abi=soft} %{mhard-float:-mfloat-abi=hard} %{mfloat-abi=*} %{mfpu=*} %(subtarget_extra_asm_spec)

*asm_debug:
%{gstabs*:--gstabs}%{!gstabs*:%{g*:--gdwarf2}}

*asm_final:


*asm_options:
%a %Y %{c:%W{o*}%{!o*:-o %w%b%O}}%{!c:-o %d%w%u%O}

*invoke_as:
%{!S:-o %|.s |
 as %(asm_options) %m.s %A }

*cpp:
%(subtarget_cpp_spec)					%{msoft-float:%{mhard-float:							%e-msoft-float and -mhard_float may not be used together}}	%{mbig-endian:%{mlittle-endian:							%e-mbig-endian and -mlittle-endian may not be used together}}

*cpp_options:
%(cpp_unique_options) %1 %{m*} %{std*&ansi&trigraphs} %{W*&pedantic*} %{w} %{f*} %{g*:%{!g0:%{!fno-working-directory:-fworking-directory}}} %{O*} %{undef} %{save-temps:-fpch-preprocess}

*cpp_debug_options:
%{d*}

*cpp_unique_options:
%{C|CC:%{!E:%eGCC does not support -C or -CC without -E}} %{!Q:-quiet} %{nostdinc*} %{C} %{CC} %{v} %{I*&F*} %{P} %I %{MD:-MD %{!o:%b.d}%{o*:%.d%*}} %{MMD:-MMD %{!o:%b.d}%{o*:%.d%*}} %{M} %{MM} %{MF*} %{MG} %{MP} %{MQ*} %{MT*} %{!E:%{!M:%{!MM:%{MD|MMD:%{o*:-MQ %*}}}}} %{remap} %{g3:-dD} %{H} %C %{D*&U*&A*} %{i*} %Z %i %{fmudflap:-D_MUDFLAP -include mf-runtime.h} %{fmudflapth:-D_MUDFLAP -D_MUDFLAPTH -include mf-runtime.h} %{E|M|MM:%W{o*}}

*trad_capable_cpp:
cc1 -E %{traditional|ftraditional|traditional-cpp:-traditional-cpp}

*cc1:
%{cxx-isystem}

*cc1_options:
%{pg:%{fomit-frame-pointer:%e-pg and -fomit-frame-pointer are incompatible}} %1 %{!Q:-quiet} -dumpbase %B %{d*} %{m*} %{a*} %{c|S:%{o*:-auxbase-strip %*}%{!o*:-auxbase %b}}%{!c:%{!S:-auxbase %b}} %{g*} %{O*} %{W*&pedantic*} %{w} %{std*&ansi&trigraphs} %{v:-version} %{pg:-p} %{p} %{f*} %{undef} %{Qn:-fno-ident} %{--help:--help} %{--target-help:--target-help} %{!fsyntax-only:%{S:%W{o*}%{!o*:-o %b.s}}} %{fsyntax-only:-o %j} %{-param*} %{fmudflap|fmudflapth:-fno-builtin -fno-merge-constants} %{coverage:-fprofile-arcs -ftest-coverage}

*cc1plus:
%{cxx-isystem}

*link_gcc_c_sequence:
%G %L %G

*link_ssp:
%{fstack-protector:}

*endfile:
%{!shared:crtend%O%s} %{shared:crtendS%O%s}    %:if-exists(crtn%O%s)

*link:
-X %{mbig-endian:-EB} %{mlittle-endian:-EL}    %(netbsd_link_spec)

*lib:
%{pthread:			     %{!p:			       %{!pg:-lpthread}}	     %{p:-lpthread_p}		     %{pg:-lpthread_p}}		   %{posix:			     %{!p:			       %{!pg:-lposix}}		     %{p:-lposix_p}		     %{pg:-lposix_p}}		   %{!shared:			     %{!symbolic:		       %{!p:				 %{!pg:-lc}}		       %{p:-lc_p}		       %{pg:-lc_p}}}

*mfwrap:
 %{static: %{fmudflap|fmudflapth:  --wrap=malloc --wrap=free --wrap=calloc --wrap=realloc --wrap=mmap --wrap=munmap --wrap=alloca} %{fmudflapth: --wrap=pthread_create}} %{fmudflap|fmudflapth: --wrap=main}

*mflib:
%{fmudflap|fmudflapth: -export-dynamic}

*libgcc:
%{static: -lgcc -lgcc_eh}%{static-libgcc: %{!shared:-lgcc -lgcc_eh}%{shared:-lgcc_pic -lgcc_eh_pic}}%{!static:%{!static-libgcc:%{!shared:%{!shared-libgcc:-lgcc -lgcc_eh}%{shared-libgcc:-lgcc_s -lgcc}}%{shared:%{shared-libgcc:-lgcc_s} -lgcc_pic}}}

*startfile:
%{!shared:			     %{pg:gcrt0%O%s}		     %{!pg:			       %{p:gcrt0%O%s}		       %{!p:crt0%O%s}}}		   %:if-exists(crti%O%s)	   %{static:%:if-exists-else(crtbeginT%O%s crtbegin%O%s)}    %{!static:      %{!shared:crtbegin%O%s} %{shared:crtbeginS%O%s}}

*switches_need_spaces:


*cross_compile:
0

*version:
4.1.2

*multilib:
. ;

*multilib_defaults:


*multilib_extra:


*multilib_matches:


*multilib_exclusions:


*multilib_options:


*linker:
collect2

*link_libgcc:
%D

*md_exec_prefix:


*md_startfile_prefix:


*md_startfile_prefix_1:


*startfile_prefix_spec:


*sysroot_spec:
--sysroot=%R

*sysroot_suffix_spec:


*sysroot_hdrs_suffix_spec:


*subtarget_cpp_spec:
%{posix:-D_POSIX_SOURCE}    %{pthread:-D_REENTRANT -D_PTHREADS}

*subtarget_extra_asm_spec:
-matpcs %{fpic|fpie:-k} %{fPIC|fPIE:-k}

*subtarget_asm_float_spec:
%{mhard-float:{!mfpu=*:-mfpu=vfp}}      %{mfloat-abi=hard:{!mfpu=*:-mfpu=vfp}}

*netbsd_link_spec:
%{assert*} %{R*} %{rpath*}    %{shared:-shared}    %{symbolic:-Bsymbolic}    %{!shared:      -dc -dp      %{!nostdlib:        %{!r*: 	 %{!e*:-e %(netbsd_entry_point)}}}      %{!static:        %{rdynamic:-export-dynamic}        %{!dynamic-linker:-dynamic-linker /usr/libexec/ld.elf_so}}      %{static:-static}}

*netbsd_entry_point:
__start

*link_command:
%{!fsyntax-only:%{!c:%{!M:%{!MM:%{!E:%{!S:    %(linker) %l %{pie:-pie} %X %{o*} %{A} %{d} %{e*} %{m} %{N} %{n} %{r}    %{s} %{t} %{u*} %{x} %{z} %{Z} %{!A:%{!nostdlib:%{!nostartfiles:%S}}}    %{static:} %{L*} %(mfwrap) %(link_libgcc) %o %(mflib)    %{fprofile-arcs|fprofile-generate|coverage:-lgcov}    %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}    %{!A:%{!nostdlib:%{!nostartfiles:%E}}} %{T*} }}}}}}

========================
gcc -dumpspecs for NB2
*asm:
%{mbig-endian:-EB} %{mlittle-endian:-EL} %{mcpu=*:-mcpu=%*} %{march=*:-march=%*} %{mapcs-*:-mapcs-%*} %(subtarget_asm_float_spec) %{mthumb-interwork:-mthumb-interwork} %{msoft-float:-mfloat-abi=soft} %{mhard-float:-mfloat-abi=hard} %{mfloat-abi=*} %{mfpu=*} %(subtarget_extra_asm_spec)

*asm_debug:
%{gstabs*:--gstabs}%{!gstabs*:%{g*:--gdwarf2}}

*asm_final:


*asm_options:
%a %Y %{c:%W{o*}%{!o*:-o %w%b%O}}%{!c:-o %d%w%u%O}

*invoke_as:
%{!S:-o %|.s |
 as %(asm_options) %m.s %A }

*cpp:
%(subtarget_cpp_spec)					%{msoft-float:%{mhard-float:							%e-msoft-float and -mhard_float may not be used together}}	%{mbig-endian:%{mlittle-endian:							%e-mbig-endian and -mlittle-endian may not be used together}}

*cpp_options:
%(cpp_unique_options) %1 %{m*} %{std*&ansi&trigraphs} %{W*&pedantic*} %{w} %{f*} %{g*:%{!g0:%{!fno-working-directory:-fworking-directory}}} %{O*} %{undef} %{save-temps:-fpch-preprocess}

*cpp_debug_options:
%{d*}

*cpp_unique_options:
%{C|CC:%{!E:%eGCC does not support -C or -CC without -E}} %{!Q:-quiet} %{nostdinc*} %{C} %{CC} %{v} %{I*&F*} %{P} %I %{MD:-MD %{!o:%b.d}%{o*:%.d%*}} %{MMD:-MMD %{!o:%b.d}%{o*:%.d%*}} %{M} %{MM} %{MF*} %{MG} %{MP} %{MQ*} %{MT*} %{!E:%{!M:%{!MM:%{MD|MMD:%{o*:-MQ %*}}}}} %{remap} %{g3:-dD} %{H} %C %{D*&U*&A*} %{i*} %Z %i %{fmudflap:-D_MUDFLAP -include mf-runtime.h} %{fmudflapth:-D_MUDFLAP -D_MUDFLAPTH -include mf-runtime.h} %{E|M|MM:%W{o*}}

*trad_capable_cpp:
cc1 -E %{traditional|ftraditional|traditional-cpp:-traditional-cpp}

*cc1:
%{cxx-isystem}

*cc1_options:
%{pg:%{fomit-frame-pointer:%e-pg and -fomit-frame-pointer are incompatible}} %1 %{!Q:-quiet} -dumpbase %B %{d*} %{m*} %{a*} %{c|S:%{o*:-auxbase-strip %*}%{!o*:-auxbase %b}}%{!c:%{!S:-auxbase %b}} %{g*} %{O*} %{W*&pedantic*} %{w} %{std*&ansi&trigraphs} %{v:-version} %{pg:-p} %{p} %{f*} %{undef} %{Qn:-fno-ident} %{--help:--help} %{--target-help:--target-help} %{!fsyntax-only:%{S:%W{o*}%{!o*:-o %b.s}}} %{fsyntax-only:-o %j} %{-param*} %{fmudflap|fmudflapth:-fno-builtin -fno-merge-constants} %{coverage:-fprofile-arcs -ftest-coverage}

*cc1plus:
%{cxx-isystem}

*link_gcc_c_sequence:
%G %L %G

*link_ssp:
%{fstack-protector|fstack-protector-all:-lssp_nonshared -lssp}

*endfile:
%{!shared:crtend%O%s} %{shared:crtendS%O%s}    %:if-exists(crtn%O%s)

*link:
-X %{mbig-endian:-EB} %{mlittle-endian:-EL}    %(netbsd_link_spec)

*lib:
%{pthread:			     %{!p:			       %{!pg:-lpthread}}	     %{p:-lpthread_p}		     %{pg:-lpthread_p}}		   %{posix:			     %{!p:			       %{!pg:-lposix}}		     %{p:-lposix_p}		     %{pg:-lposix_p}}		   %{!shared:			     %{!symbolic:		       %{!p:				 %{!pg:-lc}}		       %{p:-lc_p}		       %{pg:-lc_p}}}

*mfwrap:
 %{static: %{fmudflap|fmudflapth:  --wrap=malloc --wrap=free --wrap=calloc --wrap=realloc --wrap=mmap --wrap=munmap --wrap=alloca} %{fmudflapth: --wrap=pthread_create}} %{fmudflap|fmudflapth: --wrap=main}

*mflib:
%{fmudflap|fmudflapth: -export-dynamic}

*libgcc:
%{static: -lgcc -lgcc_eh}%{static-libgcc: %{!shared:-lgcc -lgcc_eh}%{shared:-lgcc_pic -lgcc_eh_pic}}%{!static:%{!static-libgcc:%{!shared:%{!shared-libgcc:-lgcc -lgcc_eh}%{shared-libgcc:-lgcc_s -lgcc}}%{shared:%{shared-libgcc:-lgcc_s} -lgcc_pic}}}

*startfile:
%{!shared:			     %{pg:gcrt0%O%s}		     %{!pg:			       %{p:gcrt0%O%s}		       %{!p:crt0%O%s}}}		   %:if-exists(crti%O%s)	   %{static:%:if-exists-else(crtbeginT%O%s crtbegin%O%s)}    %{!static:      %{!shared:crtbegin%O%s} %{shared:crtbeginS%O%s}}

*switches_need_spaces:


*cross_compile:
0

*version:
4.1.2

*multilib:
. ;

*multilib_defaults:


*multilib_extra:


*multilib_matches:


*multilib_exclusions:


*multilib_options:


*linker:
collect2

*link_libgcc:
%D

*md_exec_prefix:


*md_startfile_prefix:


*md_startfile_prefix_1:


*startfile_prefix_spec:


*sysroot_spec:
--sysroot=%R

*sysroot_suffix_spec:


*sysroot_hdrs_suffix_spec:


*subtarget_cpp_spec:
%{posix:-D_POSIX_SOURCE}    %{pthread:-D_REENTRANT -D_PTHREADS}

*subtarget_extra_asm_spec:
-matpcs %{fpic|fpie:-k} %{fPIC|fPIE:-k}

*subtarget_asm_float_spec:
%{mhard-float:{!mfpu=*:-mfpu=vfp}}      %{mfloat-abi=hard:{!mfpu=*:-mfpu=vfp}}

*netbsd_link_spec:
%{assert*} %{R*} %{rpath*}    %{shared:-shared}    %{symbolic:-Bsymbolic}    %{!shared:      -dc -dp      %{!nostdlib:        %{!r*: 	 %{!e*:-e %(netbsd_entry_point)}}}      %{!static:        %{rdynamic:-export-dynamic}        %{!dynamic-linker:-dynamic-linker /usr/libexec/ld.elf_so}}      %{static:-static}}

*netbsd_entry_point:
__start

*link_command:
%{!fsyntax-only:%{!c:%{!M:%{!MM:%{!E:%{!S:    %(linker) %l %{pie:-pie} %X %{o*} %{A} %{d} %{e*} %{m} %{N} %{n} %{r}    %{s} %{t} %{u*} %{x} %{z} %{Z} %{!A:%{!nostdlib:%{!nostartfiles:%S}}}    %{static:} %{L*} %(mfwrap) %(link_libgcc) %o %(mflib)    %{fprofile-arcs|fprofile-generate|coverage:-lgcov}    %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}    %{!A:%{!nostdlib:%{!nostartfiles:%E}}} %{T*} }}}}}}


 
>Fix: