Subject: Re: write() in nonblocking mode can be slow.
To: None <netbsd-users@netbsd.org>
From: Jorgen Lundman <lundman@lundman.net>
List: tech-net
Date: 03/14/2005 15:03:05
Hmm maybe it is something much more embarrassing. :)

I'm hunting three things, which could be related, but this one seems to be due 
to SSL_write().

At first I noticed this only happened with buffer sizes of 4088. If I lowered it 
(so the write would fit within "one write") to 4000, then the write would 
complete in "one call" and quickly. Since all calls to SSL_write would insist on 
writing the full buffer, this was suspicious.

Even though you are in nonblocking mode, you need to set 
"SSL_MODE_ENABLE_PARTIAL_WRITE" in SSL_set_mode(). I knew about this, but found 
my code had the call commented out. (cvs checkin was at 4am, no idea what I was 
thinking at the time). Not entirely sure why this is a desirable default, but 
that is an OpenSSL issue and not NetBSD.

This would most likely explain why SSL sets some fd's to blocking, and why write 
can be excessively slow, and why write() is called twice in the ktrace to 
complete the request. At least that would be my guess, after setting the option 
properly again, the worst I see for the same user is:
(ignoring calls to select)

  27351 lundftpd 0.098098 CALL  write(0x10,0x81f3000,0xff8)
  27351 lundftpd 0.112984 CALL  write(0x10,0x81f3000,0xff8)
  27351 lundftpd 0.116147 CALL  write(0x10,0x81f3000,0xff8)
  27351 lundftpd 0.123736 CALL  gettimeofday(0xbfbff888,0)
  27351 lundftpd 0.129811 CALL  read(0xb,0x823b005,0x20)
  27351 lundftpd 0.142905 CALL  read(0xb,0x823b005,0x1010)
  27351 lundftpd 0.145079 CALL  read(0xb,0x823b000,5)
  27351 lundftpd 0.175329 CALL  gettimeofday(0xbfbff888,0)
  27351 lundftpd 0.295211 CALL  read(0x10,0x8241000,5)
  27351 lundftpd 0.314786 RET   write 4088/0xff8

Well less than 1 second, so that is at least good. Anyone done any profiling in 
NetBSD of the socket code with blocking vs nonblocking?  I assume all the 
land-speed records are all blocking mode.

The last item is to work out why I run out of SOMAXKVA, but that is a different 
thread and "probably" isn't related to any of this.

Sorry for the red herring.

Lund


Thor Lancelot Simon wrote:
> On Fri, Mar 11, 2005 at 08:13:03AM +0100, Martin Husemann wrote:
> 
>>On Fri, Mar 11, 2005 at 11:57:50AM +0900, Jorgen Lundman wrote:
>>
>>>I have a socket (TCP) connected where their end is very slow. (~3k/s). 
>>
>>Have you checked with tcpdump that ACKs are coming in fast enough? You may
>>have a full tcp window outstanding, so new data can only be sent once
>>more data is acknowledge by the peer.
> 
> 
> Yes, but he is seeing nonblocking writes block in the kernel for multiple
> seconds before returning.  That should never happen.
> 

-- 
Jorgen Lundman       | <lundman@lundman.net>
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo    | +81 (0)90-5578-8500          (cell)
Japan                | +81 (0)3 -3375-1767          (home)