Subject: Re: rdist and net.inet.tcp.ack_on_push=1
To: David Laight <david@l8s.co.uk>
From: Robert Elz <kre@munnari.OZ.AU>
List: tech-net
Date: 06/16/2003 18:18:55
    Date:        Mon, 16 Jun 2003 09:38:04 +0100
    From:        David Laight <david@l8s.co.uk>
    Message-ID:  <20030616093804.J3322@snowdrop.l8s.co.uk>

  | IIRC about the only thing the push flag mans is 'send this now and
  | don't wait for any more data because there won't be any'.

These days PUSH means essentially nothing.   Its intent was to tell
the receiving TCP that it should deliver the data to the application
(or at least make known to the application that data existed that
could be fetched) rather than simply buffering it for sometime later.

The theory was probably that swapping back in processes to deal with
data which wasn't yet complete was a waste of time - that is, if all
the application would be able to do after having received the data
was say "I need more", then you're better off not bothering to deliver
it in the first place.

These days, TCP stacks don't bother with any of that - all data that
arrives is delivered (made available for delivery) as soon as it
has arrived (and all preceding data has arrived of course).  PUSH
being set or not is irrelevant.

The theory (theories) behind this are that in practice no-one sets
PUSH correctly anyway (as der Mouse said, there's no API to allow it
to happen - and application writers wouldn't bother anyway), and that
generally "swap process back in" is no longer much of an issue,
rather, it is better to get data out of kernel buffers, and into
userland buffers, as quickly as possible, regardless of how many
context switches of system calls that means (kernel buffers not usually
being pageable).

All of this means that it is common for PUSH to get set on every packet.
That's safe on the off chance that the receiver actually implements PUSH
the way it was originally intended, and in practice, costs nothing.
PUSH should have no affect at all on how the TCP stack acks packets,
but, of course, that doesn't mean that this is true in all implementations.

BSD based implementations used to (and probably still do) set PUSH on the
final packet generated from a single write(2) (or send(2)) request - this
in some sense being the "set PUSH" API (but never really documented).
There's some justification for this, in a way, but in practice these
days it just doesn't matter.

kre