Subject: Re: Up-stream bandwidth shaping without resorting to linux/iptables?
To: Amadeus Stevenson <amadeus.stevenson@gmail.com>
From: Greg A. Woods <woods@weird.com>
List: netbsd-users
Date: 02/05/2005 17:41:44
[ On Saturday, February 5, 2005 at 16:55:50 (+0000), Amadeus Stevenson wrote: ]
> Subject: Up-stream bandwidth shaping without resorting to linux/iptables?
>
> I have been trying to setup an asychronous dsl gateway to LAN for
> about a month now.
> 
> The problem I am having is that the up/down ratio (1:10) is severely
> limiting the performance of the link when various programs (p2p
> mainly) swamp the upstream channel with connections and therefore
> grind "essential" services (http etc) to a halt.
> 
> I've tried ALTQ with CBQ but was told it isn't possible to control upstream.

Well with any bandwidth management tool in any IP network it is only
possible to control what one sends, but not what one receives.

(You could control what you receive, but there's no point since it has
to cross the narrower wire to get to your control point anyway.)

It's also very difficult to do flow(connection)-based bandwidth
management in TCP/IP networks, especially in software.  Hardware that
can identify and tag flows in real time, such as that used in modern
high-end ethernet switches, helps, but TCP/IP is not anything like ATM
and trying to get ATM-like control over TCP/IP traffic is a bit of an
apples and oranges problem because you do not, and cannot, have full
end-to-end traffic management in an IP network like you do have in an
all-ATM network, for example.

My own connection is an aDSL line with a 0.8-out/4.0-in megabit/s ratio
and I've been experimenting with using ALTQ to help improve QoS on VoIP
traffic over my link.

My gateway is a Sparc-5 running NetBSD-1.6.2_STABLE (with local fixes to
properly integrate ALTQ into the sparc kernel -- it's only really
available by default in i386 in 1.6.x).  I also use IPF to firewall my
network somewhat, but it's not really related to this QoS job.

I've been testing VoIP calls with a Digium IAXy (S100I) CPE device
calling through an Asterisk PBX to my colleague at the other end.  It
uses the IAX protocol (with uLAW audio encoding), and conveniently both
Asterisk and the IAXy set the IPTOS_LOWDELAY bit in their packets making
it very easy for ALTQ to "classify" them.

I've found that ALTQ on the sparc exhibits better control for the type
of service I'm trying to provide if I use the HFSC scheduler.  However
other experiments at my colleague's end of the VoIP call suggest that on
an i386 (where ALTQ can use the processor cycle counter for more
accurate hi-resolution timing), CBQ does work quite well.  At some point
I may try creating a duplicate router config on a spare i386 machine and
try swapping between the sparc and the i386 to see if it really makes
any measurable difference or whether there's some other unknown factor
causing the apparent differences.


With uses like VoIP the jitter in the RTT is most critical, but it also
seems to be the hardest thing to control.  One semi-objective simple way
to easily measure RTT jitter is to use a "follow", or "fast" ping --
i.e. a ping utility that has the ability to send a new echo-request
immediately after it receives the reply to the previous one (Cisco IOS
ping can do this, as can the version once maintained by Eric Wassenaar
which I now make available on my FTP server as "eping") -- and watch the
RTT numbers (or even graph them over time)For ideal voice quality one
wants to see the RTT consistently remain within a few tens of
milliseconds at all times as jumps of 100ms or more will cause most
simple CODECs (e.g. uLAW or GSM) to generate dropouts.  With ALTQ I do
see occasional downward "curves" in the RTT when there's also lots of
bulk outbound traffic, and they to do cause minor voice dropouts, but
overall it's a hell of a lot better than without ALTQ.

The literature about QoS for IP also mentions that the jitter in delay
of getting the next high-priority packet into its proper time slot can
also sometimes be detrimental for some applications, depending on just
how fast the bits are put on the wire and what the application is.
I.e. a big FTP packet that starts just before the QoS packet is suppose
to start will push the QoS packet somewhat off its timeslot by as much
as 1500x8 bit-times.  This might even help partly explain the big curves
in RTT I see with eping when a big FTP is outbound from my network.
However since my outbound pipe is actually 10mbit/s Ethernet to
800kbit/s DSL, things are potentially a bit more complex as the packets
must be being buffered in the DSL modem.  I.e. they leave my gateway's
interface at full 10baseT speed but they can only be put on the wire at
800kbit/s speed by the DSL modem and ALTQ cannot really see this last
step happening.

One of the suggested work-arounds for this aspect of the issue is MSS
clamping to reduce the size of bulk traffic packets (ATM uses very small
packets for this very same reason).  I thought I might try IPF's
"mssclamp" feature, but it only works when the connections are NATed
and I can't seem to find any way to set up a transparent NAT that
doesn't actually translate anything.  However if you are already using
NAT then adding MSS clamping, esp. to bulk outbound traffic, might help
smooth things out.

I don't think real-time multimedia will really work well with IP until
we all have fibre (or the equivalent) from desktop to desktop (and from
media providers to each media consumer).  QoS for IP is still SF.  :-)

On the other hand the simple effects you wish to achieve shouldn't be
all that difficult.  You'll see below that I've also prioritized
outbound DNS requests (and outbound replies from my authoritative DNS
server), as well as SSH traffic.  I perceive interactive SSH response to
be orders of magnitude better with ALTQ, so to make a long story come to
a short end, yes, it really could work for what you want!  I can still
type, talk, and spew big packets out from my FTP server without major
problems for any of them.  (only very minor voice dropouts, though more
when the SSH session sends lots of output back, and only slight lag and
very little perceivable jitter in typing response)


Here's my current /etc/altq.conf with the active HFSC config as well as
the commented out CBQ config I was testing, and some comments about
each.



#
#       /etc/altq.conf -- configuration for altqd(8)
#

# hme0:  dsl.ca 4MB/800KB link with VoIP priority
#
# With an HFSC scheduler the delay introduced to the voice class when
# traffic in the bulk class attempts to max out seems reasonably
# constant and with no loss, as shown with various ICMP tests, though
# there are strange curves where it drops dramatically for short times.
#
# NOTE:  altqd(8) using hfsc installs a TBR on the interface but doesn't
# delete it and reinstall it on reload, nor does it delete it on termination
# so it may have to be manually deleted with tbrconfig(8) if the bandwidth
# specification changes or the type of scheduler is changed.
#
interface hme0 bandwidth 800K hfsc

# note that with HFSC "root" is a keyword for the automatically defined
# root class
#
class hfsc hme0 def_class root pshare 10 default

class hfsc hme0 voice_class root pshare 30 grate 100K
        filter hme0 voice_class 0 0     0 0     0  tos 0x10 tosmask 0xFF        # TOS LOWDELAY

class hfsc hme0 user_class root pshare 10 grate 50K
        filter hme0 user_class  0 0     0 22    6                               # SSH
        filter hme0 user_class  0 22    0 0     6                               # SSH

class hfsc hme0 fast_class root pshare 10 grate 50K
        filter hme0 fast_class  0 0     0 53    17                              # dns UDP queries
        filter hme0 fast_class  0 0     0 53    6                               # dns TCP queries
        filter hme0 fast_class  0 0     0 0     1                               # outgoing ICMP



# using CBQ on the sparc seems to cause more bizzare fluctuations in
# delay than HFSC, and that is definitely not good for VoIP quality
#
#interface hme0 bandwidth 800k cbq
#
#class cbq hme0 root_class NULL pbandwidth 100
#
#  class cbq hme0 ctl_class root_class priority 6 pbandwidth 4 control
#  class cbq hme0 def_class root_class priority 1 borrow pbandwidth 96 default
#
#    class cbq hme0 voice_class def_class priority 6 borrow pbandwidth 26 
#      filter hme0 voice_class   0 0    0 0    0   tos 0x10 tosmask 0xFF   # TOS LOWDELAY
#
#    class cbq hme0 fast_class def_class priority 5 borrow pbandwidth 20
#      filter hme0 fast_class    0 0    0 53   17             # dns UDP queries
#      filter hme0 fast_class    0 0    0 53   6              # dns TCP queries
#      filter hme0 fast_class    0 0    0 0    1              # outgoing ICMP
#
###  class cbq hme0 other_class def_class priority 2 borrow pbandwidth 50

-- 
						Greg A. Woods

H:+1 416 218-0098  W:+1 416 489-5852 x122  VE3TCP  RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>          Secrets of the Weird <woods@weird.com>