Subject: Re: ALTQ question
To: None <>
From: Miles Nordin <carton@Ivy.NET>
List: tech-net
Date: 03/10/2005 19:38:30
Content-Type: text/plain; charset=US-ASCII

>>>>> "rp" == Rimantas Petrauskas <> writes:

    rp> 1. 100 clients which must have 1000Kbps international traffic
    rp> and 2Mbps lithuanian traffic, but they must share that portion
    rp> of bandwidth into equal parts

I'm using ALTQ, and it can do this.  Your numbers don't add up to full
use of the link, though.  I don't understand why.

You should use ALTQ in the pflkm kernel module from pkgsrc.  You'll
have to

 1. cd /usr/pkgsrc/security/pflkm && make extract

 2. download 2.0-release kernel sources and make sure they are in /usr/src

 3. apply two patches to your kernel sources

    a. /usr/pkgsrc/security/pflkm/work/pflkm-20041204/patches/if_events.diff

    b. /usr/pkgsrc/security/pflkm/work/pflkm-20041204/patches/altq.diff

 4. set ' ifevents altq' in /etc/mk.conf and 'make
    install' the pflkm package

 5. build and install the new kernel, and add the /usr/pkg/lkm/pf.o to
    your /etc/lkm.conf

 6. configure ALTQ in /etc/pf/pf.conf

It's important to use pflkm rather than stock NetBSD, and to apply the
pflkm patches, because the version of ALTQ that comes with PF is much
newer than the one in NetBSD.  It has added an important new service
curve to the HFSC scheduler that you need for your application, and it
is more bug-free.

The scheduler you should use is HFSC.  For what you are doing you'll
use the link-sharing and upper-limit service curves.

It is an artifact of the pfctl/pf.conf syntax that you must always set
a link-share service curve with HFSC if you wish to specify bandwidth
in percentages---the % is always based off the link-sharing curve.
so, set the linksharing curve with the 'bandwidth' keyword, and the
upper-limit with the 'hfsc(upperlimit XXX) keyword, and just never use
the 'hfsc(linksharing XXX)' keyword becuase you set it with
'bandwidth' instead.

You will make a tree of queues:

                             root 100mbit/s
                      ____bandwidth 100%_______
                     /                         \
               .lt                          rest of world
            upperlimit 10mbit/s             upperlimit 4mbit/s
            bandwidth 10mbit/s              bandwidth 4mbit/s
           /             \                  /              \
    group 1            group 2          group1            group2
 bandwidth 2mbit/s   bandwidth 8mbit/s  bandwidth 1mbit/s bandwidth 3mbit/s
   |          | |     |            ||     |          |  |   |            | |
   |          | |     |            ||     |          |  |   |            | |
node1         | |   node1          ||   node1        |  | node1          | \
bandwidth 1%  | |   bandwidth 0.3% ||   bandwidth 1% |  | bandwidth 0.3% |  |
              | |                  ||                |  |                |  |
        node2 / |            node2 /|       node2   /   |     node2      |  |
        bandwidth 1%         bandwidth 0.3% bandwidth 1%|     bandwidth 0.3%|
                |                   |                   |                   |
                \                   |                   |                   |
                 ...               ...                  ...               ...

There are a few things to notice here.

 1. the use of upperlimit.  You are really queueing two virtual
    interfaces, a lithuanian interface and a rest-of-world interface.
    On some router somewhere in your AS, these are two different
    interfaces.  In the theory papers, you'd be expected to run two
    HFSC's, one on the lithuanian interface, and one on the
    rest-of-world interface.  You run HFSC _on the interface_ that is
    a scarce resource.

    But in practice, you can't do that.  You want to shape the traffic
    before it gets to these interfaces, as it heads out a single
    100mbit/s ethernet or something, and at some later point a router
    not running HFSC will split it .lt/*.  That's why we have two
    children of the root queue with the 'upperlimit' set.  With
    link-sharing, HFSC will let queues use extra bandwidth until the
    interface is full.  But here, even if the ethernet on which ALTQ
    is running has excess capacity, HFSC will drop packets if the
    upperlimit is exceeded.

    upperlimit is only in PF/ALTQ, not in NetBSD ALTQ.  You have to
    use pflkm becuase this isn't even in -current yet.

 2. The way HFSC shares bandwidth along this tree.  if there are 100
    clients active in group 1, and just 1 client active in group 2,
    then for international traffic the group 1 clients will get
    10kbit/s each, and the single group 2 client will get 3Mbit/s.

    However if there is one group 1 client active, and no group 2
    clients active at all, the group 1 client will get 4Mbit/s.
    Probably you want this.  If you don't for some marketing reason,
    use the 'upperlimit' keyword.

 3. ALTQ is meant for _outgoing interfaces only_.  In general, shaping
    inbound traffic doesn't work as well.  You shape traffic on the
    way out.  If you want to shape incoming traffic, you are supposed
    to run ALTQ on a box at your peer, before the traffic enters the
    slow link.

    That said, if you will constrain your use of ALTQ to a
    two-interface box, you can use the 'upperlimit' keyword to shape
    incoming traffic as it _exits_ the two-interface router, use it on
    your ``inside'' interface.  If you cap incoming traffic to
    something say 70% of your downlink capacity, TCP congestion
    control will help slow down the sender and keep the rtt low for
    clients that aren't using their full share of bandwidth.  This
    works a little bit for me on my home DSL connection, but I am
    trying to get an upstream router at my ISP because it doesn't work
    great.  For upstream scheduling, ALTQ does work great.

I've been using ALTQ for a while in a pretty complicated setup and
would be happy to help you out.  You really need to use the pflkm
though.  The old ALTQ without 'upperlimit' is not flexible enough to
do what you want, and is a pain in the ass to use too.

Content-Type: application/pgp-signature
Content-Transfer-Encoding: 7bit

Version: GnuPG v1.2.6 (NetBSD)