Subject: Re: altq 3.0 and pppd ? ( log vs queue )
To: None <hwidarta@yahoo.com>
From: Kenjiro Cho <kjc@csl.sony.co.jp>
List: tech-net
Date: 01/23/2002 11:17:35
----Next_Part(Wed_Jan_23_11:17:35_2002_142)--
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


It turns out that the problem is caused by a simple merge mistake
when I back-ported if_ppp.c from KAME.

In KAME (and NetBSD-current),
   (1) "#include "opt_inet.h"" is moved upwards.
   (2) "#define VJC" is moved in "#ifdef INET".
I missed (1) and merged only (2).

Its result is pretty nasty:
The size of "struct ppp_softc" doesn't agree in if_ppp.c and ppp_tty.c
since part of the fields are in "#ifdef VJC".
pppasyncstart() sees sc_outm to be zero but pppintr() always evaluates
the same field to be non-zero, which leads to an infinite loop!

This only happens on altq-3.0 for NetBSD-1.5.2.  (KAME and
NetBSD-current are ok.)
The first patch attached fixes this.

To use ALTQ on kernel ppp, you also need the second patch for
ppp_tty.c in addition to the previous patch I posted.  This prevents
pppintr() from looping under rate-limiting.

-Kenjiro

> I finally set up pppd between 2 NetBSD-1.5.2/i386 using a cross serial
> cable.  I was able to reproduce a problem but the sympton isn't the
> same.
> An ALTQ patched kernel hungs up when pppd starts, apparently just
> after the first packet arrives.
> It is strange that a KAME kernel which has almost identical ALTQ code
> works fine.
> 
> 	OK	stock GENERIC
> 	NG	GENERIC with altq-3.0 patch applied
> 	OK	GENERIC.KAME from a KAME snap
> 
> For the NG case, I can break into DDB by Alt-Ctrl-ESC and the trace
> command shows it was in Xspllower() from pppintr().  Apparently, the
> kernel loops in Xspllower().
> I don't think ALTQ changes any interrupt handling but I'll take a
> closer look tomorrow.
> Does anyone have an idea about what's going on?
> 
> -Kenjiro

----Next_Part(Wed_Jan_23_11:17:35_2002_142)--
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; filename="if_ppp.c.diff2"

--- if_ppp.c.old	Tue Jan 22 17:53:46 2002
+++ if_ppp.c	Tue Jan 22 20:15:37 2002
@@ -78,14 +78,14 @@
 #include "ppp.h"
 #if NPPP > 0
 
+#include "opt_inet.h"
+#include "opt_gateway.h"
+#include "opt_ppp.h"
+
 #ifdef INET
 #define VJC
 #endif
 #define PPP_COMPRESS
-
-#include "opt_inet.h"
-#include "opt_gateway.h"
-#include "opt_ppp.h"
 
 #include <sys/param.h>
 #include <sys/proc.h>

----Next_Part(Wed_Jan_23_11:17:35_2002_142)--
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; filename="ppp_tty.c.diff"

--- ppp_tty.c-	Wed Jan 23 11:09:46 2002
+++ ppp_tty.c	Wed Jan 23 11:08:40 2002
@@ -894,6 +894,15 @@
     if ((CCOUNT(&tp->t_outq) >= PPP_LOWAT)
 	&& ((sc == NULL) || (sc->sc_flags & SC_TIMEOUT)))
 	return 0;
+#ifdef ALTQ
+    /*
+     * if ALTQ is enabled, don't invoke NETISR_PPP.
+     * pppintr() could loop without doing anything useful
+     * under rate-limiting.
+     */
+    if (ALTQ_IS_ENABLED(&sc->sc_if.if_snd))
+	return 0;
+#endif
     if (!((tp->t_state & TS_CARR_ON) == 0 && (tp->t_cflag & CLOCAL) == 0)
 	&& sc != NULL && tp == (struct tty *) sc->sc_devp) {
 	ppp_restart(sc);

----Next_Part(Wed_Jan_23_11:17:35_2002_142)----