current-users: PPP crashes

Subject: PPP crashes
To: None <current-users@netbsd.org>
From: Michael L. VanLoon -- HeadCandy.com <michaelv@HeadCandy.com>
List: current-users
Date: 10/04/1994 21:54:29
My office-mate, explorer@vorpal.com, and I have a 386DX25 NetBSD box
in our office at Iowa State that we use as a dial-in PPP network
server for our home systems.  We have just the two of us on it, but we
both are connected pretty much 24-hours/day.  It does nothing but sit,
monitor turned off, and route PPP/ethernet traffic for us.

He has been using it for many months, with a dedicated SLIP line (all
by himself), I only started using it in July.  When he had it by
himself with just one SLIP line, and running binaries from (I think)
somewhere around April, he had an uptime of like 73 days on the
machine.  Now we can't seem to get over a week without it locking up
on us.  (Now he's running PPP instead of SLIP, and I've been running
PPP since July).

If we build a kernel without DDB, in the hopes that it will just
reboot when it panics, we get no luck because the machine locks up
after scrolling a bunch vm_fault/page fault trap messages on the
display.  The machine just died again tonight after being up only two
days (the first time he's run continuous PPP instead of SLIP).  We had
a DDB in this kernel.

This is the message that was on the display when I came into the
office after the barf:

vm_fault(f8242000, f883a000, 1, 0) -> 1
kernel: page fault trap, code=0
Stopped at  _pppstart+0x35b:  movb  0(%eax, %edx, 1), %dl

When I did a trace command, I got this:

_pppstart(f86b5d00)  at  _pppstart+0x35b
_comintr(f8672f80)  at  _comintr+0x24d
_Xintr4(0, 0, 0, 100, f7bfff58)  at  _Xintr4+0x55
bpendtsleep(f81a3dc8, 118, f81142c0, 0, f7bfce34)  at  bpendtsleep
_select(f867ba00, f7bfff94, f7bfff8c)  at  _select+0x26c
_syscall()  at  _syscall+0x10c
--- syscall (number 93) ---

This has been going on, for the most part, since I started doing
full-time PPP end-of-July/beginning-of-August.  No combination of
kernel rebuilds, changed config options, juggling hardware, has made
any difference.  There definitely seems to be a bug in the kernel
somewhere with respect to VM handling of PPP stuff, or in the PPP code
itself?  I dunno...  I don't know where to start looking.

I think I remember it complaining about multiple frees before hanging
one time, also (without DDB).  Has anyone else seen anything like
this?  My home machine hasn't died of any PPP problems the entire time
this has been going on (it's a 486DLC).  The machine in the office is
a Zenith 386DX25 (no FPU), with a very small external cache, IDE hard
drive, a 16550 on explorer's modem, and a Hayes ESP running my modem.

This doesn't seem directly traffic-related, since I can suck
litterally TONS of stuff down over the link without a problem.  It
seems more like it goes for a certain amount of time, then dies.  Like
something is filling up or overflowing... or something along those
lines.  These being just severely uneducated guesses.

Any help on this matter greatly appreciated.  If there is anything
more detailed I can look for to help shed some light on this the next
time it crashes and I'm in the debugger, please let me know.
Thanks...

				--Michael

-----------------------------------------------------------------------------
   Michael L. VanLoon     michaelv@HeadCandy.com     michaelv@iastate.edu
  Free your mind and your machine -- NetBSD free un*x for PC/Mac/Amiga/etc.
     Working NetBSD ports: 386+PC, Mac, Amiga, HP300, Sun3, Sun4c, PC532
               In progress: DEC pmax (MIPS R2k/3k), VAX, Sun4m
-----------------------------------------------------------------------------