Subject: SUP lossage, take N...
To: None <current-users@NetBSD.ORG>
From: Jason Thorpe <thorpej@NetBSD.ORG>
List: current-users
Date: 03/30/1996 11:23:56
Ok ... I *really* think I have it, this time...

The SUP server got into this sort of state:

{root}netbsd# netstat -f inet | grep sup
tcp       39      0  netbsd.supfiles        qabalah.cac.psu..1199  CLOSE_WAIT
tcp       39      0  netbsd.supfiles        scrap.cs.colorad.1038  CLOSE_WAIT
tcp       93      0  netbsd.supfiles        infpc01.informat.1075  CLOSE_WAIT
tcp       93      0  netbsd.supfiles        tanelorn.tky.hut.1026  CLOSE_WAIT
tcp       39      0  netbsd.supfiles        scrap.cs.colorad.1037  CLOSE_WAIT
tcp       39      0  netbsd.supfiles        tanelorn.tky.hut.1025  CLOSE_WAIT
tcp       39      0  netbsd.supfiles        scrap.cs.colorad.1036  CLOSE_WAIT
tcp       39      0  netbsd.supfiles        scrap.cs.colorad.1035  CLOSE_WAIT

I killed and restarted the server, and I was able to sup allsrc in its 
entirety...

And it looks like there is some success now:

{root}netbsd# netstat -f inet | grep sup
tcp        0      0  netbsd.supfiles        talc.rendition.c.1386  SYN_RCVD
tcp        0      0  netbsd.supfiles        gate.ene.unb.br.1331   ESTABLISHED
tcp        0  16200  netbsd.supfiles        pingu.math.hokud.1549  ESTABLISHED
tcp        0      0  netbsd.supfiles        pgoyette.bdt.com.1047  ESTABLISHED
tcp        0      0  netbsd.supfiles        talc.rendition.c.1385  TIME_WAIT
tcp        0      0  netbsd.supfiles        cynic.portal.ca.1157   TIME_WAIT
tcp        0      0  netbsd.supfiles        pgoyette.bdt.com.1046  TIME_WAIT
tcp        0      0  netbsd.supfiles        pgoyette.bdt.com.1045  TIME_WAIT
tcp        0      0  netbsd.supfiles        pgoyette.bdt.com.1044  TIME_WAIT
tcp        0  15528  netbsd.supfiles        crystal.PEAK.ORG.1492  ESTABLISHED
tcp        0   8259  netbsd.supfiles        ftp.cs.umn.edu.10382   ESTABLISHED
tcp        0      0  netbsd.supfiles        dogbert.cs.chalm.4743  ESTABLISHED
tcp        0  15687  netbsd.supfiles        tia1.eskimo.com.1808   ESTABLISHED
tcp        0  15207  netbsd.supfiles        white.dogwood.co.1218  ESTABLISHED

In case anyone's wondering why I mucked with it in the first place, I 
needed to add some things to the current collection, and it became clear 
that a revision controlled, automated way of updating the server configs 
was needed.  6 typos later, the server decided to get confused (I can't 
think of any reason why a script failing would have caused the server to 
wedge in CLOSE_WAIT).  The CLOSE_WAIT lossage could have been caused by 
intermediate circuit problems ...

Again, I apologize for any inconvenience this may have caused you.  If 
you observe any further problems, please let me know... (I know I'm going 
to regret saying that... :-)

Jason R. Thorpe
NetBSD Core Group
<thorpej@NetBSD.ORG>