Subject: select / poll per proc or lwp?
To: None <tech-kern@netbsd.org>
From: Matthias Drochner <M.Drochner@fz-juelich.de>
List: tech-kern
Date: 04/11/2003 17:47:54
This is a multipart MIME message.

--==_Exmh_12339607478740
Content-Type: text/plain; charset=us-ascii


Hi -

while trying to find a pattern in the states of hanging multithreaded
applications it appeared that poll/select are involved because
many threads were hanging just there.
This is probably not true as it looks now, but it seems that the
implementation can be improved anyway:

If an lwp calls select now, the whole process is recorded in the
device's (or whatever ressource's) selinfo instead of just the lwp.
If a device gets ready and calls selwakeup, all lwps doing a select
or poll at that time which belong to the recorded process are woken up,
independantly of what they are waiting for.

This might save a bit of code because comparing a PID is cheaper than
comparing both PID/LID, and it might be a bit more effective if two
threads of one application are polling for the same file descriptor
because a collision is avoided in that case.
In general however it looks like a waste.

Comments / opinions?

I'll append a patch which implements such a change. For a real
implementation, the lwp pointer would have to be passed through
the fileops.
What I can tell is that it is not worse than before -- multithreaded
applications are still hanging:-)

best regards
Matthias



--==_Exmh_12339607478740
Content-Type: text/plain ; name="selectpatch"; charset=us-ascii
Content-Description: selectpatch
Content-Disposition: attachment; filename="selectpatch"

*** sys_generic.c.~1.72.~	Thu Mar 27 11:33:20 2003
--- sys_generic.c	Wed Apr  9 11:21:21 2003
***************
*** 956,969 ****
  	struct lwp	*l;
  	struct proc	*p;
  	pid_t		mypid;
  	int		collision;
  
  	mypid = selector->p_pid;
! 	if (sip->sel_pid == mypid)
! 		return;
  	collision = 0;
  	if (sip->sel_pid && (p = pfind(sip->sel_pid))) {
  		LIST_FOREACH(l, &p->p_lwps, l_sibling) {
  			if (l->l_wchan == (caddr_t)&selwait) {
  				collision = 1;
  				sip->sel_flags |= SI_COLL;
--- 956,979 ----
  	struct lwp	*l;
  	struct proc	*p;
  	pid_t		mypid;
+ 	lwpid_t		mylid;
  	int		collision;
  
+ 	if (curproc != selector)
+ 		printf("selrecord: curproc != selector\n");
  	mypid = selector->p_pid;
! 	mylid = (curlwp ? curlwp->l_lid : -1); /* XXX */
! 	if (sip->sel_pid == mypid) {
! 		if (sip->sel_lid == mylid)
! 			return;
! 		printf("selrecord: pid %d collision: %d/%d\n",
! 		       mypid, mylid, sip->sel_lid);
! 	}
  	collision = 0;
  	if (sip->sel_pid && (p = pfind(sip->sel_pid))) {
  		LIST_FOREACH(l, &p->p_lwps, l_sibling) {
+ 			if (l->l_lid != sip->sel_lid)
+ 				continue;
  			if (l->l_wchan == (caddr_t)&selwait) {
  				collision = 1;
  				sip->sel_flags |= SI_COLL;
***************
*** 971,978 ****
  		}
  	}
  
! 	if (collision == 0)
  		sip->sel_pid = mypid;
  }
  
  /*
--- 981,990 ----
  		}
  	}
  
! 	if (collision == 0) {
  		sip->sel_pid = mypid;
+ 		sip->sel_lid = mylid;
+ 	}
  }
  
  /*
***************
*** 999,1004 ****
--- 1011,1018 ----
  	sip->sel_pid = 0;
  	if (p != NULL) {
  		LIST_FOREACH(l, &p->p_lwps, l_sibling) {
+ 			if (l->l_lid != sip->sel_lid)
+ 				continue;
  			SCHED_LOCK(s);
  			if (l->l_wchan == (caddr_t)&selwait) {
  				if (l->l_stat == LSSLEEP)

--==_Exmh_12339607478740--