Subject: Re: LWP id into ktrace output - chapter 2.
To: Nathan J. Williams <nathanw@wasabisystems.com>
From: Darren Reed <darrenr@reed.wattle.id.au>
List: tech-kern
Date: 03/22/2003 19:24:14
In some email I received from Nathan J. Williams, sie wrote:
> 
> This looks like pretty cool work. When I did the original
> LWPification, I avoided a lot of this because after some analysis I
> couldn't find a reason to propagate LWPs that low. It does appear that
> ktrace is such a reason, though.

Thanks for reviewing the patches, I'll take on board most of the comments
and post a new set of diffs when done.  There are a few things I do want
to discuss, however, where I think we need to find a better way or I
think something should stay or....anyway...

> My biggest concern is about the changes in struct uio and struct
> buf. The proc pointers that they currently hold are there because the
> I/O in question is happening to that process's memory space; the
> struct proc is used for its vmspace and a couple of other things, like
> P_WEXIT, that are truly process-level and not lwp-level. Since LWPs
> are relatively transient compared to processes, it seems that there is
> a risk of LWPs being cached in these structures that do not exist when
> the structure is used a bit later. I'm not 100% up to speed on the
> myriad uses of the two structures, so I can't say with confidence that
> it's safe or unsafe, but I think it needs closer examination.

Ok, the validity of the lwp pointer in comparison to that of proc was
not something I gave much thought to and I can see what your concern
is here.

I think there are at least two different possible solutions here:

1. put proc and lwp into uio/buf and when lwp is needed, check that
   it is still valid for the process.  This is handle when an lwp
   goes to sleep for I/O, the LWP dies but the I/O is still oustanding.

Hmmm, having said that, something seems wrong with that picture...

2. Shouldn't terminating the LWP also terminate the outstanding I/O
   request associated with it or does going down this path start to
   create "heavier" LWP's ?

> arch/x86_64/x86_64/syscall.c
> @@ -317,10 +317,9 @@
>  	KERNEL_PROC_UNLOCK(l);
>  
>  	userret(l);
> -#ifdef KTRACE
>  	if (KTRPOINT(p, KTR_SYSRET)) {
>  		KERNEL_PROC_LOCK(l);
> -		ktrsysret(p, SYS_fork, 0, 0);
> +		ktrsysret(l, SYS_fork, 0, 0);
>  		KERNEL_PROC_UNLOCK(l);
>  	}
>  #endif
> 
> Seems wrong.

Is there something wrong with this besides it being for SYS_fork ?
I realise this "seems wrong" but not much else can be done here :(
The process gets fork()'d, but the fork happens on a thread.

> Why make the first two changes? It seems less invasive to have the
> "struct proc *p = l->l_proc;" line and leave the stakgap_init() line
> alone; generally, that's why I added those "struct proc *p"
> definitions.

It seemed like 'struct proc *p' was serving no other purpose than
as something to put a temporary value in for passing to
stackgap_init(), so I garbage collected it.

> All of compat/netbsd32/netbsd32_exec_elf32.c,
> compat/svr4/svr4_exec_elf64.c, compat/osf1/osf1_exec_ecoff.c, and
> compat/irix/irix_exec_elf32.c use LIST_FIRST(&p->p_lwps) when they
> call emul_find_interp(). What state is the process in at that point?
> Which process calls those routines?

Ok, this is a point at which I arbitrarily drew a line on changing
'struct proc *' to 'struct lwp *' in function parameters.
 
I should go back and fix this so they get a 'struct lwp *' instead.

> dev/dksubr.c: In dk_lookup(), the conversion looks unfinished. p isn't
> initialized (needs to be for p->p_ucred), and there are some vn_*()
> calls still using p where l seems appropriate.

Ah.  GENERIC doesn't include "cgd" or at least _my_ GENERIC doesn't, so
I haven't compiled that code.  Looking at the GENERIC I was using, there
are a number of other things that weren't compiled in that I'll add.

> rf_copyback.c: LIST_FIRST().
> rf_disks.c: LIST_FIRST().
> rf_reconstruct.c: LIST_FIRST().
> 
> I'm not terribly familiar with the raidframe code, but since it seems
> that they're just pulling the LWP off of *_thread variables, perhaps
> the RF_Thread_t type should be made an LWP pointer itself.

Ok.  That makes sense.

> Similar issue in the USB code: it currently uses usb_proc_ptr for
> cross-BSD portability; would it be feasable to make usb_proc_ptr into
> struct lwp * instead of creating local differences?

Maybe.  I'll go the raidframe case because the type name is accurate,
but for usb_proc_ptr to be a lwp* ?  Hmmm.  If you or someone else is
ok with that just changing the meaning of usb_proc_ptr to be a lwp
pointer, I'll do it - it's just not the sort of naming change I'd feel
comfortable doing without consultation.

> +void bpx(struct proc *);
> +void bpx(p)
> +struct proc *p;
> +{
> +	static int pid;
> +	pid = p->p_pid;
> +}
> +
> 
> What's this about?

Ooops :-)  Debugging cruft for being able to set a breakpoint in the
kernel from ddb O:-)

Thanks,
Darren