Subject: Re: GCC3.3.1 switch coming soon.
To: Nathan J. Williams <nathanw@wasabisystems.com>
From: Andrew Brown <atatat@atatdot.net>
List: tech-kern
Date: 09/24/2003 11:04:36
>> from UVM's standpoint, there's no real difference between the initial stack
>> for a process and any other amap-backed mapping.  other pthreads will have
>> their own stacks, and we don't want to treat all of those specially,
>> so we shouldn't treat the initial stack specially either.
>
>There is one difference that would be useful. Currently, since the
>kernel tracks the lowest used address of the initial stack, the
>core-dumping routine only dumps the used portion, not the entire
>mapped range. It would be nice (and save a lot of space in the cores
>of most threaded programs) if we could also only dump the used portion
>of other stacks. This probably calls for cleverness on the part of the
>core-dumping code, though, rather than in the VM layer.

indeed.  't'would be nice if we could get rid of a lot of those large
chunks of zeroes.  while i realize the amap chunking is nice, it
forces us to end up with a lot of map entries.  if i make a one megabyte 

it would also keep stack segment separate and distinct from other
segments in the core dumps.

>(There are also yellow-zone stack tricks that could be useful, but I
>think those could be managed in userlevel with existing mprotect() and
>faulting-address information).

it's also perhaps worth mentioning that freebsd has a MAP_STACK flag
to mmap() that, if i correctly has the effects that:

* "addr" is the "stack top", as is the return address, meaning that if
you write to addr, you'll fault (unless there's something already
there), but that you can walk backwards "size".

* the region is expected to grow backwards from top to bottom (ie, you
decrement your "stack pointer" and push values in when using it).

to put this another way:

	void *a, *b;
	a = mmap(0x40000000, 4096, PROT_READ|PROT_WRITE, MAP_ANON, -1, 0);
	b = mmap(0x40000000, 4096, PROT_READ|PROT_WRITE, MAP_STACK, -1, 0);

with both succeed and return 0x40000000.  the first call causes the
page at 0x40000000 to be allocated and the second call causes the page
*prior* to 0x40000000 to be allocated.  note that i'm basing this on
skimming the code and the man page, not on actual experience.

it might be worth considering bringing this in, at least for emulation
purposes...

-- 
|-----< "CODE WARRIOR" >-----|
codewarrior@daemon.org             * "ah!  i see you have the internet
twofsonet@graffiti.com (Andrew Brown)                that goes *ping*!"
werdna@squooshy.com       * "information is power -- share the wealth."