Subject: Re: what is the status of the ktrace-lwp branch?
To: Chuck Silvers <chuq@chuq.com>
From: Christos Zoulas <christos@zoulas.com>
List: tech-kern
Date: 10/30/2005 13:26:57
On Oct 30, 9:42am, chuq@chuq.com (Chuck Silvers) wrote:
-- Subject: Re: what is the status of the ktrace-lwp branch?
| this issue isn't endianness, it's 32-vs-64. the ktrace file format uses
| this structure:
|
| struct ktr_header {
| int ktr_len; /* length of ktr_buf */
| short ktr_type; /* trace record type */
| pid_t ktr_pid; /* process id */
| char ktr_comm[MAXCOMLEN+1]; /* command name */
| struct timeval ktr_time; /* timestamp */
| const void *ktr_buf;
| };
|
| and then can be a variable-length blob after that. the timeval, buf pointer
| (what the heck is a kernel pointer doing in a file format anyway?), and the
| contents of the variable blob will all be different when written by 32-bit
| vs. 64-bit kernels. there's no good way to tell which format the file is,
| so they're basically not portable between platforms. (oh, the implicit
| padding in this structure is bound to cause compatibility problems as well.)
Yes, but I am not sure if it is worth-while providing cross platform
portability for ktrace records. It is not very easy to achieve and will
bloat the 32 bit ktrace files.
| if we're going to make the records incompatible at all (which nick was
| proposing to do by changing the timeval to a timespec), then we should fix
| these portability issues at the same time, to minimize the number of flag days.
|
| if we make the assertion that old kdump binaries don't need to be able to
| read new ktrace files (which seems reasonable, we've done the same with
| both user and kernel coredumps before, and probably other things),
| then we can redefine the file format any way we want and fix everything
| at once, and build in support for better handling of future format changes
| too if we want.
They can now if you "ktrace -v 0". Even v1 ktrace records can be read
using an older kdump (the timestamps are not parsed correctly).
I've left two diffs on ftp.netbsd.org:~christos/ktrace-lwp-{user,kernel}.diff.
| > 2. I understand where chuq is coming from and I think I agree with him. I
| > think that the uio_ is a process wide struct and not an lwp specific one.
| > For example, what happens when the lwp exits before the uio is completed?
| > On the other hand, keeping it as uio_proc, would require us to call
| > proc_representative_lwp(uio->uio_proc) in a bunch of places; not very
| > attractive either.
|
| the uio_procp field was intended to identify the address space that the
| pointers in uio_iov refer to. as yamt points out, in our current system
| the address space corresponds to struct vmspace since multiple processes can
| share the same address space.
|
| if we want to handle async operations, we will need to make sure that our
| reference to the address space (via whatever structure) remains valid until
| all async operations complete. in some systems this is done by making
| exiting processes wait for aio to complete, but this could be achieved
| in other ways as well.
|
| currently all the operations that use of uio_procp to refer to the kernel LWP
| that issued the request are synchronous, to avoid the previous issue in
| the absence of any other way to keep the proc pointer valid. it doesn't
| really make sense that any of these would ever become async, especially
| because in our scheduler-activations world, the kernel LWP doesn't really
| map to anything in particular that the user process is doing. previously
| we were looking to see what all non-address-space information was being
| accessed via uio_procp so that we could consider passing this information
| in a more explicit fashion, and the people doing that kinda got bogged down
| in that process. I'll dig through my old mail and see what I can find.
So what should we do? Should we commit the diff with uio_lwp and fix it
later properly? Or should we come up with the vmspace diff now?
christos