tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

How to identify specific wait-state for a "DE" process?



Continuing on the saga of using filemon(4) and specifying STDOUT_FILENO for the activity log ... (I got the answer to my earlier question so quickly, I figure maybe I'll get lucky again!)

With the recently-committed change to spec_vnops.c rev 1.60, the filemon code now successfully writes activity entries to stdout. Everything is fine until the monitoring process (the one which has opened the filemon device) tries to exit.

Whether the exit is caused by a "return" from the main() procedure, or as a result of a ^C/SIGINT, as soon as it calls sys_exit() the process hangs. A 'ps' shows the state to be "DE" (uninterruptable device wait, exitting). The process cannot be killed, and it never seems to finish whatever it is waiting for.

I'm pretty sure that the device in question is the console terminal driver /dev/console since the problem does not happen if filemon is sending the entries to a "real" file. But I can't figure why it is waiting, so I don't know what I should do to satisfy the wait state and continue.

I do have a suspicion, however!   :)

When the monitoring process is told to use a particular file descriptor for the activity log, filemon(4) uses fd_getfile(fd) to access the internal 'struct file'. At some point later, when it is time to write an activity entry, it calls

	(*filemon->fm_fp->f_ops->fo_write) (filemon->fm_fp,
	    &(filemon->fm_fp->f_offset),
	    &auio, curlwp->l_cred, FOF_UPDATE_OFFSET);

I'm guessing that there's some sort of refcount somewhere that isn't getting decremented? So the process cannot completely close its stdout descriptor. And because it hangs here, the filemon(4) exit code never gets a chance to clean up and call fd_putfile() (and decrement the ref count)....

If the refcount issue is the correct diagnosis, what would be the best way to avoid it? Should the filemon(4) code install an at_exit() handler to take care of the call to fd_putfile() ? Is there something better?

-----

SIDEBAR #1: The man page for fd_getfile(9) still shows two arguments for fd_getfile():

	struct file *
	fd_getfile(struct filedesc *fdp, int fd);

-----

SIDEBAR #2: There does not seem to be any man page file fd_putfile(), and fd_putfile() is not mentioned on filedesc(9) page.

----

SIDEBAR #3: The man page entry for fd_getfile(9) does not mention the fact that a refcount is incremented! The code in kern/kern_descrip.c is, however, pretty clear in its comments:

	/*
	 * Look up the file structure corresponding to a file descriptor
	 * and return the file, holding a reference on the descriptor.
	 */
	file_t *
	fd_getfile(unsigned fd)
	{
	...

-----

+------------------+--------------------------+------------------------+
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:      |
| (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+


Home | Main Index | Thread Index | Old Index