Subject: Re: CVS commit: src/sys/kern
To: Mindaugas R. <rmind@NetBSD.org>
From: Elad Efrat <elad@bsd.org.il>
List: source-changes
Date: 04/04/2007 15:12:31
Mindaugas R. wrote:
> Elad Efrat <elad@bsd.org.il> wrote:
> 
> [... skip ...]
>> the only way fileassoc_table_lookup() can fail in
>> fileassoc_table_delete() is if there's a locking violation. since
>> fileassoc lacks locking altogether, your introduced workaround isn't a
>> real fix, but rather a way to silence a coverity "bug".
> Hence, you did not prove me, that this change is somehow wrong. As you have
> shown, fail is simply not possible because of such circumstances (which is
> because of more abstract problem!) - but it DOES NOT MEAN, that the chain of
> NULLs is wrong. It may simply be unnecessary.

the problem is that fileassoc lacks locking. let's examine what your
commit actually says.

on entry to fileassoc_file_delete(), we lookup the file in question. if
it doesn't exist (or a table doesn't exist for it), we abort. if it
does, we continue.

later on, we lookup the table itself. according to your change, if it
doesn't exist, we return ENOENT. this is *after* a file_free() call that
(lacking return value :) succeeded in removing a file from the table. if
a table doesn't exist at this point, it means a concurrently running
operation removed the *entire* table, and there's no need to lower the
number of entries in it.

so if you *really* understood the code and wanted to do this ugly
workaround, it'd be:

	tbl = fileassoc_table_lookup(vp->v_mount);
	if (tbl == NULL)
		return (0); /* already gone */
	[...]

correct?

> Actually, leaving such parts may cause problems in future, when, for example,
> that will be able to really fail in this case. Then someone should inspect and
> handle such situations in code (and doesn't look that it would be you, even
> this is yours code). I would probably tend to disagree that looking in
> abstract, such behaviour is correct at all.

this would address (again, not "fix") *part* of the issue, making it
more confusing in the future, when locking is added. since I don't think
you're aware of why fileassoc lacks locking -- basically I waited for
ad@ to commit and stablize the newlock2 stuff -- I still believe you
modified a subsystem without understanding it at all.

therefore, I kindly ask you again to revert.

>> and, still, you did not tell me who okay'd this commit. <...>
> So as you are so experienced and enlightened, you may probably know, that it
> is not possible to every time get an OK from others. Of course, we could
> create tech-approves@, where you would be an officer with 24hours/7days per
> week support. Would you agree with this? :)

unfortunately this already exists as current-users@. luckily for netbsd,
there are some amazing people on it who put up with the lack of quality
assurance in this project and are quick enough to report and sometimes
fix bugs as they are committed. this is not how it should be though.

every new developer is sponsored *exactly* for the purpose of okay'ing
his commits. if this model doesn't work, then a better one should be
implemented. until then, I'd like to know how this commit, and the wskbd
one, are critical enough that nobody needed to okay them. I'd also like
to know why you didn't bother *compile test* before committing.

-e.