lib/50143: various bugs/weirdness with parsedate(3) (libutil) (+ some suggested fixes)

To: lib-bug-people%netbsd.org@localhost,gnats-admin%netbsd.org@localhost,netbsd-bugs%netbsd.org@localhost
Subject: lib/50143: various bugs/weirdness with parsedate(3) (libutil) (+ some suggested fixes)
From: kre%munnari.OZ.AU@localhost
Date: Tue, 11 Aug 2015 19:20:00 +0000 (UTC)
>Number:         50143
>Category:       lib
>Synopsis:       various bugs/weirdness with parsedate(3) (libutil) (+ some suggested fixes)
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    lib-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Aug 11 19:20:00 +0000 2015
>Originator:     Robert Elz
>Release:        NetBSD 7 & -current (2015-08-11) (NetBSD 6 has different bugs)
>Organization:
	Prince of Songkla University
>Environment:
This is where I'm running send-pr and is not material...
System: NetBSD perseus.noi.kre.to 6.99.17 NetBSD 6.99.17 (GENERIC) #1: Fri Feb 22 22:09:50 ICT 2013 kre%jade.coe.psu.ac.th@localhost:/usr/obj/current/i386/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
	Sorry, this is going to be a long PR - I thought it better to
	send one PR containing all of the problems with parsedate(3) I've
	noticed today, rather than 4 or 5 (or whatever) for this one
	relatively obscure libutil function.

	If you don't care about the grizzly internals of time related
	functions, you should probably just delete any e-mail about
	this PR (including this message).   That probably leaves about
	3 or 4 of us who care...

	This issue is not totally irrelevant only because parsedate() is
	used to implement date(1)'s -d option (which is where I started).

	I had (have) a need to use date -d in a script, and wanted to
	know what format in which I should generate the date for it to
	parse.  date(1) just refers to parsedate(3) - but unfortunately
	that man page is not very useful - it just handwaves over what
	formats are accepted, and what they mean (aside from now being
	somewhat out of date - more on that later).

	So, on to the source.   While looking at the yacc input to work
	out a good format for me to use (the one I picked will be
	obvious in the examples below) I first noticed ...

	#define HOUR(x)         ((time_t)(x) * 60)  

	which looks OK, but I wondered who would want hours converted to
	minutes, rather than seconds which is the normal basic time unit,
	so

		grep 'HOUR(' parsedate.y

	and found stuff like ...

	 { "est",    tZONE,     HOUR( 5) },  /* Eastern Standard */
	 { "edt",    tDAYZONE,  HOUR( 5) },  /* Eastern Daylight */

	which was all fine, well sort of ... it could use HOUR(4) for
	"edt" which would make the offset explicit, rather than just
	setting tDAYZONE rather than tZONE and then applying a one hour
	correction (by magic) later.

	But ignore that, I also saw ...

	 { "cast",   tZONE,     -HOUR(9.5) },/* Central Australian Standard */
	 { "cadt",   tDAYZONE,  -HOUR(9.5) },/* Central Australian Daylight */

	for which that macro (HOUR) will obviously fail miserably (that is,
	it would convert 9.5 to a time_t, rounding it up to 10, (or perhaps
	truncating it to 9, doesn't matter) and then multiply by 60 to get
	minutes.   Bogus.

	At the time (grep being what it is) I didn't know that all of the
	HOUR(N.5) lines in parsedate.y were #if 0'd out, making the issue
	with HOUR() really moot ... but as it was trivial to fix ...

	#define HOUR(x)         ((time_t)((x) * 60))

	I just did that (multiply first, as a float if x is N.5 otherwise int.)
	As long as only simple fractions of an hour are picked, and as only .5
	is currently used, it is fine (.25, .1, .333, ... would also be OK).

		[Aside, there is no issue at all with overflow here,
		whether x is int or float - even 16 bit int arithmetic
		won't overflow till x is bigger than 500 - and the
		range of x is (at the outside, approx) [0..25]  - it is
		also all compile time constants, no user input to worry about.]

		[Aside 2: this issue, such as it is, has been there
		since parsedate() was added to NetBSD in 2006, including
		the 3 days it was there as getdate() - I did not attempt
		to go trawling back through usenet sources to see just
		where this really originated.]

	Then I went to test it ... and found that cast/cadt didn't work
	at all (that's when I found the #if 0 stuff).    Simple to
	remove the relevant #if 0, and test again.  "cast" worked fine,
	"cadt" produced nonsense.   That made no sense to me (I won't bother
	showing the results, as to reproduce them requires more work than
	is reasonable) so I tested again with one of the other standard
	and summer time pairs...

	date -u -d '2001-01-01 00:00:00 est'
	Mon Jan  1 05:00:00 UTC 2001

	That's fine (just as "cast" had been fine -- its 9:30 offset
	worked properly with the corrected HOUR()).

	But ...

	date -u -d '2001-01-01 00:00:00 edt'
	Thu Jan  1 04:59:59 UTC 1970

	That's nonsense.    Obviously.    Try it!

	The underlying cause of that problem is this ...

            /* We rely on mktime_z(NULL, ...) working in UTC */
            result = mktime_z(NULL, &tm);
            result += Timezone * 60;

	in Convert() in parsedate.y.   There is one obvious problem with
	that, which explains the gibberish output - the result of mktime_z()
	isn't tested for errors.  When result is -1 we get one second (-1)
	before the time indicated by the timezone, hence 4:59:59 for edt
	where est is 5:00:00... (you'd get something different if you don't
	use -u and have date output in your local time zone.)  The Jan 1 1970
	comes as the rest of the input (which would have been included in
	a non -1 result) has been lost.   You get 4:59:59 Jan 1, 1970,
	(with -u, or in a UTC zone) regardless of the input, whenever "edt"
	is included (and similarly constant output in any other zone).

	Adding
		if (result == -1)	/* possibly add errno test */
			return result;

	allows Convert() to signal an error to its caller, rather than
	just returning rubbish.   (Wait, don't do that yet!)

	The actual error is caused by the immediately preceding (executed)
	code ...

    switch (DSTmode) {
    case DSTon:  tm.tm_isdst = 1; break;
    case DSToff: tm.tm_isdst = 0; break;
    default:     tm.tm_isdst = -1; break;
    }            

	That all looks fine, and in one case, i.e. this one...

    if (Timezone == USE_LOCAL_TIME) {
            result = mktime(&tm);
    } else {

	it is fine, but we're in the "else" case, and it isn't.

	Note the comment (quoted above) and here again ...

            /* We rely on mktime_z(NULL, ...) working in UTC */

	That's OK, mktime_z(NULL, ...) does work in UTC, but in UTC, it
	makes absolutely no sense at all to have tm.tm_isdst = 1;
	That's impossible, UTC has no summer time, it is always standard
	time, which is why "est" worked, and "edt" dind't, the former
	is DSToff, the latter DSTon.   mktime_z() (and mktime()) checks
	that the resulting tm_isdst is compatible with the one passed in,
	and returns an error if the time converted doesn't have the
	summer time property that tm_isdst requests (if it requests one
	at all.)

	Now at first glance it seems as if just returning an error when
	mktime_z() fails is reasonable, and for most failure modes it
	would be.  Here it even looks OK, after all, Jan 1 is not
	summer time (daylight saving) in the US Eastern time zone, so
	asking to convert "1 Jan, 2001 EDT" is erroneous, and giving
	an error response would be reasonable.

	Unfortunately ...

	date -u -d '2001-07-01 00:00:00 edt'
	Thu Jan  1 04:59:59 UTC 1970

	that's also nonsense, but July 1, 2001, was summer time in
	Eastern US (or most of it) so that conversion should not fail.

	Of course, mktime_z(NULL,...) (as it works in UTC) is always
	going to fail for any summer time conversion, regardless of
	what would be appropriate in the timezone selected.

	What should happen here is anyone's guess, the parsedate(3) man
	page being particularly useless, it gives a list of timezone
	abbreviations, and says they work (or implies they do, or that
	they mean something) - including both est & edt - but doesn't
	give any clue at all what they actually mean when present.

	For my case, I chose to fix it by always calling mktime_z()
	with tm_isdst == 0, so the only reason for it to fail is if
	the data is out of range.  If that happens, I just return
	failure (best that can be done) but if mktime_z() succeeds,
	and DSTmode == DSTon I just correct for summer time, by
	brute force subtracting an hour (subtract, as we're manipulating
	UTC, not the local time - an hour advance for local time is
	equivalent to UTC being an hour earlier.)   That's not good,
	as it allows 2001-01-01 EDT without error (but perhaps it should,
	who knows?), and because it assumes that summer time is always
	one hour ahead of standard time - the localtime() and mktime()
	libc functions go to extraordinary measures to avoid assuming that.

	I'm not including a patch, as the changes are mostlty trivial once
	you know what it is you really want to accomplish, the harder
	part is deciding what that is.

	While I was debugging this, I came across another problem.
	The code attempts to use errno tests to distinguish between
	the case where -1 just happens to mean Dec 31 1969 23:59:59 UTC
	and the case where it indicates an error.

	Unfortunately, it does that very poorly (well, it really cannot
	be other than poor, as using errno this way is a horrible botch,
	but it does it even worse than it needs to.)

	The start of parsedate() (omitting the local var declarations) is ...

	time_t
	parsedate(const char *p, const time_t *now, const int *zone)
	{   
	    saved_errno = errno; 
	    errno = 0; 


	and then at the end ..

           if (errno == 0)
	       errno = saved_errno;
	   return Start;
	}   

	which looks like it should be doing the right thing, if there's no
	error, leave errno at whatever it was on entry, otherwise return
	the error code, so if Start (which regardless of its name, is
	actually the answer) == -1 the caller can tell the difference
	between Dec 31 1969, and "error occurred".

	Unfortunately, that assumes that errno != 0 --> an error occurred,
	which it doesn't actually mean.   While watching this while debugging
	the issue above, I always found errno != 0, in fact errno == ENOENT
	which is an error code that isn't in parsedate.y at all - so it
	couldn't have come from there.  In my case, the ENOENT came from
	the lack of /etc/malloc.conf (or whatever it is intended to be
	called) on my system (it rarely exists on anyone's system I suspect).
	When malloc() checks for it, errno gets set, and stays set.

	To do the errno trick as well as it can be, the code needs

		errno = 0;

	sprinked through it, all over the place - every time something is
	about to be called that might fail, clear errno first, so that
	any earlier sys call failures, which didn't amount to errors
	parsedate() cares about can be ignored.   Then if a function does
	fail, and errno has become set, assume that that errno relates to
	the failure (in these functions it probably does, as it is always
	explicitly set) and return it.  If the function didn't fail, then
	errno might be anything, and must be cleared again before this
	trick is safe to employ around another function call.

	I haven't (yet anyway) even attempted to fix that one.

	While here, as indicated above, I was looking at parsedate(3),
	and aside from how useless it is as documentation of the
	function, there are a couple of other issues...

	The BUGS section says ...

     2 The parsedate() function cannot compute days before the unix epoch
       (19700101).

       which hasn't been true since Dec 2011, if I read the cvs log
       correctly.

	date -u -d '1918-11-11 11:00:00 +0100'
	Mon Nov 11 10:00:00 UTC 1918

      and also ...

     3 The parsedate() function assumes years less than 0 mean - year, years
       less than 70 mean 2000 + year, years less than 100 mean 1900 + year.

	the second clause of which is only partly true now - it was
	corrected in the examples earlier in the man page, but not here.

	The man page is also lacking anything that could be counted as
	useful information (aside from the calling sequence for parsedate()).

>How-To-Repeat:
	date -u -d '2001-07-01 00:00:00 edt'

	and by reading the man page in conjunction with the sources.

>Fix:
	As I have no idea what is really meant to be implied by

		'2001-07-01 00:00:00 edt'

	it is hard to know what the correct fix really is.  If it is
	intended to convert the time as if TZ=America/New_York
	was in place (actually assume US Eastern time) then I suspect
	quite a lot of work is required, as the primitives all work
	in either UTC (GMT) or the caller's local zone, it would not
	be trivial to handle properly (that is, to correctly discover
	whether or not it was summer time, in Eastern US, in July 2001,
	when my local zone is somewhere else entirely.)

	Note that even attempting that implies buying in to the hypothesis
	that it is possible to extract some meaning from time zone
	abbreviations - which there really isn't,   A whole bunch of
	plausible ones are #if 0'd out of the current sources - not
	just the ones that have fractional hours, which don't work
	without HOUR() corrected as above - but also ones which are
	just ambiguous so the code just arbitrarily assumes one (the
	man page says the abbreviations exist, but doesn't bother to
	say what they refer to - it just assumes "you must know").

	What's more (for example) the abbrevtations chosen by parsedate.y
	aren't walways what other code uses ... the "cast" example I
	used above is one such case.  That's supposed to be
		Central Australian Standard Time
	but nothing else calls it that, it is either just
		Central Standard Time			(CST)
	or
		Australian Central Standard Time	(ACST)

	older versions of the tz database had the former, more
	recent ones the latter, none have CAST.   (The same is
	true for the other Austr abbreviations that aren't #if 0'd
	out - I didn't check to see which of the others in the
	list actually agree with other users - the US ones
	obvously do.)

	So, one reasonable choice might be to just ditch the whole
	"parse the timezone abbreviation" nonsense completely, and
	require numeric -0500 -0400 +0930 +1030 zone indicators instead
	when a time is to be converted that is neither local time, nor
	UTC.  They work in all (not even so-) recent versions of parsedate().

	Of course, other than one obscure example in parsedate(3) these
	(the numeric offsets) aren't mentioned at all (unlike the
	alphabetic abbreviations) so suddenly requiring them might be
	a bit much of a leap...   (Also: it isn't clear if the one example
	intended the -0500 it used to be US EST (probable, but not stated),
	or somewhere in central/western Asia).

	Fixing HOUR() and (perhaps) removing the #if 0's around the
	half hour zones is an optional extra to all of this - bizarre
	that that one is where I started looking!

	The man page needs a total rewrite.

	errno handling needs to be totally renovated - or even better,
	just trashed.  To do the latter, we probably need to replace
	parsedate() with a new function with a new calling sequence.

	I'd suggest something like (excuse any C syntax errors here,
	this is just an example)

		const char *
		parsetime(struct timespec *result,
			  const char *datestr,
			  const time_t *time,
			  const int *tzoff)

	which would allow times like 12:25:47.1027 which are currently
	parsed to actually be returned, rather than simply ignoring the
	fractional seconds (I don't think they're even rounded), and
	also allow the function to return a string indicating what failed
	when something goes wrong, rather than just "it failed" - a NULL
	return would indicate success, non-NULL would be a pointer to
	a const string which indicates what caused the problem.
	Or return an enum value (or int) and provide a function to convert
	that to a string (ala errno and strerror() but with more appropriate
	values to the use case.)

	The old parsedate() function could be a wrapper around that
	for compatability with any 3rd party apps (NetBSD code,
	including date, would all be fixed to use the new interface
	of course) which happen to know this old interface and use it.

	Whatever is done needs a pullup to NetBSD 7.

	For NetBSD 5 & 6, best advice is probably just not to use
	parsedate() or date -u at all - they are horribly broken in
	all but the very simplest cases.

	The worst of that is fixed in NetBSD 7 & current, but hasn't
	been pulled up.  Perhaps it could be - I doubt any of the
	changes (other than 64 bit time_t's which doesn't directly
	affect this code at all) would break anything.

	NetBSD 4 (where this all first appeared) is broken too of course,
	but that's beyond EOL and won't ever get improved.

	Once again, apologies for the length of this.   Good luck fixing it.
Follow-Ups:
- re: lib/50143: various bugs/weirdness with parsedate(3) (libutil) (+ some suggested fixes)
  - From: matthew green
Prev by Date: Re: kern/47665 (IPF only works by chance on some archs)
Next by Date: NetBSD Nightly Trouble Ticket Report
Previous by Thread: Re: kern/47665 (IPF only works by chance on some archs)
Next by Thread: re: lib/50143: various bugs/weirdness with parsedate(3) (libutil) (+ some suggested fixes)
Indexes:
Home | Main Index | Thread Index | Old Index