Source-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[src/trunk]: src/lib/libutil For touch -d (which uses parsedate()) POSIX spec...



details:   https://anonhg.NetBSD.org/src/rev/fee77eebb282
branches:  trunk
changeset: 945050:fee77eebb282
user:      kre <kre%NetBSD.org@localhost>
date:      Mon Oct 19 15:08:17 2020 +0000

description:
For touch -d (which uses parsedate()) POSIX specifies that the
ISO-8601 format yyyy-mm-ddTHH:MM:SS[radix_and+frac][Z]
be accepted.

We didn't handle that, as in parsedate(), 'T' represents the
military timezone designator, not a padding separator between
date & time as POSIX specified it.

The way parsedate() is written, fixing this in the grammar/lexer
would be hard without deleting support for T as a zone indicator
(it is *my* timezone!).

So, instead of doing that, parse an ISO-8901 string which occurs
right at the start of the input (not even any preceding white space)
by hand, before invoking the grammar, and so not involving the lexer.
This is sufficient to make touch -d conform.

After doing that, we still need to allow earlier valid inputs,
where an ISO-8601 format (using space as the separator, but without
the 'Z' (Zulu, or UTC) suffix) followed by an arbitrary timezone
designation, and other modifiers (eg: "+5 minutes" work.  So we
call the grammar on whatever is left of the input after the 8601
string has been consumed.   This all "just works" with one exception,
a format like "yyyy-mm-dd hh:mm:ss +0700" would have the grammar parse
just "+0700" which by itself would be meaningless, and so wasn't
handled.    Add a grammar rule & processing to Handle it.

Also note that while POSIX specifies "at least 4" digits in the YYYY
field, we implement "at least one" so years from 0-999 continue to be
parsed as they always have (nb: these were, and continue to be, treated
as absolute year numbers, year 10 is year 10, not 2010).  Years > 2 billion
(give or take) cannot be represented in the tm_year field of a struct tm,
so there's a limit on the max number of digits as well.

diffstat:

 lib/libutil/parsedate.y |  87 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 86 insertions(+), 1 deletions(-)

diffs (111 lines):

diff -r 16e5a50cc8f7 -r fee77eebb282 lib/libutil/parsedate.y
--- a/lib/libutil/parsedate.y   Mon Oct 19 15:07:47 2020 +0000
+++ b/lib/libutil/parsedate.y   Mon Oct 19 15:08:17 2020 +0000
@@ -14,7 +14,7 @@
 
 #include <sys/cdefs.h>
 #ifdef __RCSID
-__RCSID("$NetBSD: parsedate.y,v 1.33 2020/10/19 15:05:53 kre Exp $");
+__RCSID("$NetBSD: parsedate.y,v 1.34 2020/10/19 15:08:17 kre Exp $");
 #endif
 
 #include <stdio.h>
@@ -300,6 +300,12 @@
          tZONE         { param->yyTimezone = $1; param->yyDSTmode = DSToff; }
        | tDAYZONE      { param->yyTimezone = $1; param->yyDSTmode = DSTon; }
        | tZONE tDST    { param->yyTimezone = $1; param->yyDSTmode = DSTon; }
+       | tSNUMBER      {
+                         if (param->yyHaveDate == 0 && param->yyHaveTime == 0)
+                               YYREJECT;
+                         param->yyTimezone = - ($1 % 100 + ($1 / 100) * 60);
+                         param->yyDSTmode = DSTmaybe;
+                       }
 ;
 
 day:
@@ -1066,6 +1072,85 @@
     param.yyHaveTime = 0;
     param.yyHaveZone = 0;
 
+    /*
+     * This one is too hard to parse using a grammar (the lexer would
+     * confuse the 'T' with the Mil format timezone designator)
+     * so handle it as a special case.
+     */
+    do {
+       const unsigned char *pp = (const unsigned char *)p;
+       char *ep;       /* starts as "expected, becomes "end ptr" */
+       static char format[] = "-dd-ddTdd:dd:dd";
+
+       while (isdigit(*pp))
+               pp++;
+
+       if (pp == (const unsigned char *)p)
+               break;
+
+       for (ep = format; *ep; ep++, pp++) {
+               switch (*ep) {
+               case 'd':
+                       if (isdigit(*pp))
+                               continue;
+                       break;
+               case 'T':
+                       if (*pp == 'T' || *pp == 't' || *pp == ' ')
+                               continue;
+                       break;
+               default:
+                       if (*pp == *ep)
+                               continue;
+                       break;
+               }
+               break;
+       }
+       if (*ep != '\0')
+               break;
+       if (*pp == '.' || *pp == ',') {
+               if (!isdigit(pp[1]))
+                       break;
+               while (isdigit(*++pp))
+                       continue;
+       }
+       if (*pp == 'Z' || *pp == 'z')
+               pp++;
+       else if (isdigit(*pp))
+               break;
+
+       if (*pp != '\0' && !isspace(*pp))
+               break;
+
+       /*
+        * This is good enough to commit to there being an ISO format
+        * timestamp leading the input string.   We permit standard
+        * parsedate() modifiers to follow but not precede this string.
+        */
+       param.yyHaveTime = 1;
+       param.yyHaveDate = 1;
+       param.yyHaveFullYear = 1;
+
+       if (pp[-1] == 'Z' || pp[-1] == 'z') {
+               param.yyTimezone = 0;
+               param.yyHaveZone = 1;
+       }
+
+       errno = 0;
+       param.yyYear = (time_t)strtol(p, &ep, 10);
+       if (errno != 0)                 /* out of range (can be big number) */
+               break;                  /* the ones below are all 2 digits */
+       param.yyMonth = (time_t)strtol(ep + 1, &ep, 10);
+       param.yyDay = (time_t)strtol(ep + 1, &ep, 10);
+       param.yyHour = (time_t)strtol(ep + 1, &ep, 10);
+       param.yyMinutes = (time_t)strtol(ep + 1, &ep, 10);
+       param.yySeconds = (time_t)strtol(ep + 1, &ep, 10);
+       /* ignore any fractional seconds, no way to return them in a time_t */
+
+       param.yyMeridian = MER24;
+
+       p = (const char *)pp;
+    } while (0);
+
     if (yyparse(&param, &p) || param.yyHaveTime > 1 || param.yyHaveZone > 1 ||
        param.yyHaveDate > 1 || param.yyHaveDay > 1) {
        errno = EINVAL;



Home | Main Index | Thread Index | Old Index