tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: strftime(3) oddities with %s, %z



    Date:        Sun, 6 Nov 2022 11:40:38 +0000
    From:        Taylor R Campbell <campbell+netbsd-tech-userlevel%mumble.net@localhost>
    Message-ID:  <20221106114053.995E26085C%jupiter.mumble.net@localhost>

  | I don't think that's needed for what uwe@ and dholland@ were looking
  | for, which -- if I understand correctly -- is that:
  |
  | 	struct tm gtm, ltm;
  | 	char gbuf[128], lbuf[128];
  |
  | 	gmtime_r(&t, &gtm);
  | 	localtime_r(&t, &ltm);
  | 	strftime_XXX(gbuf, sizeof(gbuf), "... %s ...", &gtm);
  | 	strftime_XXX(lbuf, sizeof(lbuf), "... %s ...", &ltm);
  |
  | should format the same number at %s in the output

That simply isn't possible, or not and retain correct operations
for other applications which follow the standard and expect strftime
to work (or once the standard gets published ... such applications
already exist, as %s in strftime is quite common - which is why POSIX
is adding it).

  | but otherwise format the broken-down time (%Y/m/d/H/M/S/...)
  | according to UTC for gtm and the local time for ltm,

Neither of those happens either, or not as you have written it.

Those conversions are just sprintf(buf, "%d", tm->tm_field);
(in effect) - what the timezone might have been that created that
field is irrelevant.   The %a/%b (etc) fields get localised %s
formats depending upon tm_mon tm_wday, etc.

  | and %z should format .tm_gmtoff,

We cannot do that either, as an application is allowed (and this
is following both existing C and POSIX standards from way before
POSIX considered adding tm_gmtoff)

	tm.tm_dst=0; strftime(buf, sizeof buf, "%z", &tm);

and get the local zone standard time offset.   The struct can
have any random garbage in the other fields - strftime() has no
way to know where the struct tm data came from.

As long as we're claiming to follow the standard, and that our
strftime.3 man page does claim, we're constrained to make that work.

  | After all, these struct tm objects are supposed to represent the same
  | moment in time (t), just phrased in different civil time zones.  So %s
  | should be the same for both, and %z should identify the UTC offset
  | that enables all the other broken-down time components to be
  | interpreted correctly to give the same moment in time.

It might have been nice had it been specified that way, but it wasn't,
and isn't.   Like you said in your original contribution to this thread
if we want this, we need a new API, the current one simply does not
provide that service.

  | I believe there is already enough information stored in struct tm to
  | do this for everything except perhaps %Z (but %Z doesn't seem to be
  | very well-specified anyway).  It's not clear to me if you can get this
  | semantics out of strftime_z but I suspect not.

The info might be there, but we are not permitted to use it, as we
have no way to know whether or not the user bothered initialising the
needed fields - as the specs say which fields are needed for each
conversion, applications only need to put data in the ones that will
be used by the conversions they are going to apply.   So if you just
want to know what the 3rd day of the week is called, all you need
to write is:
	tm.tm_wday = 3; strftime(buf, sizeof buf, "%A", &tm);

and that's required to work (to produce a local dependent day name).

%s %z and %Z are no different.   %s uses the 7 fields that mktime()
uses (year/month/day hours:mins:secs and isdst), the other two
just tm_isdst.  strftime() is not permitted to access any others.


It is possible, I think, to use strftime_z() to get the desired answers,
provided the application keeps track of the appropriate timezone_t in
order to tell it which is supposed to apply.

If we add two timezone_t's (gz and lz) to your little sample code,
set them as

	gz = tzalloc("UTC");		/* or "GMT" they result in almost the same */
	lz = tzalloc(NULL);

and replace your two strftime_XXX calls with

	strftime_z(gz, gbuf, sizeof(gbuf), "... %s ...", &gtm);
	strftime_z(lz, lbuf, sizeof(lbuf), "... %s ...", &ltm);

then I believe the desired result happens.   The code could also
use localtime_tz() instead of grtime_r() and localtime_r(), using
gz and lz as the (extra) first parameter, if it preferred (it should
make no difference).

Nothing however will cause

	tm.tm_zone = "gibberish";
	/* any other init you like, which must include tm_isdst */
	strftime_anything(..., "%Z", &tm, ...);

to result in "gibberish" being placed in the buffer, unless you can
find a time zone (or create one of your own) in which that is the
zone abbreviation (/ name).   Applications that want that should just
put the string they desire directly into the format string, and not
use %Z at all.

	strftime_anything(..., "gibberish", &tm, ....);

will indeed copy "gibberish" into the buffer, provided it is big
enough (the format can contain any other conversions desired as well,
of course).

kre




Home | Main Index | Thread Index | Old Index