tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: strftime(3) oddities with %s, %z



    Date:        Wed, 2 Nov 2022 15:59:00 +0300
    From:        Valery Ushakov <uwe%stderr.spb.ru@localhost>
    Message-ID:  <Y2JplI+Ut4CFYtZi%pony.stderr.spb.ru@localhost>

  | I think this is the point of confusion.  It's not useful to pretend
  | that struct tm and strftime are the proverbial spherical horses in
  | vacuum and argue for "a portable implementation" of strftime,

That's not quite what I am doing.    What I am saying is that we have
an implementation of strftime() which is the way it is, and has been
for decades.   When %s was added to it (whether that was original, or
a later addition, I have no idea) it was done in a particular way, and
I am assuming, most likely because its designers could not think of
a better way that would work - perhaps their struct tm was one that did
not have tm_gmtoff, I have no idea.

Further, posix is not going to standardise something that cannot be
implemented everywhere.   The mechanism doesn't always have to be the
same, but it must be possible.   That is, for %s to get added to the
standard, the definition must be something that can be implemented.
I want it in the standard, I use it in scripts, and I'd like those to
be as portable as possible.

It isn't inconceivable that the wording for %s that dh@ gave in one of
his messages could have been accepted - I doubt that it would, but that's
not my decision to make - but someone needs to do the work to propose it.
That won't be me, while I might not have designed %s as it is, were it
me who had added it, for my purposes, and I think realistically, everyone
else's for any real application, it works just fine.

What really matters here, is that this is an existing interface, it has
been this way for a VERY long time now, and is very unlikely to suddenly
be changed to behave differently.

The "specious analogies" that dh@ complained about from my messages were
(mostly, I doubt I remember them all) an attempt to show other places
where what we have is imperfect (which this may be) yet it is what it is,
we have had it for a long time, and no-one is going to change it, because
it either would (or could) break unknowable amounts of code.

Complaining about any of them (including this one) achieves nothing.

  | What initially surprised me, and what I think dh@ is also pointing
  | out, is that %s behaves as if it is indeed "a portable implementation"
  | that doesn't have access to the full tm, including private tz-related
  | parts.

Yes, it is.   And maybe when created, it didn't have access to the
rest.   But like lots of other things, it was designed with a specific
interface (and sometimes documented that way as well) and that is what
we have.

  | I think we both tried to convey that point of view in various
  | ways, apparently without much success.

I understand the point - my point is that it simply doesn't matter.
What is done is done.   Good or bad, this is the way that it is.

  | > How do you tell what produced a struct tm?   Magic?
  |
  | In other words, class tm doesn't have a public constructor that
  | provides a way to specify TZ info.

Those would be rather different words, a struct is not a class, and
there are no constructors, but yes.

  | There are other factory methods
  | that allow one to obtain an instance of tm that has the TZ info (in
  | its private parts). ...

Sure, but how is strftime() supposed to know whether one of those was
used or not?   A new field could be added to struct tm to indicate what
data was present, and then everything that sets or modifies a tm could be
modified to always update that field - go ahead and start that project if
you feel inclined (remember everything in pkgsrc, and elsewhere, that might
also need updating).

  | But I don't have time or energy to elaborate this argument further
  | because you seem to disagree with the basic premise of it - that for a
  | class method (strftime) to be aware and to use the private data of the
  | class (struct tm as fully defined by the implementation, not just the
  | public "at least the memebers tm_foo, tm_bar" as specified in the
  | standard) is actually a reasonable thing to do.

No I don't disagree with that at all.   What I am saying, and have been
saying, is that the definition of how %s works, and what it does, in
strftime() has been fixed for decades now.   It is simply too late to
make changes.

This fantasy conversation is really just a waste of time (your original
question, and the first few followups were fine - beyond that it has all
been pointless).

While I am here, and to avoid cluttering the list with too many messages
on this, just a few other points from other messages in this thread.


mouse%Rodents-Montreal.ORG@localhost said:
  | If %s is a code-writing-time constant, yes.
  | If %s comes from, say, the command line, not so much. 

Agreed.  That's what I meant when I said, in the message to which that
was a reply..

	The usage in date (and other programs offering similar interfaces
	to strftime()) is fine just as it is defined now.

There are several (stat(1) is another, even sh(1) allows it).

mouse%Rodents-Montreal.ORG@localhost said:
  | I had no idea, before reading this thread, that date +%s would get the
  | current time, convert it to a struct tm, and then convert that back to a
  | time_t to print.  With all the room for mismatches and other errors that
  | entails, I'm slightly surprised it works even most of the time.

It actually works all of the time in date - and I suspect (though haven't
audited the code to be certain .. except sh naturally .. that it will work
all the time in all of the others as well).

In a different message mouse%Rodents-Montreal.ORG@localhost said:
  | I suspect you're talking past, rather than to, one another, failing to state
  | assumptions because you each consider them so obvious they don't need
  | stating, or some such. 

Perhaps.   Though I believe I understand what dh@ (and uwe@) are getting at.
I think the issue is perhaps...

mouse%Rodents-Montreal.ORG@localhost said:
  |  I see each of you asserting that certain behaviour is right or wrong 

If it seemed like I was trying to do that, then I have indeed failed.
What is right or wrong (as opposed perhaps to objecting to claims
that the %s interface is _wrong_) was never what I was trying to say.

My point has always been that, right or wrong, it simply is.
The interface has existed, unchanged, with a specific definition of
how it works, for decades.   Too late now to contemplate changes.


dholland-tech%netbsd.org@localhost said:
  | (a) It follows from the observations so far that if I set ->tm_gmtoff 

I have said before that I will look into what happens with %z and %Z, and
if there is an actual bug (as opposed to simply a difference of opinion on
what should happen) I will fix it.   I haven't had time to look at that
yet, there are more pressing matters that have real world effects that need
attention (completely unrelated to anything in this discussion, or any of
the time related interfaces) which are occupying my time at the minute.
I will get back to this sometime later - you're just wasting everyone's
time reiterating this point over and over again until after that has happened.


dholland-tech%netbsd.org@localhost said:
  | (b) Our implementation contains the timezone info in struct tm.
  | This does not make our implementation noncompliant;

That's correct.   But:

dholland-tech%netbsd.org@localhost said:
  | It follows (I can find no text to contradict this) that consing up your
  | own struct tm to call strftime without bzeroing it first risks producing
  | undefined behavior and is therefore incorrect.

That actually isn't, I believe.   But even if it were, it would make no
difference to anything, adding a memset() to the couple of examples provided
would not change the results at all.

But posix actually specifies for each conversion provided by strftime()
(or all the standard ones anyway, not the gnu extensions, which we also
seem to support) exactly which field or fields of the struct tm are
accessed by that conversion.  Attempting a conversion without those
fields defined would indeed risk undefined behaviour.   What (if anything)
other fields are set to is irrelevant.

Our doc doesn't include that level of detail.   Perhaps it should.

dholland-tech%netbsd.org@localhost said:
  | (c) [...] (Whether our
  | implementation can be corrected without a compat mess is less clear.) 

But this is exactly the point, and applies to all implementations.

The interface has been defined and existed a very long time.   Arbitrarily
changing it, however broken you might consider it to be (unless it cannot
be used at all, which is clearly not the case here) would not be rational.


dholland-tech%netbsd.org@localhost said:
  | (d) It is, in general, quite easy to write standards language that acceps
  | both full correct and less than ideal implementations for cases where the
  | incorrect implementations cannot be fixed.

Yes, while not always quite as easy as you make it sound, that is certainly
possible.   But that's only done when required, as it makes use of whatever
is being described more difficult for applications.

The example you gave is a case where it is needed, as the way implementations
with the tm_gmtoff and tm_zone fields work to produce %z and %Z output
necessarily differs from implementations without those fields, and that
can have consequences.

The posix wording you quoted (which I hadn't looked at before, or not in
the lifetime of this thread anyway) however makes it clear that the results
in your point (a) mentioned above are not a standards violation, at the
very least.   That doesn't mean that they're not a bug in our implementation
and that we couldn't do better though.

dholland-tech%netbsd.org@localhost said:
  | (And given this text, I completely fail to understand why they are
  | proposing to change things.)

They are not.   Certainly not in this area, or not now anyway.   Perhaps
someday, after (if it ever happens) the tm_gmtoff and tm_zone fields
exist everywhere (or everywhere that matters - ie: implementations that
would like to be posix compliant, or close to it) that may alter, and it
may be possible to remove that caveat from the standard.   But no-one is
proposing that now.

If you're referring to the way that %s is being specified as it is being
added, then the issue isn't at all the same.   The interface simply does
not examine the zone fields, never has.  It has always been "the result
of mktime()".   You might consider that inadequate - clearly you do - but
good or bad, that is the interface.   The discussions on this, whatever
happened, were way before when I started participating, so I have no idea
who might have voiced opinions about any of it, aside from what has been
added to the bug report, which for this issue is very little.   I really
doubt there was any controversy at all, the only real question would have
been directed at how widespread, in different strftime() implementations,
implementations of %s exist.   Once satisfied that it existed in most
(and/or would be added to others) it would, I am guessing, simply have
been specified to match the implementations.


dholland-tech%netbsd.org@localhost said:
  | (e) Standardization that precludes a correct implementation, or that
  | mandates an incorrect implementation, remains wrong and inappropriate. 

I would agree with that.   But once again, what you believe to be required
by a correct implementation isn't necessarily what everyone else believes.

There is a very specific definition of what is to happen, which matches
precisely the implementations - that is almost the definition of a
correct specification.    Once again, if you want something different, you
should propose something different.   Either a new interface (which would
be easy to produce, but take a long time to get widespread enough
implementations that anyone could reliably use it), or a different
specification of %s (which I don't consider is very likely to be
accepted, but I am not the arbiter of such things.)


dholland-tech%netbsd.org@localhost said:
  | I have a feeling that nothing productive is likely to happen going forward.

Other than an end to this discussion, which would be productive it its
own way (it is occupying far too much time that could better be spent
on more important issues) I don't think there is anything productive
that can be achieved.   uwe@'s original question was answered long ago.

kre



Home | Main Index | Thread Index | Old Index