I very much agree that pointer arithmetic MUST NOT be "undefined", even if it includes "NULL" and/or "0". The warning that begat this thread is insane! Note I say this as someone who is very empathetic to implementers who might try to make C work in any strange hardware systems where "null" pointers are not actually all zeros in the hardware. I hope to be one. At Mon, 24 Feb 2020 14:41:26 +0100, Kamil Rytarowski <n54%gmx.com@localhost> wrote: Subject: Re: NULL pointer arithmetic issues > > Please join the C committee as a voting member or at least submit papers > with language changes. Complaining here won't change anything. > > (Out of people in the discussion, I am involved in wg14 discussions and > submit papers.) If you are active on the wg14 committee, perhaps you can be convinced to argue on "our" behalf? [0.5 :-)] I wrote the following rant some time ago and posted it somewhere (probably on G+ because I don't find it now with a quick search). I'll throw it in here for some more fuel.... NO MORE "undefined behaviour"!!! Pick something sane and stick to it! The problem with modern "Standard" C is that instead of refining the definition of the abstract machine to match the most common and/or logical behaviour of existing implementations, the standards committee chose to throw the baby out with the bath water and make whole swaths of conditions into so-called "undefined behaviour" conditions. An excellent example are the data-flow optimizations that are now commonly abused to elide security/safety-sensitive code: int foo(struct bar *p) { char *lp = p->s; if (p == NULL || lp == NULL) { return -1; } lp[0] = '\0'; return 0; } Any programmer worth their salt will assume the compiler can calculate the offset of 's' at compile time and thus anyone ignorant of C's new "undefined behaviour" rules will guess that at worst some location on the stack will be assigned a value pulled from low memory (if that doesn't cause a SIGSEGV), but more likely the de-reference of 'p' won't happen right away because we all know that any optimizer worth it's salt SHOULD defer it until the first use of 'lp', perhaps not even allocating any stack space for 'lp' at all! Worse yet this example stems from actual Linux kernel code like this: static int podhd_try_init(struct usb_interface *interface, struct usb_line6_podhd *podhd) { struct usb_line6 *line6 = &podhd->line6; if ((interface == NULL) || (podhd == NULL)) return ENODEV; .... } Here some language-lawyer-wannabees [[LLWs]] might try in vain to argue over the interpretation of "dereferencing", yet again any programmer worth their salt knows that the address of an field in a struct is simply the sum of the struct's base address and the offset of the field, the latter of which the compiler obviously knows at compile time, and adding a value to a NULL pointer should never be considered invalid or undefined! [[ You have to start from somewhere, after all.... Why not zero? ]] (I suspect the LLWs are being misled by the congruence between "a->b" and "(*a).b".) Worst of all consider this example: void * foo(struct bar *p) { size_t o = offsetof(p, s); if (s == NULL) return NULL; .... } And then consider an extremely common example of "offsetof()" which might very well appear in a legacy application's own code because it pre-dated <stddef.h>, though indeed this very definition has been used in <stddef.h> by several standard compiler implementations, and indeed it was specifically allowed in general by ISO C90 (and only more recently denied by C11, sort of): #define offsetof(type, member) ((size_t)(unsigned long)(&((type *)0)->member)) or possibly (for those who know that pointers are not always "just" integers): #define offsetof(type, member) ((size_t)(unsigned long)((&((type *)0)->member) - (type *)0)) Here we have very effectively and entirely hidden the fact that the '->' operator is used with 's'. Any sane person with some understanding of programming languages should agree that it is wrong to assume that calculating the address of an lvalue "evaluates" that lvalue. In C the '->' and '[]' operators are arithmetic operators, not (immediately and on their own) memory access operators. Sadly C's new undefined behaviour rules as interpreted by some compiler maintainers now allow the compiler to STUPIDLY assume that since the programmer has knowingly put a supposed de-reference of a pointer on the first line of the function, then any comparisons of that pointer with NULL further on are OBVIOUSLY never ever going to be true and so it can SILENTLY wipe out the whole damn security check. I guess I'm saying that modern compiler maintainers are not sane, and at least some of the more recent C Standards Committee are definitely NOT sane and/or friendly and considerate. C's primitive nature engenders the programmer to think in terms of what the target machine is going to do, and as such it is extremely sad and disheartening that the standards committee chose to endanger users in so many ways. [[ in modern "Standard C" ]] Itʼs not that evaluating something like (1<<32) might have an unpredictable result, but rather that the entire execution of any program that evaluates such an expression is ENTIRELY meaningless! Indeed according to "Standard C" the execution is not even meaningful up to the point where undefined behaviour is encountered. Undefined behaviour trumps ALL other behaviors of the C abstract machine. And it is all in the goal of attempting comprehensive maximum possible optimization of all code at any expense INCLUDING correct operation of the program. Not all so-called "undefined behaviours" are quite this bad, yet, but in general we would be infinitely better off with a more completely defined abstract machine that might force some target architectures to jump through hoops instead of forcing EVERY programmer to ALWAYS be more careful than EVERY conceivable optimizer. As Phil Pennock said: If I program in C, I need to defend against the compiler maintainers. [[ and future standards committee members!!! ]] If I program in Go, the language maintainers defend me from my mistakes. And I say: Modern "Standard C" is actually "Useless C" and "Unusable C" Indeed I now say if "Standard C" follows C++ then it will be safe to say that a good optimizing compiler will soon be able to turn all C programs into "abort()" calls. -- Greg A. Woods <gwoods%acm.org@localhost> Kelowna, BC +1 250 762-7675 RoboHack <woods%robohack.ca@localhost> Planix, Inc. <woods%planix.com@localhost> Avoncote Farms <woods%avoncote.ca@localhost>
Attachment:
pgpSk96TYNFqf.pgp
Description: OpenPGP Digital Signature