tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: why cast to char* through void*



I agree fully that any inconsistency between lint and the compiler is a
bad thing.

I also agree fully that use of /*LINTED*/ comments should be limited as
much as possible and that every single one of them should send off
warning bells for everyone, and that every use should be clearly
documented so that even a relatively in-experienced programmer can
hopefully understand the issue it is "allowing".

However in this case there really still is something very tricky going
on, and both the compiler and lint should warn about it, despite the
fact that it is a "char *" conversion.  The code explicitly mixes two
different types into the same block of storage, and it does so by
explicitly foiling any ability the compiler might have to provide better
checks for correct usage.

In this case too /*LINTED*/ is probably far from ideal too since the
code could easily be rewritten to avoid the whole problem in the first
place.

I think the following demonstrates the simplest algorithm that will
avoid both the need for pointer aliasing, plus all the other associated
scanning, counting, allocating, copying, etc.

However it also seems to demonstrate that there's at least a rather
major bug in stresep(3).  Observe the escape character appearing in
FOO2's value, and the next VAR='none' line in the output.

Of course in login_cap.c I believe the input string is writable, and
that it is not going to be used after the fact either, so the malloc()
and copying can be completely avoided.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void);

static const char * const input =
        "FOO1=bar,FOO2=\\ bar\\,\\ none,FOO3,,FOO4=boo\\ ";

int
main(void)
{
        char *var;
        char *ptr;
        char *str;
        char *freep;

        str = freep = malloc(strlen(input) + 1);
        strcpy(str, input);

        for (ptr = str; (var = stresep(&ptr, ", \t", '\\')) != NULL;) {
                char *value;

                if (!*var) {
                        printf("empty field at '%s'\n", ptr);
                        continue;
                }

                if ((value = strchr(var, '=')) != NULL) {
                        *value++ = '\0';
                } else {
                        value = NULL;
                }
                printf("VAR = '%s', VALUE='%s'\n", var, value ? value : 
"(NULL)");
        }
        free(freep);

        exit(0);
        /* NOTREACHED */
}


/*
 * Local Variables:
 * eval: (make-local-variable 'compile-command)
 * compile-command: "make 'CFLAGS=-g -fstrict-aliasing -W -Wall 
-Wstrict-aliasing=2 -Wimplicit -Wreturn-type -Wswitch -Wcomment -Wextra 
-Wcast-qual -Wpointer-arith -Wshadow -Wstrict-prototypes -Waggregate-return 
-Wcast-align -Wchar-subscripts -Wconversion -Wwrite-strings 
-Wmissing-declarations -Wmissing-prototypes -Wno-long-long -Wformat-extra-args 
-Wundef -Wbad-function-cast -Wdeclaration-after-statement -pipe' tsetuserenv && 
./tsetuserenv"
 * End:
 */ 


-*- mode: compilation; default-directory: "~/src/" -*-
Compilation started at Sat Jul  4 12:58:32

make 'CFLAGS=-g -fstrict-aliasing -W -Wall -Wstrict-aliasing=2 -Wimplicit 
-Wreturn-type -Wswitch -Wcomment -Wextra -Wcast-qual -Wpointer-arith -Wshadow 
-Wstrict-prototypes -Waggregate-return -Wcast-align -Wchar-subscripts 
-Wconversion -Wwrite-strings -Wmissing-declarations -Wmissing-prototypes 
-Wno-long-long -Wformat-extra-args -Wundef -Wbad-function-cast 
-Wdeclaration-after-statement -pipe' tsetuserenv && ./tsetuserenv
VAR = 'FOO1', VALUE='bar'
VAR = 'FOO2', VALUE=' bar,\'
VAR = 'none', VALUE='(NULL)'
VAR = 'FOO3', VALUE='(NULL)'
empty field at 'FOO4=boo\ '
VAR = 'FOO4', VALUE='boo '

Compilation finished at Sat Jul  4 12:58:33

I think the output should of course have been:

    VAR = 'FOO1', VALUE='bar'
    VAR = 'FOO2', VALUE=' bar, none'
    VAR = 'FOO3', VALUE='(NULL)'
    empty field at 'FOO4=boo\ '
    VAR = 'FOO4', VALUE='boo '



If there's any efficiency still to be gained anywhere it might be in
stresep() where an input containing multiple escape characters will
result in multiple one-byte shuffling of the rest of the whole string!
Perhaps there is a more efficient way to mark them and then do one
shuffle of just the content of the one field.

-- 
                                                Greg A. Woods
                                                Planix, Inc.

<woods%planix.com@localhost>       +1 416 218-0099        http://www.planix.com/

Attachment: pgps5CdRoNdZk.pgp
Description: PGP signature



Home | Main Index | Thread Index | Old Index