tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

VIS_HTTPSTYLE



Hi,

I think there is a mistake in vis(3) regarding VIS_HTTPSTYLE. Currently
translating URLs replaces all non-alphanumeric characters with escapes eg

ORIG: www.netbsd.org
HTTP: www%2enetbsd%2eorg

which is technically correct but not really necessary. Looking at rfc1738
it says

   Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
   reserved characters used for their reserved purposes may be used
   unencoded within a URL.

so I'm thinking that the translation should not need to happen for these
special characters (which are mentioned also in rfc1808) In
lib/libc/gen/vis.c we have this:

/*
 * This is do_hvis, for HTTP style (RFC 1808)
 */
static char *
do_hvis(char *dst, int c, int flag, int nextc, const char *extra)
{
        if (!isascii(c) || !isalnum(c) || strchr("$-_.+!*'(),", c) != NULL) {
                *dst++ = '%';
                *dst++ = xtoa(((unsigned int)c >> 4) & 0xf);
                *dst++ = xtoa((unsigned int)c & 0xf);
        } else {
                dst = do_svis(dst, c, flag, nextc, extra);
        }
        return dst;
}

but I think it should actually have been "strchr() == NULL" to escape the
unmatched characters.. Also, I think there would be a bug with the use of
strchr(), as it will match '\0' which ought to be escaped .. looking at
the FreeBSD function which is arranged slightly differently, they have
used

        if (flag & VIS_HTTPSTYLE) {
                /* Described in RFC 1808 */
                if (!(isalnum(c) /* alpha-numeric */
                    /* safe */
                    || c == '$' || c == '-' || c == '_' || c == '.' || c == '+'
                    /* extra */
                    || c == '!' || c == '*' || c == '\'' || c == '('
                    || c == ')' || c == ',')) {
                        *dst++ = '%';
                        snprintf(dst, 4, (c < 16 ? "0%X" : "%X"), c);
                        dst += 2;
                        goto done;
                }
        }

And so, I would like to apply the following patch to fix that

Index: vis.c
===================================================================
RCS file: /cvsroot/src/lib/libc/gen/vis.c,v
retrieving revision 1.40
diff -u -r1.40 vis.c
--- vis.c       11 Feb 2009 13:52:28 -0000      1.40
+++ vis.c       22 Nov 2009 19:47:00 -0000
@@ -120,13 +120,20 @@
 static char *
 do_hvis(char *dst, int c, int flag, int nextc, const char *extra)
 {
-       if (!isascii(c) || !isalnum(c) || strchr("$-_.+!*'(),", c) != NULL) {
+
+       if ((isascii(c) && isalnum(c))
+           /* safe */
+           || c == '$' || c == '-' || c == '_' || c == '.' || c == '+'
+           /* extra */
+           || c == '!' || c == '*' || c == '\'' || c == '(' || c == ')'
+           || c == ',') {
+               dst = do_svis(dst, c, flag, nextc, extra);
+       } else {
                *dst++ = '%';
                *dst++ = xtoa(((unsigned int)c >> 4) & 0xf);
                *dst++ = xtoa((unsigned int)c & 0xf);
-       } else {
-               dst = do_svis(dst, c, flag, nextc, extra);
        }
+
        return dst;
 }

which gives me:

ORIG: www.netbsd.org
HTTP: www.netbsd.org

any comments or objections?

regards,
iain

(NB the VIS_HTTPSTYLE mangles more complicated URLs too, but that is much
more complex and I will send a PR about it as I've no time atm)




Home | Main Index | Thread Index | Old Index