tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: UTF-8 cleanliness

Am September 16, 2019 4:02:22 PM UTC schrieb Jonathan Perkin <>:
>Is it assumed that any part of pkgsrc can contain characters that are
>not UTF-8 clean?
>There are a number of places where this is always going to be the
>case, for example some of the aspell-* packages install specific
>language files, and some DESCR files contain author names.
>Is there any reason why a package name couldn't?  Would we ever want
>it to?  Same for anything else that might be meaningfully used.
>Ideally we'd have some documentation which is explicit about what
>formats are supported in various parts of the infrastructure, but I've
>not found much.

The first place to look should be the pkgsrc guide. I guess you already looked there.

Another place to look is the pkglint source code. Sure, it's more difficult to read than prose text, but pkglint detects and sometimes explains more than 100 additional pkgsrc rules. One of them is in vartypecheck.go, and it is called Pkgname. It allows only very few characters to be used in package names. See Test_VarTypeCheck_Pkgname in vartypecheck_test.go for examples.

There's another check in pkglint that complains about non-ASCII characters in any pkgsrc file, if I remember correctly.

Home | Main Index | Thread Index | Old Index