On 16.09.2019 18:02, Jonathan Perkin wrote: > Is it assumed that any part of pkgsrc can contain characters that are > not UTF-8 clean? > > There are a number of places where this is always going to be the > case, for example some of the aspell-* packages install specific > language files, and some DESCR files contain author names. > > Is there any reason why a package name couldn't? Would we ever want > it to? Same for anything else that might be meaningfully used. > > Ideally we'd have some documentation which is explicit about what > formats are supported in various parts of the infrastructure, but I've > not found much. > > (Background: I'm working on something that parses various things and > am now coming up against this). > For package names please keep ASCII. What does non UTF-8 clean mean? Some other coding like ISO-Latin-2? I would prefer to normalize to ASCII as such things tend to break across filesystems/setups/archives etc.
Attachment:
signature.asc
Description: OpenPGP digital signature