tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: UTF-8 cleanliness



On 16.09.2019 18:02, Jonathan Perkin wrote:
> Is it assumed that any part of pkgsrc can contain characters that are
> not UTF-8 clean?
> 
> There are a number of places where this is always going to be the
> case, for example some of the aspell-* packages install specific
> language files, and some DESCR files contain author names.
> 
> Is there any reason why a package name couldn't?  Would we ever want
> it to?  Same for anything else that might be meaningfully used.
> 
> Ideally we'd have some documentation which is explicit about what
> formats are supported in various parts of the infrastructure, but I've
> not found much.
> 
> (Background: I'm working on something that parses various things and
> am now coming up against this).
> 

For package names please keep ASCII.

What does non UTF-8 clean mean? Some other coding like ISO-Latin-2?

I would prefer to normalize to ASCII as such things tend to break across
filesystems/setups/archives etc.

Attachment: signature.asc
Description: OpenPGP digital signature



Home | Main Index | Thread Index | Old Index