Re: [PATCH] replace 0xA0 to whitespace in plain text files (part 2)

To: tech-userlevel%NetBSD.org@localhost
Subject: Re: [PATCH] replace 0xA0 to whitespace in plain text files (part 2)
From: Alan Barrett <apb%cequrux.com@localhost>
Date: Thu, 11 Sep 2008 18:33:51 +0200

On Thu, 11 Sep 2008, der Mouse wrote:
> I do not want UTF-8; if I want to use Unicode, it seems
> much saner to me to use streams of hexdecets rather than encoding
> hexdecets into octet streams with a funky variable-length encoding.

Unicode is a 21-bit character set (or 31-bit in some old versions).
The 16-bit encoding is just as funky and variable-length as the 8-bit
encoding.

> >> Not that I care so much, but are NetBSD supposed to have its files
> >> in Latin1?  Is that supposed to be the source character set, or
> >> what?
> > I think that simply is the practical reality.
> I agree.

I think the default should be either ASCII or UTF-8.  Other encodings
are too abmiguous.  For example, when you see an octet outside the ASCII
range and not part of a valid UTF-8 sequence, do you guess that it's
iso-8859-1, iso-8859-2, iso-8859-whatever, or something else entirely?

> I think the default should be Latin-1, except that I also think tools
> such as wc should, by default, not complain about invalid Latin-1,
> instead sticking with the traditional behaviour of operating on bytes
> rather than characters.

I was talking about the default encoding used for source code and text
files supplied with the OS.  How tools should behave is a different
question, but I share your concerns.

--apb (Alan Barrett)

Follow-Ups:
- Re: [PATCH] replace 0xA0 to whitespace in plain text files (part 2)
  - From: Quentin Garnier

References:
- [PATCH] replace 0xA0 to whitespace in plain text files (part 2)
  - From: Andy Shevchenko
- Re: [PATCH] replace 0xA0 to whitespace in plain text files (part 2)
  - From: der Mouse
- Re: [PATCH] replace 0xA0 to whitespace in plain text files (part 2)
  - From: Andy Shevchenko
- Re: [PATCH] replace 0xA0 to whitespace in plain text files (part 2)
  - From: Joerg Sonnenberger
- Re: [PATCH] replace 0xA0 to whitespace in plain text files (part 2)
  - From: Anders Magnusson
- Re: [PATCH] replace 0xA0 to whitespace in plain text files (part 2)
  - From: Joerg Sonnenberger
- Re: [PATCH] replace 0xA0 to whitespace in plain text files (part 2)
  - From: der Mouse

Prev by Date: Re: [PATCH] replace 0xA0 to whitespace in plain text files (part 2)
Next by Date: Re: [PATCH] replace 0xA0 to whitespace in plain text files (part 2)
Previous by Thread: Re: [PATCH] replace 0xA0 to whitespace in plain text files (part 2)
Next by Thread: Re: [PATCH] replace 0xA0 to whitespace in plain text files (part 2)
Indexes:

Home | Main Index | Thread Index | Old Index