Subject: language names and character sets for web pages
To: None <>
From: Klaus Heinz <>
List: netbsd-docs
Date: 11/07/2002 02:49:10

I have collected the current set of web page translations (where at
least some pages exist) and the corresponding language codes and
character sets used:

Chinese   : lang="zh-TW" charset=big5
Czech     : lang="cs"	 charset=ISO-8859-2
French    : lang="fr"	 charset=ISO-8859-1
German    : lang="de"	 charset=ISO-8859-1
Japanese  : lang="ja"	 charset=ISO-2022-JP	???
Korean    : lang="ko"	 charset=EUC-KR  	???
Lithuanian: lang="lt"	 charset=ISO-8859-13
Polish    : lang="pl"	 charset=ISO-8859-2
Portuguese: lang="pt-BR" charset=ISO-8859-1
Russian   : lang="ru"	 charset=koi8-r
Spanish   : lang="es"	 charset=ISO-8859-1	???
Swedish   : lang="se"	 charset=ISO-8859-1

Lines marked with '???' I am not entirely sure about.
Several other questions still remain:

  - Is anybody still working on a Czech translation?
  - Is Brazilian Portuguese different enough from Portuguese to _need_
    the language subtag '-BR'?
    I only know this kind of differences between de-DE, de-AT and de-CH
    where it _seems_ not to be necessary for our purposes.
  - Who works on zh-TW?
    Maybe I am a bit ignorant, so please forgive me if this is a silly
    question: Couldn't it be 'zh' so it applies to all Chinese
    speaking people in the world? I suppose there _is_ an official
    Chinese language (I'm not so sure, thinking about the vast number of
    people :-).
  - Who works on Korean?