tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: change proposal: nvi behavior for multi-width character



Thank you for your comment!

rin

On 2017/11/07 2:45, Christos Zoulas wrote:
In article <689cc9ef-1580-5ab3-b988-8fafe688b4ae%rk.phys.keio.ac.jp@localhost>,
Rin Okuyama  <rokuyama%rk.phys.keio.ac.jp@localhost> wrote:
-=-=-=-=-=-

Hi,

I'm planning to change current behaviors of nvi for multi-width
characters in accordance with nvi-m17n written by itojun.

Any suggestions or comments are welcomed, especially from users
who live in non-C locales :-).

(1) cursor position (nvi-cursor.patch)

This patch fixes cursor position when a multi-width character
does not fit in a line, and is located on the next line.

Also, when cursor indicates a multi-width character, put it on
the first column of the character, instead of the last column in
the current implementation. Otherwise, some terminal emulators
do not focus on the entire the character, the right-most column
instead.

(2) join command (nvi-join.patch)

This patch changes amount of white spaces inserted when lines
ending or beginning with multi-width characters are joined:

  last char       first char      behavior
  ---             ---             ---
  multi-width     multi-width     nothing ins'ed
  multi-width     single-width    1 spc ins'ed
  single-width    multi-width     1 spc ins'ed
  single-width    single-width    original

This is (basically) the same behavior to nvi-m17n. As a Japanese,
I feel this is a quite reasonable choice, and I guess it may be
for other non-European languages that leave no space between
words.

(3) word-wise movement (not yet)

At the moment, word-wise movements do not work for languages
without space between words. It may never work unless we have
LC_COLLATE support in our libc. (Also, morphological analysis
would be required for full implementation for languages like
Japanese. However it is a quite different matter...)

Tentatively, I suggest to regard a change in character width as
a word boundary (not character length in byte, cf., characters
with umlaut symbols in UTF-8). How do you think of this?

I am good with all that...

christos




Home | Main Index | Thread Index | Old Index