tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: change proposal: nvi behavior for multi-width character



In article <689cc9ef-1580-5ab3-b988-8fafe688b4ae%rk.phys.keio.ac.jp@localhost>,
Rin Okuyama  <rokuyama%rk.phys.keio.ac.jp@localhost> wrote:
>-=-=-=-=-=-
>
>Hi,
>
>I'm planning to change current behaviors of nvi for multi-width
>characters in accordance with nvi-m17n written by itojun.
>
>Any suggestions or comments are welcomed, especially from users
>who live in non-C locales :-).
>
>(1) cursor position (nvi-cursor.patch)
>
>This patch fixes cursor position when a multi-width character
>does not fit in a line, and is located on the next line.
>
>Also, when cursor indicates a multi-width character, put it on
>the first column of the character, instead of the last column in
>the current implementation. Otherwise, some terminal emulators
>do not focus on the entire the character, the right-most column
>instead.
>
>(2) join command (nvi-join.patch)
>
>This patch changes amount of white spaces inserted when lines
>ending or beginning with multi-width characters are joined:
>
>   last char       first char      behavior
>   ---             ---             ---
>   multi-width     multi-width     nothing ins'ed
>   multi-width     single-width    1 spc ins'ed
>   single-width    multi-width     1 spc ins'ed
>   single-width    single-width    original
>
>This is (basically) the same behavior to nvi-m17n. As a Japanese,
>I feel this is a quite reasonable choice, and I guess it may be
>for other non-European languages that leave no space between
>words.
>
>(3) word-wise movement (not yet)
>
>At the moment, word-wise movements do not work for languages
>without space between words. It may never work unless we have
>LC_COLLATE support in our libc. (Also, morphological analysis
>would be required for full implementation for languages like
>Japanese. However it is a quite different matter...)
>
>Tentatively, I suggest to regard a change in character width as
>a word boundary (not character length in byte, cf., characters
>with umlaut symbols in UTF-8). How do you think of this?

I am good with all that...

christos



Home | Main Index | Thread Index | Old Index