On 12-Oct-08, at 1:19 PM, Aleksey Cheusov wrote:
libc's regexp doesn't support UTF-8 and therefore usr.bin/grep will not too.
I'd strongly suggest that's actually more of an incentive to move to a BSD grep. (primarily because it would mean grep/egrep would then conform to the standard libc regexp implementation since obviously it doesn't do so now)
It's also an incentive to improve (eg. add UTF-8 support to) the base regexp code too, if that's deemed desirable and possible to do without causing major confusion and bloat.
-- Greg A. Woods; Planix, Inc. <woods%planix.ca@localhost>
Description: This is a digitally signed message part