NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: bin/57616: sed(1) is unable to process multibyte unicode characters properly
The following reply was made to PR bin/57616; it has been noted by GNATS.
From: "Fege, Marc Daniel" <marc.fege%uni-bonn.de@localhost>
To: gnats-bugs%netbsd.org@localhost, gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
Cc:
Subject: Re: bin/57616: sed(1) is unable to process multibyte unicode
characters properly
Date: Mon, 11 Sep 2023 17:40:15 +0200
--_=_swift_1694446815_6a72f414bbd60b0874d6b9b6c0c01343_=_
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Hello Michael,
thank's a lot for your quick reply.
> Wide char support ("NLS") from FreeBSD was integrated in 2021 and
> will be in NetBSD-10.
That's fantastic news. So it seems, that I'm a little late with my bug
report, then, even though, 9.3 is the most recent stable release, and
the issue is at least valid for 9.x the branch.
However: are there plans to backport that stuff to a possible NetBSD
9.4 or do we actually have to wait for possible 10 release?
> It's not actually about sed failing but what the underlying regexp
> library can do.
Due to the fact, that I'm just an ordinary user, not a developer in
any way, I was unable to state details beyond surface level
diagnostic. What I see as a user as frontend of all of that underlying
stuff is just a program called sed(1). That's why I was referring to
it in a certain use case.
Thank's alot!
=20
Am Montag, den 11.09.2023 um 17:05 schrieb mlelstv%serpens.de@localhost (michael
van elst):
The following reply was made to PR bin/57616; it has been noted by
GNATS.
From: mlelstv%serpens.de@localhost (Michael van Elst)
To: gnats-bugs%netbsd.org@localhost
Cc:=20
Subject: Re: bin/57616: sed(1) is unable to process multibyte unicode
characters properly
Date: Mon, 11 Sep 2023 15:03:24 -0000 (UTC)
marc.fege%uni-bonn.de@localhost writes:
>NetBSD rpi 9.3 NetBSD 9.3 (RPI) #0: Thu Aug=C2=A0=C2=A04 15:30:37 UTC
2022=C2=A0=C2=A0mkrepro%mkrepro.NetBSD.org@localhost:/usr/src/sys/arch/evbarm/compile=
/RPI
evbarm
>sed(1) has a problem processing multibyte unicode characters
properly.
>=C2=A0=C2=A0=C2=A0=C2=A0 echo "abc???xyz" | sed 's/./& /g'
>I expect the following output format for further processing:
>=C2=A0=C2=A0=C2=A0=C2=A0 "a b c ? ? ? x y z "
It's not actually about sed failing but what the underlying regexp
library can do.
Wide char support ("NLS") from FreeBSD was integrated in 2021 and
will be in NetBSD-10.
--_=_swift_1694446815_6a72f414bbd60b0874d6b9b6c0c01343_=_
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<style type=3D"text/css" id=3D"groupoffice-email-style">
h6 {
font-size: 11px;
line-height: 14px;
font-weight: bold;
color: var(--fg-secondary-text);
}
h4 {
font-size: 14px;
line-height: 21px;
letter-spacing: 0.4px;
color: var(--fg-text);
font-weight: normal;
}
h5 {
font-size: 12px;
color: var(--fg-secondary-text);
font-weight: normal;
}
h3 {
font-size: 16px;
line-height: 21px;
font-weight: normal;
letter-spacing: 0.6px;
color: var(--fg-base);
}
h2 {
font-size: 21px;
line-height: 28px;
font-weight: normal;
letter-spacing: 0.6px;
color: var(--fg-base);
}
h1 {
font-size: 30px;
line-height: 35px;
font-weight: normal;
letter-spacing: 0.6px;
color: var(--fg-base);
}
body, p, span, div {
font-family: Helvetica, Arial, sans-serif;
font-size: 14px;
color: var(--fg-text);
font-weight: normal;
line-height: 21px;
background-color: white;
}
@media screen and (max-device-width: 1200px) {
body, p, span, div {
font-size: 16px;
line-height: 24px;
}
}
code {
border: 1px solid var(--fg-line);
background-color: var(--bg-background);
padding: 7px;
margin: 14px 0;
display: block;
font-family: "Courier New", Courier, monospace;
color: var(--fg-base);
border-radius: 3.5px;
}
ul {
display: block;
list-style-type: disc;
list-style-position: outside;
margin: 0;
padding: 0 0 0 2em;
}
ul > ul {
list-style-type: circle;
}
ul > ul > ul {
list-style-type: square;
}
ol {
display: block;
list-style-type: decimal;
list-style-position: outside;
margin: 0;
padding: 0 0 0 2em;
}
ol > ol {
list-style-type: lower-alpha;
}
ol > ol > ol {
list-style-type: lower-roman;
}
</style>
</head>
<body><style></style>Hello Michael,<br><div><br></div><div>thank's a lot =
for your quick reply.<br></div><div><br></div><div>> Wide char suppor=
t ("NLS") from FreeBSD was integrated in 2021 and<br>> will be in Net=
BSD-10.<style></style></div><div><br></div><div>That's fantastic news. S=
o it seems, that I'm a little late with my bug report, <style></style>th=
en, even though, 9.3 is the most recent stable release, and the issue is=
at least valid for 9.x the branch.<br></div><div>However: are there pla=
ns to backport that stuff to a possible NetBSD 9.4 or do we actually hav=
e to wait for possible 10 release?<br></div><div><br></div><div>> It'=
s not actually about sed failing but what the underlying regexp<br>
> library can do.<style></style></div><div><br></div><div>Due to the =
fact, that I'm just an ordinary user, not a developer in any way, I was =
unable to state details beyond surface level diagnostic. What I see as a=
user as frontend of all of that underlying stuff is just a program call=
ed sed(1). That's why I was referring to it in a certain use case.</div>=
<div><br></div><div>Thank's alot!<br><style></style></div>
<br>Am Montag, den 11.09.2023 um 17:05 schrieb <a href=3D"mailto:mlelstv@se=
rpens.de" class=3D"normal-link normal-link-email" target=3D"_blank" rel=3D=
"noopener noreferrer">mlelstv%serpens.de@localhost</a> (michael van elst):<br><blo=
ckquote style=3D"border:0;border-left: 2px solid #22437f; padding:0px; mar=
gin:0px; padding-left:5px; margin-left: 5px; "><div class=3D"msg">The foll=
owing reply was made to PR bin/57616; it has been noted by GNATS.<br>
<br>
From: <a class=3D"normal-link" href=3D"mailto:mlelstv%serpens.de@localhost">mlelstv=
@serpens.de</a> (Michael van Elst)<br>
To: <a class=3D"normal-link" href=3D"mailto:gnats-bugs%netbsd.org@localhost">gnats-=
bugs%netbsd.org@localhost</a><br>
Cc: <br>
Subject: Re: bin/57616: sed(1) is unable to process multibyte unicode cha=
racters properly<br>
Date: Mon, 11 Sep 2023 15:03:24 -0000 (UTC)<br>
<br>
<a class=3D"normal-link" href=3D"mailto:marc.fege%uni-bonn.de@localhost">marc.fege=
@uni-bonn.de</a> writes:<br>
<br>
>NetBSD rpi 9.3 NetBSD 9.3 (RPI) #0: Thu Aug 4 15:30:37 UT=
C 2022 <a href=3D"mailto:mkrepro%mkrepro.NetBSD.org@localhost" class=3D=
"normal-link normal-link-email" target=3D"_blank" rel=3D"noopener norefe=
rrer">mkrepro%mkrepro.NetBSD.org@localhost</a>:/usr/src/sys/arch/evbarm/compile/RP=
I evbarm<br>
<br>
>sed(1) has a problem processing multibyte unicode characters properl=
y.<br>
<br>
> echo "abc???xyz" | sed 's/./& /g'<br>
>I expect the following output format for further processing:<br>
> "a b c ? ? ? x y z "<br>
<br>
<br>
It's not actually about sed failing but what the underlying regexp<br>
library can do.<br>
<br>
Wide char support ("NLS") from FreeBSD was integrated in 2021 and<br>
will be in NetBSD-10.</div></blockquote></body></html>
--_=_swift_1694446815_6a72f414bbd60b0874d6b9b6c0c01343_=_--
Home |
Main Index |
Thread Index |
Old Index