NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Ideas for stripping tags from document



On 2021-01-17 10:57, Ignatios Souvatzis (GSG) wrote:


Am 17. Januar 2021 00:01:23 MEZ schrieb Johnny Billquist <bqt%update.uu.se@localhost>:
On 2021-01-16 19:45, Todd Gruhn wrote:
I have a large document (18,000L). It is full of tags such as <93>
,<94> , <95> .

If I view the doc in a PERL editor I see \x{93} , \x{94} , \{95} ...

Is there a pkg or command to strip these tags and leave the text ?

tr -d "\223\224\225" < infile > outfile

I,d convert them to ", ",and maybe *, if you really want pure ASCII, but yes.

Well, he did ask how to strip them.

But sure, tr can be used for replacing them with other characters as well, obviously. Trivial, in fact.

  Johnny

--
Johnny Billquist                  || "I'm on a bus
                                  ||  on a psychedelic trip
email: bqt%softjar.se@localhost             ||  Reading murder books
pdp is alive!                     ||  tryin' to stay hip" - B. Idol


Home | Main Index | Thread Index | Old Index