tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: A new modern spell(1) Implementation



On Sun, Jan 29, 2017 at 4:37 AM, Robert Elz <kre%munnari.oz.au@localhost> wrote:
>     Date:        Sun, 29 Jan 2017 01:47:18 +0530
>     From:        Abhinav Upadhyay <er.abhinav.upadhyay%gmail.com@localhost>
>     Message-ID:  <CAHwRYJm-TPnV+EaSqgF=KP_tU-6DXCtztzDnJ-eGjym2RtzfWg%mail.gmail.com@localhost>
>
>   | But a spell checker is no good if it tells you that you misspelled a
>   | word but doesn't tell you the correct spelling.
>
> With that I disagree.   Particularly if the aim is a replacement of
> spell(1) rather than yet another {a,i,hun}spell type program.
>
> One common usage of spell is
>
>         spell input files >wordlist
>
> and then edit wordlist to delete the actual misspellings, leaving in names,
> acronyms, and similar, which aren't appropriate for a dictionary, but
> which just annoy when reported as errors (and then:
>         spell +wordlist input files
> or ispell -p wordlist ... etc.)

I have added an option (-w) to allow providing a word list file, these
words would not be marked as misspellings.

> Further, it should be possible to use spell in a Makefile, when generating
> a doc, to have the process fail if the doc contains spelling errors.
> Again, fixing things is not the objective.  Just discovering whether there
> are errors.

That should be achievable, I have not tried doing this though.
Probably spell(1) should exit with >0 exit code in case there are
spelling errors in the document.

> And last on this point, generally, anyone with a half reasonable education
> can usually see how a word should be spelled, when it is pointed out that
> the way it is, is not correct (the hard part in proof reading is actually
> spotting the errors, not in fixing them).   And when you really don't know,
> we have dictionaries, and the ability to perform lookups.

Yes, I agree that most of the times we can fix the spelling ourselves
when told, but for those rare times when we do need to lookup the
dictionary (or for users with poor spelling), it would still be handy
to provide the possible corrections. This in my opinion, makes it a
much more useful tool. We can disable this by default and only show
suggestions when invoked with an option :)

Also, having the suggestions feature allows other applications to use
it, if they want to do spell checks.

> ps: while I have no doubt that the algorithm that spell(1) uses could be
> improved, and that it is certainly possible to set out, and succeed, in
> fooling it with "words" like "birdlets" when you know how it works, in
> practice, if the doc was prepared by someone who was actually attempting
> to type and spell correctly, spell very rarely misses mistakes.

Yes, while those examples are far fetched, but there are some real
misspellings as well which spell(1) won't find. Such as:

"appled", "coffeed" - They have an extra "d" added which makes
spell(1) confuse it with the "ed" suffix
"undoubt" - It is not an actual word but a non-native speaker might
not realize that
"repremanded" - "repremanded" is a common misspelling for
reprimanded", but our spell(1) fails with it
 "undoubtedlys" - pretty easy for someone to type an extra "s" and
spell(1) will never complain :)

It is better if spell(1) flags some unknown words as misspellings but
it should definitely not miss mistakes like those :)

> pps: personally I'd prefer using the OED, rather than that Webster nonsense,
> but that's a different battle.

We can if it is available for free use like the Webster's :)


Home | Main Index | Thread Index | Old Index