tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: List of Keywords for apropos(1) Which Should Not be Stemmed



On Mon, Jul 11, 2016 at 06:59:25PM +0530, Abhinav Upadhyay wrote:
> But the downside is that technical keywords (e.g. kms, lfs, ffs), are
> also stemmed down and stored (e.g. km, lf, ff) in the index. So if you
> search for kms, you will see results for both kms and km.

Interesting problem.

I expect the set of documents that contain a word ("directories") and
the set of documents containing its true stem ("directory") to overlap
widely.  I also expect the set of documents that contain a word ("kms")
and an incorrect stem ("km") to scarcely overlap.  Do the manual pages
meet these expections?  If so, then maybe you can decide whether or not
to keep a stem by looking at the document-set overlap?

Dave

-- 
David Young         //\ Trestle Technology Consulting
(217) 721-9981      Urbana, IL   http://trestle.tech/


Home | Main Index | Thread Index | Old Index