NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: bin/51018: Update stopwords list for apropos(1)



The following reply was made to PR bin/51018; it has been noted by GNATS.

From: Abhinav Upadhyay <er.abhinav.upadhyay%gmail.com@localhost>
To: Joerg Sonnenberger <joerg%britannica.bec.de@localhost>
Cc: NetBSD GNATS <gnats-bugs%netbsd.org@localhost>, gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
Subject: Re: bin/51018: Update stopwords list for apropos(1)
Date: Tue, 29 Mar 2016 01:17:32 +0530

 On Mon, Mar 28, 2016 at 11:46 PM, Abhinav Upadhyay
 <er.abhinav.upadhyay%gmail.com@localhost> wrote:
 > On Mon, Mar 28, 2016 at 11:16 PM, Joerg Sonnenberger
 > <joerg%britannica.bec.de@localhost> wrote:
 >> On Mon, Mar 28, 2016 at 10:40:00AM +0000, er.abhinav.upadhyay%gmail.com@localhost wrote:
 >>> The current stopwords list of apropos(1) contains some legitimate words for which man pages exist, for example, an, as, at, be etc.
 >>>
 >>> Also, I came across a larger set of stopwords list [1], which I thought would be useful and should be used with apropos.
 >>>
 >>> [1]: http://www.lextek.com/manuals/onix/stopwords2.html
 >>
 >> I don't think just dropping them from the stop word list is helpful, but
 >> there was a change awhile ago to try more agressively as second pass.
 >
 > Yes, just noticed that. If the query just consists of stop words, we
 > try with the same query instead of showing an error message. I think
 > there is no point of removing these stop words.
 >
 > But we can still add new ones :)
 
 Just went through the removed stop words in the patch, I think at
 least three of them (we, an, last) should still be removed.
 For example:
 
 apropos -n 1 we device driver
 should give the we(4) man page at the top, it doesn't because we gets
 removed from the query.
 Similarly,
 
 apropos -n 1 last login session
 should give the last(1) man page at the top but it doesn't
 And,
 
 apropos -n 1 an driver
 should give the an(4) page
 
 I am not feeling too strongly in either direction, I guess better to
 keep them in the list. Here is the updated patch with just the new
 additions.
 
 Index: stopwords.txt
 ===================================================================
 RCS file: /cvsroot/src/usr.sbin/makemandb/stopwords.txt,v
 retrieving revision 1.1
 diff -u -r1.1 stopwords.txt
 --- stopwords.txt    7 Feb 2012 19:13:32 -0000    1.1
 +++ stopwords.txt    28 Mar 2016 19:12:34 -0000
 @@ -8,154 +8,392 @@
  7
  8
  9
 +a's
 +able
  about
 +above
 +according
 +accordingly
 +across
 +actually
 +after
 +afterwards
  again
 +against
 +ain't
  all
 +allow
 +allows
 +almost
 +alone
 +along
 +already
  also
 +although
  always
 +am
 +among
 +amongst
  an
  and
  another
  any
 +anybody
 +anyhow
 +anyone
 +anything
 +anyway
 +anyways
 +anywhere
 +apart
 +appear
 +appreciate
 +appropriate
  are
 +aren't
  around
  as
 +aside
  ask
 +asking
 +associated
  at
 +available
 +away
 +awfully
  b
  back
  be
 +became
  because
 +become
 +becomes
 +becoming
  been
  before
 +beforehand
 +behind
 +being
 +believe
  below
 +beside
 +besides
 +best
 +better
  between
 +beyond
 +both
 +brief
  but
  by
  bye
 +c'mon
 +c's
 +came
  can
 +can't
 +cannot
 +cant
  case
 +cause
 +causes
 +certain
 +certainly
 +changes
 +clearly
 +come
 +comes
 +concerning
 +consequently
 +consider
 +considering
  consist
 +contain
 +containing
 +contains
 +corresponding
  could
 +couldn't
 +course
 +currently
  d
 +definitely
 +described
 +despite
  did
 +didn't
 +different
 +do
  does
 +doesn't
 +doing
 +don't
 +done
  down
 +downwards
 +during
  e
  each
  early
 +edu
 +eight
  either
 +else
 +elsewhere
  end
  enough
 +entirely
 +especially
 +etc
  even
 +ever
  every
 +everybody
 +everyone
 +everything
 +everywhere
 +exactly
 +example
 +except
  f
  fact
  far
  few
 +fifth
 +first
 +five
  follow
 +followed
 +following
 +follows
 +for
 +former
 +formerly
 +forth
  four
  from
  full
  further
 +furthermore
  g
  general
  get
 +getting
  give
  given
 +gives
 +go
 +goes
 +going
 +gone
  good
  got
 +gotten
  great
 +greetings
  h
  had
 +hadn't
 +happens
 +hardly
  has
 +hasn't
  have
 +haven't
  having
 +he
 +he's
 +hello
 +help
 +hence
 +her
  here
 +here's
 +hereafter
 +hereby
 +herein
 +hereupon
 +hers
 +herself
 +hi
  high
  him
 +himself
  his
 +hither
 +hopefully
  how
 +howbeit
  however
  i
 +i'd
 +i'll
 +i'm
 +i've
 +ie
  if
 +ignored
 +immediate
  important
  in
 +inasmuch
 +inc
 +indeed
 +indicate
 +indicated
 +indicates
 +inner
 +insofar
 +instead
  interest
  into
 +inward
  is
 +isn't
  it
 +it'd
 +it'll
 +it's
 +its
 +itself
  j
  just
  k
  keep
  keeps
 +kept
  kind
  knew
  know
 +known
 +knows
  l
  large
  larger
  last
 +lately
  later
  latest
  latter
 +latterly
  least
 +lest
  let
 +let's
  like
 +liked
  likely
 +little
  long
  longer
 +looking
 +looks
 +ltd
  m
  made
 +mainly
  many
  may
 +maybe
  me
 +mean
 +meanwhile
 +merely
  might
 +moreover
  most
  mostly
  much
  must
  my
 +myself
  n
 +name
 +namely
  names
 +nd
 +near
 +nearly
  necessary
  need
  needs
 +neither
  never
 +nevertheless
  new
  next
 +nine
  no
 +nobody
  non
 +none
  noone
 +nor
 +normally
  not
  nothing
 +novel
 +now
 +nowhere
  o
 +obviously
  of
  off
  often
 +oh
 +ok
 +okay
  old
  older
  on
  once
 +one
 +ones
  only
 +onto
  or
  order
 +other
 +others
 +otherwise
 +ought
  our
 +ours
 +ourselves
  out
 +outside
  over
 +overall
 +own
  p
  part
 +particular
 +particularly
  per
  perhaps
 +placed
 +please
 +plus
  possible
  present
 +presumably
 +probably
  problem
 +provides
  q
 +que
  quite
 +qv
  r
  rather
 +rd
  really
 +reasonably
 +regarding
 +regardless
 +regards
 +relatively
 +respectively
  right
  room
  s
 @@ -163,89 +401,207 @@
  same
  saw
  say
 +saying
  says
  second
 +secondly
  see
 +seeing
  seem
  seemed
 +seeming
  seems
 +seen
  sees
 +self
 +selves
 +sensible
 +sent
 +serious
 +seriously
 +seven
  several
  shall
 +she
  should
 +shouldn't
  side
  sides
 +since
 +six
  small
  smaller
  so
  some
 +somebody
 +somehow
 +someone
  something
 +sometime
 +sometimes
 +somewhat
 +somewhere
 +soon
 +sorry
 +specified
 +specify
 +specifying
  state
  states
  still
 +sub
  such
  sure
  t
 +t's
  take
  taken
 +tell
 +tends
 +th
 +than
 +thank
 +thanks
 +thanx
  that
 +that's
 +thats
  the
  their
 +theirs
  them
 +themselves
  then
 +thence
  there
 +there's
 +thereafter
 +thereby
  therefore
 +therein
 +theres
 +thereupon
  these
 +they
 +they'd
 +they'll
 +they're
 +they've
  thing
  think
  thinks
 +third
  this
 +thorough
 +thoroughly
  those
  though
  three
 +through
 +throughout
 +thru
  thus
  to
  together
  too
  took
  toward
 +towards
 +tried
 +tries
 +truly
 +try
 +trying
  turn
 +twice
  two
  u
 +un
 +under
 +unfortunately
 +unless
 +unlikely
  until
 +unto
  up
  upon
  us
  use
  used
 +useful
  uses
 +using
 +usually
  v
 +value
 +various
  very
 +via
 +viz
 +vs
  w
  want
  wanted
  wants
  was
 +wasn't
  way
  ways
  we
 +we'd
 +we'll
 +we're
 +we've
 +welcome
  well
  went
  were
 +weren't
  what
 +what's
 +whatever
  when
 +whence
 +whenever
 +where
 +where's
 +whereafter
 +whereas
 +whereby
 +wherein
 +whereupon
 +wherever
  whether
 +while
 +whither
 +who's
 +whoever
 +whole
 +whom
 +whose
  why
  will
  willing
 +wish
  with
  within
  without
 +won't
 +wonder
  work
  would
 +wouldn't
  x
  y
  year
  yet
  you
 +you'd
 +you'll
 +you're
 +you've
 +your
 +yours
 +yourself
 +yourselves
  z
 


Home | Main Index | Thread Index | Old Index