tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Fwd: GSoC Project: Replacement for Apropos

On Mon, Mar 21, 2011 at 12:40:18AM +0530, Abhinav Upadhyay wrote:
> Hi,
> My name is Abhinav Upadhyay, and I am a 4th year student of Bachelor
> of Technology from India.


A few pointers about the apropos project:

1 Don't reinvent the wheel: more than one open-source full-text search
  system exists already.  When this project was originally proposed, I
  indexed the HTML manual pages on my system with HyperEstraier (can
  be found in pkgsrc) and found that it's much better than apropos and
  *almost adequate* for searching manual pages.  Make sure that you
  understand existing systems' shortcomings before you start a new

2 The user interface (UI) of a search system is principally important,
  and you should have a close look at existing examples of full-text
  search systems (Apple Spotlight, Google) to take ideas from and to
  see how their UIs succeed or fail.  What is the grammar for user
  queries; how will your system interpret queries?  How will your system
  present search results to the user?  How do you ensure that the most
  relevant results are listed first?  How quickly does your system
  produce results?  Does your system start to produce results before the
  user has stopped typing their query?  Does the system accommodate 
  or punish mistakes such as search-term misspellings?


[1] HyperEstraier, for all of its sophistication, can disappoint badly
    with its search results.  IMO, intro(2) is a more relevant result
    for EINVAL than pthread_mutex(3), yet HyperEstraier's first result
    is the latter manual page, and intro(2) is not in the first 10
    results.  Likewise, url(4) and rtw(4) are more relevant to a search
    for Realtek than mii(4), yet mii(4) appears in HyperEstraier's
    results list before either url(4) or rtw(4).

David Young             OJC Technologies      Urbana, IL * (217) 344-0444 x24

Home | Main Index | Thread Index | Old Index