tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Support for boolean queries in apropos



Hi all,

The FTS engine of Sqlite supports Boolean queries. Boolean queries
simply means building complex queries by combining simple queries
using Boolean operators: AND, OR, and NOT. For example:

$ apropos add new user NOT (git OR ssh)

This will provide results which match the query "add new user" but do
not match for "git" or "ssh".

It is possible to combine such queries using parenthesis. It is a
powerful and useful feature but till now apropos was unable to use it,
the reason being that, it was removing away the keywords AND, OR, NOT
as stopwords. I think there are two ways to handle this:

1. Either remove the keywords AND, OR, NOT from the list of stopwords
and treat them as Boolean operators. That means if the user specifies
any of these keywords in the query, they will be treated as Boolean
operators and the results will be evaluated that way. This has the
side-effect that if the user unknowingly uses these keywords in his
query when he did not really intend to use them as Boolean operators,
he might be surprised at the results.

2. Or, another option is to use '||', '&&' and '~' as the symbols for
the OR, AND, NOT Boolean operators respectively in the apropos
frontend and pre-process it to build a proper SQL query so that Sqlite
is able to understand and execute it properly.

I would like to hear opinions on this. I am attaching a patch for the
1st option but to me the 2nd option seems more attractive.


Index: apropos-utils.c
===================================================================
RCS file: /cvsroot/src/usr.sbin/makemandb/apropos-utils.c,v
retrieving revision 1.2
diff -u -p -r1.2 apropos-utils.c
--- apropos-utils.c     7 Feb 2012 19:17:16 -0000       1.2
+++ apropos-utils.c     11 Mar 2012 05:36:58 -0000
@@ -93,6 +93,33 @@ lower(char *str)
        return str;
 }

+void
+build_boolean_query(char *str)
+{
+       char *temp;
+       while ((temp = strstr(str, "and")) || (temp = strstr(str, "not"))
+                       || (temp = strstr(str, "or"))) {
+               switch (temp[0]) {
+                       case 'a':
+                               temp[0] = 'A';
+                               temp[1] = 'N';
+                               temp[2] = 'D';
+                               break;
+
+                       case 'n':
+                               temp[0] = 'N';
+                               temp[1] = 'O';
+                               temp[2] = 'T';
+                               break;
+                       case 'o':
+                               temp[0] = 'O';
+                               temp[1] = 'R';
+                               break;
+               }
+               str = temp + 1;
+       }
+}
+
 /*
 * concat--
 *  Utility function. Concatenates together: dst, a space character and src.
Index: apropos-utils.h
===================================================================
RCS file: /cvsroot/src/usr.sbin/makemandb/apropos-utils.h,v
retrieving revision 1.2
diff -u -p -r1.2 apropos-utils.h
--- apropos-utils.h     7 Feb 2012 19:17:16 -0000       1.2
+++ apropos-utils.h     11 Mar 2012 05:36:58 -0000
@@ -89,4 +89,5 @@ void close_db(sqlite3 *);
 int run_query(sqlite3 *, const char *[3], query_args *);
 int run_query_html(sqlite3 *, query_args *);
 int run_query_pager(sqlite3 *, query_args *);
+void build_boolean_query(char *);
 #endif
Index: apropos.c
===================================================================
RCS file: /cvsroot/src/usr.sbin/makemandb/apropos.c,v
retrieving revision 1.5
diff -u -p -r1.5 apropos.c
--- apropos.c   15 Feb 2012 23:53:13 -0000      1.5
+++ apropos.c   11 Mar 2012 05:37:01 -0000
@@ -146,6 +146,8 @@ main(int argc, char *argv[])
        /* Eliminate any stopwords from the query */
        query = remove_stopwords(lower(str));
        free(str);
+       build_boolean_query(query);
+       fprintf(stderr, "%s\n", query);

        /* if any error occured in remove_stopwords, exit */
        if (query == NULL)
Index: stopwords.txt
===================================================================
RCS file: /cvsroot/src/usr.sbin/makemandb/stopwords.txt,v
retrieving revision 1.1
diff -u -p -r1.1 stopwords.txt
--- stopwords.txt       7 Feb 2012 19:13:32 -0000       1.1
+++ stopwords.txt       11 Mar 2012 05:37:18 -0000
@@ -14,7 +14,6 @@ all
 also
 always
 an
-and
 another
 any
 are
@@ -128,7 +127,6 @@ next
 no
 non
 noone
-not
 nothing
 o
 of
@@ -139,7 +137,6 @@ older
 on
 once
 only
-or
 order
 our
 out


Home | Main Index | Thread Index | Old Index