Desktop search engine
In writing this search engine and its web interface, I had the following Objectives:
Learning about indexed text search.
Becoming more familiar in working with PostgreSQL and Python.
This search engine may also serve as a good starting point for building an search engine which can be added to an online web site. I have tried to keep the design of search engine an clean and simple as possible, avoiding too many features.
Requirements
tsearch2: The text search module tsearch2 is included in the source distribution, but you may need to compile and install it separately.
psycopg2: Python-PostgreSQL database adapter
Installation
Create a database. You don't have to be superuser to do this (if the superuser has granted you the permission to create databases).
$ createdb dts
dts
is the default name for the database used.
Next, the superuser (usually called postgres
) has to execute the
script /contrib/tsearch2/tsearch2.sql
in the PostgreSQL source package.
postgres: POSTGRES_SOURCE/contrib/tsearch2$ psql dts <tsearch2.sql
Now the user can start working with the tsearch2 functions.
In order for our engine to work, we need to initialize the database dts
.
Conveniently, this can be done by:
$ ./engine.py --create
This will create a table and an index to the text search vectors in the database.
Everything is now setup for the scripts to run.
First, we can check what the statistics option of engine.py
works.
You should see something like this:
$ ./engine.py -s
Statistics ...
Indexed files: 0
Total size of files: zero
If this works, one can start indexing files in directories:
$ ./engine.py -i /some/path/ /another/path/ ...
And run queries:
$ ./engine.py hello world
Web front end
The program webapp
is a simple web interface for the search engine.
The web interface launches its own web server when invoked.
It is also possible to run the web application behind another web server
using FastCGI. In order to do so, you'll need
flup which a
Web Server Gateway Interface (WSGI) for FastCGI.
I've only tested the application with Firefox, but it uses only very basic
HTML and CSS (no Ajax), so it should work with all browsers.
Download
Source: search.tar.gz