PHASAR

Phrase-based
High Accuracy
Search and Retrieval

PHASAR is a new kind of search engine, which does not consider queries and documents as bags-of-words but is based on a deep linguistic analysis and indexation of both documents and queries. It is intended for various forms of professional search, which is badly supported by current word-based search engines.

The PHASAR engine provides its users with a wholly new way of searching, using linguistically motivated search terms, giving the user tight control over precision and recall (avoiding long lists of spurious hits) and providing unprecedented support of the search process by information from the index and the thesauri.

The PHASAR search engine is still in the prototype stage. Its first application, as an experimental literature search engine for BioInformatics giving access to Medline abstracts, is just a foretaste of the future, limited by the unripeness and incompleteness of the software and lingware.

PHASAR requires extremely accurate parsers and thesauri, and is only applicable to languages and domains for which such parsers and thesauri are available. The (further) development of suitable parsers and thesauri requires a large investment, but it will enable many professional applications of PHASAR.

Cornelis H.A. Koster
Department of Computing Science
University of Nijmegen
6525ED Nijmegen, The Netherlands

PHASAR

Phrase-based High Accuracy Search and Retrieval

Phrase-based
High Accuracy
Search and Retrieval