When building textIR, a number of applications are created for different tasks. The applications are:
- dindex : Build an inverted index from TREC documents.
- dindexCsv : Build in inverted index from a CSV file, where the first element of every row is the document id.
- dquery : Query an index with TREC style topic queries.
- dqueryCsv : Query an index with queries in a CSV file, where the first element of each row is the id of the query.
- dlsa : Use latent semantic analysis to generate either an index or thesaurus (requires
./configure --enable-arpack
to be run)
- dplsa : Use probabilistic latent semantic analysis to generate a thesaurus.
- dcot : Generate a thesaurus using term co-occurrence.
- dextract : Print the contents of an index.