Access an index file in a filtered manner. More...
Public Member Functions | |
SubIndex (BlockStats *stats) | |
Initialise the class. | |
void | sample (int minDocumentCount, int maxDocumentCount) |
Allow access to words that satisfy the minimum and maximum document count. | |
void | printSize (void) |
Print the reduction statistics to the screen. | |
void | sample (Word **queryTerms, int uniqueQueryTerms) |
Allow access words that satisfy the query term and unique query terms. | |
int | realWord (int word) |
Return the real word id. | |
int | realDocument (int document) |
Return the real document id. | |
int | subWordCount (void) |
Return the number of words in the SubIndex. | |
int | subDocumentCount (void) |
Return the number of documents in the SubIndex. | |
int | totalCount (void) |
Return the sum of all frequency counts in the SubIndex. | |
int | nonZeroCount (void) |
Return the number of elements in the SubIndex. | |
void | reset (void) |
Reset the SubIndex. | |
void | nextElement (TRIPLET *subTriple) |
Store the next element in the SubIndex in subTriple. | |
Protected Member Functions | |
virtual void | generateIndexList (int minDocumentCount, int maxDocumentCount)=0 |
Compute the list of items in the SubIndex. | |
virtual void | generateIndexList (Word **queryTerms, int uniqueQueryTerms)=0 |
Compute the list of items in the SubIndex. | |
virtual void | buildWordList (void)=0 |
Read the word statistics from the stats file. | |
virtual void | buildDocumentList (void)=0 |
Read the document statistics from the stats file. | |
virtual int | wordCount (void)=0 |
Return the real number of words. | |
virtual int | documentCount (void)=0 |
Return the real number of documents. | |
virtual T * | indexStats (int word)=0 |
Return the stats structure for the index. | |
virtual int | indexElements (int word)=0 |
Return the number of elements for index list 'word'. | |
virtual int | indexLength (void)=0 |
Return the number of lists in the index. | |
Protected Attributes | |
int * | _wordSamples |
The mapping from sub to real index values. | |
int | _reducedWordCount |
The number of index lists in the SubIndex. | |
int | _reducedDocumentCount |
The number of elements in the SubIndex. | |
int | _totalFrequency |
The total frequency count in the SubIndex. | |
char * | _indexFileName |
The name of the input index file. |
Access an index file in a filtered manner.
This class provides an interface to access certain elements of an index file as if they were all of the elements in the file. This means that other classes do not have to compute while elements to select, since they are selected at this level.