[1] |
Laurence A. F. Park, Kotagiri Ramamohanarao, and Marimuthu Palaniswami.
A new implementation technique for fast spectral based document
retrieval systems.
In Vipin Kumar and Shusaku Tsumoto, editors, The Second IEEE
International Conference on Data Mining, pages 346--353, Los Alamitos,
California, USA, December 2002. IEEE Computer Society.
[ bib |
DOI |
.pdf ]
The traditional methods of spectral text retrieval (FDS,CDS) create an index of spatial data and convert the data to its spectral form at query time. We present a new method of implementing and querying an index containing spectral data which will conserve the high precision performance of the spectral methods, reduce the time needed to resolve the query, and maintain an acceptable size for the index. This is done by taking advantage of the properties of the discrete cosine transform and by applying ideas from vector space document ranking methods. |
[2] |
Laurence A. F. Park, Marimuthu Palaniswami, and Kotagiri Ramamohanarao.
A novel web text mining method using the discrete cosine transform.
In T. Elomaa, H. Mannila, and H. Toivonen, editors, The 6th
European Conference on Principles of Data Mining and Knowledge Discovery,
number 2431 in Lecture Notes in Artificial Intelligence, pages 385--396,
Berlin, August 2002. Springer-Verlag.
[ bib |
DOI |
.pdf ]
Fourier Domain Scoring (FDS) has been shown to give a 60% improvement in precision over the existing vector space methods, but its index requires a large storage space. We propose a new Web text mining method using the discrete cosine transform (DCT) to extract use- ful information from text documents and to provide improved document ranking, without having to store excessive data. While the new method preserves the performance of the FDS method, it gives a 40% improve- ment in precision over the established text mining methods when using only 20% of the storage space required by FDS. |
This file was generated by bibtex2html 1.99.