[1] |
Yong Zhen Guo, Kotagiri Ramamohanarao, and Laurence A. F. Park.
Personalized pagerank for web page prediction based on access
time-length and frequency.
In The Proceedings of the IEEE/WIC/ACM International Conference
on Web Intelligence, pages 687--690. IEEE Computer Society, November 2007.
[ bib |
DOI |
.pdf ]
Web page prefetching techniques are used to address the access latency problem of the Internet. To perform successful prefetching, we must be able to predict the next set of pages that will be accessed by users. The PageRank algorithm used by Google is able to compute the popularity of a set of Web pages based on their link structure. In this paper, a novel PageRank-like algorithm is proposed for conducting Web page predction. Two biasing factors are adopted to personalize PageRank, so that it favors the pages that are more important to users. One factor is the length of time spent on visiting a page and the other is the frequency that a page was visited. The experiments conducted show that using these two factors simultaneously to bias PageRank results in more accurate Web page prediction than other methods that use only one of these two factors. |
[2] |
Laurence A. F. Park and Kotagiri Ramamohanarao.
Mining web multi-resolution community-based popularity for
information retrieval.
In The Proceedings of the 2007 ACM Conference on Information and
Knowledge Management, pages 545--552, November 2007.
[ bib |
DOI |
.pdf ]
The PageRank algorithm is used in Web information retrieval to calculate a single list of popularity scores for each page in the Web. These popularity scores are used to rank query results when presented to the user. By using the structure of the entire Web to calculate one score per document, we are calculating a general popularity score, not particular to any community. Therefore, the PageRank scores are more suited to general queries. In this paper, we introduce a more general form of PageRank, using Web multi-resolution community-based popularity scores, where each document obtains a popularity score dependent on a given Web community. When a query is related to a specific community, we choose the associated set of popularity scores and order the query results accordingly. Using Web-community based popularity scores, we achieved an 11% increase in precision over PageRank. |
[3] |
Laurence A. F. Park and Kotagiri Ramamohanarao.
Query expansion using a collection dependent probabilistic latent
semantic thesaurus.
In Zhi-Hua Zhou, Hang Li, and Qiang Yang, editors, The Eleventh
Pacific-Asia Conference on Knowledge Discovery and Data Mining Workshop,
volume 4426 of Lecture Notes in Computer Science, pages 224--235.
Springer, 2007.
[ bib |
DOI |
.pdf ]
Many queries on collections of text documents are too short to produce informative results. Automatic query expansion is a method of adding terms to the query without interaction from the user in order to obtain more refined results. In this investigation, we examine our novel automatic query expansion method using the probabilistic latent semantic thesaurus, which is based on probabilistic latent semantic analysis. We show how to construct the thesaurus by mining text documents for probabilistic term relationships, and we show that by using the latent semantic thesaurus, we can overcome many of the problems associated to latent semantic analysis on large document sets which were previously identified. Experiments using TREC document sets show that our term expansion method out performs the popular probabilistic pseudo-relevance feedback method by 7.3%. |
[4] |
Laurence A. F. Park and Yuye Zhang.
On the distribution of user persistence for rank-biased precision.
In The Proceedings of the Twelfth Australasian Document
Computing Symposium, 2007.
[ bib |
.pdf ]
Rank-biased precision (RBP) is a new method of information retrieval system evaluation that takes into account any uncertainty due to incomplete relevance judgements for a given document and query set. To do so, RBP uses a model of user persistence. In this article, we will present a statistical analysis of the RBP user persistence model to observe how the user persistence value affects the user persistence distribution. We also provide a method of fitting data from existing users to the persistence model, in order to compute their persistence value. Using the Microsoft MSN query log, we were able to demonstrate a typical distribution of the user persistence value and show that it closely resembles a reverse lognormal distribution, with a mean of p = 0.78. |
This file was generated by bibtex2html 1.99.