alshukri2011b - Web Site Boundary Detection Using Incremental Random Walk Clustering
Web Site Boundary Detection Using Incremental Random Walk Clustering
by A. Alshukri, F. Coenen, M. Zito
Abstract
In this paper we describe a random walk clustering technique to address the Website Boundary Detection (WBD) problem. The technique is fully described and compared with alternative (breadth and depth first) approaches. The reported evaluation demonstrates that the random walk technique produces comparable or better results than those produced by these alternative techniques, while at the same time visiting fewer ‘noise’ pages. To demonstrate that the good results are not simply a consequence of a randomisation of the input data we also compare with a random ordering technique.
Reference
Web-Site Boundary Detection Using Incremental Random Walk Clustering (A. Alshukri, F. Coenen, M. Zito), In Proceedings of the 31st SGAI International Conference (SGAI’11), 13-15th December, Cambridge, England UK, Springer, 2011.
Bibtex Entry
@inproceedings{Alshukri2011b,
author = {Alshukri, A. and Coenen, F. and Zito, M.},
title = {Web-Site Boundary Detection Using Incremental Random Walk Clustering},
booktitle = {Proceedings of the 31st SGAI International Conference (SGAI'11), 13-15th December, Cambridge, England UK},
year = {2011},
address = {Cambridge, England UK},
pages = {255--268},
publisher = {Springer},
isbn = {},
series = {},
}
For further details see the full paper.