alshukri2011a - Incremental Web-Site Boundary Detection Using Random Walks
Incremental Web-Site Boundary Detection Using Random Walks
by A. Alshukri, F. Coenen, M. Zito
Abstract
The paper describes variations of the classical k-means clustering algorithm that can be used effectively to address the so called Web-site Boundary Detection (WBD) problem. The suggested advantages offered by these techniques are that they can quickly identify most of the pages belonging to a web-site; and, in the long run, return a solution of comparable (if not better) accuracy than other clustering methods. We analyze our techniques on artificial clones of the web generated using a well-known preferential attachment method
Reference
Incremental Web-Site Boundary Detection Using Random Walks (A. Alshukri, F. Coenen, M. Zito), In Proceedings of the 7th International Conference on Machine Learning and Data Mining (MLDM’11). 30th August-3rd September, New York, USA, Springer, 2011.
Bibtex Entry
@inproceedings{Alshukri2011a,
author = {Alshukri, A. and Coenen, F. and Zito, M.},
title = {Incremental Web-Site Boundary Detection Using Random Walks},
booktitle = {Proceedings of the 7th International Conference on Machine Learning and Data Mining (MLDM'11). 30th August-3rd September, New York, USA},
year = {2011},
address = {New York, USA},
pages = {414----427},
publisher = {Springer},
isbn = {},
series = {Lecture Notes in Computer Science},
}
For further details see the full paper.