alshukri2011b - Web Site Boundary Detection Using Incremental Random Walk Clustering

Web Site Boundary Detection Using Incremental Random Walk Clustering

by A. Alshukri, F. Coenen, M. Zito

[PDF]

Abstract

In this paper we describe a random walk clustering technique to address the Website Boundary Detection (WBD) problem. The technique is fully described and compared with alternative (breadth and depth first) approaches. The reported evaluation demonstrates that the random walk technique produces comparable or better results than those produced by these alternative techniques, while at the same time visiting fewer ‘noise’ pages. To demonstrate that the good results are not simply a consequence of a randomisation of the input data we also compare with a random ordering technique.

Reference

Web-Site Boundary Detection Using Incremental Random Walk Clustering (A. Alshukri, F. Coenen, M. Zito), In Proceedings of the 31st SGAI International Conference (SGAI’11), 13-15th December, Cambridge, England UK, Springer, 2011.

Bibtex Entry

@inproceedings{Alshukri2011b,
	author = {Alshukri, A. and Coenen, F. and Zito, M.},
	title = {Web-Site Boundary Detection Using Incremental Random Walk Clustering},
	booktitle = {Proceedings of the 31st SGAI International Conference (SGAI'11), 13-15th December, Cambridge, England UK},
	year = {2011},
	address = {Cambridge, England UK},
	pages = {255--268},
	publisher = {Springer},
	isbn = {},
	series = {},
}

For further details see the full paper.

Creating your first programming language is easier than you think,
...also looks great on your resume/cv.