Loading...

Probabilistic heuristics for hierarchical web data clustering

Haghir Chehreghani, M ; Sharif University of Technology

567 Viewed
  1. Type of Document: Article
  2. DOI: 10.1111/j.1467-8640.2012.00414.x
  3. Abstract:
  4. Clustering Web data is one important technique for extracting knowledge from the Web. In this paper, a novel method is presented to facilitate the clustering. The method determines the appropriate number of clusters and provides suitable representatives for each cluster by inference from a Bayesian network. Furthermore, by means of the Bayesian network, the contents of the Web pages are converted into vectors of lower dimensions. The method is also extended for hierarchical clustering, and a useful heuristic is developed to select a good hierarchy. The experimental results show that the clusters produced benefit from high quality
  5. Keywords:
  6. Bayesian networks ; Hierarchical clustering ; Clustering web ; Hier-archical clustering ; High quality ; Number of clusters ; Probabilistic heuristics ; Representative point ; Web clustering ; Web data clustering ; Clustering algorithms ; Data mining ; Heuristic methods ; Websites
  7. Source: Computational Intelligence ; Volume 28, Issue 2 , 2012 , Pages 209-233 ; 08247935 (ISSN)
  8. URL: http://onlinelibrary.wiley.com/doi/10.1111/j.1467-8640.2012.00414.x/abstract;jsessionid=9E25B83792AEA3EB7CBC6AE560574162.f04t04