Loading...
Search for: clustering-web
0.008 seconds

    Probabilistic heuristics for hierarchical web data clustering

    , Article Computational Intelligence ; Volume 28, Issue 2 , 2012 , Pages 209-233 ; 08247935 (ISSN) Haghir Chehreghani, M ; Haghir Chehreghani, M ; Abolhassani, H ; Sharif University of Technology
    Abstract
    Clustering Web data is one important technique for extracting knowledge from the Web. In this paper, a novel method is presented to facilitate the clustering. The method determines the appropriate number of clusters and provides suitable representatives for each cluster by inference from a Bayesian network. Furthermore, by means of the Bayesian network, the contents of the Web pages are converted into vectors of lower dimensions. The method is also extended for hierarchical clustering, and a useful heuristic is developed to select a good hierarchy. The experimental results show that the clusters produced benefit from high quality  

    Density link-based methods for clustering web pages

    , Article Decision Support Systems ; Volume 47, Issue 4 , 2009 , Pages 374-382 ; 01679236 (ISSN) Haghir Chehreghani, M ; Abolhassani, H ; Haghir Chehreghani, M ; Sharif University of Technology
    2009
    Abstract
    World Wide Web is a huge information space, making it a valuable resource for decision making. However, it should be effectively managed for such a purpose. One important management technique is clustering the web data. In this paper, we propose some developments in clustering methods to achieve higher qualities. At first we study a new density based method adapted for hierarchical clustering of web documents. Then utilizing the hyperlink structure of web, we propose a new method that incorporates density concepts with web graph. These algorithms have the preference of low complexity and as experimental results reveal, the resultant clusters have high quality. © 2009 Elsevier B.V. All rights... 

    Web page clustering using harmony search optimization

    , Article IEEE Canadian Conference on Electrical and Computer Engineering, CCECE 2008, Niagara Falls, ON, 4 May 2008 through 7 May 2008 ; 2008 , Pages 1601-1604 ; 08407789 (ISSN) ; 9781424416431 (ISBN) Forsati, R ; Mahdavi, M ; Kangavari, M ; Safarkhani, B ; Sharif University of Technology
    2008
    Abstract
    Clustering has become an increasingly important task in modern application domains. Targeting useful and relevant information on the World Wide Web is a topical and highly complicated research area. Clustering techniques have been applied to categorize documents on web and extracting knowledge from the web. In this paper we propose novel clustering algorithms based on Harmony Search (HS) optimization method that deals with web document clustering. By modeling clustering as an optimization problem, first, we propose a pure HS based clustering algorithm that finds near global optimal clusters within a reasonable time. Then we hybridize K-means and harmony clustering to achieve better... 

    Improving density-based methods for hierarchical clustering of web pages

    , Article Data and Knowledge Engineering ; Volume 67, Issue 1 , 2008 , Pages 30-50 ; 0169023X (ISSN) Haghir Chehreghani, M ; Abolhassani, H ; Haghir Chehreghani, M ; Sharif University of Technology
    2008
    Abstract
    The rapid increase of information on the web makes it necessary to improve information management techniques. One of the most important techniques is clustering web data. In this paper, we propose a new 3-phase clustering method that finds dense units in a data set using density-based algorithms. The distances in the dense units are stored in order in structures such as a min heap. In the extraction stage, these distances are extracted one by one, and their effects on the clustering process are examined. Finally, in the combination stage, clustering is completed using improved versions of well-known single and average linkage methods. All steps of the methods are performed in O(n log n) time...