Sharif Digital Repository / Sharif University of Technology / Search result

GDCluster: A general decentralized clustering algorithm

, Article IEEE Transactions on Knowledge and Data Engineering ; Volume 27, Issue 7 , 2015 , Pages 1892-1905 ; 10414347 (ISSN) Mashayekhi, H ; Habibi, J ; Khalafbeigi, T ; Voulgaris, S ; Van Steen, M ; Sharif University of Technology

IEEE Computer Society 2015

Abstract

In many popular applications like peer-to-peer systems, large amounts of data are distributed among multiple sources. Analysis of this data and identifying clusters is challenging due to processing, storage, and transmission costs. In this paper, we propose GDCluster, a general fully decentralized clustering method, which is capable of clustering dynamic and distributed data sets. Nodes continuously cooperate through decentralized gossip-based communication to maintain summarized views of the data set. We customize GDCluster for execution of the partition-based and density-based clustering methods on the summarized views, and also offer enhancements to the basic algorithm. Coping with...

GDCLU: A new grid-density based clustring algorithm

, Article Proceedings - 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, SNPD 2012, 8 August 2012 through 10 August 2012 ; August , 2012 , Pages 102-107 ; 9780769547619 (ISBN) Esfandani, G ; Sayyadi, M ; Namadchian, A ; Sharif University of Technology

2012

Abstract

This paper addresses the density based clustering problem in data mining where clusters are established based on density of regions. The most well-known algorithm proposed in this area is DBSCAN [1] which employs two parameters influencing the shape of resulted clusters. Therefore, one of the major weaknesses of this algorithm is lack of ability to handle clusters in multi-density environments. In this paper, a new density based grid clustering algorithm, GDCLU, is proposed which uses a new definition for dense regions. It determines dense grids based on densities of their neighbors. This new definition enables GDCLU to handle different shaped clusters in multi-density environments. Also...

MSDBSCAN: Multi-density scale-independent clustering algorithm based on DBSCAN

, Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 19 November 2010 through 21 November 2010, Chongqing ; Volume 6440 LNAI, Issue PART 1 , November , 2010 , Pages 202-213 ; 03029743 (ISSN) ; 3642173152 (ISBN) Esfandani, G ; Abolhassani, H ; Sharif University of Technology

2010

Abstract

A good approach in data mining is density based clustering in which the clusters are constructed based on the density of shape regions. The prominent algorithm proposed in density based clustering family is DBSCAN [1] that uses two global density parameters, namely minimum number of points for a dense region and epsilon indicating the neighborhood distance. Among others, one of the weaknesses of this algorithm is its un-suitability for multi-density data sets where different regions have various densities so the same epsilon does not work. In this paper, a new density based clustering algorithm, MSDBSCAN, is proposed. MSDBSCAN uses a new definition for core point and dense region. The...

UALM: unsupervised active learning method for clustering low-dimensional data

, Article Journal of Intelligent and Fuzzy Systems ; Volume 32, Issue 3 , 2017 , Pages 2393-2411 ; 10641246 (ISSN) Javadian, M ; Bagheri Shouraki, S ; Sharif University of Technology

Abstract

In this paper the Unsupervised Active Learning Method (UALM), a novel clustering method based on the Active Learning Method (ALM) is introduced. ALM is an adaptive recursive fuzzy learning algorithm inspired by some behavioral features of human brain functionality. UALM is a density-based clustering algorithm that relies on discovering densely connected components of data, where it can find clusters of arbitrary shapes. This approach is a noise-robust clustering method. The algorithm first blurs the data points as ink drop patterns, then summarizes the effects of all data points, and finally puts a threshold on the resulting pattern. It uses the connected-component algorithm for finding...

GoSCAN: Decentralized scalable data clustering

, Article Computing ; Volume 95, Issue 9 , 2013 , Pages 759-784 ; 0010485X (ISSN) Mashayekhi, H ; Habibi, J ; Voulgaris, S ; Van Steen, M ; Sharif University of Technology

2013

Abstract

Identifying clusters is an important aspect of analyzing large datasets. Clustering algorithms classically require access to the complete dataset. However, as huge amounts of data are increasingly originating from multiple, dispersed sources in distributed systems, alternative solutions are required. Furthermore, data and network dynamicity in a distributed setting demand adaptable clustering solutions that offer accurate clustering models at a reasonable pace. In this paper, we propose GoScan, a fully decentralized density-based clustering algorithm which is capable of clustering dynamic and distributed datasets without requiring central control or message flooding. We identify two major...

A novel method to find appropriate ε for DBSCAN

, Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 24 March 2010 through 26 March 2010 ; Volume 5990 LNAI, Issue PART 1 , 2010 , Pages 93-102 ; 03029743 (ISSN) ; 3642121446 (ISBN) Esmaelnejad, J ; Habibi, J ; Hassas Yeganeh, S ; Sharif University of Technology

2010

Abstract

Clustering is one of the most useful methods of data mining, in which a set of real or abstract objects are categorized into clusters. The DBSCAN clustering method, one of the most famous density based clustering methods, categorizes points in dense areas into same clusters. In DBSCAN a point is said to be dense if the ε-radius circular area around it contains at least MinPts points. To find such dense areas, region queries are fired. Two points are defined as density connected if the distance between them is less than ε and at least one of them is dense. Finally, density connected parts of the data set extracted as clusters. The significant issue of such a method is that its parameters (ε...

Density link-based methods for clustering web pages

, Article Decision Support Systems ; Volume 47, Issue 4 , 2009 , Pages 374-382 ; 01679236 (ISSN) Haghir Chehreghani, M ; Abolhassani, H ; Haghir Chehreghani, M ; Sharif University of Technology

2009

Abstract

World Wide Web is a huge information space, making it a valuable resource for decision making. However, it should be effectively managed for such a purpose. One important management technique is clustering the web data. In this paper, we propose some developments in clustering methods to achieve higher qualities. At first we study a new density based method adapted for hierarchical clustering of web documents. Then utilizing the hyperlink structure of web, we propose a new method that incorporates density concepts with web graph. These algorithms have the preference of low complexity and as experimental results reveal, the resultant clusters have high quality. © 2009 Elsevier B.V. All rights...

An approximation algorithm for finding skeletal points for density based clustering approaches

, Article 2009 IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2009, Nashville, TN, 30 March 2009 through 2 April 2009 ; 2009 , Pages 403-410 ; 9781424427659 (ISBN) Hassas Yeganeh, S ; Habibi, J ; Abolhassani, H ; Abbaspour Tehrani, M ; Esmaelnezhad, J ; Sharif University of Technology

2009

Abstract

Clustering is the problem of finding relations in a data set in an supervised manner. These relations can be extracted using the density of a data set, where density of a data point is defined as the number of data points around it. To find the number of data points around another point, region queries are adopted. Region queries are the most expensive construct in density based algorithm, so it should be optimized to enhance the performance of density based clustering algorithms specially on large data sets. Finding the optimum set of region queries to cover all the data points has been proven to be NP-complete. This optimum set is called the skeletal points of a data set. In this paper, we...

DSCLU: A new data stream CLUstring algorithm for multi density environments

, Article Proceedings - 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, SNPD 2012 ; 2012 , Pages 83-88 ; 9780769547619 (ISBN) Namadchian, A ; Esfandani, G ; Sharif University of Technology

2012

Abstract

Recently, data stream has become popular in many contexts of data mining. Due to the high amount of incoming data, traditional clustering algorithms are not suitable for this family of problems. Many data stream clustering algorithms proposed in recent years considered the scalability of data, but most of them did not attend the following issues: (1) The quality of clustering can be dramatically low over the time. (2) Some of the algorithms cannot handle arbitrary shapes of data stream and consequently the results are limited to specific regions. (3) Most of the algorithms have not been evaluated in multi-density environments. Identifying appropriate clusters for data stream by handling the...

A novel density-based fuzzy clustering algorithm for low dimensional feature space

, Article Fuzzy Sets and Systems ; 2016 ; 01650114 (ISSN) Javadian, M ; Bagheri Shouraki, S ; Sheikhpour Kourabbaslou, S ; Sharif University of Technology

Elsevier B.V 2016

Abstract

In this paper, we propose a novel density-based fuzzy clustering algorithm based on Active Learning Method (ALM), which is a methodology of soft computing inspired by some hypotheses claiming that human brain interprets information in pattern-like images rather than numerical quantities. The proposed clustering algorithm, Fuzzy Unsupervised Active Learning Method (FUALM), is performed in two main phases. First, each data point spreads in the feature space just like an ink drop that spreads on a sheet of paper. As a result of this process, densely connected ink patterns are formed that represent clusters. In the second phase, a fuzzifying process is applied in order to summarize the effects...

A novel density-based fuzzy clustering algorithm for low dimensional feature space

, Article Fuzzy Sets and Systems ; Volume 318 , 2017 , Pages 34-55 ; 01650114 (ISSN) Javadian, M ; Bagheri Shouraki, S ; Sheikhpour Kourabbaslou, S ; Sharif University of Technology

Abstract

In this paper, we propose a novel density-based fuzzy clustering algorithm based on Active Learning Method (ALM), which is a methodology of soft computing inspired by some hypotheses claiming that human brain interprets information in pattern-like images rather than numerical quantities. The proposed clustering algorithm, Fuzzy Unsupervised Active Learning Method (FUALM), is performed in two main phases. First, each data point spreads in the feature space just like an ink drop that spreads on a sheet of paper. As a result of this process, densely connected ink patterns are formed that represent clusters. In the second phase, a fuzzifying process is applied in order to summarize the effects...

Communities detection for advertising by futuristic greedy method with clustering approach

, Article Big Data ; Volume 9, Issue 1 , 2021 , Pages 22-40 ; 21676461 (ISSN) Bakhthemmat, A ; Izadi, M ; Sharif University of Technology

Mary Ann Liebert Inc 2021

Abstract

Community detection in social networks is one of the advertising methods in electronic marketing. One of the approaches to find communities in large social networks is to use greedy methods, because these methods perform very fast. Greedy methods are generally designed based on local decisions; thus, inappropriate local decisions may result in an improper global solution. The use of a greedy improved index with a futuristic approach can, to some extent, prevent inappropriate local choices. Our proposed method determines the influential nodes in the social network based on the followers and following and new futuristic greedy index. It classifies the nodes based on the influential nodes by...