Sharif Digital Repository / Sharif University of Technology / Search result

A streaming algorithm for 2-center with outliers in high dimensions

, Article Computational Geometry: Theory and Applications ; Volume 60 , 2017 , Pages 26-36 ; 09257721 (ISSN) Hatami, B ; Zarrabi Zadeh, H ; Sharif University of Technology

Abstract

We study the 2-center problem with outliers in high-dimensional data streams. Given a stream of points in arbitrary d dimensions, the goal is to find two congruent balls of minimum radius covering all but at most z points. We present a (1.8+ε)-approximation streaming algorithm, improving over the previous (4+ε)-approximation algorithm available for the problem. The space complexity and update time of our algorithm are poly(d,z,1/ε), independent of the size of the stream. © 2016 Elsevier B.V

UALM: unsupervised active learning method for clustering low-dimensional data

, Article Journal of Intelligent and Fuzzy Systems ; Volume 32, Issue 3 , 2017 , Pages 2393-2411 ; 10641246 (ISSN) Javadian, M ; Bagheri Shouraki, S ; Sharif University of Technology

Abstract

In this paper the Unsupervised Active Learning Method (UALM), a novel clustering method based on the Active Learning Method (ALM) is introduced. ALM is an adaptive recursive fuzzy learning algorithm inspired by some behavioral features of human brain functionality. UALM is a density-based clustering algorithm that relies on discovering densely connected components of data, where it can find clusters of arbitrary shapes. This approach is a noise-robust clustering method. The algorithm first blurs the data points as ink drop patterns, then summarizes the effects of all data points, and finally puts a threshold on the resulting pattern. It uses the connected-component algorithm for finding...

How to extend visibility polygons by mirrors to cover invisible segments

, Article 11th International Conference and Workshops on Algorithms and Computation, WALCOM 2017, 29 March 2017 through 31 March 2017 ; Volume 10167 LNCS , 2017 , Pages 42-53 ; 03029743 (ISSN); 9783319539249 (ISBN) Vaezi, A ; Ghodsi, M ; Sharif University of Technology

Springer Verlag 2017

Abstract

Given a simple polygon P with n vertices, the visibility polygon (V P) of a point q (V P(q)), or a segment (formula present) (V P(pq)) inside P can be computed in linear time. We propose a linear time algorithm to extend V P of a viewer (point or segment), by converting some edges of P into mirrors, such that a given non-visible segment (formula present) can also be seen from the viewer. Various definitions for the visibility of a segment, such as weak, strong, or complete visibility are considered. Our algorithm finds every edge such that, when converted to a mirror, makes (formula present) visible to our viewer. We find out exactly which interval of (formula present) becomes visible, by...

Prioritized K-mean clustering hybrid GA for discounted fixed charge transportation problems

, Article Computers and Industrial Engineering ; Volume 126 , 2018 , Pages 63-74 ; 03608352 (ISSN) Ghassemi Tari, F ; Hashemi, Z ; Sharif University of Technology

Elsevier Ltd 2018

Abstract

The problem of allocating different types of vehicles for transporting a set of products in an existing transportation network, to minimize the total transportation costs, is considered. The distribution network involves a heterogeneous fleet of vehicles each with the given capacity and with a variable transportation cost and a fixed cost with a discounting mechanism. Due to nonlinearity of the discounting mechanism, a nonlinear mathematical programming model is developed. A prioritized K-mean clustering encoding is introduced to designate the distribution depots distances, their demands, and the vehicles’ capacity. Using this priority clustering, a heuristic routine is developed by which...

A Task-Based Greedy Scheduling Algorithm for Minimizing Energy of MapReduce Jobs

, Article Journal of Grid Computing ; Volume 16, Issue 4 , 2018 , Pages 535-551 ; 15707873 (ISSN) Yousefi, M.H.N ; Goudarzi, M ; Sharif University of Technology

Springer Netherlands 2018

Abstract

MapReduce and its open source implementation, Hadoop, have gained widespread adoption for parallel processing of big data jobs. Since the number of such big data jobs is also rapidly rising, reducing their energy consumption is increasingly more important to reduce environmental impact as well as operational costs. Prior work by Mashayekhy et al. (IEEE Trans. Parallel Distributed Syst. 26, 2720–2733, 2016), has tackled the problem of energy-aware scheduling of a single MapReduce job but we provide a far more efficient heuristic in this paper. We first model the problem as an Integer Linear Program to find the optimal solution using ILP solvers. Then we present a task-based greedy scheduling...

Designing a new procedure for reward and penalty scheme in performance-based regulation of electricity distribution companies

, Article International Transactions on Electrical Energy Systems ; Volume 28, Issue 11 , 2018 ; 20507038 (ISSN) Jooshaki, M ; Abbaspour, A ; Fotuhi Firuzabad, M ; Moeini Aghtaie, M ; Lehtonen, M ; Sharif University of Technology

John Wiley and Sons Ltd 2018

Abstract

This paper introduces a new fuzzy-based design procedure for more efficient application of reward-penalty schemes in distribution sector. To achieve a fair as well as applicable regulation scheme, the fuzzy C-means clustering algorithm is employed to efficiently determine the similarity among distribution companies. As setting procedure of the reward-penalty scheme parameters can significantly affect the income of different companies, a new procedure based on the membership degrees obtained from the fuzzy C-means algorithm is proposed to fairly determine these parameters for each electricity distribution company. Some numerical studies are performed on the Iranian electricity distribution...

Slack clustering for scheduling frame-based tasks on multicore embedded systems

, Article Microelectronics Journal ; Volume 81 , 2018 , Pages 144-153 ; 00262692 (ISSN) Poursafaei, F ; Bazzaz, M ; Mohajjel Kafshdooz, M ; Ejlali, A ; Sharif University of Technology

Elsevier Ltd 2018

Abstract

Adopting multicore platforms is a general trend in real-time embedded systems. However, integrating tasks with different real-time constraints into a single platform presents new design challenges. While it must be guaranteed that hard real-time tasks are able to meet their deadline even in worst case scenarios, firm real-time tasks should be scheduled in a way to achieve high system utilization in order to provide a better quality of service. In this paper, we propose a scheduling scheme for frame-based tasks on real-time multicore embedded systems which is able to guarantee the schedulability of the hard real-time tasks, while it improves the number of executed firm real-time tasks....

Cluster-based sparse topical coding for topic mining and document clustering

, Article Advances in Data Analysis and Classification ; Volume 12, Issue 3 , 2018 , Pages 537-558 ; 18625347 (ISSN) Ahmadi, P ; Gholampour, I ; Tabandeh, M ; Sharif University of Technology

Springer Verlag 2018

Abstract

In this paper, we introduce a document clustering method based on Sparse Topical Coding, called Cluster-based Sparse Topical Coding. Topic modeling is capable of improving textual document clustering by describing documents via bag-of-words models and projecting them into a topic space. The latent semantic descriptions derived by the topic model can be utilized as features in a clustering process. In our proposed method, document clustering and topic modeling are integrated in a unified framework in order to achieve the highest performance. This framework includes Sparse Topical Coding, which is responsible for topic mining, and K-means that discovers the latent clusters in documents...

A clustering-based algorithm for de novo motif discovery in DNA sequences

, Article 2017 24th Iranian Conference on Biomedical Engineering and 2017 2nd International Iranian Conference on Biomedical Engineering, ICBME 2017, 30 November 2017 through 1 December 2017 ; 2018 ; 9781538636091 (ISBN) Ebrahim Abadi, M. H ; Fatemizadeh, E ; Sharif University of Technology

Abstract

Motif discovery is a challenging problem in molecular biology and has been attracting researcher's attention for years. Different kind of data and computational methods have been used to unravel this problem, but there is still room for improvement. In this study, our goal was to develop a method with the ability to identify all the TFBS signals, including known and unknown, inside the input set of sequences. We developed a clustering method specialized as part of our algorithm which outperforms other existing clustering methods such as DNACLUST and CD-HIT-EST in clustering short sequences. A scoring system was needed to determine how much a cluster is close to being a real motif. Multiple...

Application of Fuzzy C-means algorithm as a novel approach to predict solubility of hydrocarbons in carbon dioxide

, Article Petroleum Science and Technology ; Volume 36, Issue 4 , 2018 , Pages 308-312 ; 10916466 (ISSN) Darvish, H ; Garmsiri, H ; Zare, M ; Hemmati, N ; Sharif University of Technology

Taylor and Francis Inc 2018

Abstract

In the recent years, declination of oil reservoir causes the importance of researches on enhancement of oil recovery processes become more important. One of wide applicable approaches in enhancement of oil recovery is carbon dioxide injection which becomes interested because of relative low cost, good displacement and environmentally aspects. The injection of carbon dioxide to oil reservoir causes the lighter hydrocarbons of crude oil are extracted by CO2. This phenomena can be affected by various factors such the solubility of hydrocarbons in carbon dioxide so in the present investigation Fuzzy c-means (FCM) as a novel approach for estimation of solubility of alkanes in carbon dioxide in...

Partial discharges pattern recognition of transformer defect model by LBP & HOG features

, Article IEEE Transactions on Power Delivery ; 2018 ; 08858977 (ISSN) Firuzi, K ; Vakilian, M ; Phung, B. T ; Blackburn, T. R ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2018

Abstract

Partial discharge (PD) measurement and identification have great importance to condition monitoring of power transformers. In this paper a new method for recognition of single and multi-source of PD based on extraction of high level image features have been introduced. A database, involving 365 samples of phase-resolved PD (PRPD) data, is developed by measurement carried out on transformer artificial defect models (having different sizes of defect) under a specific applied voltage, to be used for proposed algorithm validation. In the first step, each set of PRPD data is converted into grayscale images to represent different PD defects. Two “image feature extraction” methods, the Local Binary...

Using minimum matching for clustering with balancing constraints

, Article 2009 Second ISECS International Colloquium on Computing, Communication, Control, and Management, CCCM 2009, Sanya, 8 August 2009 through 9 August 2009 ; Volume 1 , 2009 , Pages 225-228 ; 9781424442461 (ISBN) Shirali Shahreza, S ; Abolhassani, H ; Shirali Shahreza, M. H ; Yangzhou University; Guangdong University of Business Studies; Wuhan Institute of Technology; IEEE SMC TC on Education Technology and Training; IEEE Technology Management Council ; Sharif University of Technology

2009

Abstract

Clustering is a major task in data mining which is used in many applications. However, general clustering is inappropriate for many applications where some constraints should be applied. One category of these constraints is the cluster size constraint. In this paper, we propose a new algorithm for solving the clustering with balancing constraints by using the minimum matching. We compare our algorithm with the method proposed by Banerjee and Ghosh that uses stable matching and show that our algorithm converge to the final solution in fewer iterations. ©2009 IEEE

Extracting activated regions of fMRI data using unsupervised learning

, Article Proceedings of the International Joint Conference on Neural Networks, 14 June 2009 through 19 June 2009, Atlanta, GA ; 2009 , Pages 641-645 ; 9781424435531 (ISBN) Davoudi, H ; Taalimi, A ; Fatemizadeh, E ; International Neural Network Society; IEEE Computational Intelligence Society ; Sharif University of Technology

2009

Abstract

Clustering approaches are going to efficiently define the activated regions of the brain in fMRI studies. However, choosing appropriate clustering algorithms and defining optimal number of clusters are still key problems of these methods. In this paper, we apply an improved version of Growing Neural Gas algorithm, which automatically operates on the optimal number of clusters. The decision criterion for creating new clusters at the heart of this online clustering is taken from MB cluster validity index. Comparison with other so-called clustering methods for fMRI data analysis shows that the proposed algorithm outperforms them in both artificial and real datasets. ©2009 IEEE

Feature-based data stream clustering

, Article Proceedings of the 2009 8th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2009, 1 June 2009 through 3 June 2009, Shanghai ; 2009 , Pages 363-368 ; 9780769536415 (ISBN) Jafari Asbagh, M ; Abolhassani, H ; IEEE Computer Society; International Association for; Computer and Information Science, ACIS ; Sharif University of Technology

2009

Abstract

Data stream clustering has attracted a huge attention in recent years. Many one-pass and evolving algorithms have been developed in this field but feature selection and its influence on clustering solution has not been addressed by these algorithms. In this paper we explain a feature-based clustering method for streaming data. Our method establishes a ranking between features based on their appropriateness in terms of clustering compactness and separateness. Then, it uses an automatic algorithm to identify unimportant features and remove them from feature set. These two steps take place continuously during lifetime of clustering task. © 2009 IEEE

Clustering method for fMRI activation detection using optimal number of clusters

, Article 2009 4th International IEEE/EMBS Conference on Neural Engineering, NER '09, Antalya, 29 April 2009 through 2 May 2009 ; 2009 , Pages 171-174 ; 9781424420735 (ISBN) Taalimi, A ; Bayati, H ; Fatemizadeh, E ; National Institutes of Health, NIH; National Institute of Neurological Disorders and Stroke, NINDS; National Science Foundation, NSF ; Sharif University of Technology

2009

Abstract

In this study, clustering based method for activation detection in functional magnetic resonance imaging (fMRI) is employed. Moreover, some features are obtained by fitting two models namely FIR filter and Gamma function, to hemodynamic response function (HRF). After applying clustering methods (that require number of clusters as an input) to feature space, our simulations show that number of clusters can affect activation detection significantly. Therefore a newly proposed clustering algorithm namely evolving neural gas (ENG) that gives optimal number of clusters is exploited. In addition to ENG, the result of four clustering algorithms namely k-means, fuzzy C-means, neural gas, and clara...

An approximation algorithm for finding skeletal points for density based clustering approaches

, Article 2009 IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2009, Nashville, TN, 30 March 2009 through 2 April 2009 ; 2009 , Pages 403-410 ; 9781424427659 (ISBN) Hassas Yeganeh, S ; Habibi, J ; Abolhassani, H ; Abbaspour Tehrani, M ; Esmaelnezhad, J ; Sharif University of Technology

2009

Abstract

Clustering is the problem of finding relations in a data set in an supervised manner. These relations can be extracted using the density of a data set, where density of a data point is defined as the number of data points around it. To find the number of data points around another point, region queries are adopted. Region queries are the most expensive construct in density based algorithm, so it should be optimized to enhance the performance of density based clustering algorithms specially on large data sets. Finding the optimum set of region queries to cover all the data points has been proven to be NP-complete. This optimum set is called the skeletal points of a data set. In this paper, we...

An FPCA-based color morphological filter for noise removal

, Article Scientia Iranica ; Volume 16, Issue 1 D , 2009 , Pages 8-18 ; 10263098 (ISSN) Soleymani Baghshah, M ; Kasaei, S ; Sharif University of Technology

2009

Abstract

Morphological filtering is a useful technique for the processing and analysis of binary and gray scale images. The extension of morphological techniques to color images is not a straightforward task because this extension stems from the multivariate ordering problem. Since multivariate ordering is ambiguous, existing approaches have used known vector ordering schemes for the color ordering purpose. In the. last decade, many different color morphological operators have been introduced in the literature. Some of them have focused on noise suppression purposes. However, none has shown good performance, especially on edgy regions. In this paper, new color morphological operators, based on a...

Harmony K-means algorithm for document clustering

, Article Data Mining and Knowledge Discovery ; Volume 18, Issue 3 , 2009 , Pages 370-391 ; 13845810 (ISSN) Mahdavi, M ; Abolhassani, H ; Sharif University of Technology

2009

Abstract

Fast and high quality document clustering is a crucial task in organizing information, search engine results, enhancing web crawling, and information retrieval or filtering. Recent studies have shown that the most commonly used partition-based clustering algorithm, the K-means algorithm, is more suitable for large datasets. However, the K-means algorithm can generate a local optimal solution. In this paper we propose a novel Harmony K-means Algorithm (HKA) that deals with document clustering based on Harmony Search (HS) optimization method. It is proved by means of finite Markov chain theory that the HKA converges to the global optimum. To demonstrate the effectiveness and speed of HKA, we...

Visibility extension via mirror-edges to cover invisible segments

, Article Theoretical Computer Science ; Volume 789 , 2019 , Pages 22-33 ; 03043975 (ISSN) Vaezi, A ; Ghodsi, M ; Sharif University of Technology

Elsevier B.V 2019

Abstract

Given a simple polygon P with n vertices, the visibility polygon (VP) of a point q, or a segment pq‾ inside P can be computed in linear time. We propose a linear time algorithm to extend the VP of a viewer (point or segment), by converting some edges of P into mirrors, such that a given non-visible segment uw‾ can also be seen from the viewer. Various definitions for the visibility of a segment, such as weak, strong, or complete visibility are considered. Our algorithm finds every edge that, when converted to a mirror, makes uw‾ visible to our viewer. We find out exactly which interval of uw‾ becomes visible, by every edge middling as a mirror, all in linear time. In other words, in this...

Inline high-bandwidth network analysis using a robust stream clustering algorithm

, Article IET Information Security ; Volume 13, Issue 5 , 2019 , Pages 486-497 ; 17518709 (ISSN) Noferesti, M ; Jalili, R ; Sharif University of Technology

Institution of Engineering and Technology 2019

Abstract

High-bandwidth network analysis is challenging, resource consuming, and inaccurate due to the high volume, velocity, and variety characteristics of the network traffic. The infinite stream of incoming traffic forms a dynamic environment with unexpected changes, which requires analysing approaches to satisfy the high-bandwidth network processing challenges such as incremental learning, inline processing, and outlier handling. This study proposes an inline high-bandwidth network stream clustering algorithm designed to incrementally mine large amounts of continuously transmitting network traffic when some outliers can be dropped before determining the network traffic behaviour. Maintaining...