Sharif Digital Repository / Sharif University of Technology / Search result

K-means-G*: Accelerating k-means clustering algorithm utilizing primitive geometric concepts

, Article Information Sciences ; Volume 618 , 2022 , Pages 298-316 ; 00200255 (ISSN) Ismkhan, H ; Izadi, M ; Sharif University of Technology

Elsevier Inc 2022

Abstract

The k-means is the most popular clustering algorithm, but, as it needs too many distance computations, its speed is dramatically fall down against high-dimensional data. Although, there are some quite fast variants proposed in literature, but, there is still much room for improvement against high-dimensional large-scale datasets. What proposed here, k-means-g*, is based on a simple geometric concept. For four distinct points, if distance between all pairs except one pair are known, then, a lower bound can be determined for the unknown distance. Utilizing this technique in the assignment step of the k-means, many high-dimensional distance computations can be easily ignored, where small amount...

Preclustering algorithms for imprecise points

, Article Algorithmica ; Volume 84, Issue 6 , 2022 , Pages 1467-1489 ; 01784617 (ISSN) Abam, M. A ; de Berg, M ; Farahzad, S ; Haji Mirsadeghi, M. O ; Saghafian, M ; Sharif University of Technology

Springer 2022

Abstract

We study the problem of preclustering a set B of imprecise points in Rd: we wish to cluster the regions specifying the potential locations of the points such that, no matter where the points are located within their regions, the resulting clustering approximates the optimal clustering for those locations. We consider k-center, k-median, and k-means clustering, and obtain the following results. Let B: = { b1, … , bn} be a collection of disjoint balls in Rd, where each ball bi specifies the possible locations of an input point pi. A partition C of B into subsets is called an (f(k) , α) -preclustering (with respect to the specific k-clustering variant under consideration) if (i) C consists of...

Three hybrid GAs for discounted fixed charge transportation problems

, Article Cogent Engineering ; Volume 5, Issue 1 , 2018 ; 23311916 (ISSN) Ghassemi Tari, F ; Hashemi, Z ; Sharif University of Technology

Cogent OA 2018

Abstract

The problem of allocating heterogeneous fleet of vehicles to the existing distribution network for dispensing products fro. manufacturing firm t. set of depots is considered. It is assume. heterogeneous fleet of vehicles with the given capacities and total costs consisting o. discounted fixed cost an. variable cost proportional to the amount shipped is employed for handling products. To minimize the total transportation costs, the problem is modeled i. form of the nonlinear mixed integer program. Due to the NP hard complexity of the mathematical model, three prioritized K-mean clustering hybrid GAs, by incorporating two new heuristic algorithms, are proposed. The efficiency of the algorithms...

An innovative implementation of Circular Hough Transform using eigenvalues of Covariance Matrix for detecting circles

, Article Proceedings Elmar - International Symposium Electronics in Marine, 14 September 2011 through 16 September 2011, Zadar ; 2011 , Pages 397-400 ; 13342630 (ISSN) ; 9789537044121 (ISBN) Tooei, M. H. D. H ; Mianroodi, J. R ; Norouzi, N ; Khajooeizadeh, A ; Sharif University of Technology

2011

Abstract

In this paper, a fast and accurate algorithm for identifying circular objects in images is proposed. The presented method is a robust, fast and optimized adaption of Circular Hough Transform (CHT), Eigenvalues of Covariance Matrix and K-means clustering techniques. Results are greatly improved by implementing iterative K-means clustering algorithm and establishing an exponential growth instead of updating values in the parameter space of CHT through summation, both in runtime and quality. In fact, using the Eigenvalues of Covariance Matrix as a validating method, a well balanced compromise between the speed and accuracy of results is achieved. This method is tested on several real world...

List Estimation

, M.Sc. Thesis Sharif University of Technology Shahrivari, Farzad (Author) ; Amini, Arash (Supervisor) ; Aminzadeh Gohari, Amin (Co-Advisor)

Abstract

Let X be an unknown vector of size n which is to be estimated from a known m 1 vector Y. According to the MMSE criterion, the best estimator (denoted bX(Y)) is an estimator which minimizes the mean squared error. Now, consider a List Decodingproblem in which the sender delivers a list of codes instead of a single decoder. Assume that it is allowed to use multiple parallel estimators (bX1 (Y); ^X2(Y); : : : ; bX k(Y)) instead of delivering a single estimation of samples. The goal is to find the best possible list of estimators, in a way that the mean squared error is optimized between the multiple bX i(Y); (i = 1; 2; : : : ; k). As a medical example, imagine a MRI device which produces three...

محتواي کتاب

Home Healthcare Routing and Scheduling Problem During the Covid-19 Pandemic with Uncertainties

, M.Sc. Thesis Sharif University of Technology Nabavizadeh Rafsanjani, Najmeh (Author) ; Rafiee, Majid (Supervisor)

Abstract

In summary, this thesis presents a new mathematical model for Home Health Care (HHC) services during the Corona era, which aims to increase the efficiency and quality of services provided by these organizations while ensuring compliance with quarantine protocols. The model is an extension of the VRPPDTW formulation and considers relevant features such as patient classification, caregiver classification, work and break regulations, workload balancing, and multi-depot capabilities. The optimization can be performed with two separate objective functions, one to minimize traveling and idle costs and the other to minimize the total working time of caregivers. The contradictions between two...

محتواي کتاب

Preclustering algorithms for imprecise points

, Article 17th Scandinavian Symposium and Workshops on Algorithm Theory, SWAT 2020, 22 June 2020 through 24 June 2020 ; Volume 162 , 2020 Abam, M. A ; de Berg, M ; Farahzad, S ; Haji Mirsadeghi, M. O ; Saghafian, M ; Sharif University of Technology

Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing 2020

Abstract

We study the problem of preclustering a set B of imprecise points in Rd: we wish to cluster the regions specifying the potential locations of the points such that, no matter where the points are located within their regions, the resulting clustering approximates the optimal clustering for those locations. We consider k-center, k-median, and k-means clustering, and obtain the following results. Let B := {b1, . . ., bn} be a collection of disjoint balls in Rd, where each ball bi specifies the possible locations of an input point pi. A partition C of B into subsets is called an (f(k), α)preclustering (with respect to the specific k-clustering variant under consideration) if (i) C consists of...

Solving MEC and MEC/GI problem models, using information fusion and multiple classifiers

, Article Innovations'07: 4th International Conference on Innovations in Information Technology, IIT, Dubai, 18 November 2007 through 20 November 2007 ; 2007 , Pages 397-401 ; 9781424418411 (ISBN) Asgarian, E ; Moeinzadeh, M. H ; Mohammadzadeht, J ; Ghazinezhad, A ; Habibi, J ; Najafi Ardabili, A ; Sharif University of Technology

IEEE Computer Society 2007

Abstract

Mutations in Single Nucleotide Polymorphisms (SNPs - different variant positions (1%) from human genomes) are responsible for some genetic diseases. As a consequence, obtaining all SNPs from human populations is one of the primary goals of recent studies in human genomics. Two sequences of mentioned SNPs in diploid human organisms are called haplotypes. In this paper, we study haplotype reconstruction from SNP-fragments with and without genotype information, problems. Designing serial and parallel classifiers was center of our research. Genetic algorithm and K-means were two components of our approaches. This combination helps us to cover the single classifier's weaknesses. ©2008 IEEE

Online exams and the COVID-19 pandemic: a hybrid modified FMEA, QFD, and k-means approach to enhance fairness

, Article SN Applied Sciences ; Volume 3, Issue 10 , 2021 ; 25233971 (ISSN) Haghshenas Gorgani, H ; Shabani, S ; Sharif University of Technology

Springer Nature 2021

Abstract

COVID-19 pandemic caused an increasing demand for online academic classes, which led to the demand for effective online exams with regards to limitations on time and resources. Consequently, holding online exams with sufficient reliability and effectiveness became one of the most critical and challenging subjects in higher education. Therefore, it is essential to have a preventive algorithm to allocate time and financial resources effectively. In the present study, a fair test with sufficient validity is first defined, and then by analogy with an engineering product, the design process is implemented on it. For this purpose, a hybrid method based on FMEA, which is a preventive method to...

Deep long short-term memory (LSTM) networks for ultrasonic-based distributed damage assessment in concrete

, Article Cement and Concrete Research ; Volume 162 , 2022 ; 00088846 (ISSN) Ranjbar, I ; Toufigh, V ; Sharif University of Technology

Elsevier Ltd 2022

Abstract

This paper presented a comprehensive study on developing a deep learning approach for ultrasonic-based distributed damage assessment in concrete. In particular, two architectures of long short-term memory (LSTM) networks were proposed: (1) a classification model to evaluate the concrete's damage stage; (2) a regression model to predict the concrete's absorbed energy ratio. Two input configurations were considered and compared for both architectures: (1) the input was a single signal; (2) the inputs were four signals from four sides of the specimen. A comprehensive experimental study was designed and conducted on ground granulated blast furnace slag-based geopolymer concrete, providing a...

A bi-objective Hybrid Algorithm to Reduce Noise and Data Dimension in Diabetes Disease Diagnosis Using Support Vector Machines

, M.Sc. Thesis Sharif University of Technology Alirezaei, Mahsa (Author) ; Akhavan Niaki, Taghi (Supervisor)

Abstract

There is a significant amount of data in the healthcare domain and it is unfeasible to process such volume of data manually in order to diagnose the diseases and develop a treatment method in the short term. Diabetes mellitus has attracted the attention of data miners for a couple of reasons among which significant effects on the health and well-being of the contracted people and the economic burdens on the health care system are of prime importance. Researchers are trying to find a statistical correlation between the causes of this disease and factors like patient's lifestyle, hereditary information, etc. The purpose of data mining is to discover rules that facilitate the early diagnosis...

محتواي کتاب

Routing and Scheduling of Home Health Care Problem Under Uncertainty

, M.Sc. Thesis Sharif University of Technology Khodabandeh, Pouria (Author) ; Rafiee, Majid (Supervisor) ; Kayvanfar, Vahid (Co-Supervisor)

Abstract

Home health care is one of the newest methods of providing services to patients in developed societies that can respond to the individual lifestyle of the modern age and increase of life expectancy. In this study, a new mathematical model is developed taking into account the flexibility of starting and ending locations of each nurse, according to the specific requirements of each service. In this context, there are some special services that require the picking of materials and health equipment from laboratory or force the nurse to return to laboratory to deliver the specimens and equipment. In the next step, this model is expanded to downgrading aspects by adding the objective of minimizing...

محتواي کتاب

Clustering for Large-Scale Datasets

, Ph.D. Dissertation Sharif University of Technology Ismkhan, Hassan (Author) ; Izadi, Mohammad (Supervisor)

Abstract

Sofar, many clustering algorithms have been proposed, however, they lose their speed against the large-scale dataset. The large-scale datasets are those with many number of points that their dimensions are also high. For low-dimensionl datasets with many number of points, the classic methods like tree-based structres can easily speed up the algorithms. In this thesis, to accelerate the data clustering, number of distance computations are reduced, because high number of distance computations is the main reason that clustering algorithms are slow. To reach this goal, in this thesis, it is considered that to accelerate clustering algorithms, it is needed to accelerate the tasks of searching for...

محتواي کتاب

A combination of PSO and K-means methods to solve haplotype reconstruction problem

, Article 2009 International Conference on Innovations in Information Technology, IIT '09, 15 December 2009 through 17 December 2009 ; 2009 , Pages 190-194 ; 9781424456987 (ISBN) Sharifian R, S ; Baharian, A ; Asgarian, E ; Rasooli, A ; Sharif University of Technology

Abstract

Disease association study is of great importance among various fields of study in bioinformatics. Computational methods happen to be advantageous specifically when experimental approaches fail to obtain accurate results. Haplotypes are believed to be the most responsible biological data for genetic diseases. In this paper, the problem of reconstructing haplotypes from error-containing SNP fragments is discussed For this purpose, two new methods have been proposed by a combination of k-means clustering and particle swarm optimization algorithm. The methods and their implementation results on real biological and simulation datasets are represented which shows that they outperform the methods...

A weighted K-means clustering approach to solve the redundancy allocation problem of systems having components with different failures

, Article Proceedings of the Institution of Mechanical Engineers, Part O: Journal of Risk and Reliability ; Volume 233, Issue 6 , 2019 , Pages 925-942 ; 1748006X (ISSN) Karimi, B ; Akhavan Niak, S. T ; Miriha, S. M ; Ghare Hasanluo, M ; Javanmard, S ; Sharif University of Technology

SAGE Publications Ltd 2019

Abstract

A nonlinear integer programming model is developed in this article to solve redundancy allocation problems with multiple components having different failure rates in the series–parallel configuration using an active strategy. The main objective of this research is to select the number and the type of each component in subsystems so as the reliability of the system under certain constraints is maximized. To this aim, a weighted K-means clustering method is proposed, in which the analytical network process is employed to assign weights to the components of each cluster. As the proposed model belongs to the class of nondeterministic polynomial-time hardness problems, precise solution methods...

Supervised fuzzy partitioning

, Article Pattern Recognition ; Volume 97 , 2020 Ashtari, P ; Nateghi Haredasht, F ; Beigy, H ; Sharif University of Technology

Elsevier Ltd 2020

Abstract

Centroid-based methods including k-means and fuzzy c-means are known as effective and easy-to-implement approaches to clustering purposes in many applications. However, these algorithms cannot be directly applied to supervised tasks. This paper thus presents a generative model extending the centroid-based clustering approach to be applicable to classification and regression tasks. Given an arbitrary loss function, the proposed approach, termed Supervised Fuzzy Partitioning (SFP), incorporates labels information into its objective function through a surrogate term penalizing the empirical risk. Entropy-based regularization is also employed to fuzzify the partition and to weight features,...

An online portfolio selection algorithm using clustering approaches and considering transaction costs

, Article Expert Systems with Applications ; Volume 159 , November , 2020 Khedmati, M ; Azin, P ; Sharif University of Technology

Elsevier Ltd 2020

Abstract

This paper presents an online portfolio selection algorithm based on pattern matching principle where it makes a decision on the optimal portfolio in each period and updates the optimal portfolio at the beginning of each period. The proposed method consists of two steps: i) sample selection, ii) portfolio optimization. First, in the sample selection, clustering algorithms including k-means, k-medoids, spectral and hierarchical clustering are applied to discover time windows (TW) similar to the recent time window. Then, after finding the similar time windows and predicting the market behavior of the next day, the optimum function along with the transaction cost is used in the portfolio...

Separation of speech sources in under-determined case using SCA and time-frequency methods

, Article 2008 International Symposium on Telecommunications, IST 2008, Tehran, 27 August 2008 through 28 August 2008 ; 2008 , Pages 533-538 ; 9781424427512 (ISBN) Mahdian, R ; Babaiezadeh, M ; Jutten, C ; Sharif University of Technology

2008

Abstract

This paper presents a new algorithm for Blind Source Separation (BSS) of Instantaneous speech mixtures in under-determined case. A demixing algorithm which exploits the sparsity of speech signals in the short time Fourier transform (STFT) domain is proposed. This algorithm combines the modified k-means clustering procedure involved in the Line Orientation Separation Technique (LOST) with Smoothed l0-norm minimization (SL0) method. First procedure along with a transformation into a sparse domain tries to estimate the mixing matrix, and the second method tries to extract the sources from the mixtures. Simulation results are presented and compared to the Degenerate Unmixing Estimation Technique...

Shrinking FPGA static power via machine learning-based power gating and enhanced routing

, Article IEEE Access ; Volume 9 , 2021 , Pages 115599-115619 ; 21693536 (ISSN) Seifoori, Z ; Asadi, H ; Stojilovic, M ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2021

Abstract

Despite FPGAs rapidly evolving to support the requirements of the most demanding emerging applications, their high static power consumption, concentrated within the routing resources, still presents a major hurdle for low-power applications. Augmenting the FPGAs with power-gating ability is a promising way to effectively address the power-consumption obstacle. However, the main challenge when implementing power gating is in choosing the clusters of resources in a way that would allow the most power-saving opportunities. In this paper, we take advantage of machine learning approaches, such as K-means clustering, to propose efficient algorithms for creating power-gating clusters of FPGA...

Climate Classification of the MENA (Middle East and North Africa) by Introducing a New Index for Clustering Validation

, M.Sc. Thesis Sharif University of Technology Rajabi, Reza (Author) ; Moghim, Sanaz (Supervisor)

Abstract

Clustering presents valuable information in discovery of the climatic zones. To use clustering approaches, similarity measure, clustering algorithm, and clustering validity index should be determined. To find climatic zones over Middle East nad North Africa (MENA), this study performs k-means clustering with Euclidean distance as the similarity measure on four monthly precipitation datasets (CRU, GPCC, UDEL, and PREC/L) and two monthly temperature datasets (CRU, NOAA GHCN-CAMS). This study aims to validate clustering results and find a proper number of clusters. For this purpose, five traditional validity indices are examined on experimental datasets. Results show significant differences...

محتواي کتاب