Search for: data-mining
Total 200 records
Article 2nd International Workshop on Contexts and Ontologies: Theory, Practice and Applications, C and O 2006 - Collocated with the 17th European Conference on Artificial Intelligence, ECAI 2006, Riva del Garda, 28 August 2006 through 28 August 2006 ; Volume 210 , 2006 ; 16130073 (ISSN) ; Sayyadi, H ; Abolhassani, H ; Sheykh Esmaili, K ; Sharif University of Technology
Several metrics have been proposed for recognition of relationships between elements of two Ontologies. Many of these methods select a number of such metrics and combine them to extract existing mappings. In this article, we present a method for selection of more effective metrics - based on data mining techniques. Furthermore, by having a set of metrics, we suggest a data-mining-like means for combining them into a better ontology alignment
Article 2011 IEEE 3rd International Conference on Communication Software and Networks, ICCSN 2011, Xi'an, 27 May 2011 through 29 May 2011 ; 2011 , Pages 329-337 ; 9781612844855 (ISBN) ; Alishahi, M ; Sharif University of Technology
There are some elements such as competition among companies and changes in demands which result in changes of customers' behaviors. Therefore, paying no attention to these changes may lead to a reduction in company benefits and loss of customers. Since data and their analyses determine the activities and decision makings of companies, data quality is of paramount in analyzing them because misinformation leads to wrong decision making. Since data mining has been designed to find out multi repetition patterns, it can be used to improve the product sales violations by sales people and increase the quality of data. Most of data mining models available try to find patterns in one table, but the...
Article 2007 IADIS European Conference on Data Mining, DM 2007, part of the 1st IADIS Multi Conference on Computer Science and Information Systems, MCCSIS 2007, 3 July 2007 through 8 July 2007 ; 2020 , Pages 230-232 ; Arasteh, B ; Karimi, M. B ; Hoseyni, M. J ; Bouyer, A ; Movaghar, A ; Sharif University of Technology
IADIS Press 2020
Grid computing is the on-demand sharing of computing resources with in a tightly-coupled network to solve certain problems. One of the main topics is knowledge discovery and data mining. By using grid computing, we can solve this problem. To achieve these very ambitious goals, we present an architecture for applying Association rule mining algorithms on grid environment. Association rule mining seeks to discover associations among transactions encoded in a database on each machine and then send result to related coordinator. This new architecture is powerful and rapid. We tried to compare our method with a serial processing. Our experimental results show that by using the new architecture on...
An efficient algorithm for solving bi-objective fuzzy job-shop scheduling problems by genetic algorithms and data mining, Article Amirkabir (Journal of Science and Technology) ; Volume 15, Issue 58 D , 2004 , Pages 570-591 ; 10150951 (ISSN) ; Moghaddam, R. T ; Ranjbar, M ; Sharif University of Technology
This paper presents a meta-heuristic algorithm for solving bi-objective fuzzy job shop scheduling problems. These objectives are to minimize the makespan and minimize the early and late penalty. Processing time and due date are considered as fuzzy triangular numbers. This paper also introduces a novel use of data mining algorithm for solving of combinatorial optimization problems. The proposed algorithm combines genetic algorithms and an attribute-oriented induction algorithm, which is much quicker than previous methods providing the optimal solution. By considering the structure of proposed algorithm, the whole feasible solutions of a special job shop-scheduling problem are considered as a...
Developing an approach to evaluate stocks by forecasting effective features with data mining methods, Article Expert Systems with Applications ; Volume 42, Issue 3 , February , 2014 , Pages 1325-1339 ; 09574174 (ISSN) ; Modarres, M ; Sharif University of Technology
Elsevier Ltd 2014
In this research, a novel approach is developed to predict stocks return and risks. In this three stage method, through a comprehensive investigation all possible features which can be effective on stocks risk and return are identified. Then, in the next stage risk and return are predicted by applying data mining techniques for the given features. Finally, we develop a hybrid algorithm, on the basis of filter and function-based clustering; the important features in risk and return prediction are selected then risk and return re-predicted. The results show that the proposed hybrid model is a proper tool for effective feature selection and these features are good indicators for the prediction...
Article 2010 45th International Universities' Power Engineering Conference, UPEC 2010, Cardiff, 31 August 2010 through 3 September 2010 ; 2010 ; 9780956557025 (ISBN) ; Vakilian, M ; Sharif University of Technology
Partial discharge (PD) is a common phenomenon which occurs in insulation of high voltage equipments, such as; transformers and has a damaging effect on the insulation. If data mining techniques be used to find specifications and features of different types of partial discharges in power transformers, one can monitor the insulation condition of such equipment online and continuously. Those results can be employed to develop preventive measures more exactly and consequently the maintenance would require less time and cost for electric utility and improve the life time expectancy of the transformers. In this paper experiments are set up to create models for some types of PD that occurs in Power...
Article Educational Data Mining 2010 - 3rd International Conference on Educational Data Mining, 11 June 2010 through 13 June 2010 ; June , 2010 , Pages 241-248 ; 9780615375298 (ISBN) ; Habibi, J ; Pittsburgh Science of Learning Center DataShop; Carnegie Learning Inc ; Sharif University of Technology
In the past few years, Iranian universities have embarked to use e-learning tools and technologies to extend and improve their educational services. After a few years of conducting e-learning programs a debate took place within the executives and managers of the e-learning institutes concerning which activities are of the most influence on the learning progress of online students. This research is aimed to investigate the impact of a number of e-learning activities on the students' learning development. The results show that participation in virtual classroom sessions has the most substantial impact on the students' final grades. This paper presents the process of applying data mining...
M.Sc. Thesis Sharif University of Technology ; Hooshmand, Mahmoud
Nowadays, because of high volume and growth of data in industrial organizations and productive factories, registration and storing of data have forgotten manual and tradition styles for which using automation and mechanized machinery and systems has been a necessary task. In order to reach to this revolution, need to some tools, facilities and methods which can fulfill this requirement is felt strongly. Therefore, high volume of data is considered as an advantage because based on precise analysis it is possible to make logical management decisions with less risk. During last years, statistical and numerical methods and simulation were used to discover knowledge and information when one of...
M.Sc. Thesis Sharif University of Technology ; Gholampour, Iman
Nowadays, using different types of data has shown significant impacts on analyzing the related systems. Growth in data volume, systems complexity and existence of error and obscurity in collecting the data, increased the necessity of inventing new data analysis methods. Location-based data is an important data type for such analyses which are collected from sensors in different places. These data besides other official organization's information like municipality or Google … provide us a bulk volume of raw data. Such collections of raw data are mostly diverse, heterogeneous, bulk and outspread. Inspite of that, raw data with machine learning algorithms lead to considerable practical...
M.Sc. Thesis Sharif University of Technology ; Shavandi, Hassan
The healthcare industry collects huge amounts of healthcare data which, unfortunately, are not“mined” to discover hidden information for effective decision making. Discovery of hidden patterns and relationships often goes nexploited. Advanced data mining techniques can help remedy this situation. The diagnosis of diseases is a vital and intricate job in medicine. The recognition of heart disease from diverse features or signs is a multi-layered problem that is not free from false assumptions and is frequently accompanied by impulsive effects. Thus the attempt to exploit knowledge and experience of several specialists and clinical screening data of patients composed in databases to assist the...
Applying data mining techniques to business process reengineering based on simultaneous use of two novel proposed approaches, Article International Journal of Business Process Integration and Management ; Volume 6, Issue 3 , 2013 , Pages 247-267 ; 17418763 (ISSN) ; Khanbabaei, M ; Saniee Abadeh, M ; Sharif University of Technology
Business process reengineering (BPR) can help organisations to identify and improve their business processes. A major problem is the high volume of business process datasets with characteristics such as high dimensionality, noise, uncertainty in process datasets and complicated interactions among process variables. Data mining (DM) techniques facilitate the identification and analysis of business processes, and improve their performance by extracting the hidden knowledge in business process datasets. In this paper, we present the application of DM to BPR, based on two novel approaches. By a literature review, the first approach proposes DMbBPR model, mainly focuses on the applications of...
Diagnosis of coronary artery disease using data mining techniques based on symptoms and ECG features, Article European Journal of Scientific Research ; Volume 82, Issue 4 , Aug , 2012 , Pages 542-553 ; 1450216X (ISSN) ; Habibi, J ; Hosseini, M. J ; Boghrati, R ; Ghandeharioun, A ; Bahadorian, B ; Sani, Z. A ; Sharif University of Technology
EuroJournals, Inc 2012
The most common heart disease is Coronary artery disease (CAD). CAD is one of the main causes of heart attacks and deaths across the globe. Early diagnosis of this disease is therefore, of great importance. A large number of methods have thus far been devised for diagnosing CAD. Most of these techniques have been conducted on the basis of the Irvine dataset (University of California), which not only has a limited number of features but is also full of missing values and thus lacks reliability. The present study was designed to seek a new set, free from missing values, comprising features such as the functional class, dyspnea, Q wave, ST elevation, ST depression, and T inversion. Information...
How realistic is static traffic assignment? Analyzing automatic number-plate recognition data and image processing of real-time traffic maps for investigation, Article Transportation Research Interdisciplinary Perspectives ; Volume 9 , 2021 ; 25901982 (ISSN) ; Gholampour, I ; Sedghi, M ; Zhu, L ; Sharif University of Technology
Elsevier Ltd 2021
Travel demand information in the form of the origin–destination (OD) matrix plays an essential role in studying urban traffic management and network design. The present study takes a novel step toward urban traffic analysis using data mining of processed images of real-time traffic maps as a location-based data model, in which the data were analyzed by software programs such as KNIME and Python workspaces and comparing the results with the conventional traffic assignment results. Thus, we investigated a real-time OD matrix based on the trip-per-vehicle by automatic number-plate recognition (ANPR) cameras for the congestion charge zone (CCZ) of Tehran, Iran. The obtained matrix was assigned...
Article IET Conference Publications, Stockholm ; Volume 2013, Issue 615 CP , June , 2013 ; 9781849197328 (ISBN) ; Asl, M. A ; Vasigh, R ; Shafiei, H ; Sharif University of Technology
The communication between control centers, contact centers, field technicians and customers have been changed in previous decade and traditional analogue communication devices have been being replaced by integrated digital systems. Nowadays communication and computer network substructures are vital to distribution companies because communication system interruption can disrupt majority of services. Therefore, it is necessary to control any situation which can threat communication system. Alborz province power distribution company uses data mining methods like classification and clustering in addition to quantitative statistics for analyzing historical fault data. The result of this study is...
Article Proceedings of the 2011 International Conference on Artificial Intelligence, ICAI 2011, 18 July 2011 through 21 July 2011 ; Volume 2 , July , 2011 , Pages 725-729 ; 9781601321855 (ISBN) ; Sharif University of Technology
Nowadays, discovery the association rules is an important and controversial area in data mining research studies. These rules, describe noticeable association relationships among different attributes. While most studies have focused on binary valued transaction data, in real world applications, there data usually consist of quantitative values. With that in mind, in this paper, we propose a fuzzy data mining algorithm for extracting membership functions from quantitative transactions. This is a hybrid genetic-pso algorithm for finding membership functions suitable for mining problems by a strong cooperation of GA and PSO. This algorithm integrates the two techniques entire run of simulation...
Article International Journal of Electrical Power and Energy Systems ; Volume 71 , October , 2015 , Pages 373-382 ; 01420615 (ISSN) ; Vakilian, M ; Blackburn, T. R ; Phung, B. T ; Sharif University of Technology
Elsevier Ltd 2015
Suggestion and application of a set of new features for on-line Partial Discharge (PD) monitoring, where there is no information about the type of PD is a challenging task for condition assessment of power equipments, such as a power transformer. This is looked for in this paper. So far, in past various techniques have been employed to develop a comprehensive PD monitoring system, however limited success has been achieved. One of the challenging issues in this field is the discovering of proper features capable of differentiating the involvement of possible types of PD sources. In order to examine the efficiency of the method established in this paper, which is based on application of a set...
Article ICEIE 2010 - 2010 International Conference on Electronics and Information Engineering, Proceedings, 1 August 2010 through 3 August 2010 ; Volume 1 , August , 2010 , Pages V1468-V1472 ; 9781424476800 (ISBN) ; Pedram, M. M ; Alishahi, M ; Badie, K ; Sharif University of Technology
The activities and decisions of organizations and companies are based on data and the information obtained from data analysis. Data quality plays a crucial role in data analysis, because the incorrect data leads to wrong decisions. Nowadays, improving the data quality manually is very difficult and in many cases is impossible as data quality is one of the complicated and non-structured concepts and data refinement process can not be done without the help of professional domain experts, and detection and correction of errors require a thorough knowledge in the related domain of the data. Thus, the necessity of using (semi-)automatic methods is discussed to find data defects and errors and...
Article SECRYPT 2009 - International Conference on Security and Cryptography, Proceedings, 7 July 2009 through 7 October 2009, Milan ; 2009 , Pages 213-218 ; 9789896740054 (ISBN) ; Movaghar, A ; Sharif University of Technology
This paper combines the cryptanalysis of RC4 and Data mining algorithm. It analyzes RC4 by Data mining algorithm (J48) for the first time and discloses more vulnerabilities of RC4. The motivation for this paper is combining Artificial Intelligence and Machine learning with cryptography to decrypt cyphertext in the shortest possible time. This analysis shows that lots of numbers in RC4 during different permutations and substitutions do not change their positions and are fixed in their places. This means KSA and PRGA are bad shuffle algorithms. In this method, the information theory and Decision trees are used which are very powerful for solving hard problems and extracting information from...
M.Sc. Thesis Sharif University of Technology ; Mahdavi Amiri, Nezamoddin
M.Sc. Thesis Sharif University of Technology ; Abolhassani, Hassan
In the last decade, Intrusion Detection Systems has attracted attention due to their importance in network security, but still they've shortcomings. Generating a lot of low level alerts is the main problem. Many of these alerts are actually false positives. One suggested solution is Alert Correlation Analysis. Because of false positives alert correlation techniques are not able to build accurate scenarios, but the accuracy of alerts can be verified with the aid of the information logged in the host systems. In this dissertation after surveying the current alert correlation techniques, a model will be introduced to effectively verify the generated alerts and to apply correlation techniques to...