Loading...
Search for: categoricalism
0.005 seconds
Total 22 records

    Categorization of various essential datasets and methods for textual spelling detection and normalization

    , Article Iranian Journal of Information Processing Management ; Volume 32, Issue 4 , 2017 , Pages 1143-1170 ; 22518223 (ISSN) Hosseini Beheshti, M. S ; Abdi Ghavidel, H ; Sharif University of Technology
    Iranian Research Institute for Scientific Information and Documentation  2017
    Abstract
    One of the most primary phases of automatic text processing is spelling error detection and grapheme normalization. Storing textual documents faces several problems without passing this phase, which causes a disturbance in retrieving the documents automatically. Therefore, specialists in the fields of natural language processing and computational linguistics usually make an attempt to sample various data through presenting ideal methods and algorithms in order to reach the normalized data. Several researches have been conducted on English and some other languages, which have been followed by a certain amount of researches on Farsi too. Sometimes, these several researches have remained to be... 

    A categorization scheme for semantic web search engines

    , Article IEEE International Conference on Computer Systems and Applications, 2006, Sharjah, 8 March 2006 through 8 March 2006 ; Volume 2006 , 2006 , Pages 171-178 ; 1424402123 (ISBN); 9781424402120 (ISBN) Sheykh Esmaili, K ; Abolhassani, H ; Sharif University of Technology
    IEEE Computer Society  2006
    Abstract
    Semantic web search engines are evolving and many prototype systems and some implementation have been developed. However, there are some different views on what a semantic search engine should do. In this paper, a categorization scheme for semantic web search engines are introduced and elaborated. For each category, its components are described according to a proposed general architecture and various approaches employed in these components are discussed. We also propose some factors to evaluate systems in each category. © 2006 IEEE  

    Categorizing CAPTCHA

    , Article Proceedings of the ACM Conference on Computer and Communications Security, 21 October 2011 through 21 October 2011 ; October , 2011 , Pages 107-108 ; 15437221 (ISSN) ; 9781450310031 (ISBN) Shirali Shahreza, S ; Shirali Shahreza, M ; ACM SIGSAC ; Sharif University of Technology
    2011
    Abstract
    CAPTCHA (Completely Automatic Public Turing Test to Tell Computer and Human Apart) systems are used to distinguish human users from computer programs automatically. The goal of them is to ask questions which human users can easily answer, but current computers cannot. Most current CAPTCHA methods are based on the weak points of OCR (Optical Character Recognition) systems. In this paper, a new CAPTCHA method is presented on the basis of object categorization. In this method, a number of objects are chosen randomly and the pictures of these objects are searched in the Internet and downloaded. The pictures are then shown to the user and the user is asked to mark the objects which belong to a... 

    Feature Ranking in Text Classification

    , M.Sc. Thesis Sharif University of Technology Sadeghi, Sabereh (Author) ; Beigy, Hamid (Supervisor)
    Abstract
    Text classification is one if the widest and most important applications in data mining. Because of the huge number of features in these applications, a method for dimensionality reduction is needed before applying the classification algorithm. Various number of methods for dimensionality reduction and feature selection are proposed. Feature selection based on feature ranking has received much attention by researchers. The major reasons are their scalability, ease of use, and fast computation. Feature ranking methods are divided to different categories and use different measures for scoring features. Recently ensemble methods have entered the field of ranking, and achieved more accuracy... 

    Automatic Blank Verse Poet Identification Using Linguistic Features

    , M.Sc. Thesis Sharif University of Technology Azin, Zahra (Author) ; Bahrani, Mohammad (Supervisor) ; Khosravi Zadeh, Parvaneh (Co-Advisor)
    Abstract
    Author identification using statistical methods is a branch of authorship attribution which is one of important problems in natural language processing. Using different statistical methods, an anonymous text is attributed to an author. One of the primary parts of the task is to choose the appropriate stylistic features of the text in order to study the significances of style. These features must be quantitatively studied and could be extracted in lexical level, character level, and syntactic or semantic levels. The next step is text classification in which different machine learning methods such as decision tree, Artificial Neural Networks, Naïve Bayes and other methods could be used.... 

    Human Action Categorization using Spatiotemporal Features

    , M.Sc. Thesis Sharif University of Technology Ghodrati, Amir (Author) ; Kasaei, Shohre (Supervisor)
    Abstract
    Recognizing human actions is an important and challenging topic in computer vision, which has important applications such as video surveillance and Indexing. From a computational perspective, actions can be defined as three-dimensional patterns, in space and in time which can be modeled using several representations. Action representations differ in visual information used in spatial dimensions (e.g., shape or appearance) and the representation of dynamics in time. The goal of this thesis is to develop new techniques and improve current results in action categorization. As such, using a general structure, three methods are proposed. In this structure, local spatio-temporal features are... 

    Hierarchical co-clustering for web queries and selected URLs

    , Article 8th International Conference on Web Information Systems Engineering, WISE 2007, Nancy, 3 December 2007 through 7 December 2007 ; Volume 4831 LNCS , 2007 , Pages 653-662 ; 03029743 (ISSN); 9783540769927 (ISBN) Hosseini, M ; Abolhassani, H ; Sharif University of Technology
    Springer Verlag  2007
    Abstract
    Recently query log mining is extensively used by web information systems. In this paper a new hierarchical co-clustering for queries and URLs of a search engine log is introduced. In this method, firstly we construct a bipartite graph for queries and visited URLs, and then to discover noiseless clusters, all queries and related URLs are projected in a reduced dimensional space by applying singular value decomposition. Finally, all queries and URLs are iteratively clustered for constructing hierarchical categorization. The method has been evaluated using a real world data set and shows promising results. © Springer-Verlag Berlin Heidelberg 2007  

    Sensitivity of a real-time freeway crash prediction model to calibration optimality

    , Article European Transport Research Review ; Volume 4, Issue 3 , September , 2012 , Pages 167-174 ; 18670717 (ISSN) Samimi, A ; Hellinga, B ; Sharif University of Technology
    2012
    Abstract
    Background: Real-time crash prediction models are often structured as general log-linear categorical models which must be calibrated using an extensive database. However, there is no method to optimally select the number of categories and the values that define the boundaries between categories when representing continuous measures as categorical variables within the log-linear model. This raises the question of how important the calibration is to the safety impacts estimated when using the crash prediction model. In this paper, we examined the impact that the process used to calibrate the crash prediction model has on estimates of safety impacts of a variable speed limit system. Methods:... 

    Metaphysics of Bohm-Bub's Modal Interpretation of Quantum Mechanics

    , M.Sc. Thesis Sharif University of Technology Yaghmaie, Aboutorab (Author) ; Shafiee, Afshin (Supervisor)
    Abstract
    According to modal interpretations of quantum mechanics, a state possesses definite values even if it is not an eigenstate. On the other hand, Kochen-Specker theorem holds that all possible possessed values by a system can not be definite. Consequently, modal interpretations to block this theorem, take new metaphysics of properties. In this dissertation, I will show that dispositionalism is that metaphysics. According to dispositionalism, causal powers of a property are constituents of the property. As regards this outline, the thesis would be divided into 5 chapters. In chapter one, presuppositions of later arguments are clarified. Chapter two is related to algebra of properties and finally... 

    An Open Domain Question Answering Method Based on Document Categorization

    , M.Sc. Thesis Sharif University of Technology Anvari, Hamid Reza (Author) ; Abolhassani, Hassan (Supervisor)
    Abstract
    One of the new paradigms in information retrieval is to develop textual Question-Answering systems. Question-Answering (QA) is an advanced IR process at which for a natural language question, the answer is extracted and issued in natural language. The QA systems are divided into two general groups: Open-Domain QA and Restricted-Domain QA.
    In this research field, a number of different models and methods are developed in which a document collection is used to retrieve candidate answers and then different methods are deployed to detect and eliminate irrelevant ones from answer set. Most of these methods decide based on expected semantic answer type, which is determined using pre-defined... 

    Incremental Learning Approach in Spam Detection

    , M.Sc. Thesis Sharif University of Technology Ghanbari, Elham (Author) ; Beygi, Hamid (Supervisor)
    Abstract
    Studies show that a large proportion of sent emails are spam. Spam is one of the major problems of e-mail users that result in wasting time and cost. To overcome this problem different ways are used, one of the best ways is detecting spam based on their contents. Separating legitimate e-mails and spam within their contents can be categorized as text classification. So machine-learning approaches are extremely applied in text classification, that machine-learning algorithms can be used for spam classification. However, in the majority of these algorithms, training phase is in a batch. Whereas using incremental learning algorithms is preferred in many applications, especially spam detections.... 

    Human Action Recognition Using Expandable Graphical Models

    , M.Sc. Thesis Sharif University of Technology Moradi, Reza (Author) ; Kasaei, Shohreh (Supervisor)
    Abstract
    In recent years, ability of computers to recognize human actions, because of numerousapplications, has attracted scientists. Surveillancesystems in house, work and public places, human computer interaction, study of human movement problems, remote supervision of ill or old people and sport training are only some of the applications. In this thesis 10 actions are considered. These actions are Walking, Running, Galloping side, Bending, Jump jacking, Jumping, Jumping in place, Skipping, Waving one hand and Waving two hands. All actions exist in Weisemann dataset so this dataset is used as training and testing dataset. Here important objectives are recognising human action so that it is... 

    Doc2vec Natural Language Model of Farsi

    , M.Sc. Thesis Sharif University of Technology Fazeli, Mohammad (Author) ; Moghadasi, Reza (Supervisor)
    Abstract
    Due to immense increase in availability of text data, interest in using machine learning models to solve problems previously impossibly costly has increased significantly. The first step is to represent natural language in a form that is easy for the machine learning algorithms to work on. Recent advances in learned representation of text data using simple neural networks(e.g. word2vec and doc2vec) helped increase performance of natural language processing on downstream tasks. Here we show that methods like doc2vec that were examined mostly in the English language can be used on Persian(Farsi) with little modification. To Demonstrate this, we use text classification tasks, and train... 

    New approaches in monitoring multivariate categorical processes based on contingency tables in phase II

    , Article Quality and Reliability Engineering International ; 2016 ; 07488017 (ISSN) Kamranrad, R ; Amiri, A ; Niaki, S. T. A ; Sharif University of Technology
    John Wiley and Sons Ltd  2016
    Abstract
    In some statistical process control (SPC) applications, quality of a process or product is characterized by contingency table. Contingency tables describe the relation between two or more categorical quality characteristics. In this paper, two new control charts based on the WALD and Stuart score test statistics are designed for monitoring of contingency table-based processes in Phase-II. The performances of the proposed control charts are compared with the generalized linear test (GLT) control chart proposed in the literature. The results show the better performance of the proposed control charts under small and moderate shifts. Moreover, new schemes are proposed to diagnose which cell... 

    New approaches in monitoring multivariate categorical processes based on contingency tables in phase II

    , Article Quality and Reliability Engineering International ; Volume 33, Issue 5 , 2017 , Pages 1105-1129 ; 07488017 (ISSN) Kamranrad, R ; Amiri, A ; Akhavan Niaki, S. T ; Sharif University of Technology
    Abstract
    In some statistical process control (SPC) applications, quality of a process or product is characterized by contingency table. Contingency tables describe the relation between two or more categorical quality characteristics. In this paper, two new control charts based on the WALD and Stuart score test statistics are designed for monitoring of contingency table-based processes in Phase-II. The performances of the proposed control charts are compared with the generalized linear test (GLT) control chart proposed in the literature. The results show the better performance of the proposed control charts under small and moderate shifts. Moreover, new schemes are proposed to diagnose which cell... 

    Machine Learning in Automated Spam Detection

    , M.Sc. Thesis Sharif University of Technology Famil Saeedian, Mehrnoush (Author) ; Beigy, Hamid (Supervisor)
    Abstract
    Nowadays spam has become as a universal problem which all email users are familiar with it. Studies show that a large proportion of sent emails are spam. Obviously it results in wasting a vast range of resources. There is different ways to fight spam; each of them has its own strengths and weaknesses. The most common filtering technique is content based filtering. This problem has been addressed as a text classification problem. Two main defect of spam filtering techniques are manually definition of rules and circumventing them, one solution for overcoming this problem is applying machine learning algorithms. Spam classification using machine learning techniques is very successful and... 

    Content Based Video Classification

    , M.Sc. Thesis Sharif University of Technology Zarrin Kolah, Majid (Author) ; Manzuri Shalmani, Mohammad Taghi (Supervisor)
    Abstract
    Simultaneous development of technology and social networks and universal access to them caused to produce and distribute huge volume of videos that recognition of their content without use of machine vision is very hard. This thesis examine some video classification algorithms to improve them. The algorithm that is used to improve is based on one of local descriptor algorithms. At first with using STIP tools, the local interest point found by Harris3d and describe by HOG/HOF. Then by using Bag of Features, all local descriptors in a video produce a descriptor per videos. Bag of Features divide the domain of all local descriptors from all videos to K cluster and produce a vector per video... 

    Human action categorization using discriminative local spatio-temporal feature weighting

    , Article Intelligent Data Analysis ; Volume 16, Issue 4 , July , 2012 , Pages 537-550 ; 1088467X (ISSN) Ghodrati, A ; Kasaei, S ; Sharif University of Technology
    IOP  2012
    Abstract
    New methods based on local spatio-temporal features have exhibited significant performance in action recognition. In these methods, feature selection plays an important role to achieve a superior performance. Actions are represented by local spatio-temporal features extracted from action videos. Action representations are then classified by applying a classifier (such as k-nearest neighbor or SVM). In this paper, we have proposed two feature weighting methods to better discriminate similar actions. We have proposed a definition of feature discrimination power to be used in the feature selection process. Our proposed weighting schemes have greatly improved the final categorization accuracy on... 

    Bug localization using revision log analysis and open bug repository text categorization

    , Article 6th International IFIP WG 2.13 Conference on Open Source Systems, OSS 2010, Notre Dame, IN, 30 May 2010 through 2 June 2010 ; Volume 319 AICT , 2010 , Pages 188-199 ; 18684238 (ISSN) ; 9783642132438 (ISBN) Moin, A. H ; Khansari, M ; Sharif University of Technology
    2010
    Abstract
    In this paper, we present a new approach to localize a bug in the software source file hierarchy. The proposed approach uses log files of the revision control system and bug reports information in open bug repository of open source projects to train a Support Vector Machine (SVM) classifier. Our approach employs textual information in summary and description of bugs reported to the bug repository, in order to form machine learning features. The class labels are revision paths of fixed issues, as recorded in the log file of the revision control system. Given an unseen bug instance, the trained classifier can predict which part of the software source file hierarchy (revision path) is more... 

    Phase-II monitoring and diagnosing of multivariate categorical processes using generalized linear test-based control charts

    , Article Communications in Statistics: Simulation and Computation ; Volume 46, Issue 8 , 2017 , Pages 5951-5980 ; 03610918 (ISSN) Kamranrad, R ; Amiri, A ; Akhavan Niaki, S. T ; Sharif University of Technology
    Abstract
    In this paper, two control charts based on the generalized linear test (GLT) and contingency table are proposed for Phase-II monitoring of multivariate categorical processes. The performances of the proposed methods are compared with the exponentially weighted moving average-generalized likelihood ratio test (EWMA-GLRT) control chart proposed in the literature. The results show the better performance of the proposed control charts under moderate and large shifts. Moreover, a new scheme is proposed to identify the parameter responsible for an out-of-control signal. The performance of the proposed diagnosing procedure is evaluated through some simulation experiments. © 2017 Taylor & Francis...