Loading...
Search for: information-retrieval
0.007 seconds

    Fundamental Bounds for Clustering of Bernoulli Mixture Models

    , M.Sc. Thesis Sharif University of Technology Behjati, Amin (Author) ; Motahari, Abolfazl (Supervisor)
    Abstract
    A random vector with binary components that are independent of each other is referred to as a Bernoulli random vector. A Bernoulli Mixture Model (BMM) is a combination of a finite number of Bernoulli models, where each sample is generated randomly according to one of these models. The important challenge is to estimate the parameters of a Bernoulli Mixture Model or to cluster samples based on their source models. This problem has applications in bioinformatics, image recognition, text classification, social networks, and more. For example, in bioinformatics, it pertains to clustering ethnic groups based on genetic data. Many studies have introduced algorithms for solving this problem without... 

    Improving Reasoning in Question Answering Systems Using Deep Learning

    , M.Sc. Thesis Sharif University of Technology Rahimi, Zahra (Author) ; Sameti, Hossein (Supervisor)
    Abstract
    Nowadays Artificial Intelligence systems are ubiquitous. One of the important applications is textual question-answering systems, which provide a means of information retrieval in a user-friendly manner. Reasoning is an inseparable part of human daily life, and people use reasoning to judge and find rational and correct answers to questions. To get the desired output from question-answering systems, these systems must be equipped with reasoning. This research focuses on improving question answering by considering Commonsense Reasoning. The two most important weaknesses of the existing question-answering systems are the questions being in the form of multiple-choice, which is far from a... 

    Music emotion recognition using two level classification

    , Article 2014 Iranian Conference on Intelligent Systems, ICIS 2014 ; Feb , 2014 ; 9781479933501 Pouyanfar, S ; Sameti, H ; Sharif University of Technology
    Abstract
    Rapid growth of digital music data in the Internet during the recent years has led to increase of user demands for search based on different types of meta data. One kind of meta data that we focused in this paper is the emotion or mood of music. Music emotion recognition is a prevalent research topic today. We collected a database including 280 pieces of popular music with four basic emotions of Thayer's two Dimensional model. We used a two level classifier the process of which could be briefly summarized in three steps: 1) Extracting most suitable features from pieces of music in the database to describe each music song; 2) Applying feature selection approaches to decrease correlations... 

    Efficient stochastic algorithms for document clustering

    , Article Information Sciences ; Volume 220 , 2013 , Pages 269-291 ; 00200255 (ISSN) Forsati, R ; Mahdavi, M ; Shamsfard, M ; Meybodi, M. R ; Sharif University of Technology
    2013
    Abstract
    Clustering has become an increasingly important and highly complicated research area for targeting useful and relevant information in modern application domains such as the World Wide Web. Recent studies have shown that the most commonly used partitioning-based clustering algorithm, the K-means algorithm, is more suitable for large datasets. However, the K-means algorithm may generate a local optimal clustering. In this paper, we present novel document clustering algorithms based on the Harmony Search (HS) optimization method. By modeling clustering as an optimization problem, we first propose a pure HS based clustering algorithm that finds near-optimal clusters within a reasonable time.... 

    ISO-TimeML event extraction in persian text

    , Article 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers, 8 December 2012 through 15 December 2012 ; December , 2012 , Pages 2931-2944 Yaghoobzadeh, Y ; Ghassem-Sani, G ; Mirroshandel, S. A ; Eshaghzadeh, M ; Sharif University of Technology
    2012
    Abstract
    Recognizing TimeML events and identifying their attributes, are important tasks in natural language processing (NLP). Several NLP applications like question answering, information retrieval, summarization, and temporal information extraction need to have some knowledge about events of the input documents. Existing methods developed for this task are restricted to limited number of languages, and for many other languages including Persian, there has not been any effort yet. In this paper, we introduce two different approaches for automatic event recognition and classification in Persian. For this purpose, a corpus of events has been built based on a specific version of ISO-TimeML for Persian.... 

    PEN: Parallel English-Persian news corpus

    , Article Proceedings of the 2011 International Conference on Artificial Intelligence, ICAI 2011, 18 2011 through 21 July 2011 ; Volume 2 , July , 2011 , Pages 523-528 ; 9781601321855 (ISBN) Farajian, M. A ; ICAI 2011
    2011
    Abstract
    Parallel corpora are the necessary resources in many multilingual natural language processing applications, including machine translation and cross-lingual information retrieval. Manual preparation of a large scale parallel corpus is a very time consuming and costly procedure. In this paper, the work towards building a sentence-level aligned English-Persian corpus in a semi-automated manner is presented. The design of the corpus, collection, and alignment process of the sentences is described. Two statistical similarity measures were used to find the similarities of sentence pairs. To verify the alignment process automatically, Google Translator was used. The corpus is based on news... 

    CFM: A file manager with multiple categorization support

    , Article SEKE 2010 - Proceedings of the 22nd International Conference on Software Engineering and Knowledge Engineering, 1 July 2010 through 3 July 2010 ; 2010 , Pages 748-751 ; 1891706268 (ISBN); 9781891706264 (ISBN) Badashian, A. S ; Afzali, H ; Khalkhali, I ; Delcheh, M. A ; Shafiei, M. S ; Mahdavi, M ; Sharif University of Technology
    Abstract
    This paper introduces a new file manager to support multiple categorization. The proposed file manager is designed based on a subtle idea named Conceptual File Management (CFM). According to this approach, files are not contained by folders; nevertheless, each file can be a member of one or more folders (concepts). A prototype file manager is designed and implemented based on the new approach. Filtering by set operations and also manual concept selection improves retrieval of the files. CFM improves file system's clarity and avoids ambiguity and redundancy. As a result, it reduces the size of file system and enhances file access  

    Rate-power-interference optimization in underlay OFDMA CRNs with imperfect CSI

    , Article IEEE Communications Letters ; Volume 21, Issue 7 , 2017 , Pages 1657-1660 ; 10897798 (ISSN) Robat Mili, M ; Musavian, L ; Ng, D. W. K ; Sharif University of Technology
    Abstract
    Achieving higher transmission rate while reducing transmission power and induced interference on neighboring receivers is deemed necessary for the advancement of future generation networks and is particularly challenging, since these directions could be conflicting in nature. This letter adopts a multiobjective optimization (MOOP) approach to settle the tradeoffs between these three conflicting objectives in orthogonal frequency-division multiple access-based cognitive radio networks. Besides, unlike most of the work in the literature that studied the imperfect channel side information (CSI) of the link between the secondary transmitter and the primary receiver to evaluate ergodic capacity,... 

    Cluster-based sparse topical coding for topic mining and document clustering

    , Article Advances in Data Analysis and Classification ; 2017 , Pages 1-22 ; 18625347 (ISSN) Ahmadi, P ; Gholampour, I ; Tabandeh, M ; Sharif University of Technology
    Abstract
    In this paper, we introduce a document clustering method based on Sparse Topical Coding, called Cluster-based Sparse Topical Coding. Topic modeling is capable of improving textual document clustering by describing documents via bag-of-words models and projecting them into a topic space. The latent semantic descriptions derived by the topic model can be utilized as features in a clustering process. In our proposed method, document clustering and topic modeling are integrated in a unified framework in order to achieve the highest performance. This framework includes Sparse Topical Coding, which is responsible for topic mining, and K-means that discovers the latent clusters in documents... 

    A generalized audio identification system using adaptive filters

    , Article 26th Iranian Conference on Electrical Engineering, ICEE 2018, 8 May 2018 through 10 May 2018 ; 2018 , Pages 1641-1646 ; 9781538649169 (ISBN) Yazdanian, S ; Sameti, H ; Alidoust, A ; Sharif University of Technology
    Abstract
    From searching music with smartphones to broadcast monitoring by radio channels, audio identification systems are being used more in recent years. Design of such systems may differ when the problem domain changes, since each environment has special conflicting constraints to consider, like required speed and robustness to signal degradations. In this paper, a widely used audio identification system originally developed by Haitsma and Kalker is analyzed from a signal processing point of view and the fingerprint (audio feature) extraction method is modified. By adding a flexible filter to the fingerprint extraction method, the original system can be tuned to work in different domains. In order... 

    Cluster-based sparse topical coding for topic mining and document clustering

    , Article Advances in Data Analysis and Classification ; Volume 12, Issue 3 , 2018 , Pages 537-558 ; 18625347 (ISSN) Ahmadi, P ; Gholampour, I ; Tabandeh, M ; Sharif University of Technology
    Springer Verlag  2018
    Abstract
    In this paper, we introduce a document clustering method based on Sparse Topical Coding, called Cluster-based Sparse Topical Coding. Topic modeling is capable of improving textual document clustering by describing documents via bag-of-words models and projecting them into a topic space. The latent semantic descriptions derived by the topic model can be utilized as features in a clustering process. In our proposed method, document clustering and topic modeling are integrated in a unified framework in order to achieve the highest performance. This framework includes Sparse Topical Coding, which is responsible for topic mining, and K-means that discovers the latent clusters in documents... 

    Duality in bipolar triangular fuzzy number quadratic programming problems

    , Article Proceedings of the International Conference on Intelligent Sustainable Systems, ICISS 2017, 7 December 2017 through 8 December 2017 ; 19 June , 2018 , Pages 1236-1238 ; 9781538619599 (ISBN) Ghorbani Moghadam, K ; Ghanbari, R ; Mahdavi Amiri, N ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2018
    Abstract
    We discuss how to solve bipolar fuzzy quadratic programming problems, where the parameters are bipolar triangular fuzzy numbers, making use of linear ranking functions. Also, we explore some duality properties of bipolar triangular fuzzy number quadratic programming problem (BTFNQPP). © 2017 IEEE  

    Content based image retrieval using the knowledge of texture, color and binary tree structure

    , Article 2009 Canadian Conference on Electrical and Computer Engineering, CCECE '09, St. Johns, NL, 3 May 2009 through 6 May 2009 ; 2009 , Pages 999-1003 ; 08407789 (ISSN); 9781424435081 (ISBN) Mansoori, Z ; Jamzad, M ; Sharif University of Technology
    2009
    Abstract
    Content base image retrieval is an important research field with many applications. In this paper we presents a new approach for finding similar images to a given query, in a general-purpose image database using content-based image retrieval. Color and texture are used as basic features to describe images. In addition, a binary tree structure is used to describe higher level features of an image. It has been used to keep information about separate segments of the images. The performance of the proposed system has been compared with the SIMPLIcity system using COREL image database. Our experimental results showed that among 10 image categories available in COREL database, our system had a... 

    On the uniform sampling of the web: An improvement on bucket based sampling

    , Article 2009 International Conference on Communication Software and Networks, ICCSN 2009, Macau, 27 February 2009 through 28 February 2009 ; 2009 , Pages 205-209 ; 9780769535227 (ISBN) Heidari, S ; Mousavi, H ; Movaghar, A ; Sharif University of Technology
    2009
    Abstract
    Web is one of the biggest sources of information. The tremendous size, the dynamicity, and the structure of the Web have made the information retrieval process of the web a challenging issue. Web Search Engines (WSEs) have started to help users with this matter. However, these types of application, to perform more effectively, always need current information about many characteristics of the Web. To determine these characteristics, one way is to use statistical sampling of the Web pages. In this kind of approaches, instead of analyzing a large number of Web pages, a rather smaller and more uniform set of Web pages is used. This research attempts to analyze the presented methods for... 

    Solving fuzzy quadratic programming problems based on ABS algorithm

    , Article Soft Computing ; Volume 23, Issue 22 , 2019 , Pages 11343-11349 ; 14327643 (ISSN) Ghanbari, R ; Ghorbani Moghadam, K ; Sharif University of Technology
    Springer Verlag  2019
    Abstract
    Recently, Ghanbari and Mahdavi-Amiri (Appl Math Model 34:3363–3375, 2010) gave the general compromised solution of an LR fuzzy linear system using ABS algorithm. Here, using this general solution, we solve quadratic programming problems with fuzzy LR variables. We convert fuzzy quadratic programming problem to a crisp quadratic problem by using general solution of fuzzy linear system. By using this method, the crisp optimization problem has fewer variables in comparison with other methods, specially when rank of the coefficient matrix is full. Thus, solving the fuzzy quadratic programming problem by using our proposed method is computationally easier than the solving fuzzy quadratic... 

    Performance evaluation of epidemic content retrieval in DTNs with restricted mobility

    , Article IEEE Transactions on Network and Service Management ; Volume 16, Issue 2 , 2019 , Pages 701-714 ; 19324537 (ISSN) Rashidi, L ; Entezari Maleki, R ; Chatzopoulos, D ; Hui, P ; Trivedi, K. S ; Movaghar, A ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2019
    Abstract
    In some applicable scenarios, such as community patrolling, mobile nodes are restricted to move only in their own communities. Exploiting the meetings of the nodes within the same community and the nodes within the neighboring communities, a delay tolerant network (DTN) can provide communication between any two nodes. In this paper, two analytical models based on stochastic reward nets (SRNs) are proposed to evaluate the performance of the epidemic content retrieval in such multi-community DTNs. Performance measures computed by the proposed models are the average retrieval delay and the average number of transmissions. The monolithic SRN model proposed in the first step is not scalable, in... 

    Fuzzy linear programming problems: models and solutions

    , Article Soft Computing ; Volume 24, Issue 13 , 2020 , Pages 10043-10073 Ghanbari, R ; Ghorbani Moghadam, K ; Mahdavi Amiri, N ; De Baets, B ; Sharif University of Technology
    Springer  2020
    Abstract
    We investigate various types of fuzzy linear programming problems based on models and solution methods. First, we review fuzzy linear programming problems with fuzzy decision variables and fuzzy linear programming problems with fuzzy parameters (fuzzy numbers in the definition of the objective function or constraints) along with the associated duality results. Then, we review the fully fuzzy linear programming problems with all variables and parameters being allowed to be fuzzy. Most methods used for solving such problems are based on ranking functions, α-cuts, using duality results or penalty functions. In these methods, authors deal with crisp formulations of the fuzzy problems. Recently,... 

    Using social annotations for search results clustering

    , Article 13th International Computer Society of Iran Computer Conference on Advances in Computer Science and Engineering, CSICC 2008, Kish Island, 9 March 2008 through 11 March 2008 ; Volume 6 CCIS , 2008 , Pages 976-980 ; 18650929 (ISSN); 3540899847 (ISBN); 9783540899846 (ISBN) Aliakbary, S ; Khayyamian, M ; Abolhassani, H ; Sharif University of Technology
    2008
    Abstract
    Clustering search results helps the user to overview returned results and to focus on the desired clusters. Most of search result clustering methods use title, URL and snippets returned by a search engine as the source of information for creating the clusters. In this paper we propose a new method for search results clustering (SRC) which uses social annotations as the main source of information about web pages. Social annotations are high-level descriptions for web pages and as the experiments show, clustering based on social annotations yields good clusters with informative labels. © 2008 Springer-Verlag  

    Challenges in using peer-to-peer structures in order to design a large-scale web search engine

    , Article 13th International Computer Society of Iran Computer Conference on Advances in Computer Science and Engineering, CSICC 2008, Kish Island, 9 March 2008 through 11 March 2008 ; Volume 6 CCIS , 2008 , Pages 461-468 ; 18650929 (ISSN); 3540899847 (ISBN); 9783540899846 (ISBN) Mousavi, H ; Movaghar, A ; Sharif University of Technology
    2008
    Abstract
    One of the distributed solutions for scaling Web Search Engines (WSEs) may be peer-to-peer (P2P) structures. P2P structures are successfully being used in many systems with lower cost than ordinary distributed solutions. However, the fact that they can also be beneficial for large-scale WSEs is still a controversial subject. In this paper, we introduce challenges in using P2P structures to design a large-scale WSE. Considering different types of P2P systems, we introduce possible P2P models for this purpose. Using some quantitative evaluation, we compare these models from different aspects to find out which one is the best in order to construct a large-scale WSE. Our studies indicate that... 

    Removing noises similar to dots from persian scanned documents

    , Article ISECS International Colloquium on Computing, Communication, Control, and Management, CCCM 2008, Guangzhou, 3 August 2008 through 4 August 2008 ; Volume 2 , 2008 , Pages 313-317 ; 9780769532905 (ISBN) Shirali Shahreza, M. H ; Shiral Shahreza, S ; Sharif University of Technology
    2008
    Abstract
    Nowadays, computer is being used in many aspects of human life. A consequence of computer is electronic documents. Computers cannot understand written documents. So, we need to convert written documents to electronic documents in order to be able to process them with computers. One of the common methods for converting written texts to electronic text is Optical Character Recognition (OCR). A lot of work has been done on English OCR, but Persian/Arabic OCR is still under development. One of the major problems in Persian/Arabic OCR is noise removal. Because dots are very important in Persian and Arabic languages and they are very similar to noises, so noise removal from Persian/Arabic...