Loading...
Search for:
ghasem-sani--gholamreza
0.153 seconds
Total 460 records
Implementation of a Statistical Persian-English Translator Prototype
, M.Sc. Thesis Sharif University of Technology ; Ghasem Sani, Gholamreza (Supervisor)
Abstract
Machine translation has been an important subject in the field of natural language processing (NLP). In recent years, because of providing essential linguistic data resources, statistical approached have been deployed in machine translation. Although there have been several attempt to create English to Persian automatic translator, there has not been sufficient effort in the reverse direction. In this project, we reviewed previous works in machine translator for Persian and implemented a statistical machine translator from Persian to English. We needed a bilingual corpus for building the translator. For this purpose, we used a corpus of Phd and MSc abstracts in Persian and their translation...
Improving the Efficiency of Sat-Based Planning by Enhancing the Representations
, M.Sc. Thesis Sharif University of Technology ; Ghasem Sani, Gholamreza (Supervisor)
Abstract
Automated planning is a branch of artiticial intelligence that studies intelligent agents’ decision making process. The objective is to design agents that are able to decide on their own, about how to perform tasks that are assigned to them. In the past 20 years, a popular and appealing method for solving planning problems has been to use satisfiability (SAT) techniques. In this method, the planning ptoblem with a preset length would be encoded into a satisfiability problem, which is then solved by a general satisfiability solver. The solution to the planning problem is then extracted from the solution of the SAT problem. The length of the problem is proportional to the number of steps in...
Using Satisfiability in Solving Planning Problems having Numerical Values
, M.Sc. Thesis Sharif University of Technology ; Ghasem Sani, Gholamreza (Supervisor)
Abstract
Considering numerical values is an important step toward real world problems in planning. Although planning community has been aware of this fact since many years ago, but the complication involved in reasoning with numerical values made this challenge too difficult, thus very little and occasional research has been done on this issue.This dissertation is an effort to find an efficient method for solving numerical planning problems; in this regard, we use the “planning as satisfiability” approach. Planning as satisfiability is one of the most important and successful approaches for solving planning problems. Furthermore, developing SAT solvers with the capability of considering numerical...
Event Extraction in Persian Texts By Learning Methods
, M.Sc. Thesis Sharif University of Technology ; Ghasem Sani, Gholamreza (Supervisor)
Abstract
Event Extraction in Texts is one of the main challenges of Natural Language Processing. Event extraction is one of necessary components of question answering, summarization and information extraction systems. The purpose of this project has been the design and implementation of different statistical methods for event extraction in Persian and also correcting and expanding an existing corpus named PresTimeBank. The new system is composed of a preliminary rule based module that annotates events and find their features based on a predefined set of rules. The result of this stage is then revised in a subsequent manual annotation process. The output is a corpus that is compliant with the ISO...
Persian Aspect-based Sentiment Analysis using Unsupervised Learning Methods
, M.Sc. Thesis Sharif University of Technology ; Ghasem-Sani, Gholamreza (Supervisor)
Abstract
Sentiment analysis, is a subfield of natural language processing that aims at opinion mining to analyze thoughts, orientation and evaluation of users within some texts. Different organizations in multiple social domains, use this approach as a tool to asses their strengths and shortcomings. In sentiment analysis, the goal is to use machine learning techniques with the purpose of specifying users’ positive or negative orientation about a product or merchandise. The solution to this problem includes two main steps: extracting aspects and determining users’ positive or negative sentiments in respect to the aspects. Two main challenges of sentiment analysis in Farsi, are lack of comprehensive...
Design and Development of a Persian to English Translator Prototype
, M.Sc. Thesis Sharif University of Technology ; Ghasem Sani, Gholamreza (Supervisor)
Abstract
Increasing relations between different cultures necessitates easier and more affordable methods of translation between different languages. Hence, using computers as Translators has been very attractive to many governmental and commercial organizations as well as scientific community since the very beginning of the computer era. So far, different approaches to MT have been proposed. Two of the main approaches to MT are Statistical MT and Rule-Based MT. Unlike Statistical MT, which uses statistical information for translation, Rule-Based MT utilizes precise linguistic information to understand the source and generate the target language. This inguisitic information is usually encoded as a...
Temporal Relation Extraction of Persian Texts by Learning Methods
, M.Sc. Thesis Sharif University of Technology ; Ghasem Sani, Gholamreza (Supervisor)
Abstract
To fully understanding a text written in a natural language, we need to comprehend the events within that text. Temporal relation extraction always have been one of the main challenges in natural language processing in semantic level. Temporal relation extraction makes the understanding and interpretation of text easier and the extracted information can be used in many natural language systems like question answering, summarization, and information retrieval systems. Early researches on temporal relation extraction was mainly on English and limited to rule based systems. However, with extending the English corpora and availability of temporal corpora in other languages, more attention has...
Designing a Hybrid Approach to Persian-English Machine Translation
, M.Sc. Thesis Sharif University of Technology ; Ghasem Sani, Gholamreza (Supervisor)
Abstract
Nowadays, because of growing web and consequently increasing data in different languages, the need for machine translation is inevitable. Machine translators are created to speed up the translation process. Machine translation methods are generally divided into three categories: rule-based, corpus-based, and hybrid. Rule-based machine translation uses grammar for translation, but it needs a complete grammar of language for correct translation. Corpus-based method has many variations. One of those variations is the statistical machine translation which uses probabilistic and statistical rules for translation and nowadays is frequently used. Hybrid machine translation benefits from the...
Persian Grammar Induction Based on a Dependency Corpus
, M.Sc. Thesis Sharif University of Technology ; Ghasem Sani, Gholamreza (Supervisor)
Abstract
Grammar induction is one of the research topics of natural language processing. Grammar induction methods can be categorized into three main groups of supervised, semi-supervised, and unsupervised methods. Recently, developing Treebanks in different languages has motivated supervised methods. The main goal of this project has been extracting a dependency grammar based on a dependency Treebank. In a Treebank, the structure of every sentence represented as a dependency tree where the relation between words are specified. In this structure synonym sentences with free word order has the same dependency structure. Because of this property, dependency parsers accuracy does not decrease on Persian...
Cross-Lingual Sentiment Analysis of Persian Text Using Deep Learning
, M.Sc. Thesis Sharif University of Technology ; Ghasem-Sani, Gholamreza (Supervisor)
Abstract
One of the subfields of natural language processing is sentiment analysis. Generally, sentiment analysis, analyzes the positivity, or negativity of an opinion expressed in a sentence or document. Because each person's opinions have a huge impact on the decisions of other people and businesses, automatic analysis of texts has a particular importance; on which, extensive researches have been conducted in recent years. One of the common problems in sentiment analysis of some languages, including Persian, is the lack of proper resources in them. Cross-lingual sentiment analysis is one solution to this problem. In these methods, the goal is, through using the rich resources available in a source...
Using Machine Learning Approaches for Persian Pronoun Resolution
, M.Sc. Thesis Sharif University of Technology ; Ghasem Sani, Gholamreza (Supervisor)
Abstract
Coreference resolution is an essential step toward understanding discourses, and it is needed by many NLP tasks such as summarization, machine translation, question answering, etc. Pronoun resolution is a major and challenging subpart of coreference resolution, in which only the resolution of pronouns is considered. The existing coreference resolution approaches can be classified into two broad categories: linguistic and machine learning approaches. Linguistic approaches need a lot of linguistic information for the resolution process. Acquisition of such information is an error- prone and time-consuming process. In contrast, learning approaches need less linguistic information and provide...
A Hybrid Approach for Normalization of Non-Standard Persian Texts
, M.Sc. Thesis Sharif University of Technology ; Sameti, Hossein (Supervisor) ; Ghasem-Sani, Gholamreza (Co-Advisor)
Abstract
With the increase of internet usage and the volume of available data, the need for data mining and text processing is felt. One of the common obstacles for using these methods is usage of colloquial and non-standard language in writings. Due to this fact, combined with the fact that NLP tasks in Persian language had always faced data shortage issues, in this thesis, we first collect and construct a parallel data set, consisting of colloquial texts used in social media. Then after examining various methods used in other languages for text normalization, we propose a combination of new hybrid methods, involving Statistical Machine Translation methodology with some modification, to normalize...
Improving the Performance of the Fast Downward Planning System
, M.Sc. Thesis Sharif University of Technology ; Ghasem Sani, Golamreza (Supervisor)
Abstract
The so-called “Fast Downward” is a successful heuristic planner. This planner extracts informative data structures during planning. A number of efforts have been made to detect some constraints for guiding search toward the goal, in the hope to speed up the planning process. These constraints usually result in a number of sub-goals and determining their proper ordering. These sub-goals are called landmarks, which must be true at some point in every valid solution plan. Landmarks can be used to decompose a given planning task into several smaller sub-tasks. In this dissertation a new method is proposed to extracts landmark and recognizes their ordering based on fast downward basic data...
Temporal Planning using Satifiability
, M.Sc. Thesis Sharif University of Technology ; Ghassem Sani, Gholamreza (Supervisor)
Abstract
Automated Planning is an active research area in Artificial Intelligence. In Classical planning, for simplicity, time is considered as the order of actions in plan. In temporal planning, due to the importance of time in real world problems, this simplifying assumption is not considered, and time is explicitly used in the planning process. Most of current methods for temporal planning are extensions of classical planning methods to include the explicit definition of time. Planning using Satisfiability is used as an efficient method to find optimal solutions for classical planning problems. In this dissertation, a temporal planner based on Satisfiability has been developed. This planner, as we...
Partial Order Planning using Machine Learning Techniques
, M.Sc. Thesis Sharif University of Technology ; Ghassem-Sani, Gholamreza (Supervisor)
Abstract
Automated planning is a branch of artificial intelligence that studies intelligent agents’ decision making process.In planning, we can design agents that can decide on their own, about how to perform tasksthat are assigned to them. In classical planning, there is a restrictive assumption that actions in plans are totally ordered. By relaxing this restrictive assumption, partial order planninghas been created.Partial order planning uses a general principle, called the least commitment principle that results in a better performance than other classical planning methods. Yet, this branch of planning cannot compete with newer planning methods like heuristic search planning.That is why there has...
Persian Grammar Induction Using Unsupervised Data Oriented Parsing
, M.Sc. Thesis Sharif University of Technology ; Ghassem Sani, Gholamreza (Supervisor)
Abstract
Automatic grammar induction is one of attractive research topics in natural language processing field. Automatic grammar induction methods can be categorized into three main groups of supervised, semi-supervised and unsupervised methods based on the type of training data that they need. Unsupervised methods are more difficult than two other. Data Oriented Parsing (DOP) is one of successful methods in unsupervised group. This method has been trained by some examples of language as same as child, then it parses new sentences based on its training knowledge. The aim of this project is finding and improving performance of UDOP method on Persian language as a Free Word Order language. Results of...
Towards Unsupervised Temporal Relation Extraction Between Events
,
M.Sc. Thesis
Sharif University of Technology
;
Ghassem-Sani, Gholamreza
(Supervisor)
Abstract
Temporal relation classification is one of the contemporary demanding tasks in natural language processing. This task can be used in various applications such as question answering, summarization, and language specific information retrieval. Temporal relation classification methods can be categorized into three main groups of supervised, semi-supervised, and unsupervised (based on the type of the training data that they need). In this thesis, we have two main goals: first, improving accuracy of temporal relation learning, and second, decreasing supervision of algorithm as much as possible. For achieving these goals, three main steps are proposed. In the first step, we propose an improved...
Automatic Headline Generation for Persian News Texts
, M.Sc. Thesis Sharif University of Technology ; Ghassem-Sani, Gholamreza (Supervisor)
Abstract
The news headlines should represent the main and the most important topics of their stories. The task of selecting an appropriate headline for news stories is mainly done by journalists. The goal of this project has been the design and implementation of a system to automate this task, that is generating headlines for news. This task has been done for Persian news stories. There are various methods for automatic headline generation in English and some other languages, but no work has been done for Persian, yet. Thus, we have adopted some of the ideas from those methods, and do the remaining by our initiation. Our proposed method consists of three main parts: keyword extraction, most important...
Persian Aspect-based Sentiment Analysis Using Learning Methods
, M.Sc. Thesis Sharif University of Technology ; Ghassem Sani, Gholamreza (Supervisor)
Abstract
As digital content grows rapidly due to the internet, user reviews about different topics such as product quality can be used as a rich source to check and analyze product quality and performance. Automatic methods are being widely used to extract these information because of the massive amount of available resources. Sentiment analysis is one of the important fields in natural language processing, which uses a combination of learning and rule-based methods to extract subjective information out of documents. Aspect based sentiment analysis deals with sentiment analysis based on each aspect of the product. It consists of two main steps: first, aspects should be extracted from the reviews and...
Representation Based Multi-hop Question Answering
, Ph.D. Dissertation Sharif University of Technology ; Ghassem Sani, Gholamreza (Supervisor)
Abstract
The Question-Answering(QA) problem has long been a significant focus of researchers. Its connection with natural language understanding and knowledge retrieval makes it one of the most critical issues in Natural Language Processing (NLP). Given the inefficiency of simple question-answering methods, multi-hop question-answering (Multi-hop QA) across multiple documents has become one of the most attractive problems in recent years. In general, multi-hop question-answering is supposed to answer natural language questions that require extracting and combining information conained in several documents and performing reasoning about that information. The ability to answer questions and perform...