Loading...
Search for: text-mining
0.007 seconds
Total 25 records

    Pattern Based Relation Extraction on Presian News Articles

    , M.Sc. Thesis Sharif University of Technology Cholmaghani Qaheh, Ali (Author) ; Bahrani, Mohammad (Supervisor) ; Sameti, Hossein (Co-Advisor)
    Abstract
    Relation extraction is known as a main task in information extraction. There are two main approach in this field, rule based and statistical approaches. This thesis applied a rule based relation extraction approach. In this research we tried to recognize Persian syntactic and morphological patterns to extract relation between named entities. At first we annotated a news dataset by person,organization and location named entity tags which is included more than 100 thousand tokens. After that we found there are 1037 relations 2197 candidate relations. Candidate and labled relations extracted between two entities which is located in a clause. These relations are "PERS_PERS-COMMENTING",... 

    Text Mining in Biological data for Protein-Protein Interaction

    , M.Sc. Thesis Sharif University of Technology Taheri, Nooshin (Author) ; Ghorshi, Ali (Supervisor) ; Kavousi, Kaveh (Supervisor)
    Abstract
    Decades ago, scientists and researchers found out proteins are not function isolated and act in multi protein complexes as complex networks. So, they started to study about proteins and their interaction in the term of protein-protein interaction, therefore, the number of publication in this field grows rapidly. This large amount of published articles (in scientific journals or web pages or books) are unstructured and it is hard to classify them manually. Also, study and read all of these documents is difficult for one person. Hence, it’s better to find a way which could help scientists and researcher to study these unstructured or semi-structured information more easily. The best way to... 

    Aspect-based Opinion Mining for Product Reviews

    , M.Sc. Thesis Sharif University of Technology Ezami, Sahba (Author) ; Beigy, Hamid (Supervisor)
    Abstract
    Other people’s opinions are important piece of information for making informed decisions. With the expansion of using internet, today the web has become an excellent source of user’s viewpoints in different domains. However, in one hand, the growing volume of opinionated text and on the other hand, complexity caused by contrast in user opinion, makes it almost impossible to read all of these reviews and make an informed decision. These needs, have inspired a new line of research on mining user reviews, which is called “opinion mining”. One of the most challenging sub-problems of opinion mining, is “aspect-based opinion mining”. The goal of aspect-based opinion mining is to extract different... 

    Online Big Data Analytics in Tourism Supply Chain

    , M.Sc. Thesis Sharif University of Technology Khorsand, Ramina (Author) ; Rafiee, Majid (Supervisor) ; Kayvanfar, Vahid (Supervisor)
    Abstract
    User-generated data in TripAdvisor.com consists of considerable amount of useful information that can help managers to provide better services to their customers. In this study reviews to all hotels of Tehran, Iran and Auckland, New Zealand on TripAdvisor.com as real data are selected and scraped by Java programming language. In addition, information about hotels (e.g. name of hotel, overall rating, hotel’s class, total reviews, and hotel’s amenities) and information about the reviews and reviewers (e.g. date of review, country of the reviewer, contribution rate, rate to the hotel, text of review, date of stay, trip type, and years in TripAdvisor) are extracted as well. 64 and 190 hotels of... 

    Stock Market Prediction Based on Analysis of Textual and Numerical Data

    , M.Sc. Thesis Sharif University of Technology Taleb, Mohsen (Author) ; Akhavan Niaki, Taghi (Supervisor)
    Abstract
    Unstructured data is an important resource in data mining which In spite of their large volume, they haven’t been analyzed so much. Natural language data are a typical kind of unstructured data which humans can easily understand them but normally it is not possible for machines to process these kind of data. To make these data usable for prediction, pre-processing is required to prepare them for feeding into machine learning algorithms. Therefore, feature extraction is needed for texts in order to make presentative features from them that can unveil the hidden pattern. In this study, in addition to the variables that extracted from the technical indicators, the texts from telegram channels... 

    News Text Mining for Gold Price Prediction

    , M.Sc. Thesis Sharif University of Technology Farzam, Mohammad Sina (Author) ; Izadi, Mohammad (Supervisor)
    Abstract
    Textual news published in the media on a daily basis is a large and valuable source of unstructured data that can be used to analyze and model the financial market by using text mining methods. The purpose of this study is to design a news-reading system for economic analysis and modeling of gold prices using features extracted from textual news and text mining methods; it seeks to enable the machine to read news like financial analysts then analyze and forecast the economic situation and market trends. For this purpose, we collected news from the website of an Iranian economic news agency. To design the economic analyzer, we extracted important economic, political, and social factors from... 

    Design of Decision Support System for Stock Exchange Using Text Mining Techniques

    , M.Sc. Thesis Sharif University of Technology Taheri Nastooh, Ali (Author) ; Habibi, Jafar (Supervisor)
    Abstract
    Today, the stock market has a significant impact on a country's economy and increases the country's gross production. Many factors such as financial statements, political and economic changes can significantly impact the market capitalization. Working in the stock market requires specialized knowledge, but the attractiveness of this market attracts even non-experts. These people are looking for easy solutions to invest in this market. Due to the dynamics of this market and various factors which affect the price, it is difficult to predict stock price through raw price data. In addition, today, easy access to social media, sharing opinions and ideas related to various topics has become very... 

    A novel algorithm for using GA in concept weighting for text mining

    , Article WSEAS Transactions on Computers ; Volume 5, Issue 12 , 2006 , Pages 2992-2999 ; 11092750 (ISSN) Zaefarian, R ; Akhgar, B ; Siddiqi, J. I ; Zaefarian, G ; Gruzdz, A ; Ihnatowicz, A ; Sharif University of Technology
    2006
    Abstract
    The importance of good weighting methodology in information retrieval methods - the method that affects the most useful features of a document or query representative - is examined.. Weighting features is the thing that many information retrieval systems are regarding as being of minor importance as compared to find the feature and the experiments are confirming this. There are different methods for the term weighting such as TF*IDF and Information Gain Ratio which have been used in information retrieval systems, the paper provides a brief review of the related literature. This paper explores using GA for concept weighting which is a novel application to the field of text mining It proposes... 

    Election vote share prediction using a sentiment-based fusion of Twitter data with Google trends and online polls

    , Article 6th International Conference on Data Science, Technology and Applications, DATA 2017, 24 July 2017 through 26 July 2017 ; 2017 , Pages 363-370 ; 9789897582554 (ISBN) Kassraie, P ; Modirshanechi, A ; Aghajan, H. K ; Institute for Systems and Technologies of Information, Control and Communication (INSTICC) ; Sharif University of Technology
    SciTePress  2017
    Abstract
    It is common to use online social content for analyzing political events. Twitter-based data by itself is not necessarily a representative sample of the society due to non-uniform participation. This fact should be noticed when predicting real-world events from social media trends. Moreover, each tweet may bare a positive or negative sentiment towards the subject, which needs to be taken into account. By gathering a large dataset of more than 370,000 tweets on 2016 US Elections and carefully validating the resulting key trends against Google Trends, a legitimate dataset is created. A Gaussian process regression model is used to predict the election outcome; we bring in the novel idea of... 

    Prediction of Hotel Customers’ Revisit Behavior by Determining the Appropriate Marketing Mix Using Customer Review Analysis

    , M.Sc. Thesis Sharif University of Technology Marandi, Ali Akbar (Author) ; Najmi, Manoochehr (Supervisor) ; Tasavori, Misagh (Supervisor)
    Abstract
    In recent years, the tourism industry has become one of the most influential industries in the income generation of countries and has attracted the attention of researchers. Identifying the important features of the hotel from the users' point of view is one of the areas that have been considered, while the segmentation of hotel customers based on the extracted features has been less in the focus of researchers and the need for more research in this field has been felt by experts.Given that the desire to return to the hotel by travelers has always been one of the factors affecting the financial performance of hotels, the factors affecting it are of great importance. It can be very valuable... 

    Contextual Data Analysis in Online Hotel Businesses

    , M.Sc. Thesis Sharif University of Technology Kookhahi, Ahmad (Author) ; Rafiee, Majid (Supervisor)
    Abstract
    in this study we intend to build a recommender system, more specifically We try to build a multi-criteria collaborative filtering. Collaborative filtering is one of the methods used in building of recommender systems. In this study, we use technical attributes to build a recommender system. Technical attributes refer to the attributes which focus on the writing style of the texts. After building the recommender system based on technical attributes, we also build a recommender system based on the conventional criteria in order to make a comparison between these two criteria. Collaborative filtering consists two major categories, namely memory-based and model-based that both of them have been... 

    News-based Stock Forecasting Using Text Mining Methods

    , M.Sc. Thesis Sharif University of Technology Ashtiani, Mohammad Hossein (Author) ; Rafiee, Majid (Supervisor)
    Abstract
    Predicting the trend of stock prices is always one of the concerns that stock market analysts and investors face with, which plays a critical role in maximizing the profit from investing in stocks. Past stock price charts, raw material prices, the value of the company's assets, the impact of global markets, the company's products, the organization's development plans, and the same other factors influencing the stock price. News reports are an important source of information for people. Recently, a lot of research has been done to examine the impact of news on stock price trends. This research has two research phases. In the first phase, a text mining method is presented which is using... 

    Personal Name Disambiguation in Persian Written News

    , M.Sc. Thesis Sharif University of Technology Saneei, Sara (Author) ; Sameti, Hossein (Supervisor)
    Abstract
    Diverse personal names are mentioned in everyday news but news agencies do not separate entities with same or equal names. This could make irrelevant news appear while searching an ambiguous name. Personal Name Disambiguation in news seeks to partition a significant amount of news to distinct classes each of which belongs to a single entity in the real world. In this thesis, which up to the researcher is the first of its kind at least in Persian, researcher gained opportunity of using FarsiYar News Dataset and to be specific 50,000 of news in FarsNews dataset which were published in the year 1397. First of all, a database was built using these news data and then the nonstructured news were... 

    Snappfood UGC Classification Using Machine Learning and Comparison of SVM and NB Methods

    , M.Sc. Thesis Sharif University of Technology Honarvar, Mohsen (Author) ; Najmi, Manoochehr (Supervisor)
    Abstract
    One way for businesses to grow and compete, in any age (especially the digital age), is to create a Brand Relevance through creating or finding, and then owning new categories or subcategories. In this way, instead of beating competitors; they become irrelevant by enticing customers to buy a new category or subcategory for which other alternative brands are not considered relevant. Firms traditionally rely on interviews and focus groups to identify these subcategories and customer needs. Nowadays, with the growth of social media, user-generated content (UGC) is also a good alternative source. However, Due to the large size of UGC and the non-informative or repetitive data it contains,... 

    Design a Recommender System for Purchasing Cosmetics using Text Mining Methods

    , M.Sc. Thesis Sharif University of Technology Ramezani Khozestani, Fatemeh (Author) ; Rafiee, Majid (Supervisor)
    Abstract
    In recent years, the cosmetics industry has dramatically grown in e-commerce. In e-commerce platforms, where multiple choices are available, an efficient recommender system is required to sort, order, and effectively transfer relevant content or product information to users. Recommender systems have attracted a lot of attention from retailers because they provide consumers with a personalized shopping experience. With technological advancements, this branch of artificial intelligence exhibits great potential in imaging, analysis, classification, and segmentation. Despite the great potential, the academic articles in this field are limited. Therefore, we conducted research in this context, in... 

    Constructing Brand Perceptual Maps from Consumer Reviews of Online Shops Using Machine Learning

    , M.Sc. Thesis Sharif University of Technology Ghadamyari, Mostafa (Author) ; Najmi, Manoochehr (Supervisor)
    Abstract
    Brand perceptual map is a practical tool for visualizing the position of a brand and its competitors in the mind of customers. In traditional ways of building a brand perceptual map, the researcher identifies important aspects of the product and designs a questionnaire to measure the scores of different brands to gather the required information from users. With the increasing development of online shopping, users have been voluntarily registering their reviews in online shops, in a free and unstructured manner, and have created valuable data sources in these online stores. Due to the large size of registered reviews, processing them requires the use of automated methods in computers. In... 

    Proposing a Hybrid Approach based on Deep Learning Algorithms for Stock Market Prediction

    , M.Sc. Thesis Sharif University of Technology Mobasseri, Niloofar (Author) ; Khedmati, Majid (Supervisor)
    Abstract
    Now a day, stock price prediction is known as one of the most challenging activities in the financial field. Research in price prediction models in financial markets, despite its many challenges, is still one of the most active areas for research. The price of non-linear financial assets is dynamic and unpredictable. Therefore, it is very difficult to arrangement and predict financial time series. Recently, many studies demonstrate that checking the news published in relation to a stock can significantly improve the accuracy of the prediction model.Among the latest techniques available for stock price prediction, we can mention deep learning models, which due to their high ability to... 

    A New Approach in Text Analysis in Order to Improve the Process of Gaining Information from Customer Reviews

    , M.Sc. Thesis Sharif University of Technology Partovizadeh Benam, Aylar (Author) ; Akhavan Niaki, Taghi (Supervisor)
    Abstract
    What people write about their experience on web pages or social media about a product they have used or a service they have received can influence the reputation and the popularity of a certain brand with a great deal. If the reviews that exist about a product or a service of a certain company are mainly positive, it can increase the profit and improve the image of the company. On the other hand, mostly negative reviews can decrease the profit and destroy a company's image irreversibly. Unfortunately, because of this great influence that online reviews have over general public's decision to use a a product or a service of a brand, some companies hire people to write undeserving positive... 

    Collecting positive instances of "instance-of" relationship in the Persian language

    , Article ICECT 2010 - Proceedings of the 2010 2nd International Conference on Electronic Computer Technology, 7 May 2010 through 10 May 2010, Kuala Lumpur ; May , 2010 , Pages 46-49 ; 9781424474059 (ISBN) Rastegari, Y ; Abolhassani, H ; Zibanezhad, B ; Sayadiharikandeh, M ; Sharif University of Technology
    2010
    Abstract
    Fetching Lexico-Syntactic patterns from text rely on pairs of words (positive instances) that represent the target relation, and finding their simultaneous occurrence in text corpus. Due to existence of WordNet thesaurus (which contains the semantic relationship between words), collecting positive instances is easy. In non-english languages, it's hard to collect large number of positive instances in various contexts. We investigated some new ideas for collecting them in Persian language and finally run the best one and collected approximately 6,000 positive instances  

    Persian sentiment lexicon expansion using unsupervised learning methods

    , Article 9th International Conference on Computer and Knowledge Engineering, ICCKE 2019, 24 October 2019 through 25 October 2019 ; 2019 , Pages 461-465 ; 9781728150758 (ISBN) Akhoundzade, R ; Hashemi Devin, K ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2019
    Abstract
    Sentiment analysis, is a subfield of natural language processing that aims at opinion mining to analyze thoughts, orientation and, evaluation of users within some texts. The solution to this problem includes two main steps: extracting aspects and determining users' positive or negative sentiments with respect to the aspects. Two main challenges of sentiment analysis in the Persian language are lack of comprehensive tagged data sets and use of colloquial language in texts. In this paper we propose, a system to specify and extract sentiment words using unsupervised methods in the Persian language that also support colloquial words. Additionally, we also proposed and implemented a state-of-art...