Loading...

Stock Market Prediction Based on Analysis of Textual and Numerical Data

Taleb, Mohsen | 2020

404 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 53398 (01)
  4. University: Sharif University of Technology
  5. Department: Industrial Engineering
  6. Advisor(s): Akhavan Niaki, Taghi
  7. Abstract:
  8. Unstructured data is an important resource in data mining which In spite of their large volume, they haven’t been analyzed so much. Natural language data are a typical kind of unstructured data which humans can easily understand them but normally it is not possible for machines to process these kind of data. To make these data usable for prediction, pre-processing is required to prepare them for feeding into machine learning algorithms. Therefore, feature extraction is needed for texts in order to make presentative features from them that can unveil the hidden pattern. In this study, in addition to the variables that extracted from the technical indicators, the texts from telegram channels that related to Tehran stock market is used to improve the forecast of the price movement of each stock for the next three days. To prepare these data for algorithms, we cluster them into five groups and extract the number of texts that published per day in each cluster. Besides, we take the polarity of these texts into consideration. This polarity is determined by a dictionary and machine learning algorithms. Finally, a graph is created with textual data which shows the relationship between stocks. Six community are extracted from the graph that in each community, stocks have a high correlation. According to these community, a new index is defined that shows the movement of related stocks which helps the prediction. A number of machine learning algorithms are used to measure the improvement caused by the variables obtained from textual and numerical data and the results will be analyzed
  9. Keywords:
  10. Data Mining ; Text Mining ; Feature Extraction ; Prediction ; Stock Market ; Data Analysis ; Unstructured Data

 Digital Object List

 Bookmark

No TOC