Loading...

Online Big Data Analytics in Tourism Supply Chain

Khorsand, Ramina | 2020

320 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 53100 (01)
  4. University: Sharif University of Technology
  5. Department: Industrial Engineering
  6. Advisor(s): Rafiee, Majid; Kayvanfar, Vahid
  7. Abstract:
  8. User-generated data in TripAdvisor.com consists of considerable amount of useful information that can help managers to provide better services to their customers. In this study reviews to all hotels of Tehran, Iran and Auckland, New Zealand on TripAdvisor.com as real data are selected and scraped by Java programming language. In addition, information about hotels (e.g. name of hotel, overall rating, hotel’s class, total reviews, and hotel’s amenities) and information about the reviews and reviewers (e.g. date of review, country of the reviewer, contribution rate, rate to the hotel, text of review, date of stay, trip type, and years in TripAdvisor) are extracted as well. 64 and 190 hotels of Tehran and Auckland had profiles on TripAdvisor and total number of scraped reviews of them are 4,736 and 55,458 respectively. This study consists of two main parts. The first part is quantitative analysis of the data in which 8 different supervised machine learning models including K-nearest neighbors (KNN), Naïve Bayes, decision tree, logistic regression, support vector machine, neural network, random forest, and gradient boosting are applied to the data to select the best method to predict new users’ rate to a specific hotel. KNN algorithm which uses similarity and distance measures for classification is finally selected as the best method according to comprehensive conducted comparisons and statistical analysis, accompanied by data-based sensitivity analysis is applied over the best model. The second part is text mining of reviews to better understanding the reviewers’ opinion about these 254 hotels. In this regard, basic analysis of the text, text visualization, sentiment analysis, and Latent Dirichlet Allocation (LDA) were applied to the text of reviews. Since this study investigates an intensive set of data of all hotels in two cities in all time, some worthful managerial insights are presented
  9. Keywords:
  10. Big Data ; Data Mining ; Supply Chain ; Tourism ; K-Nearest Neighbor Method ; Machine Learning ; Text Mining

 Digital Object List

 Bookmark

No TOC