Loading...
Search for:
weblog
0.069 seconds
Incremental Representative Words Extraction of Persian Weblogs with Change of Theme Detection Using Graph Approach
, M.Sc. Thesis Sharif University of Technology ; hodsi, Mohammad (Supervisor)
Abstract
Although dimension reduction techniques for text documents can be used for preprocessing of blogs, these techniques will be more effective if they deal with the nature of the blogs properly. In this project we propose a novel algorithm called PostRank using shallow approach to identify theme of the blog or blog representative words in order to reduce the dimensions of blogs. PostRank uses a graph-based syntactic representation of the weblog by taking into account some structural features of weblog. At the first step it models the blog as a complete graph and assumes the theme of the blog as a query applied to a search engine like Google and each post as a search result. It tries to rank the...