Loading...

Sampling in Large-Scale Complex Networks

Salehi, Mostafa | 2012

759 Viewed
  1. Type of Document: Ph.D. Dissertation
  2. Language: Farsi
  3. Document No: 43636 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Rabiei, Hamid Reza
  7. Abstract:
  8. Many real-world communication systems such as Internet, online social networks, and brain networks can be modeled as a complex network of interacting dynamical nodes. These networks have non-trivial topological features, i.e., features that do not occur in simple networks such as lattices or random networks. The tremendous growth of Internet and its applications in recent years has resulted in creation of large-scale complex networks involving tens or hundreds of millions of nodes and links. Thus, it may be impossible or costly to obtain a complete picture of these large networks, and sampling methods are essential for practical estimation of network properties. Therefore, in this thesis, we focus on the use of sampling to estimate network properties from incomplete (sampled) data.In the first part, we propose a framework to measure nodal characteristics in an arbitrary directed network. To this end, we introduce a personalized PageRank-based algorithm to sample nodes. Comprehensive theoretical and empirical analysis demonstrates that it is nearly unbiased even in situations where stationary distribu-tion of PageRank is poorly approximated. In the second part, we propose a novel link-tracing algorithm that considers community structures in the process of network sampling. Empirical studies on several synthetic and real-world networks show that the proposed method improves the performance of network sampling compared to the popular link-based sampling methods in terms of accuracy and visited commu-nities. In the third part, we propose a diffusion-aware sampling method which uses the infection times (as local information) to explore an information diffusion net-work. Our empirical analysis demonstrates that in average, the proposed framework outperforms the common sampling methods in terms of link-based characteristics
  9. Keywords:
  10. Complex Network ; Social Networks ; Emission ; Sampling ; Estimating

 Digital Object List

 Bookmark

...see more