Loading...
- Type of Document: Ph.D. Dissertation
- Language: Farsi
- Document No: 58392 (19)
- University: Sharif University of Technology
- Department: Computer Engineering
- Advisor(s): Beigy, Hamid; Abam, Mohammad Ali
- Abstract:
- In many real-world problems, graphlets in graph theory and simplets in simplicial complexes (SCs) are essential building blocks for analyzing topological structures. Graphlet analysis, which involves calculating the frequency distributions of graphlets, poses significant challenges in large networks. Two crucial concepts in this context are the graphlet frequency distribution (GFD) vector for entire networks and the graphlet degree vector (GDV) for individual nodes. Although previous studies have attempted to approximate the GFD and GDV, little attention has been given to determining the required sample complexity to achieve specific error bounds and confidence levels. This article addresses this gap by investigating the sample complexity needed to approximate the GFD and GDV using sampling-based algorithms accurately. We provide upper bounds for their sample complexity and extend our findings to graphlet degree centrality (GDC). We propose algorithms for identifying the top $k$ most frequent graphlet types and the $k$ topologically central vertices in networks. Similarly, simplets—the fundamental components of SCs—play a crucial role in understanding the structure of SCs. While prior research has mainly focused on counting or approximating the number of simplets, analyzing their frequency distribution is more practical for large-scale SCs. This article introduces the Simplet Frequency Distribution (SFD) vector, which facilitates the analysis of simplet frequencies within SCs. We also present a method for approximating the SFD vector using uniform sampling-based algorithms, establishing bounds on the sample complexity required for accurate approximation. Additionally, we extend the concept of simplet frequency distribution to simplices by introducing the Simplet Degree Vector (SDV) and Simplet Degree Centrality (SDC) for every simplex in SCs. We provide bounds for the sample complexity necessary to approximate the SDV and SDC and demonstrate the effectiveness of our methods through algorithms for approximating the SFD, geometric SFD, SDV, and SDC. The theoretical results are validated through experiments on both random and real-world networks and SCs, showing that the number of required samples is independent of the size of the underlying structure
- Keywords:
- Graphlet-Based Method ; Graphlet Degree Centrality (GDC) ; Graphlet Degree Vector (GDV) ; Simplet Degree Vector (SDV) ; Simplet ; Simplet Frequency Distribution (SFD) ; Graphlet Frequency Distribution (GFD) ; Simplet Degree Centrality (SDC)
-
محتواي کتاب
- view
- مقدمه
- پیشزمینه
- پژوهشهای پیشین
- امضای ساختاری ریزگراف و ریزسادهگان
- امضای ساختاری برای رئوس و سادکها
- مرکزیت درجهای ریزگراف و ریزسادهگان
- مطالعه موردی و ارزیابیهای تجربی
- جمعبندی و نتیجهگیری
- مراجع
- واژهنامه
