Network Analysis


Network Science

Networks have become a worthy modeling language for diverse phenomena in nature and society. A broad interdisciplinary field — under the name network science — has emerged as a result. Network science seeks to understand the structure and behavior of complex networks and to reason about dynamic processes that take place on them. At SCADS, we are primarily interested in the synergetic use of graph-theoretic and matrix algorithms in network analysis.


Maximum Clique Algorithms and Applications

Finding tightly connected subnetworks is a fundamental problem in network analysis. A special case is the problem of finding the largest clique — a subnetwork in which every node is connected with every other node. The maximum clique problem is a classical NP-hard (also to approximate) in graph theory. In recent work, in collaborations with colleagues, we have developed a fast, branch-and-bound type maximum clique finding algorithm that has proven effective on large sparse, networks. In a follow-up work we further improved the clique finder’s performance and utility and applied it to analyze various social and information networks, to compute strongly connected components in temporal networks, and to compress large graphs.


ScaWL: Scaling Weisfeiler–Leman Refinement

The k-Weisfeiler–Leman (k-WL) refinement process is widely used to generate powerful graph structural embeddings and to perform isomorphism tests. However, its memory demands have limited its use on large graphs. ScaWL overcomes this bottleneck with a cache-conscious, partitioned implementation that runs entirely in-core on shared- and distributed-memory systems, reducing memory requirements by orders of magnitude—all while matching the accuracy of standard k-WL.


Community Detection

Communities reveal mesoscale structure in social, biological, and information networks. SCADS developed a direction-optimizing label-propagation framework that alternates push- and pull-style traversals to minimise message traffic while maximising cache and NUMA locality. The implementation scales from a laptop to clusters, doubles traversal throughput on billion-edge graphs, and consistently yields higher-modularity clusters than standard label propagation—making web-scale community detection a practical routine.


Graph Query Performance Prediction 

Query performance prediction has shown benefits to query optimization and resource allocation for relational databases. Emerging applications are leading to search scenarios where workloads with heterogeneous, structure-less analytical queries are processed over large-scale graph and network data. This calls for effective models to predict the performance of graph analytical queries, which are often more involved than their relational counterparts. In a recent effort, we have begun to study and evaluate techniques for graph query performance prediction.


Papers

  • R.A. Rossi, D.F. Gleich and A.H. Gebremedhin, Parallel Maximum Clique Algorithms with Applications to Network Analysis, SIAM Journal on Scientific Computing, Vol 37, Issue 5, pages C589-C618, 2015.
    Abstract    Paper in PDF
  • B. Pattabiraman, M.M.A Patwary, A.H. Gebremedhin, W.K. Liao, A. Choudhary, Fast Algorithms for the Maximum Clique Problem on Massive Graphs with Applications to Overlapping Community Detection, Internet Mathematics, Vol 11, No 4-5, pp 421-448, 2015.
    Abstract    Paper in PDF
  • R.A. Rossi, D.F. Gleich, A.H. Gebremedhin and M.M.A Patwary, Fast Maximum Clique Algorithms for Large Graphs, Proceedings of WWW2014.
    Abstract    Paper in PDF
  • B. Pattabiraman, M.M.A Patwary, A.H. Gebremedhin, W.K. Liao, A. Choudhary, Fast Algorithms for the Maximum Clique Problem on Massive Sparse Graphs, WAW 2013: 10th Workshop on Algorithms and Models for the Web Graph, Lecture Notes in Computer Science 8305, pp 156-169, 2013.
    Abstract    Paper in PDF