Communication Efficient Algorithms for Generating Massive Networks
Massive complex systems are prevalent throughout all of our lives, from various biological
systems as the human genome to technological networks such as Facebook or Twitter.
Rapid advances in technology allow us to gather more and more data that is connected to
these systems. Analyzing and extracting this huge amount of information is a crucial task
for a variety of scientific disciplines.
A common abstraction for handling complex systems are networks (graphs) made up of
entities and their relationships. For example, we can represent wireless ad hoc networks in
terms of nodes and their connections with each other.We then identify the nodes as vertices
and their connections as edges between the vertices. This abstraction allows us to develop
algorithms that are independent of the underlying domain.
Designing algorithms for massive networks is a challenging task that requires thorough
analysis and experimental evaluation. A major hurdle for this task is the scarcity of publicly
available large-scale datasets. To approach this issue, we can make use of network generators
. These generators allow us to produce synthetic instances that exhibit properties
found in many real-world networks.
In this thesis we develop a set of novel graph generators that have a focus on scalability.
In particular, we cover the classic Erd˝os-Rényi model, random geometric graphs and
random hyperbolic graphs. These models represent different real-world systems, from the
aforementioned wireless ad-hoc networks  to social networks .We ensure scalability
by making use of pseudorandomization via hash functions and redundant computations.
The resulting network generators are communication agnostic, i.e. they require no communication.
This allows us to generate massive instances of up to 243 vertices and 247 edges
in less than 22 minutes on 32:768 processors.
In addition to proving theoretical bounds for each generator, we perform an extensive
experimental evaluation. We cover both their sequential performance, as well as scaling
behavior.We are able to show that our algorithms are competitive to state-of-the-art implementations
found in network analysis libraries. Additionally, our generators exhibit near
optimal scaling behavior for large instances. Finally, we show that pseudorandomization
has little to no measurable impact on the quality of our generated instances.
|Zugehörige Institution(en) am KIT
||Institut für Theoretische Informatik (ITI)
KITopen ID: 1000068617
||X, 98 S.
||Abschlussarbeit - Master
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page