Finding Subgraphs and Triads in Networks #
Before we dive in, just a heads-up: this guide hits close to home for me since some of the work featured in my PhD focuses on subgraphs in social networks. I know a lot about this, so there’s a good chance I could get a little too deep into it. I’ll do my best to keep it straightforward and not ramble on!
What are subgraphs? #
To keep things official, here’s a formal definition from Wikipedia:
A subgraph of a graph G is another graph formed from a subset of the vertices and edges of G. The vertex subset must include all endpoints of the edge subset, but may also include additional vertices.
Wikipedia editors
In simpler terms, a subgraph is just a smaller part of a larger graph. You can think of it as a mini version of the full graph, where we’re interested in seeing specific patterns or groups of nodes.
Why are subgraphs useful? #
Subgraphs are like little windows into the overall network. They’re the building blocks that reveal how the network is structured. For example, the subgraph I mentioned above is called a feed-forward formation. It’s a big deal in social sciences because it resembles the “enemy of my enemy is my friend” situation, also known as indirect reciprocity.
How are subgraphs extracted? #
There are many ways to extract subgraphs, but the most common method is by counting them based on their structure. Essentially, a subgraph is identified by the number of nodes it contains, which could range from just a couple to hundreds. To keep things simple, we often stick to smaller numbers, like three nodes—which is why triads come into play.
What are the challenges? #
Finding subgraphs falls into what we call an “NP-complete” problem (you can read more on the subgraph isomorphism problem). In plain terms: the more nodes you want to include in a subgraph, the more complex it becomes, and the number of combinations skyrockets. This means that analyzing larger subgraphs takes longer. Here’s a quick table showing how the number of possible subgraphs increases as the number of nodes grows:
No. Nodes | No. Combinations (undirected) | No. Combinations (directed) |
---|---|---|
3 | 2 | 13 |
4 | 6 | 199 |
5 | 21 | 9,364 |
6 | 112 | 1,530,843 |
Implementations #
If you’re using networkx
, there’s a super handy function for counting triads in a network. Just remember that this works best if your network has directed edges. Here’s how you can use it with a random graph generator:
>>> G = nx.fast_gnp_random_graph(n=50, p=0.15, directed=True)
>>> nx.triadic_census(G)
{'003': 7538, '012': 7896, '102': 636, '021D': 647, '021U': 662, '021C': 1375, '111D': 220, '111U': 217, '030T': 227, '030C': 79, '201': 17, '120D': 23, '120U': 21, '120C': 35, '210': 7, '300': 0}
So, what does this give you? The result is a dictionary where each key represents a specific type of triad, and the value tells you how many times that triad shows up in your network. If you’re curious about what these codes mean, you can check them out here along with some visuals.
Applications #
Subgraphs are a pretty big deal in network science. Here are a few ways they can help you analyse complex networks:
Motifs #
Motifs are subgraphs that appear either more or less often than you’d expect by chance. The idea is to count the subgraphs, compare the counts to a random baseline, and see which structures stand out. This method has roots in biology (like analyzing neural circuits), but it’s also widely used in social network studies—my own research included!
Cliques #
Remember cliques? They’re groups of tightly connected nodes. Subgraph analysis can help you find these cliques by spotting specific patterns. For example, the triad labeled 300 represents a fully connected subgraph.
Path Finding #
Subgraph analysis is also useful for tracking connections between nodes in a network. This is great if you’re trying to find simple paths or cycles within the network. These substructures help you figure out how nodes link up from point A to point B.
Conclusion and Final Thoughts #
Subgraphs are a crucial part of network analysis. Most of what we’ve covered deals with global metrics, which describe the whole network. But by looking at subgraphs, we focus on smaller parts of the network, which is key to understanding how it’s structured.
Prev: Core Analysis Next: Paths and Shortest Routes