Plotting networks#

In this notebook, we’ll look at a few ways to plot networks: mainly, by directly plotting the adjacency matrix or via network layouts.

Plotting adjacency matricies#

Lets start by plotting a very simple network - this happens to be one of the most famous toy datasets in network science, the Karate Club dataset.

import networkx as nx

g = nx.karate_club_graph()

First, we’ll extract the adjacency matrix representation.

nodelist = list(g.nodes)

A = nx.to_numpy_array(g, nodelist=nodelist)

A
array([[0., 1., 1., ..., 1., 0., 0.],
       [1., 0., 1., ..., 0., 0., 0.],
       [1., 1., 0., ..., 0., 1., 0.],
       ...,
       [1., 0., 0., ..., 0., 1., 1.],
       [0., 0., 1., ..., 1., 0., 1.],
       [0., 0., 0., ..., 1., 1., 0.]])

We can plot the the adjacency matrix using a visualization commonly called a heatmap, for instance, using the heatmap function from seaborn.

import seaborn as sns

sns.heatmap(A)
<AxesSubplot:>
_images/6312bf974707a372bbfb955ac3510eb9b53977fef53f5596d686a1c71d33305b.png

We can also use the heatmap function in graspologic, which essentially just wraps that of seaborn and adds a few useful features when plotting adjacency matrices. You can read more about these in the graspologic documentation for heatmap.

from graspologic.plot import heatmap

heatmap(A, cbar=False)
<AxesSubplot:>
_images/db139df1401a5ca0ad0203fc263ef28d12935bef43d9b7fae58eb8cdce502253.png

As we mentioned in Representing networks, any permutation of this adjacency matrix represents the same graph. Let’s see how this same plot looks with a different permutation.

# generate a random permutation

import numpy as np

rng = np.random.default_rng(8888)

n = len(A)  # n is the number of nodes
perm = rng.permutation(n)

perm
array([ 1, 12,  9,  2,  0, 13, 30,  7, 24, 28, 16, 14, 32, 15, 17, 10,  4,
       33,  5, 27, 26, 29, 18, 19,  3, 21, 25, 31, 11, 23,  6,  8, 20, 22])

Question

The operation A[perm] permutes the rows of an adjacency matrix only. In general, will this permutation of the rows of always A still represent the same network?

A_perm = A[perm][:, perm]

heatmap(A_perm, cbar=False)
<AxesSubplot:>
_images/0b064d1c3907ae8c293407c247fae01a1cfc10700e9820ade570c477331f6306.png

This highlights part of why plotting with adjacency matrices can be difficult - depending on the permutation you use, the perception of the network can be very different. It is important to keep this in mind when plotting or looking at plots of adjacency matrices.

Often, it can be a good idea to have some specific way to sort the adjacency matrix - here, I infer some groups or communities in the network, and then use those as a partition of the adjacency matrix. I also sort the nodes in order within each community in descending order by degree. We’ll talk more about both of these concepts later in the course.

from graspologic.partition import leiden

partition_map = leiden(g, trials=100)

labels = np.vectorize(partition_map.get)(nodelist)

labels
array([1, 1, 1, 1, 0, 0, 0, 1, 2, 2, 0, 1, 1, 1, 2, 2, 0, 1, 2, 1, 2, 1,
       2, 3, 3, 3, 2, 3, 3, 2, 2, 3, 2, 2])
heatmap(A, inner_hier_labels=labels, sort_nodes=True, cbar=False)
<AxesSubplot:>
_images/30405aa011bd71185cf22cee0c8d35ab2961230c7bc727157054597acc7ff61b.png

Let’s try plotting on a more interesting, real world dataset. This one is a connectome dataset from a region of the Drosophila larva brain called the mushroom body.

from graspologic.datasets import load_drosophila_left

A, labels = load_drosophila_left(return_labels=True)

heatmap(A, inner_hier_labels=labels, sort_nodes=True)
<AxesSubplot:>
_images/0aec36ffae4d215619f90d3aacb3dcd061548ec380411044930fca9a00c85a0a.png

Note that this is a weighted network (hence the scale bar to the right) and that we’ve used some node labels to sort the adjacency matrix. Often, it can be helpful to ignore or transform the weights of a network to make the visualization more clear when weights cover multiple orders of magnitude.

heatmap(A, inner_hier_labels=labels, sort_nodes=True, transform='simple-all')
<AxesSubplot:>
_images/a8430d15000d285cf30e05264e486921984c9355d5622a9942836c767c22f0a5.png

For large, sparse networks, heatmaps can be difficult to use. This is because with enough nodes, there aren’t enough pixels on your screen to have a unique row/column in the adjacency matrix. Also, there is often lots of metadata associated with the nodes of a network which you may want to incorporate into an adjacency matrix visualization. graspologic has a more complex function called adjplot for dealing with adjacency matrices with associated metadata that can handle this kind of plot - see the tutorial here. If you are interested in more complex adjacency matrix visualizations.

Plotting network layouts#

Another common way to look at networks is via network layouts, sometimes called ball-and-stick diagrams or many other names.

NetworkX has a few simple functions for drawing networks.

nx.draw_networkx(g)
_images/a38d8d1e68f56023a9ddc352b3e29fdccf86c8721e7af8a55ed0a41b927bbd77.png

We can use matplotlib to make things look a little nicer.

import matplotlib.pyplot as plt

fig, ax = plt.subplots(1, 1, figsize=(10, 10))

nx.draw_networkx(g, ax=ax)
_images/fbd0a1882cab535832d35ca886ca54a6c7c24c0a3605f55e8987042e486a505d.png

NetworkX has a few functions for computing nice-looking positions for each node - one of my favorites for small-ish networks is kamada_kawai_layout.

pos = nx.kamada_kawai_layout(g)
pos
{0: array([0.02595264, 0.33261791]),
 1: array([-0.15415403,  0.25105994]),
 2: array([ 0.07281129, -0.00339833]),
 3: array([0.15120782, 0.23250616]),
 4: array([0.19313543, 0.57445429]),
 5: array([0.1899636 , 0.67901702]),
 6: array([-0.00442196,  0.69195753]),
 7: array([0.28351952, 0.21023576]),
 8: array([-0.17565775, -0.00990143]),
 9: array([ 0.0706129 , -0.28925159]),
 10: array([0.37980825, 0.53262302]),
 11: array([-0.18776547,  0.6332519 ]),
 12: array([0.43374412, 0.37950472]),
 13: array([-0.0851201 ,  0.05268196]),
 14: array([-0.50006702, -0.31244434]),
 15: array([-0.46577851, -0.42902514]),
 16: array([0.15957838, 1.        ]),
 17: array([-0.279331  ,  0.51420533]),
 18: array([-0.39589977, -0.53439666]),
 19: array([-0.2753847 ,  0.06921985]),
 20: array([-0.293375  , -0.61554259]),
 21: array([-0.37319088,  0.43635094]),
 22: array([-0.15864175, -0.64280203]),
 23: array([ 0.23606483, -0.5155009 ]),
 24: array([ 0.57008117, -0.29665386]),
 25: array([ 0.54879029, -0.16639415]),
 26: array([-0.0475633 , -0.75792026]),
 27: array([ 0.28462741, -0.28982154]),
 28: array([ 0.18539659, -0.29276329]),
 29: array([ 0.0685199 , -0.62042602]),
 30: array([-0.38137478, -0.08940135]),
 31: array([ 0.19717886, -0.07218863]),
 32: array([-0.17231283, -0.31221451]),
 33: array([-0.10095415, -0.33963971])}
fig, ax = plt.subplots(1, 1, figsize=(10, 10))

nx.draw_networkx(g, pos=pos, ax=ax)
_images/77a6d02097a052ad0fdfd3ec8e8d46404f92ac910568edb79a75801a4f2c88be.png

With a little tweaking of colors, text, etc., these layouts can be somewhat informative for smaller networks. Here is a plot I made of a network of neuroscience researchers at Hopkins, and how they were linked via various collaborations. The code can be found here.

graspologic also has a similar function for plotting network layouts, which can be useful for plotting if one has node attributes (such as community labels) that should be added into the plot.

from graspologic.plot import networkplot

xs = []
ys = []
for node in nodelist:
    xs.append(pos[node][0])
    ys.append(pos[node][1])
xs = np.array(xs)
ys = np.array(ys)

A = nx.to_numpy_array(g, nodelist=nodelist)
partition_map = leiden(g, trials=100)
labels = np.vectorize(partition_map.get)(nodelist)

ax = networkplot(
    A,
    x=xs,
    y=ys,
    node_alpha=1.0,
    edge_alpha=1.0,
    edge_linewidth=1.0,
    node_hue=labels,
    node_kws=dict(s=200, linewidth=2),
)
_ = ax.axis('off')
_images/c7318c6987092d9698a3680ec39cad01eaf59e1eeeb9ace214253706702cb8d9.png

For larger networks, it is often helpful to use something called a network embedding to compute positions for each node. The plots also have to often be modified - it can become unwieldy to try to view every node and edge when there are thousands or millions of each. Below is an example of one of these higher-level network layouts computed on the hemibrain connectome dataset. The code to generate this plot can be found here. We’ll talk about all of the ingredients that go into the plot below throughout the course.