Jekyll2023-06-12T11:20:30-07:00https://bdpedigo.github.io/feed.xmlBenjamin Pedigopersonal descriptionBenjamin D. Pedigobpedigo@jhu.eduUsing Joblib and reproducible random numbers2020-02-18T00:00:00-08:002020-02-18T00:00:00-08:00https://bdpedigo.github.io/posts/2020/02/demo-parallel<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">python</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="n">pd</span>
<span class="kn">import</span> <span class="nn">seaborn</span> <span class="k">as</span> <span class="n">sns</span>
<span class="kn">from</span> <span class="nn">joblib</span> <span class="kn">import</span> <span class="n">Parallel</span><span class="p">,</span> <span class="n">delayed</span>
<span class="kn">from</span> <span class="nn">sklearn.cluster</span> <span class="kn">import</span> <span class="n">KMeans</span>
<span class="kn">from</span> <span class="nn">sklearn.datasets</span> <span class="kn">import</span> <span class="n">make_blobs</span>
<span class="kn">from</span> <span class="nn">sklearn.metrics</span> <span class="kn">import</span> <span class="n">adjusted_rand_score</span>
<span class="kn">from</span> <span class="nn">sklearn.mixture</span> <span class="kn">import</span> <span class="n">GaussianMixture</span>
</code></pre></div></div>
<h2 id="getting-random-numbers">Getting random numbers</h2>
<p>The following is one of the simplest ways that one could generate many random
numbers, which are useful in almost any scientific computing application</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">8888</span><span class="p">)</span>
<span class="n">n_numbers</span> <span class="o">=</span> <span class="mi">20</span>
<span class="n">outs</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_numbers</span><span class="p">):</span>
<span class="n">big_number</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">randint</span><span class="p">(</span><span class="mf">1e8</span><span class="p">)</span>
<span class="n">outs</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">big_number</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">outs</span><span class="p">)</span>
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[28863715, 20470441, 59601088, 35178672, 31953535, 54920184, 6437049, 49557007, 32591667, 33196361, 14963174, 59717179, 32480075, 70590040, 82187373, 67242005, 76389711, 43332706, 66541139, 8632395]
</code></pre></div></div>
<h2 id="make-the-above-stuff-in-my-for-loop-into-a-function">Make the above stuff in my for loop into a function</h2>
<p>The first step to getting to something we can do in Joblib is to turn the stuff
we had in a for loop into a function</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">_get_big_number</span><span class="p">():</span>
<span class="n">big_number</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">randint</span><span class="p">(</span><span class="mf">1e8</span><span class="p">)</span>
<span class="k">return</span> <span class="n">big_number</span>
<span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">8888</span><span class="p">)</span>
<span class="n">outs</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_numbers</span><span class="p">):</span>
<span class="n">big_number</span> <span class="o">=</span> <span class="n">_get_big_number</span><span class="p">()</span>
<span class="n">outs</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">big_number</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">outs</span><span class="p">)</span>
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[28863715, 20470441, 59601088, 35178672, 31953535, 54920184, 6437049, 49557007, 32591667, 33196361, 14963174, 59717179, 32480075, 70590040, 82187373, 67242005, 76389711, 43332706, 66541139, 8632395]
</code></pre></div></div>
<h2 id="now-do-it-in-parallel">Now, do it in parallel</h2>
<p>With Joblib, parallelizing the above is super easy!</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">8888</span><span class="p">)</span>
<span class="n">par</span> <span class="o">=</span> <span class="n">Parallel</span><span class="p">(</span><span class="n">n_jobs</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span>
<span class="n">outs</span> <span class="o">=</span> <span class="n">par</span><span class="p">(</span><span class="n">delayed</span><span class="p">(</span><span class="n">_get_big_number</span><span class="p">)()</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_numbers</span><span class="p">))</span>
<span class="k">print</span><span class="p">(</span><span class="n">outs</span><span class="p">)</span>
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[49298211, 960679, 7840371, 75202960, 28282479, 7528719, 99466831, 76798086, 61512489, 60087525, 55473392, 85450202, 13124747, 14944763, 92530329, 29784778, 16918058, 67836125, 81264675, 73960372]
</code></pre></div></div>
<h2 id="but-is-it-reproducible">But is it reproducible?</h2>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">8888</span><span class="p">)</span>
<span class="n">par</span> <span class="o">=</span> <span class="n">Parallel</span><span class="p">(</span><span class="n">n_jobs</span><span class="o">=</span><span class="mi">8</span><span class="p">)</span>
<span class="n">outs</span> <span class="o">=</span> <span class="n">par</span><span class="p">(</span><span class="n">delayed</span><span class="p">(</span><span class="n">_get_big_number</span><span class="p">)()</span> <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_numbers</span><span class="p">))</span>
<span class="k">print</span><span class="p">(</span><span class="n">outs</span><span class="p">)</span> <span class="c1"># note that now we don't get reproducible results!
</span>
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[11772194, 70427031, 91672483, 58518373, 42038149, 42933984, 3123949, 96178412, 41251378, 27098387, 98151772, 22073752, 87024182, 39619220, 78539229, 47795066, 77917795, 25815037, 54690881, 72367281]
</code></pre></div></div>
<h2 id="get-random-numbers-in-parallel-reproducibly">Get random numbers in parallel, reproducibly</h2>
<p>Even when setting the random seed in the above, we did not get reproducible results.
To make this happen, I usually just start by generating a long list of random seeds
(starting from a random seed, of course) and then pass those seeds down to the
individual jobs.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">8888</span><span class="p">)</span>
<span class="n">seeds</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">randint</span><span class="p">(</span><span class="mf">1e8</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">n_numbers</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">_get_big_reproducible_number</span><span class="p">(</span><span class="n">seed</span><span class="p">):</span>
<span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">seed</span><span class="p">(</span><span class="n">seed</span><span class="p">)</span>
<span class="k">return</span> <span class="n">_get_big_number</span><span class="p">()</span>
<span class="n">par</span> <span class="o">=</span> <span class="n">Parallel</span><span class="p">(</span><span class="n">n_jobs</span><span class="o">=</span><span class="mi">4</span><span class="p">)</span>
<span class="n">outs</span> <span class="o">=</span> <span class="n">par</span><span class="p">(</span><span class="n">delayed</span><span class="p">(</span><span class="n">_get_big_reproducible_number</span><span class="p">)(</span><span class="n">seed</span><span class="p">)</span> <span class="k">for</span> <span class="n">seed</span> <span class="ow">in</span> <span class="n">seeds</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">outs</span><span class="p">)</span>
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[46621082, 97292465, 84093849, 69981988, 69717233, 28029841, 53811122, 71335538, 50534020, 6092775, 87017978, 84213854, 7721363, 31245923, 89469332, 30208313, 81965930, 74720508, 80658938, 82125302]
</code></pre></div></div>
<h2 id="check-that-we-now-get-reproducible-results">Check that we now get reproducible results</h2>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">8888</span><span class="p">)</span>
<span class="n">seeds</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">randint</span><span class="p">(</span><span class="mf">1e8</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">n_numbers</span><span class="p">)</span>
<span class="n">par</span> <span class="o">=</span> <span class="n">Parallel</span><span class="p">(</span><span class="n">n_jobs</span><span class="o">=</span><span class="mi">4</span><span class="p">)</span>
<span class="n">outs</span> <span class="o">=</span> <span class="n">par</span><span class="p">(</span><span class="n">delayed</span><span class="p">(</span><span class="n">_get_big_reproducible_number</span><span class="p">)(</span><span class="n">seed</span><span class="p">)</span> <span class="k">for</span> <span class="n">seed</span> <span class="ow">in</span> <span class="n">seeds</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">outs</span><span class="p">)</span>
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[46621082, 97292465, 84093849, 69981988, 69717233, 28029841, 53811122, 71335538, 50534020, 6092775, 87017978, 84213854, 7721363, 31245923, 89469332, 30208313, 81965930, 74720508, 80658938, 82125302]
</code></pre></div></div>
<h2 id="simple-demo-with-gaussian-blobs">Simple demo with Gaussian blobs</h2>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">generate_data</span><span class="p">(</span><span class="n">n_samples</span><span class="o">=</span><span class="mi">300</span><span class="p">):</span>
<span class="n">X</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">make_blobs</span><span class="p">(</span><span class="n">n_samples</span><span class="o">=</span><span class="n">n_samples</span><span class="p">,</span> <span class="n">cluster_std</span><span class="o">=</span><span class="mf">2.5</span><span class="p">)</span>
<span class="n">transformation</span> <span class="o">=</span> <span class="p">[[</span><span class="mf">0.6</span><span class="p">,</span> <span class="o">-</span><span class="mf">0.6</span><span class="p">],</span> <span class="p">[</span><span class="o">-</span><span class="mf">0.4</span><span class="p">,</span> <span class="mf">0.8</span><span class="p">]]</span>
<span class="n">X_aniso</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">dot</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">transformation</span><span class="p">)</span>
<span class="n">aniso</span> <span class="o">=</span> <span class="p">(</span><span class="n">X_aniso</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>
<span class="k">return</span> <span class="n">aniso</span>
<span class="n">X</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">generate_data</span><span class="p">()</span>
<span class="n">plot_df</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">X</span><span class="p">)</span>
<span class="n">plot_df</span><span class="p">[</span><span class="s">"Label"</span><span class="p">]</span> <span class="o">=</span> <span class="n">y</span>
<span class="n">sns</span><span class="p">.</span><span class="n">set_context</span><span class="p">(</span><span class="s">"talk"</span><span class="p">)</span>
<span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">subplots</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">))</span>
<span class="n">sns</span><span class="p">.</span><span class="n">scatterplot</span><span class="p">(</span>
<span class="n">data</span><span class="o">=</span><span class="n">plot_df</span><span class="p">,</span>
<span class="n">x</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="n">y</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">,</span>
<span class="n">hue</span><span class="o">=</span><span class="s">"Label"</span><span class="p">,</span>
<span class="n">palette</span><span class="o">=</span><span class="n">sns</span><span class="p">.</span><span class="n">color_palette</span><span class="p">(</span><span class="s">"Set1"</span><span class="p">,</span> <span class="n">plot_df</span><span class="p">[</span><span class="s">"Label"</span><span class="p">].</span><span class="n">nunique</span><span class="p">()),</span>
<span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">axis</span><span class="p">(</span><span class="s">"off"</span><span class="p">)</span>
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(-8.551714730149058, 7.03254519805067, -5.95587662068765, 12.951742653648202)
</code></pre></div></div>
<p><img src="/images/demo_parallel_files/demo_parallel_16_1.png" alt="png" /></p>
<h2 id="look-at-the-performance-of-two-different-clustering-algorithms">Look at the performance of two different clustering algorithms</h2>
<p>Here we’ll just look at a single dataset and see how K-means and GMM perform</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">gmm</span> <span class="o">=</span> <span class="n">GaussianMixture</span><span class="p">(</span><span class="n">n_components</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">covariance_type</span><span class="o">=</span><span class="s">"full"</span><span class="p">)</span>
<span class="n">gmm_pred_labels</span> <span class="o">=</span> <span class="n">gmm</span><span class="p">.</span><span class="n">fit_predict</span><span class="p">(</span><span class="n">X</span><span class="p">)</span>
<span class="n">gmm_ari</span> <span class="o">=</span> <span class="n">adjusted_rand_score</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">gmm_pred_labels</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"GMM ARI:</span><span class="si">{</span><span class="n">gmm_ari</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
<span class="n">kmeans</span> <span class="o">=</span> <span class="n">KMeans</span><span class="p">(</span><span class="n">n_clusters</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span>
<span class="n">kmeans_pred_labels</span> <span class="o">=</span> <span class="n">kmeans</span><span class="p">.</span><span class="n">fit_predict</span><span class="p">(</span><span class="n">X</span><span class="p">)</span>
<span class="n">kmeans_ari</span> <span class="o">=</span> <span class="n">adjusted_rand_score</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">kmeans_pred_labels</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"K-means ARI: </span><span class="si">{</span><span class="n">kmeans_ari</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
<span class="n">plot_df</span><span class="p">[</span><span class="s">"KMeans"</span><span class="p">]</span> <span class="o">=</span> <span class="n">kmeans_pred_labels</span>
<span class="n">plot_df</span><span class="p">[</span><span class="s">"GMM"</span><span class="p">]</span> <span class="o">=</span> <span class="n">gmm_pred_labels</span>
<span class="n">fig</span><span class="p">,</span> <span class="n">axs</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">subplots</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
<span class="n">sns</span><span class="p">.</span><span class="n">scatterplot</span><span class="p">(</span>
<span class="n">data</span><span class="o">=</span><span class="n">plot_df</span><span class="p">,</span>
<span class="n">x</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="n">y</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">hue</span><span class="o">=</span><span class="s">"KMeans"</span><span class="p">,</span>
<span class="n">ax</span><span class="o">=</span><span class="n">axs</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span>
<span class="n">palette</span><span class="o">=</span><span class="n">sns</span><span class="p">.</span><span class="n">color_palette</span><span class="p">(</span><span class="s">"Set1"</span><span class="p">,</span> <span class="n">plot_df</span><span class="p">[</span><span class="s">"KMeans"</span><span class="p">].</span><span class="n">nunique</span><span class="p">()),</span>
<span class="n">s</span><span class="o">=</span><span class="mi">40</span><span class="p">,</span>
<span class="p">)</span>
<span class="n">sns</span><span class="p">.</span><span class="n">scatterplot</span><span class="p">(</span>
<span class="n">data</span><span class="o">=</span><span class="n">plot_df</span><span class="p">,</span>
<span class="n">x</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="n">y</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">hue</span><span class="o">=</span><span class="s">"GMM"</span><span class="p">,</span>
<span class="n">ax</span><span class="o">=</span><span class="n">axs</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span>
<span class="n">palette</span><span class="o">=</span><span class="n">sns</span><span class="p">.</span><span class="n">color_palette</span><span class="p">(</span><span class="s">"Set1"</span><span class="p">,</span> <span class="n">plot_df</span><span class="p">[</span><span class="s">"KMeans"</span><span class="p">].</span><span class="n">nunique</span><span class="p">()),</span>
<span class="n">s</span><span class="o">=</span><span class="mi">40</span><span class="p">,</span>
<span class="p">)</span>
<span class="n">axs</span><span class="p">[</span><span class="mi">0</span><span class="p">].</span><span class="n">axis</span><span class="p">(</span><span class="s">"off"</span><span class="p">)</span>
<span class="n">axs</span><span class="p">[</span><span class="mi">0</span><span class="p">].</span><span class="n">set_title</span><span class="p">(</span><span class="sa">f</span><span class="s">"ARI: </span><span class="si">{</span><span class="n">kmeans_ari</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
<span class="n">axs</span><span class="p">[</span><span class="mi">1</span><span class="p">].</span><span class="n">axis</span><span class="p">(</span><span class="s">"off"</span><span class="p">)</span>
<span class="n">axs</span><span class="p">[</span><span class="mi">1</span><span class="p">].</span><span class="n">set_title</span><span class="p">(</span><span class="sa">f</span><span class="s">"ARI: </span><span class="si">{</span><span class="n">gmm_ari</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>GMM ARI:0.41269325786287725
K-means ARI: 0.21338447619753562
Text(0.5, 1.0, 'ARI: 0.41269325786287725')
</code></pre></div></div>
<p><img src="/images/demo_parallel_files/demo_parallel_18_2.png" alt="png" /></p>
<h2 id="now-run-an-actual-experiment-over-many-random-inits">Now run an actual experiment over many random inits</h2>
<p>Here is an example of how we could profile the performance of these two algorithms
over many random samples using Joblib</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">run_experiment</span><span class="p">(</span><span class="n">seed</span><span class="p">):</span>
<span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">seed</span><span class="p">(</span><span class="n">seed</span><span class="p">)</span>
<span class="n">X</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">generate_data</span><span class="p">()</span>
<span class="n">gmm</span> <span class="o">=</span> <span class="n">GaussianMixture</span><span class="p">(</span><span class="n">n_components</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">covariance_type</span><span class="o">=</span><span class="s">"full"</span><span class="p">,</span> <span class="n">n_init</span><span class="o">=</span><span class="mi">20</span><span class="p">)</span>
<span class="n">gmm_pred_labels</span> <span class="o">=</span> <span class="n">gmm</span><span class="p">.</span><span class="n">fit_predict</span><span class="p">(</span><span class="n">X</span><span class="p">)</span>
<span class="n">gmm_ari</span> <span class="o">=</span> <span class="n">adjusted_rand_score</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">gmm_pred_labels</span><span class="p">)</span>
<span class="n">kmeans</span> <span class="o">=</span> <span class="n">KMeans</span><span class="p">(</span><span class="n">n_clusters</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">n_init</span><span class="o">=</span><span class="mi">20</span><span class="p">)</span>
<span class="n">kmeans_pred_labels</span> <span class="o">=</span> <span class="n">kmeans</span><span class="p">.</span><span class="n">fit_predict</span><span class="p">(</span><span class="n">X</span><span class="p">)</span>
<span class="n">kmeans_ari</span> <span class="o">=</span> <span class="n">adjusted_rand_score</span><span class="p">(</span><span class="n">y</span><span class="p">,</span> <span class="n">kmeans_pred_labels</span><span class="p">)</span>
<span class="k">return</span> <span class="n">gmm_ari</span> <span class="o">-</span> <span class="n">kmeans_ari</span>
<span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">8888</span><span class="p">)</span>
<span class="n">n_sims</span> <span class="o">=</span> <span class="mi">40</span>
<span class="n">seeds</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">randint</span><span class="p">(</span><span class="mf">1e8</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">n_sims</span><span class="p">)</span> <span class="c1"># random
# seeds = np.ones(n_sims, dtype=int) # uncomment for not random
</span><span class="n">par</span> <span class="o">=</span> <span class="n">Parallel</span><span class="p">(</span><span class="n">n_jobs</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="n">ari_diffs</span> <span class="o">=</span> <span class="n">par</span><span class="p">(</span><span class="n">delayed</span><span class="p">(</span><span class="n">run_experiment</span><span class="p">)(</span><span class="n">seed</span><span class="p">)</span> <span class="k">for</span> <span class="n">seed</span> <span class="ow">in</span> <span class="n">seeds</span><span class="p">)</span>
<span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">subplots</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="mi">4</span><span class="p">))</span>
<span class="n">ax</span><span class="p">.</span><span class="n">axvline</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">linewidth</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">linestyle</span><span class="o">=</span><span class="s">"--"</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">"red"</span><span class="p">)</span>
<span class="n">sns</span><span class="p">.</span><span class="n">distplot</span><span class="p">(</span><span class="n">ari_diffs</span><span class="p">,</span> <span class="n">norm_hist</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">sns</span><span class="p">.</span><span class="n">rugplot</span><span class="p">(</span><span class="n">ari_diffs</span><span class="p">)</span>
<span class="n">xlim</span> <span class="o">=</span> <span class="n">ax</span><span class="p">.</span><span class="n">get_xlim</span><span class="p">()</span>
<span class="n">ylim</span> <span class="o">=</span> <span class="n">ax</span><span class="p">.</span><span class="n">get_ylim</span><span class="p">()</span>
<span class="n">y_range</span> <span class="o">=</span> <span class="n">ylim</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">-</span> <span class="n">ylim</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
<span class="n">ypos</span> <span class="o">=</span> <span class="n">ylim</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="n">y_range</span> <span class="o">*</span> <span class="mf">0.75</span>
<span class="n">x_range</span> <span class="o">=</span> <span class="n">xlim</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">-</span> <span class="n">xlim</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
<span class="n">ax</span><span class="p">.</span><span class="n">text</span><span class="p">(</span><span class="n">xlim</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="mf">0.05</span> <span class="o">*</span> <span class="n">x_range</span><span class="p">,</span> <span class="n">ypos</span><span class="p">,</span> <span class="s">"KMeans </span><span class="se">\n</span><span class="s"> better"</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">text</span><span class="p">(</span><span class="n">xlim</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">-</span> <span class="mf">0.05</span> <span class="o">*</span> <span class="n">x_range</span><span class="p">,</span> <span class="n">ypos</span><span class="p">,</span> <span class="s">"GMM </span><span class="se">\n</span><span class="s"> better"</span><span class="p">,</span> <span class="n">horizontalalignment</span><span class="o">=</span><span class="s">"right"</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">spines</span><span class="p">[</span><span class="s">"left"</span><span class="p">].</span><span class="n">set_visible</span><span class="p">(</span><span class="bp">False</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">spines</span><span class="p">[</span><span class="s">"right"</span><span class="p">].</span><span class="n">set_visible</span><span class="p">(</span><span class="bp">False</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">spines</span><span class="p">[</span><span class="s">"top"</span><span class="p">].</span><span class="n">set_visible</span><span class="p">(</span><span class="bp">False</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_yticks</span><span class="p">([])</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_xlabel</span><span class="p">(</span><span class="s">"(GMM - KMeans) ARI"</span><span class="p">)</span>
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Text(0.5, 0, '(GMM - KMeans) ARI')
</code></pre></div></div>
<p><img src="/images/demo_parallel_files/demo_parallel_20_1.png" alt="png" /></p>Benjamin D. Pedigobpedigo@jhu.edupython import matplotlib.pyplot as plt import numpy as np import pandas as pd import seaborn as sns from joblib import Parallel, delayed from sklearn.cluster import KMeans from sklearn.datasets import make_blobs from sklearn.metrics import adjusted_rand_score from sklearn.mixture import GaussianMixtureSignal Flow2019-12-16T00:00:00-08:002019-12-16T00:00:00-08:00https://bdpedigo.github.io/posts/2019/12/signal-flow<p>The following is some derivation of the “signal flow” calculation for a directed network used in <em>Varshney et al. 2011</em> [1], as well as my own implementation and some simple simulations to attempt to understand this function better.</p>
<p>Feedback welcome!</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">import</span> <span class="nn">seaborn</span> <span class="k">as</span> <span class="n">sns</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="n">pd</span>
<span class="n">sns</span><span class="p">.</span><span class="n">set_context</span><span class="p">(</span><span class="s">'talk'</span><span class="p">)</span>
</code></pre></div></div>
<h3 id="problem-statement">Problem statement</h3>
<p>Given a directed graph, represented by adjacency matrix $A$, we are interested in understanding the feedforward structure of the graph.</p>
<p>For example, in the case of a connectome, we would like to quantify the</p>
<ul>
<li>How “high” is an individual neuron in the sensory $\rightarrow$ motor pathway?</li>
<li>To what extent can the entire graph be thought of as feedforward pathway?</li>
</ul>
<p>For now, we will focus on the first of these questions. We seek a vector $z \in \mathbb{R}^n$ such that nodes which are “high” in the feedforward structure of the graph have a high $z_i$ associated with node $i$. We will call this value $z_i$ the signal flow for node $i$.</p>
<h3 id="notation">Notation</h3>
<p>$A \in \mathbb{R}^{n \times n}$: a (possibly weighted), loopless adjacency matrix representing a graph with a single connected component</p>
<p>$W \in \mathbb{R}^{n \times n}$: $\frac{A + A^T}{2}$, the symmetrized adjacency matrix</p>
<p>$\Delta \in \mathbb{R}^{n \times n}$: an antisymmetric matrix, which we will specify later</p>
<p>$z \in \mathbb{R}^{n}$: the vector of signal flows for each node ${1, 2, … ,n}$</p>
<h3 id="generating-some-data">Generating some data</h3>
<p>Here we create a perfect feedforward network, where all nodes in a block project to the “next” block</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">low_p</span> <span class="o">=</span> <span class="mi">0</span> <span class="c1"># probability of random edges anywhere in the graph
</span><span class="n">diag_p</span> <span class="o">=</span> <span class="mi">0</span> <span class="c1"># probability of edges within a block
</span><span class="n">feedforward_p</span> <span class="o">=</span> <span class="mi">1</span> <span class="c1"># probability of edges projecting to the next block
</span><span class="n">n_blocks</span> <span class="o">=</span> <span class="mi">5</span>
<span class="n">n_per_block</span> <span class="o">=</span> <span class="mi">50</span>
<span class="n">block_sizes</span> <span class="o">=</span> <span class="n">n_blocks</span><span class="o">*</span><span class="p">[</span><span class="n">n_per_block</span><span class="p">]</span>
<span class="n">n_verts</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">(</span><span class="n">block_sizes</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">get_feedforward_B</span><span class="p">(</span><span class="n">low_p</span><span class="p">,</span> <span class="n">diag_p</span><span class="p">,</span> <span class="n">feedforward_p</span><span class="p">,</span> <span class="n">n_blocks</span><span class="o">=</span><span class="mi">5</span><span class="p">):</span>
<span class="n">B</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">((</span><span class="n">n_blocks</span><span class="p">,</span> <span class="n">n_blocks</span><span class="p">))</span>
<span class="n">B</span> <span class="o">+=</span> <span class="n">low_p</span>
<span class="n">B</span> <span class="o">-=</span> <span class="n">np</span><span class="p">.</span><span class="n">diag</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">diag</span><span class="p">(</span><span class="n">B</span><span class="p">))</span>
<span class="n">B</span> <span class="o">-=</span> <span class="n">np</span><span class="p">.</span><span class="n">diag</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">diag</span><span class="p">(</span><span class="n">B</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span><span class="p">),</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">B</span> <span class="o">+=</span> <span class="n">np</span><span class="p">.</span><span class="n">diag</span><span class="p">(</span><span class="n">diag_p</span> <span class="o">*</span> <span class="n">np</span><span class="p">.</span><span class="n">ones</span><span class="p">(</span><span class="n">n_blocks</span><span class="p">))</span>
<span class="n">B</span> <span class="o">+=</span> <span class="n">np</span><span class="p">.</span><span class="n">diag</span><span class="p">(</span><span class="n">feedforward_p</span> <span class="o">*</span> <span class="n">np</span><span class="p">.</span><span class="n">ones</span><span class="p">(</span><span class="n">n_blocks</span> <span class="o">-</span> <span class="mi">1</span><span class="p">),</span> <span class="n">k</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="k">return</span> <span class="n">B</span>
<span class="k">def</span> <span class="nf">get_block_labels</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
<span class="n">n</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">(</span><span class="n">n</span><span class="p">)</span>
<span class="n">n_cumsum</span> <span class="o">=</span> <span class="n">n</span><span class="p">.</span><span class="n">cumsum</span><span class="p">()</span>
<span class="n">labels</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">(</span><span class="n">n</span><span class="p">.</span><span class="nb">sum</span><span class="p">(),</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">int64</span><span class="p">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">n</span><span class="p">)):</span>
<span class="n">labels</span><span class="p">[</span><span class="n">n_cumsum</span><span class="p">[</span><span class="n">i</span> <span class="o">-</span> <span class="mi">1</span><span class="p">]</span> <span class="p">:</span> <span class="n">n_cumsum</span><span class="p">[</span><span class="n">i</span><span class="p">]]</span> <span class="o">=</span> <span class="n">i</span>
<span class="k">return</span> <span class="n">labels</span>
<span class="n">block_probs</span> <span class="o">=</span> <span class="n">get_feedforward_B</span><span class="p">(</span><span class="n">low_p</span><span class="p">,</span>
<span class="n">diag_p</span><span class="p">,</span>
<span class="n">feedforward_p</span><span class="p">,</span>
<span class="n">n_blocks</span><span class="o">=</span><span class="n">n_blocks</span><span class="p">)</span>
<span class="n">block_labels</span> <span class="o">=</span> <span class="n">get_block_labels</span><span class="p">(</span><span class="n">block_sizes</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">10</span><span class="p">))</span>
<span class="n">sns</span><span class="p">.</span><span class="n">heatmap</span><span class="p">(</span><span class="n">block_probs</span><span class="p">,</span> <span class="n">annot</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">cmap</span><span class="o">=</span><span class="s">"Reds"</span><span class="p">,</span> <span class="n">cbar</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">title</span><span class="p">(</span><span class="s">"Feedforward block probability matrix"</span><span class="p">);</span>
</code></pre></div></div>
<p><img src="/images/signal_flow_files/signal_flow_5_0.png" alt="png" /></p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">graspy.simulations</span> <span class="kn">import</span> <span class="n">sbm</span>
<span class="kn">from</span> <span class="nn">graspy.plot</span> <span class="kn">import</span> <span class="n">heatmap</span>
<span class="n">A</span> <span class="o">=</span> <span class="n">sbm</span><span class="p">(</span><span class="n">block_sizes</span><span class="p">,</span> <span class="n">block_probs</span><span class="p">,</span> <span class="n">directed</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">loops</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">heatmap</span><span class="p">(</span><span class="n">A</span><span class="p">,</span>
<span class="n">cbar</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">title</span><span class="o">=</span><span class="s">"Perfect feedforward SBM"</span><span class="p">,</span>
<span class="n">inner_hier_labels</span><span class="o">=</span><span class="n">block_labels</span><span class="p">,</span>
<span class="n">hier_label_fontsize</span><span class="o">=</span><span class="mi">15</span><span class="p">);</span>
</code></pre></div></div>
<p><img src="/images/signal_flow_files/signal_flow_6_0.png" alt="png" /></p>
<p><em>NB:</em> If you had been handed a shuffled adjacency matrix, it would be hard to tell it’s feedforward:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">shuffle_inds</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">permutation</span><span class="p">(</span><span class="n">n_verts</span><span class="p">)</span>
<span class="n">heatmap</span><span class="p">(</span><span class="n">A</span><span class="p">[</span><span class="n">np</span><span class="p">.</span><span class="n">ix_</span><span class="p">(</span><span class="n">shuffle_inds</span><span class="p">,</span> <span class="n">shuffle_inds</span><span class="p">)],</span>
<span class="n">cbar</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">title</span><span class="o">=</span><span class="s">"Shuffled feedforward SBM"</span><span class="p">);</span>
</code></pre></div></div>
<p><img src="/images/signal_flow_files/signal_flow_8_0.png" alt="png" /></p>
<h3 id="defining-an-energy-objective-function">Defining an energy (objective) function</h3>
<p>Define the energy function</p>
\[E(z) = \frac{1}{2} \sum_{i, j = 1}^n w_{ij} (z_i - z_j - \delta_{ij})^2\]
\[E(z) = \frac{1}{2} \sum_{i, j = 1}^n w_{ij} (z_i - z_j)^2 - \sum_{i, j = 1}^n w_{ij} \delta_{ij} (z_i - z_j) + \frac{1}{2} \sum_{i, j = 1}^n w_{ij} x_{ij}^2\]
<p>Let $E_0 = \frac{1}{2} \sum_{i, j = 1}^n w_{ij} x_{ij}^2$</p>
<p>Note that $\frac{1}{2} \sum_{i, j = 1}^n w_{ij} (z_i - z_j)^2$ is related to the unnormalized graph laplacian: $L = D - W$</p>
\[\frac{1}{2} \sum_{i, j = 1}^n w_{ij} (z_i - z_j)^2\]
\[\frac{1}{2} \sum_{i, j = 1}^n w_{ij} (z_i^2 - 2z_i z_j - z_j^2)\]
<p>Let $d_i = \sum_{j = 1}^n w_{ij}$, the degree of node $i$. Since $W$ is symmetric</p>
\[\frac{1}{2} \sum_{i = 1}^n d_i z_i^2 - \sum_{i, j = 1}^n w_{ij} z_i z_j + \frac{1}{2} \sum_{j = 1}^n d_j z_j^2\]
\[\frac{1}{2} \sum_{i = 1}^n d_i z_i^2 - \sum_{i, j = 1}^n w_{ij} z_i z_j + \frac{1}{2} \sum_{j = 1}^n d_j z_j^2\]
\[\sum_{i = 1}^n d_i z_i^2 - \sum_{i, j = 1}^n w_{ij} z_i z_j\]
\[z^T D z - z^T W z\]
\[z^T (D - W) z\]
\[z^T L z\]
<p>Now, consider the term</p>
\[\sum_{i, j = 1}^n w_{ij} \delta_{ij} (z_i - z_j)\]
\[\sum_{i, j = 1}^n w_{ij} \delta_{ij} z_i - \sum_{i, j = 1}^n w_{ij} \delta_{ij}z_j\]
\[\sum_{i, j = 1}^n w_{ji}( - \delta_{ji}) z_i - \sum_{i, j = 1}^n w_{ij} \delta_{ij}z_j\]
\[\sum_{i, j = 1}^n w_{ji}( - \delta_{ji}) z_i - \sum_{i, j = 1}^n w_{ij} \delta_{ij}z_j\]
<p>define $b_i = \sum_{j=1}^n w_{ij}x_{ij}$. Now the above reduces to</p>
<p>\(-z^Tb - z^Tb = -2z^Tb\).</p>
\[E(z) = \frac{1}{2} \sum_{i, j = 1}^n w_{ij} (z_i - z_j)^2 - \sum_{i, j = 1}^n w_{ij} \delta_{ij} (z_i - z_j) + \frac{1}{2} \sum_{i, j = 1}^n w_{ij} x_{ij}^2 = z^T L z - 2z^T b + E_0\]
<p>where $E_0$ is the remaining constant (which does not depend on $z$).</p>
<h3 id="solving-for-z">Solving for $z$</h3>
<p>Taking the derivative of $E(z)$,</p>
\[\frac{dE(z)}{dz} = \frac{d}{dz} (z^T L z - 2z^T b + E_0) = 2 L z - 2 b\]
<p>Setting equal to 0,</p>
\[L z = b\]
<p>$L$ is singular. To see this, recall that</p>
\[Lx = (D - W)x\]
<p>Let $x$ be the vector of all ones, then</p>
\[(Dx - Wx)_i = d_i - \sum_{j=1}^n w_{ij} = 0\]
<p>Thus, $Lx = 0 = 0x$ so the vector of all ones is an eigenvector of $L$ with eigenvalue $0$, so $L$ is not invertible.</p>
<p>However, we can solve</p>
\[argmin_z \| Lz - b \|_2\]
<p>via the Moore-Penrose inverse of $L$, $L^\dagger$.</p>
\[z^* = L^\dagger b\]
<p>The Moore-Penrose inverse yields the <em>unique</em> solution to $argmin_z | Lz - b |<em>2$ _with minimum 2-norm</em>. However, there are many solutions, all of the form</p>
\[z^* = L^\dagger b + y\]
<p>where $y \in Null(L)$. What is in $Null(L)$? Any vector spanned by the vector of all ones, as shown above! This means that all of the values of the signal flow vector $z$ could be shifted by a constant and the value of the objective function $E(z)$ would remain the same. Signal flow is not an absolute measure, but rather, a measure of where a node lives in the graph relative to its peers.</p>
<h3 id="defining-delta">Defining $\Delta$</h3>
<p>What we have seen is that</p>
\[E(z) = \frac{1}{2} \sum_{i, j = 1}^n w_{ij} (z_i - z_j - \delta_{ij})^2\]
<p>is minimized by $z = L^\dagger b$, where $b$ is a vector such that $b_i = \sum_{j=1}w_{ij}\delta_{ij}$</p>
<p>$\Delta$ can be whatever we choose, as long as it is antisymmetric. One choice used in <em>Varshney et al. 2011</em> is $\delta_{ij} = sgn(A_{ij} - A_{ji})$ This choice makes some intuitive sense.</p>
<h3 id="some-intuition">Some intuition</h3>
<p>Considering a single pair of nodes, $i, j$, the energy function looks like</p>
\[w_{ij} (z_i - z_j - sgn(A_{ij} - A_{ji}))\]
<p>If $A_{ij} > A_{ji}$, then node $i$ projects more strongly to $j$ than $j$ does to it. $sgn(A_{ij} - A_{ji})$ returns 1, and so the optinal configuration of $z_i$ and $z_j$ is to have $z_i = z_j + 1$.</p>
<p>$z$ is chosen by weighting each of these terms by the average projection weight $w_{ij}$, and finding the solution $z$ which minimizes these terms over all $i, j$ pairs.</p>
<h3 id="computing-the-solution">Computing the solution</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">W</span> <span class="o">=</span> <span class="p">(</span><span class="n">A</span> <span class="o">+</span> <span class="n">A</span><span class="p">.</span><span class="n">T</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span>
<span class="n">D</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">diag</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="nb">sum</span><span class="p">(</span><span class="n">W</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">))</span>
<span class="n">L</span> <span class="o">=</span> <span class="n">D</span> <span class="o">-</span> <span class="n">W</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nb">sum</span><span class="p">(</span><span class="n">W</span> <span class="o">*</span> <span class="n">np</span><span class="p">.</span><span class="n">sign</span><span class="p">(</span><span class="n">A</span> <span class="o">-</span> <span class="n">A</span><span class="p">.</span><span class="n">T</span><span class="p">),</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">L_dagger</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">linalg</span><span class="p">.</span><span class="n">pinv</span><span class="p">(</span><span class="n">L</span><span class="p">)</span> <span class="c1"># this is the Moore-Penrose inverse
</span><span class="n">z</span> <span class="o">=</span> <span class="n">L_dagger</span> <span class="o">@</span> <span class="n">b</span>
</code></pre></div></div>
<h3 id="examining-the-solution">Examining the solution</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">plt</span><span class="p">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
<span class="k">for</span> <span class="n">b</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_blocks</span><span class="p">):</span>
<span class="n">ax</span> <span class="o">=</span> <span class="n">sns</span><span class="p">.</span><span class="n">distplot</span><span class="p">(</span><span class="n">z</span><span class="p">[</span><span class="n">block_labels</span> <span class="o">==</span> <span class="n">b</span><span class="p">],</span>
<span class="n">label</span><span class="o">=</span><span class="n">b</span><span class="p">,</span>
<span class="n">kde</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">bins</span><span class="o">=</span><span class="n">np</span><span class="p">.</span><span class="n">linspace</span><span class="p">(</span><span class="o">-</span><span class="mf">2.5</span><span class="p">,</span><span class="mf">2.5</span><span class="p">,</span><span class="mi">100</span><span class="p">),</span>
<span class="n">norm_hist</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_xlabel</span><span class="p">(</span><span class="sa">r</span><span class="s">"Signal flow ($z$)"</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_yticks</span><span class="p">(())</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_ylabel</span><span class="p">(</span><span class="s">"Frequency"</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">legend</span><span class="p">(</span><span class="n">bbox_to_anchor</span><span class="o">=</span><span class="p">(</span><span class="mf">1.0</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="n">loc</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">borderaxespad</span><span class="o">=</span><span class="p">.</span><span class="mi">5</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s">"Block"</span><span class="p">);</span>
</code></pre></div></div>
<p><img src="/images/signal_flow_files/signal_flow_20_0.png" alt="png" /></p>
<p>For this perfect feedforward network, the signal flow of a node in block 0 is exactly 1 greater than the signal flow of the block it projects to, as we hoped it would be.</p>
<h3 id="what-if-the-graph-wasnt-perfectly-feedforward">What if the graph wasn’t perfectly feedforward?</h3>
<p>Here we sample a graph randomly (the first one wasn’t really random since the probability of feedforward was 1). We set the projection probability within block to $p=0.2$ and to the next block at $p=0.3$.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">low_p</span> <span class="o">=</span> <span class="mi">0</span> <span class="c1"># probability of random edges anywhere in the graph
</span><span class="n">diag_p</span> <span class="o">=</span> <span class="mf">0.2</span> <span class="c1"># probability of edges within a block
</span><span class="n">feedforward_p</span> <span class="o">=</span> <span class="mf">0.3</span> <span class="c1"># probability of edges projecting to the next block
</span>
<span class="n">block_probs</span> <span class="o">=</span> <span class="n">get_feedforward_B</span><span class="p">(</span><span class="n">low_p</span><span class="p">,</span>
<span class="n">diag_p</span><span class="p">,</span>
<span class="n">feedforward_p</span><span class="p">,</span>
<span class="n">n_blocks</span><span class="o">=</span><span class="n">n_blocks</span><span class="p">)</span>
<span class="n">A</span> <span class="o">=</span> <span class="n">sbm</span><span class="p">(</span><span class="n">block_sizes</span><span class="p">,</span> <span class="n">block_probs</span><span class="p">,</span> <span class="n">directed</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">loops</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">heatmap</span><span class="p">(</span><span class="n">A</span><span class="p">,</span>
<span class="n">cbar</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">title</span><span class="o">=</span><span class="s">"Less perfect feedforward SBM"</span><span class="p">,</span>
<span class="n">inner_hier_labels</span><span class="o">=</span><span class="n">block_labels</span><span class="p">,</span>
<span class="n">hier_label_fontsize</span><span class="o">=</span><span class="mi">15</span><span class="p">);</span>
</code></pre></div></div>
<p><img src="/images/signal_flow_files/signal_flow_23_0.png" alt="png" /></p>
<p>Let’s turn our signal flow calculation into an actual function, and use it on the graph above.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">calc_signal_flow</span><span class="p">(</span><span class="n">A</span><span class="p">):</span>
<span class="n">W</span> <span class="o">=</span> <span class="p">(</span><span class="n">A</span> <span class="o">+</span> <span class="n">A</span><span class="p">.</span><span class="n">T</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span>
<span class="n">D</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">diag</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="nb">sum</span><span class="p">(</span><span class="n">W</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">))</span>
<span class="n">L</span> <span class="o">=</span> <span class="n">D</span> <span class="o">-</span> <span class="n">W</span>
<span class="n">b</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nb">sum</span><span class="p">(</span><span class="n">W</span> <span class="o">*</span> <span class="n">np</span><span class="p">.</span><span class="n">sign</span><span class="p">(</span><span class="n">A</span> <span class="o">-</span> <span class="n">A</span><span class="p">.</span><span class="n">T</span><span class="p">),</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">L_dagger</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">linalg</span><span class="p">.</span><span class="n">pinv</span><span class="p">(</span><span class="n">L</span><span class="p">)</span> <span class="c1"># this is the Moore-Penrose inverse
</span> <span class="n">z</span> <span class="o">=</span> <span class="n">L_dagger</span> <span class="o">@</span> <span class="n">b</span>
<span class="k">return</span> <span class="n">z</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">calc_signal_flow</span><span class="p">(</span><span class="n">A</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
<span class="k">for</span> <span class="n">b</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_blocks</span><span class="p">):</span>
<span class="n">ax</span> <span class="o">=</span> <span class="n">sns</span><span class="p">.</span><span class="n">distplot</span><span class="p">(</span><span class="n">z</span><span class="p">[</span><span class="n">block_labels</span> <span class="o">==</span> <span class="n">b</span><span class="p">],</span> <span class="n">label</span><span class="o">=</span><span class="n">b</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_xlabel</span><span class="p">(</span><span class="sa">r</span><span class="s">"Signal flow ($z$)"</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_yticks</span><span class="p">(())</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_ylabel</span><span class="p">(</span><span class="s">"Frequency"</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">legend</span><span class="p">(</span><span class="n">bbox_to_anchor</span><span class="o">=</span><span class="p">(</span><span class="mf">1.0</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="n">loc</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">borderaxespad</span><span class="o">=</span><span class="p">.</span><span class="mi">5</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s">"Block"</span><span class="p">);</span>
</code></pre></div></div>
<p><img src="/images/signal_flow_files/signal_flow_25_0.png" alt="png" /></p>
<h3 id="what-if-we-start-to-add-noise-to-the-feedforward-pattern">What if we start to add noise to the feedforward pattern?</h3>
<p>Here we keep the probability within block the same, as well as
the feedforward probability, but let any two nodes connect with
probability $p = 0.05$.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">low_p</span> <span class="o">=</span> <span class="mf">0.05</span> <span class="c1"># probability of random edges anywhere in the graph
</span><span class="n">diag_p</span> <span class="o">=</span> <span class="mf">0.2</span> <span class="c1"># probability of edges within a block
</span><span class="n">feedforward_p</span> <span class="o">=</span> <span class="mf">0.3</span> <span class="c1"># probability of edges projecting to the next block
</span>
<span class="n">block_probs</span> <span class="o">=</span> <span class="n">get_feedforward_B</span><span class="p">(</span><span class="n">low_p</span><span class="p">,</span>
<span class="n">diag_p</span><span class="p">,</span>
<span class="n">feedforward_p</span><span class="p">,</span>
<span class="n">n_blocks</span><span class="o">=</span><span class="n">n_blocks</span><span class="p">)</span>
<span class="n">A</span> <span class="o">=</span> <span class="n">sbm</span><span class="p">(</span><span class="n">block_sizes</span><span class="p">,</span> <span class="n">block_probs</span><span class="p">,</span> <span class="n">directed</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">loops</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">heatmap</span><span class="p">(</span><span class="n">A</span><span class="p">,</span>
<span class="n">cbar</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">title</span><span class="o">=</span><span class="s">"Feedforward SBM with non-feedforward noise"</span><span class="p">,</span>
<span class="n">inner_hier_labels</span><span class="o">=</span><span class="n">block_labels</span><span class="p">,</span>
<span class="n">hier_label_fontsize</span><span class="o">=</span><span class="mi">15</span><span class="p">);</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">calc_signal_flow</span><span class="p">(</span><span class="n">A</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
<span class="k">for</span> <span class="n">b</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_blocks</span><span class="p">):</span>
<span class="n">ax</span> <span class="o">=</span> <span class="n">sns</span><span class="p">.</span><span class="n">distplot</span><span class="p">(</span><span class="n">z</span><span class="p">[</span><span class="n">block_labels</span> <span class="o">==</span> <span class="n">b</span><span class="p">],</span> <span class="n">label</span><span class="o">=</span><span class="n">b</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_xlabel</span><span class="p">(</span><span class="sa">r</span><span class="s">"Signal flow ($z$)"</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_yticks</span><span class="p">(())</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_ylabel</span><span class="p">(</span><span class="s">"Frequency"</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">legend</span><span class="p">(</span><span class="n">bbox_to_anchor</span><span class="o">=</span><span class="p">(</span><span class="mf">1.0</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="n">loc</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">borderaxespad</span><span class="o">=</span><span class="p">.</span><span class="mi">5</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s">"Block"</span><span class="p">);</span>
</code></pre></div></div>
<p><img src="/images/signal_flow_files/signal_flow_27_0.png" alt="png" /></p>
<p><img src="/images/signal_flow_files/signal_flow_27_1.png" alt="png" /></p>
<h3 id="but-what-if-we-had-more-data">But what if we had more data?</h3>
<p>Let’s make each block 10x bigger</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">n_per_block</span> <span class="o">=</span> <span class="mi">500</span>
<span class="n">block_sizes</span> <span class="o">=</span> <span class="n">n_blocks</span><span class="o">*</span><span class="p">[</span><span class="n">n_per_block</span><span class="p">]</span>
<span class="n">n_verts</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">(</span><span class="n">block_sizes</span><span class="p">)</span>
<span class="n">block_labels</span> <span class="o">=</span> <span class="n">get_block_labels</span><span class="p">(</span><span class="n">block_sizes</span><span class="p">)</span>
<span class="n">low_p</span> <span class="o">=</span> <span class="mf">0.05</span> <span class="c1"># probability of random edges anywhere in the graph
</span><span class="n">diag_p</span> <span class="o">=</span> <span class="mf">0.2</span> <span class="c1"># probability of edges within a block
</span><span class="n">feedforward_p</span> <span class="o">=</span> <span class="mf">0.3</span> <span class="c1"># probability of edges projecting to the next block
</span>
<span class="n">block_probs</span> <span class="o">=</span> <span class="n">get_feedforward_B</span><span class="p">(</span><span class="n">low_p</span><span class="p">,</span>
<span class="n">diag_p</span><span class="p">,</span>
<span class="n">feedforward_p</span><span class="p">,</span>
<span class="n">n_blocks</span><span class="o">=</span><span class="n">n_blocks</span><span class="p">)</span>
<span class="n">A</span> <span class="o">=</span> <span class="n">sbm</span><span class="p">(</span><span class="n">block_sizes</span><span class="p">,</span> <span class="n">block_probs</span><span class="p">,</span> <span class="n">directed</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">loops</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">heatmap</span><span class="p">(</span><span class="n">A</span><span class="p">,</span>
<span class="n">cbar</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">title</span><span class="o">=</span><span class="sa">f</span><span class="s">"Feedforward SBM with non-feedforward noise n=</span><span class="si">{</span><span class="n">n_verts</span><span class="si">}</span><span class="s">"</span><span class="p">,</span>
<span class="n">inner_hier_labels</span><span class="o">=</span><span class="n">block_labels</span><span class="p">,</span>
<span class="n">hier_label_fontsize</span><span class="o">=</span><span class="mi">15</span><span class="p">);</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">calc_signal_flow</span><span class="p">(</span><span class="n">A</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
<span class="k">for</span> <span class="n">b</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_blocks</span><span class="p">):</span>
<span class="n">ax</span> <span class="o">=</span> <span class="n">sns</span><span class="p">.</span><span class="n">distplot</span><span class="p">(</span><span class="n">z</span><span class="p">[</span><span class="n">block_labels</span> <span class="o">==</span> <span class="n">b</span><span class="p">],</span> <span class="n">label</span><span class="o">=</span><span class="n">b</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_xlabel</span><span class="p">(</span><span class="sa">r</span><span class="s">"Signal flow ($z$)"</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_yticks</span><span class="p">(())</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_ylabel</span><span class="p">(</span><span class="s">"Frequency"</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">legend</span><span class="p">(</span><span class="n">bbox_to_anchor</span><span class="o">=</span><span class="p">(</span><span class="mf">1.0</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="n">loc</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">borderaxespad</span><span class="o">=</span><span class="p">.</span><span class="mi">5</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s">"Block"</span><span class="p">);</span>
</code></pre></div></div>
<p><img src="/images/signal_flow_files/signal_flow_29_0.png" alt="png" /></p>
<p><img src="/images/signal_flow_files/signal_flow_29_1.png" alt="png" /></p>
<h3 id="how-does-this-look-with-more-blocks">How does this look with more blocks?</h3>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">n_per_block</span> <span class="o">=</span> <span class="mi">500</span>
<span class="n">n_blocks</span> <span class="o">=</span> <span class="mi">10</span>
<span class="n">block_sizes</span> <span class="o">=</span> <span class="n">n_blocks</span><span class="o">*</span><span class="p">[</span><span class="n">n_per_block</span><span class="p">]</span>
<span class="n">n_verts</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">(</span><span class="n">block_sizes</span><span class="p">)</span>
<span class="n">block_labels</span> <span class="o">=</span> <span class="n">get_block_labels</span><span class="p">(</span><span class="n">block_sizes</span><span class="p">)</span>
<span class="n">low_p</span> <span class="o">=</span> <span class="mf">0.05</span> <span class="c1"># probability of random edges anywhere in the graph
</span><span class="n">diag_p</span> <span class="o">=</span> <span class="mf">0.2</span> <span class="c1"># probability of edges within a block
</span><span class="n">feedforward_p</span> <span class="o">=</span> <span class="mf">0.3</span> <span class="c1"># probability of edges projecting to the next block
</span>
<span class="n">block_probs</span> <span class="o">=</span> <span class="n">get_feedforward_B</span><span class="p">(</span><span class="n">low_p</span><span class="p">,</span>
<span class="n">diag_p</span><span class="p">,</span>
<span class="n">feedforward_p</span><span class="p">,</span>
<span class="n">n_blocks</span><span class="o">=</span><span class="n">n_blocks</span><span class="p">)</span>
<span class="n">A</span> <span class="o">=</span> <span class="n">sbm</span><span class="p">(</span><span class="n">block_sizes</span><span class="p">,</span> <span class="n">block_probs</span><span class="p">,</span> <span class="n">directed</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">loops</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">heatmap</span><span class="p">(</span><span class="n">A</span><span class="p">,</span>
<span class="n">cbar</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
<span class="n">title</span><span class="o">=</span><span class="sa">f</span><span class="s">"Feedforward SBM with non-feedforward noise n=</span><span class="si">{</span><span class="n">n_verts</span><span class="si">}</span><span class="s">"</span><span class="p">,</span>
<span class="n">inner_hier_labels</span><span class="o">=</span><span class="n">block_labels</span><span class="p">,</span>
<span class="n">hier_label_fontsize</span><span class="o">=</span><span class="mi">15</span><span class="p">);</span>
<span class="n">z</span> <span class="o">=</span> <span class="n">calc_signal_flow</span><span class="p">(</span><span class="n">A</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">figure</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
<span class="k">for</span> <span class="n">b</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_blocks</span><span class="p">):</span>
<span class="n">ax</span> <span class="o">=</span> <span class="n">sns</span><span class="p">.</span><span class="n">distplot</span><span class="p">(</span><span class="n">z</span><span class="p">[</span><span class="n">block_labels</span> <span class="o">==</span> <span class="n">b</span><span class="p">],</span> <span class="n">label</span><span class="o">=</span><span class="n">b</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_xlabel</span><span class="p">(</span><span class="sa">r</span><span class="s">"Signal flow ($z$)"</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_yticks</span><span class="p">(())</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_ylabel</span><span class="p">(</span><span class="s">"Frequency"</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">legend</span><span class="p">(</span><span class="n">bbox_to_anchor</span><span class="o">=</span><span class="p">(</span><span class="mf">1.0</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="n">loc</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">borderaxespad</span><span class="o">=</span><span class="p">.</span><span class="mi">5</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="s">"Block"</span><span class="p">);</span>
</code></pre></div></div>
<p><img src="/images/signal_flow_files/signal_flow_31_0.png" alt="png" /></p>
<p><img src="/images/signal_flow_files/signal_flow_31_1.png" alt="png" /></p>
<p>It appears that the first and last block as clearly being separated in signal flow, and somewhat more so for blocks 1 and 8 (which are second from first and second from last in the feedforward path, respectively). I need to investigate how to get separable Gaussians like we had before as the number of blocks in the feedforward pathway grows, and the “off-feedforward” noise is modest.</p>
<h3 id="references">References</h3>
<p>[1] Varshney, Lav R., et al. “Structural properties of the Caenorhabditis elegans neuronal network.” PLoS computational biology 7.2 (2011): e1001066.</p>
<p>[2] Carmel, Liran, David Harel, and Yehuda Koren. “Combining hierarchy and energy for drawing directed graphs.” IEEE Transactions on Visualization and Computer Graphics 10.1 (2004): 46-57.</p>
<p>[3] Von Luxburg, Ulrike. “A tutorial on spectral clustering.” Statistics and computing 17.4 (2007): 395-416.</p>Benjamin D. Pedigobpedigo@jhu.eduThe following is some derivation of the “signal flow” calculation for a directed network used in Varshney et al. 2011 [1], as well as my own implementation and some simple simulations to attempt to understand this function better.