Jekyll2019-06-12T11:24:54+00:00http://www.stefanmesken.info/feed.xmlStefan MeskenData ScienceStefan MeskenTensorFlow Without Tears2019-06-11T00:00:00+00:002019-06-11T00:00:00+00:00http://www.stefanmesken.info/machine%20learning/tensorflow-without-tears<p>Life is pretty crazy right now. Finishing a PhD, changing from set theory to applied data science, forming connections to other data scientists, learning as much about machine learning (both theory and practice) as I possibly can, finding a new apartment and preparing to move, dealing with German bureaucracy… This doesn’t leave a lot of spare time. Truth be told, it doesn’t leave any.</p>
<p>I’m not complaining – the last couple of months have been immensely enjoyable (well, except for the bureaucracy part). But being pressed for time also made me decide to postpone my planned blog series on the importance of probability distributions in machine learning as I’m currently lacking the spare mental capacity to do this rather broad, open-ended topic justice.</p>
<h1 id="tensorflow-2">TensorFlow 2</h1>
<p>Instead, to finally return the ball of machine learning conversation to my readers, I’d like to share my excitement about <a href="https://www.tensorflow.org/beta/">TensorFlow 2</a>. The beta of TF2 has been announced 4 days ago and while I didn’t take part in the alpha at all, I’ve set aside a bit of time over the weekend to play around with the newly released beta. And it’s been incredibly exciting!</p>
<p>See, what often sets me apart in a group of people is that I really like to get down to the nuts and bolts of anything I lay my hands on. As a young child I’ve disassembled pretty much every electronic device in our household, I’ve worked on <a href="https://en.wikipedia.org/wiki/Trabant">charmingly simple cars</a> by the time I graduated from kindergarten, I’ve spent most of my childhood in the workshop of my grandfather. This hands-on, detail-focused, borderline obsessive personality trait never left me.</p>
<p>And it’s precisely what made me fall in love with TF1 when I first discovered it. TF1, at least to me, isn’t really a machine learning framework at heart. Admittedly, it has a bunch of utilities built into it that are useful in machine learning applications, but what TF1 does at its core is to provide a framework for the manipulation of dataflow graphs. This philosophy makes TF1 very versatile, performant and transparent. But it also sets up a learning curve that, especially to someone without a background in mathematics or computer science, looks more like a learning wall. And like so many other learning walls (vi anyone?), it naturally turns away a lot of people. Especially those people that I’ve benefited the most from during my entire life: People who, unlike me, don’t care much about bolts and nuts but who naturally focus much more on the greater picture, who provide a sense of style, a desire for polish that benefits the entire community.</p>
<p><a href="https://keras.io/">Keras</a> has done a lot to attract a more diverse group of users to TensorFlow 1. To quote their <a href="https://keras.io/why-use-keras/">FAQ</a>:</p>
<blockquote>
<p><strong>Keras prioritizes developer experience</strong></p>
<ul>
<li>Keras is an API designed for human beings, not machines. Keras follows best practices for reducing cognitive load: it offers consistent & simple APIs, it minimizes the number of user actions required for common use cases, and it provides clear and actionable feedback upon user error.</li>
<li>This makes Keras easy to learn and easy to use. As a Keras user, you are more productive, allowing you to try more ideas than your competition, faster – which in turn helps you win machine learning competitions.</li>
<li>This ease of use does not come at the cost of reduced flexibility: because Keras integrates with lower-level deep learning languages (in particular TensorFlow), it enables you to implement anything you could have built in the base language. In particular, as tf.keras, the Keras API integrates seamlessly with your TensorFlow workflows.</li>
</ul>
</blockquote>
<p>This amazing body of work has enabled TF2 to pull off what so rarely seems attainable: By embracing Keras’ philosophy, TF2 became a piece of software that is both user-friendly and still provides all the functionality, all the performance and all the transparency any user could ask for. This almost never happens but when it does, it makes me really, really happy.</p>
<h1 id="hello-im-tensorflow">Hello, I’m TensorFlow.</h1>
<p>In an effort to invite as many people as possible to share my excitement about TF2 and join the discussion, I’d like to end this post with a very simple regression task completed in TensorFlow 2. We aim to approximate the function <script type="math/tex">\sin \colon [0, 2 \pi] \to [-1, 1], x \mapsto \sin (x)</script> with a neural network. You can find the <a href="https://colab.research.google.com/drive/1b1Lil2bfpT--axFKgd2z-2LeIkCv0z5Y">interactive Jupyter notebook over at Google Colaboratory</a> or just continue reading for a slighly more verbose version.</p>
<p>The first part of any machine learning project is to load our dependencies.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">!</span><span class="n">pip</span> <span class="n">install</span> <span class="n">tf</span><span class="o">-</span><span class="n">nightly</span><span class="o">-</span><span class="mf">2.0</span><span class="o">-</span><span class="n">preview</span>
<span class="kn">import</span> <span class="nn">tensorflow</span> <span class="k">as</span> <span class="n">tf</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
</code></pre></div></div>
<p>Next, we load our data. In this case, to keep things as simple as possible, we will simply create the sine function on the interval <script type="math/tex">[0, 2 \pi]</script> and approximate it.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">x</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">pi</span><span class="p">,</span> <span class="mi">2</span><span class="o">*</span><span class="n">np</span><span class="o">.</span><span class="n">pi</span> <span class="o">/</span> <span class="mi">10000</span><span class="p">)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">sin</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
</code></pre></div></div>
<p>When possible, it’s always a good idea to look at a visual representation of your data. This serves as a baseline sanity check to ensure there is no immediately obvious problem.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="n">y</span><span class="p">)</span>
</code></pre></div></div>
<figure style="align: center">
<img src="/images/diagrams/sine.png" style="max-width: 400px;" alt="graph of the sine function" />
</figure>
<p>That looks okay, so let’s continue by building our neural network. The integration of Keras makes this very simple:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Build a neural network with 2 hidden layers, each of size 128</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">keras</span><span class="o">.</span><span class="n">models</span><span class="o">.</span><span class="n">Sequential</span><span class="p">([</span>
<span class="n">tf</span><span class="o">.</span><span class="n">keras</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">Dense</span><span class="p">(</span><span class="mi">128</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s">'relu'</span><span class="p">,</span> <span class="n">input_shape</span><span class="o">=</span><span class="p">[</span><span class="mi">1</span><span class="p">]),</span>
<span class="n">tf</span><span class="o">.</span><span class="n">keras</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">Dense</span><span class="p">(</span><span class="mi">128</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s">'relu'</span><span class="p">),</span>
<span class="n">tf</span><span class="o">.</span><span class="n">keras</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">Dense</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s">'linear'</span><span class="p">)</span>
<span class="p">])</span>
</code></pre></div></div>
<p>Compiling the model is similarly trivial.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">model</span><span class="o">.</span><span class="nb">compile</span><span class="p">(</span><span class="n">optimizer</span><span class="o">=</span><span class="s">'adam'</span><span class="p">,</span>
<span class="n">loss</span><span class="o">=</span><span class="s">'mean_squared_error'</span><span class="p">,</span>
<span class="n">metrics</span><span class="o">=</span><span class="p">[</span><span class="s">'accuracy'</span><span class="p">])</span>
</code></pre></div></div>
<p>Finally, let’s fit the neural network to our training data.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="n">y</span><span class="p">,</span> <span class="n">epochs</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span>
</code></pre></div></div>
<blockquote>
<p>Train on 10000 samples</p>
<p>Epoch 1/5
10000/10000 [==============================] - 1s 72us/sample - loss: 0.1698</p>
<p>Epoch 2/5
10000/10000 [==============================] - 1s 61us/sample - loss: 0.0904</p>
<p>Epoch 3/5
10000/10000 [==============================] - 1s 58us/sample - loss: 0.0415</p>
<p>Epoch 4/5
10000/10000 [==============================] - 1s 53us/sample - loss: 0.0109</p>
<p>Epoch 5/5
10000/10000 [==============================] - 1s 52us/sample - loss: 0.0023</p>
<p><tensorflow.python.keras.callbacks.History at 0x7fa30305c5f8></p>
</blockquote>
<p>That’s it! We’ve created and successfully trained a neural network to approximate the sine function on <script type="math/tex">[0, 2 \pi]</script> and the training output seems encouraging. All that’s left to do is to visualize how well we’ve done:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pred</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="n">pred</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">'model prediction'</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">()</span>
</code></pre></div></div>
<figure style="align: center">
<img src="/images/diagrams/since_prediction.png" style="max-width: 400px;" alt="prediction of the sine function" />
</figure>
<p>Not bad, not bad at all! The interval <script type="math/tex">[5,2 \pi]</script> needs some work (and I encourage you to think about why our model performed so much worse on that section) but other than that, I’d say that our model does know how to create a sine wave.</p>
<p>I’ve decided to give a few hints as to why our model performs worse on the tail end of our data set: Consider the following animation of the learning process</p>
<figure style="align: center">
<img src="/images/diagrams/adam_animation.gif" style="max-width: 400px;" alt="graph of the sine function" />
</figure>
<p>and keep in mind that we are using the ‘Adam’ optimizer.</p>Stefan MeskenLife is pretty crazy right now. Finishing a PhD, changing from set theory to applied data science, forming connections to other data scientists, learning as much about machine learning (both theory and practice) as I possibly can, finding a new apartment and preparing to move, dealing with German bureaucracy… This doesn’t leave a lot of spare time. Truth be told, it doesn’t leave any.How To Beat Kaggle (the Easy Way)2019-05-25T00:00:00+00:002019-05-25T00:00:00+00:00http://www.stefanmesken.info/machine%20learning/how-to-beat-kaggle-(the-easy-way)<p>A few nights ago, I found myself tinkering with the <a href="https://www.kaggle.com/c/titanic">Titanic data set on Kaggle</a> and couldn’t help but notice the number of people with a <a href="https://www.kaggle.com/c/titanic/leaderboard">perfect score</a> – many of whom have a single entry.</p>
<p>So, I thought to myself: “Clearly, they must be cheating. But how do you cheat efficiently?”</p>
<h2 id="a-mathematicians-perspective">A Mathematician’s Perspective</h2>
<p>The Kaggle competition ‘Titanic: Machine Learning from Disaster’ (and in fact any classification competition on Kaggle) can be modelled as follows:</p>
<p>Kaggle asks you to find a secret <script type="math/tex">n</script>-dimensional vector <script type="math/tex">\vec{k} = (k_1, k_2, \ldots, k_n)</script> (with <script type="math/tex">n=418</script> for the Titanic data set) where each number <script type="math/tex">k_i</script> is either <script type="math/tex">0</script> (the passenger with ID <script type="math/tex">i</script> didn’t survive) or <script type="math/tex">1</script> (the passenger with ID <script type="math/tex">i</script> did survive). So all we really have to do is to guess <script type="math/tex">\vec{k}</script> – no need for fancy machine learning techniques!</p>
<p>There are <script type="math/tex">2^n</script> many possible values for <script type="math/tex">\vec{k}</script> and guessing all of them would be practically impossible. Fortunately, there’s one more ingredient to Kaggle that we can exploit: If we submit a guess <script type="math/tex">\vec{g} = (g_1, \ldots, g_n)</script> to Kaggle, it will return a score – the number of correct guesses. I.e. the number of <script type="math/tex">i</script>s such that <script type="math/tex">g_i = k_i</script>.</p>
<h2 id="a-naive-approach">A Naive Approach</h2>
<p>This allows for a naive solution in (at most) <script type="math/tex">n+1</script> many guesses: First guess <script type="math/tex">\vec{g}_0 = (0,0, \ldots, 0)</script> resulting in a score of correct entries <script type="math/tex">s_0</script> and then, for each <script type="math/tex">i = 1, \ldots, n</script> guess <script type="math/tex">\vec{g}_i</script> – the binary vector with only one <script type="math/tex">1</script> in position <script type="math/tex">i</script>. Let <script type="math/tex">s_i</script> be the resulting score. If <script type="math/tex">s_i > s_0</script>, then the <script type="math/tex">i</script>th entry of <script type="math/tex">\vec{k}</script> must be a <script type="math/tex">1</script>. Otherwise it is a <script type="math/tex">0</script>.</p>
<p>While this is certainly possible to pull off in practice <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup>, it <em>does not</em> satisfy my urge for efficiency.</p>
<p>Can we do better?</p>
<p>Indeed! But making significant progress will require some work.</p>
<h2 id="insert-graph-theory">Insert Graph Theory</h2>
<h3 id="definition">Definition</h3>
<p>Let <script type="math/tex">G = (V;E)</script> be an undirected <a href="https://en.wikipedia.org/wiki/Graph_(discrete_mathematics)">graph</a>. Let <script type="math/tex">u,v \in V</script> be vertices. The <em>distance <script type="math/tex">d_G(v,u)</script> of <script type="math/tex">v</script> and <script type="math/tex">u</script></em> in <script type="math/tex">G</script> (in symbols <script type="math/tex">d_G(u,v)</script>) is the length of the minimal path of edges in <script type="math/tex">G</script> that connects <script type="math/tex">v</script> and <script type="math/tex">u</script>. If no such path exists, we let <script type="math/tex">d_G(v,u) := \infty</script>. <sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup></p>
<figure>
<img src="/images/diagrams/distance.png" alt="distance of two nodes in a graph" />
<figcaption>A shortest path of length 2 between u and v</figcaption>
</figure>
<h3 id="definition-1">Definition</h3>
<p>Let <script type="math/tex">G = (V;E)</script> be an undirected graph and let <script type="math/tex">R \subseteq V</script> be a set of vertices. <em><script type="math/tex">R = \{r_1, \ldots, r_k \}</script> resolves <script type="math/tex">G</script></em> if every vertex <script type="math/tex">v \in V</script> is uniquely determined by its vector of distances to members of <script type="math/tex">R</script>. To put it in mathematical terms: <script type="math/tex">R</script> resolves <script type="math/tex">G</script> if the function</p>
<p>\[
d_R \colon V \to [0, \infty]^k, v \mapsto (d_G(v,r_1), d_G(v,r_2), \ldots, d_G(v,r_k))
\]</p>
<p>is <a href="https://en.wikipedia.org/wiki/Injective_function">injective</a>.</p>
<p>Note that <script type="math/tex">R = V = \{ v_1, \ldots, v_n \}</script> trivially resolves <script type="math/tex">G</script> as every <script type="math/tex">v \in V</script> is uniquely determined by <script type="math/tex">i \in \{1, \ldots, n \}</script> with <script type="math/tex">d_G(v,v_i) = 0</script> since in this case we have <script type="math/tex">v = v_i</script>.</p>
<p>Consider the following example:</p>
<figure>
<img src="/images/diagrams/resolving_set.png" alt="a resolving set of size 4 for H(3)" />
<figcaption>A resolving set of size 3 for H(3)</figcaption>
</figure>
<p>Here, the vertex <script type="math/tex">(0,1,1)</script> is uniquely identified by having distance <script type="math/tex">2</script> to <script type="math/tex">r_1 = (0,0,0)</script>, distance <script type="math/tex">3</script> to <script type="math/tex">r_2 = (1,0,0)</script> and distance <script type="math/tex">2</script> to <script type="math/tex">r_3 = (1,1,0)</script>. In other words, <script type="math/tex">(0,1,1)</script> is the unique vertex with distance vector <script type="math/tex">d_R(0,1,1) = (2,3,2)</script> to the highlighted resolving set.</p>
<p>I encourage the reader to check that <script type="math/tex">R = \{ (0,0,0), (1,0,0), (1,1,0) \}</script> is indeed a resolving set for <script type="math/tex">H(3)</script> to get a better feel for this rather abstract concept before moving on.</p>
<h3 id="definition-2">Definition</h3>
<p>Let <script type="math/tex">G = (V;E)</script> be an undirected graph. The <em><a href="https://en.wikipedia.org/wiki/Metric_dimension_(graph_theory)">metric dimension</a> of <script type="math/tex">G</script></em> is the minimal size of some <script type="math/tex">R \subseteq V</script> that resolves <script type="math/tex">G</script>.</p>
<p>Returning to our example <script type="math/tex">H(3)</script>: We’ve already found a resolving set of size <script type="math/tex">3</script> and a bit of computation confirms that there is no resolving set of size <script type="math/tex">2</script>. Therefore the metric dimension of <script type="math/tex">H(3)</script> is <script type="math/tex">3</script>.</p>
<p>It is now time to meet the hero of our advanced guessing strategy.</p>
<h3 id="definition-3">Definition</h3>
<p>The <em><script type="math/tex">n</script>-dimensional <a href="https://en.wikipedia.org/wiki/Hamming_graph">Hamming graph</a></em> is the undirected graph <script type="math/tex">H(n) = (V;E)</script> whose vertices are all <script type="math/tex">n</script>-dimensional binary vectors <script type="math/tex">\vec{v} = (v_1, \ldots, v_n)</script> such that <script type="math/tex">\{\vec{v},\vec{u}\} \in E</script> iff they differ in exactly one position, i.e.</p>
<p>\[
E = \{ \{ \vec{v}, \vec{u} \} \mid \{i \mid v_i \neq u_i \} \text{ has size 1 } \}.
\]</p>
<p>If you look back to the diagram of <script type="math/tex">H(3)</script>, you will see that this is indeed the <script type="math/tex">3</script>-dimensional Hamming graph. Its vertices are binary vectors of length <script type="math/tex">3</script> and they are connected via a single edge if and only if they differ in exactly one coordinate.</p>
<h2 id="a-graph-theoretic-guessing-strategy">A Graph Theoretic Guessing Strategy</h2>
<p>Here is our graph theoretic guessing strategy: Let <script type="math/tex">R</script> be a small resolving set for <script type="math/tex">H(n)</script>. Submit each <script type="math/tex">\vec{r} \in R</script> as a guess to Kaggle which sends back its score <script type="math/tex">s(\vec{r})</script>. If <script type="math/tex">s(\vec{r}) = n</script> for some <script type="math/tex">\vec{r}</script>, we’ve achieved a perfect score and are done with this competition. Otherwise, since <script type="math/tex">R</script> is a resolving set, there is a unique vertex <script type="math/tex">\vec{g}</script> in <script type="math/tex">H(n)</script> such that</p>
<p>\[
d_{H(n)}(\vec{g},\vec{r}) = n - s(\vec{r})
\]</p>
<p>for all <script type="math/tex">\vec{r} \in R</script>.</p>
<p>However, by the construction of <script type="math/tex">H(n)</script>, we have <script type="math/tex">d_{H(n)} (\vec{k},\vec{r}) = n - s(\vec{r})</script> (where <script type="math/tex">\vec{k}</script> is Kaggle’s secret solution vector) for all <script type="math/tex">\vec{r} \in R</script>.</p>
<p>Since <script type="math/tex">\vec{g}</script> is the <em>unique</em> vector with this property, we must have that <script type="math/tex">\vec{g} = \vec{k}</script> is the desired solution. This strategy therefore guarantees a perfect score in at most <script type="math/tex">\mathrm{size(R)} + 1</script> many submissions!</p>
<p>Now, because <script type="math/tex">H(n)</script> has a metric dimension of <script type="math/tex">\frac{(2+ o(1))n}{\log_2(n)}</script>, we will be able to crack the Titanic dataset in <script type="math/tex">\sim 97</script> <sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup> submissions and furthermore, if we commit to submitting all but the final guess at once, this is known to be an optimal guessing strategy.</p>
<h2 id="confessions">Confessions</h2>
<p>While this graph theoretic approach to beat Kaggle, from a purely mathematical point of view, is very pleasing, it doesn’t seem that practical after all. For instance, while the metric dimension of <script type="math/tex">H(n)</script> is known asymptotically, <a href="https://mathoverflow.net/questions/332434/explicit-small-resolving-sets-for-hamming-graphs">it’s not clear to me how to find small resolving sets for <script type="math/tex">H(n)</script> in practice</a>. Furthermore, even if you were given a small resolving set <script type="math/tex">R</script>, calculating <script type="math/tex">\vec{k}</script> from <script type="math/tex">\{ d_{H(n)}(\vec{k}, \vec{r}) \mid \vec{r} \in R \}</script> requires some serious computational power. Granted, it doesn’t increase the number of guesses required, the metric I’ve chosen to optimize for, but it still results in so much computational overhead that the naive approach will win out.</p>
<p>What’s worse: We are not taking full advantage of Kaggle’s feedback to our guesses. When guessing the entries of our resolving set one by one, we don’t use the information gained to guide our future guesses. Instead we could just as well have submitted them all in one go, collect the scores and then compute the correct solution. If we add this further restriction, the solution presented here is optimal. However, without this artificial restriction, I suspect that a better performing, general solution should be possible and I’d very much like to find one.</p>
<p>In practice, if you want to do better than this graph theoretic guessing strategy, I’d suggest cooking up a reasonably well-performing model. You then adapt the naive approach by prioritizing those bits that your model is least certain about. This, if I had to guess, is what people actually do to obtain perfect scores. Finally, to get a perfect score with a single entry, just calculate the solution on a different account and, once you’ve obtained it, submit it on your main account.</p>
<p>From Kaggle’s point of view, at least in the case of the Titanic dataset, this attack vector is pretty much impossible to defend against. For larger datasets, on the other hand, there certainly is a lot they can do to prevent successful guessing strategies. That, however, is a story for another day…</p>
<h2 id="sources">Sources</h2>
<ul>
<li><a href="https://mathoverflow.net/questions/58600">Math Overflow. Guessing a subset of <script type="math/tex">\{1, \ldots, N \}</script></a></li>
<li><a href="https://epubs.siam.org/doi/pdf/10.1137/1.9781611975482.74">Jian, Polyanskii. How to guess an <script type="math/tex">n</script>-digit number</a></li>
<li><a href="https://arxiv.org/pdf/1712.02723.pdf">Jian, Polyanksii. On the metric dimension of Cartesian powers of a graph</a></li>
<li><a href="https://arxiv.org/pdf/math/0507527.pdf">Caceres, Puertas. ON THE METRIC DIMENSION OFCARTESIAN PRODUCTS OF GRAPHS</a></li>
<li><a href="https://arxiv.org/pdf/math/0507527.pdf">Caceres, Hernando, Mora, Pelayo, Puertas, Seara and Wood. ON THE METRIC DIMENSION OF CARTESIAN PRODUCTS OF GRAPHS</a></li>
<li>Bernt Lindström. On a combinatory detection problem. I (Magyar Tud. Akad.Mat. Kutato Int. Közl., 9:195–207, 1964)</li>
<li><a href="https://mathoverflow.net/questions/332434/explicit-small-resolving-sets-for-hamming-graphs">Math Overflow. Explicit, small resolving sets for Hamming graphs</a></li>
</ul>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>Kaggle’s daily submission limit poses only the tiniest of hurdles to any programmer… <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p><script type="math/tex">d_G</script> is the usual <a href="https://en.wikipedia.org/wiki/Distance_(graph_theory)">graph metric</a> <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>The exact number depends on the precise metric dimension of <script type="math/tex">H(418)</script> which I do not yet know at the time of writing this post. All I know for certain is that it is <script type="math/tex">\le 418</script>. <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Stefan MeskenA few nights ago, I found myself tinkering with the Titanic data set on Kaggle and couldn’t help but notice the number of people with a perfect score – many of whom have a single entry.