GitList - GitList

Browse code

Add coding posts

Joseph Weston authored on 28/03/2021 06:12:17
Showing 17 changed files

config.toml index 67b95aa..00e456e 100644
content/posts/coding/adaptive-vs-parallel.md index 0000000..e68ce60
content/posts/coding/gcc-bug.md index 0000000..fecb100
content/posts/coding/git-rebase.md index 0000000..23d23ac
content/posts/coding/google-sheets-auth.md index 0000000..de90c60
content/posts/coding/haskell-fizzbuzz.md index 0000000..32de188
content/posts/coding/haskell-snake.md index 0000000..bec6b88
content/posts/coding/haskell-snake2.md index 0000000..4ecd77d
content/posts/coding/haskell.md index 0000000..585b8ba
content/posts/coding/isolating-docker-containers.md index 0000000..e39203a
content/posts/coding/kwant-tutorial.md index 0000000..b59b108
content/posts/coding/markov-chain-decrypter.md index 0000000..7857fd4
content/posts/coding/postscript.md index 0000000..50e57fe
content/posts/coding/stop-squashing-your-commits.md index 0000000..1caa32d
content/posts/coding/stop-teaching-git-pull.md index 0000000..904e0d2
content/posts/coding/think-differently.md index 0000000..1f1d14a
content/posts/coding/trolling-physicists.md index 0000000..f301cfc

@@ -81,17 +81,17 @@ type = "application/rss+xml"
                        url = "about"
                      [[menu.main]]
                        name = "Contact"
                     -  weight = 1
                     +  weight = 2
                        url = "contact"
                     -# [[menu.main]]
                     -#   name = "Blog"
                     -#   weight = 1
                     -#   url  = "posts"
                     +[[menu.main]]
                     +  name = "Blog"
                     +  weight = 3
                     +  url  = "posts"
                      [[menu.main]]
                        name = "Publications"
                     -  weight = 1
                     +  weight = 4
                        url  = "publications"
                      [[menu.main]]
                        name = "CV"
                     -  weight = 2
                     +  weight = 5
                        url  = "cv.pdf"

content/posts/coding/adaptive-vs-parallel.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,24 @@
                     +---
                     +title: Adaptive vs. Parallel computation
                     +date: 2017-11-08
                     +tags:
                     +  - coding
                     +draft: true
                     +---
+                    +
                     +Often we will run simulations for several values of a parameter.
                     +Often we want to *sweep* over a parameter space, and look for *features* in
                     +the simulation results (e.g. when the simulated quantity changes abruptly).
+                    +
                     +Homogeneous sampling is simple but kind of dumb. We are using a computer -- can't
                     +we do better?
+                    +
                     +Yes we can! We can try and sample in an *adaptive* manner, that is, we choose points
                     +in "interesting" regions of parameter space, by inspecting the values of the function
                     +that we have evaluated thus far.
+                    +
                     +There are of course challenges to making a "good" adaptive sampler, but essentially
                     +any problems that people have with any particular method can all be summarized as
                     +"my idea of what constitutes an 'interesting region of parameter space' differs
                     +from yours". This is nevertheless and interesting discussion, and will probably appear
                     +in the form of a blog post at a later stage

content/posts/coding/gcc-bug.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,120 @@
                     +---
                     +title: Tracking down bugs in GCC
                     +date: 2016-03-01
                     +tags:
                     +  - coding
                     +  - C
                     +---
+                    +
                     +I *think* I may have found a bug
                     +in [msp430-gcc][mspgcc], which is The GNU C compiler for the MSP430 series of
                     +microntrollers. While I was
                     +hacking on a tiny event loop to power the devices in my personal intranet of
                     +things I discovered that something was going a bit crazy.
                     +Cracking out [mspdebug][mspdebug] I noticed that at a certain point control was
                     +jumping to a seemingly random location in memory that did not have valid
                     +instructions, causing the microntroller to reset. Weird! Where was this
                     +happening, and why?
+                    +
                     +I managed to track the problem down to the main event loop that pops events off
                     +a FIFO and acts on them. An "event" in this context is a two-element C-struct
                     +consisting of a function pointer and a data pointer. Even when I provided a
                     +perfectly valid function pointer, my code was still jumping to an arbitrary
                     +position in memory and resetting. The plot thickens; it looks like I'm going to
                     +have to get my hands dirty and dig around a bit in the generated assembly!
                     +After some more back-and-forth between my C source and the assembly I
                     +managed to construct a minimal example that illustrates the problem that I
                     +am having:
+                    +
                     +```C
                     +// problem_test.c
                     +typedef struct {
                     +    void (*function)(void*) ;
                     +    void *data ;
                     +} event_t ;
+                    +
                     +extern void placeholder(event_t*) ;
                     +extern void test_function(void*) ;
+                    +
                     +int main(void) {
                     +    event_t e ;
                     +    e.function = test_function ;  // set to valid function pointer
                     +    e.data = (void*) 0x03 ;  // arbitrary data
                     +    placeholder(&e) ;  // prevent everything from being optimised away
                     +    e.function(e.data) ;
                     +}
                     +```
+                    +
                     +When the above code is compiled with optimisations disabled it produces correct
                     +output. The output of `msp430-gcc -O0 -S -c problem_test.c` is shown below.
                     +For clarity I have removed the assembler directives and have added in-line
                     +comments.
+                    +
                     +```nasm
                     +main:
                     +; stack setup and allocation of space for `event_t e`
                     +mov r1, r4
                     +add #2, r4
                     +sub #4, r1
                     +; `e.function = test_function`
                     +mov #test_function, -6(r4)
                     +; `e.data = 0x03`
                     +mov #3, -4(r4)
                     +; call `placeholder(&e)`
                     +mov r4, r15
                     +add #llo(-6), r15
                     +call    #placeholder
                     +; call `e.function(e.data)`
                     +mov -6(r4), r14
                     +mov -4(r4), r15
                     +call    r14
                     +; de-allocate stack space for `e`
                     +add #4, r1
                     +```
+                    +
                     +This code is correct, however if we now enable optimisations, compiling
                     +with `msp430-gcc -O1 -S -c problem_test.c` (`-O1` and `-O2`
                     +produce the same output for the above C code), we get the following
                     +assembly:
+                    +
                     +```nasm
                     +main:
                     +; allocate space for `event_t e` on the stack
                     +sub #4, r1
                     +; `e.function = test_function`
                     +mov #test_function, @r1
                     +; `e.data = 0x03`
                     +mov #3, 2(r1)
                     +; call `placeholder(&e)`
                     +mov r1, r15
                     +call    #placeholder
                     +; move `e.data` into r15
                     +mov 2(r1), r15
                     +; ??? call `e.data(e.data)` ???
                     +call    2(r1)
                     +; de-allocate stack space for `e`
                     +add #4, r1
                     +```
+                    +
                     +The second and third to last lines are the most important ones.
                     +we know that `r1` points to the top of the stack, and so the values
                     +of `e.function` and `e.data` can be found with `0(r1)` and `2(r1)`
                     +respectively, as each is a pointer, and hence 2 bytes wide on the
                     +MSP430 architecture. Despite this we clearly see that there is
                     +a `call 2(r1)` -- the program is going to jump to the address in
                     +`e.data` and start executing the data it finds there as if they
                     +were machine code! Clearly for sufficiently arbitrary data we will
                     +very quickly run into something that is not a valid machine instruction
                     +and the microcontroller will reset.
+                    +
                     +So, it appears that we have found the source of the problem, although it
                     +is still not clear why the wrong offsets are calculated  when optimisations
                     +are enabled; I will submit a bug report when I have a moment.
                     +As a workaround I noticed that if I use a global variable for the
                     +`event_t` then everything works correctly, even with optimisations enabled.
                     +Luckily for my actual use case this is a viable option, so I will be
                     +able to keep working until a fix is released.
+                    +
+                    +
                     +[mspgcc]: http://www.ti.com/tool/msp430-gcc-opensource
                     +[mspdebug]: https://github.com/dlbeer/mspdebug

content/posts/coding/git-rebase.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,133 @@
                     +---
                     +title: How I learned to stop worrying and love the rebase
                     +date: 2018-10-20
                     +tags:
                     +  - coding
                     +  - git
                     +---
+                    +
                     +I've been using Git for nearly ten years now. Ten years is a long time, and I've been able to try
                     +different approaches and evaluate how effective they are in my workflow. I've also had the opportunity to
                     +teach Git to others; both to colleagues in an informal environment, and to students in the more structured
                     +environment of the Casimir graduate school programming course. This experience has given me the chance to reflect on the
                     +Git workflow and how best to use the tool.
+                    +
                     +There's one question in particular which often comes up among people who have used Git for a while, and
                     +there never seems to be any concensus on how to use it properly: `git rebase`.
+                    +
                     +## What is `rebase`?
+                    +
                     +Let's start with a quick recap of what `git rebase` does for us. Let's say that we're developing a new
                     +feature on an aptly-named branch:
+                    +
                     +                                              ◯—◯ ← feature
                     +                                             ╱
                     +                                        ◯—◯—◯ ← master
+                    +
                     +We then pull in some changes from master, so that the histories for the master and feature
                     +branches are now divergent:
+                    +
                     +                                              ◯—◯ ← feature
                     +                                             ╱
                     +                                        ◯—◯—◯—◯—◯ ← master
+                    +
                     +Now, if the changes made on `master` were made to the same places in the same files as the
                     +changes on `feature`, then we know that when we finally merge our feature branch we're going
                     +to get conflicts. It's a general rule that the longer that you leave a branch un-merged, the
                     +more likely it is that you are going to get conflicts. Generally, while we're developing on
                     +`feature` we're going to want to incorporate the changes from `master` every so often, so
                     +that we don't have to deal with all the merge conflicts at once during the final merge.
                     +At this point we have 2 options for incorporating the changes from `master`:
+                    +
                     +                                          ◯—◯—◯ ← feature      ╮
                     +                                         ╱   ╱                 │ merge
                     +                                    ◯—◯—◯—◯—◯ ← master         ╯
+                    +
                     +                                              ◯—◯ ← feature    ╮
                     +                                             ╱                 │ rebase
                     +                                    ◯—◯—◯—◯—◯ ← master         ╯
+                    +
                     +See what we did? Rebase allows us to "chop" the link attaching the base
                     +of the `feature` branch and re-attach it (re-*base* geddit?) to the commit
                     +where `master` is pointing now.
+                    +
                     +Then we add a couple more commits and merge:
+                    +
                     +                                      ◯—◯—◯—◯—◯ ← feature      ╮
                     +                                     ╱   ╱     ╲               │ merge
                     +                                ◯—◯—◯—◯—◯———————◯ ← master     ╯
+                    +
                     +                                          ◯—◯—◯—◯ ← feature    ╮
                     +                                         ╱       ╲             │ rebase
                     +                                ◯—◯—◯—◯—◯—————————◯ ← master   ╯
+                    +
                     +Using `rebase` in this way allows us to maintain an almost-linear history (i.e. we could
                     +always fast-forward when merging instead of creating an explicit merge commit), which makes
                     +it easier to understand what we've done.
+                    +
                     +### Interactive `rebase`
+                    +
                     +The above usage of rebase is pretty uncontentious; you start to get divided opinions when
                     +you start talking about *interactive rebase*, which allows us to rewrite history in more
                     +exotic ways. For example, we can use interactive rebase to re-order commits or squash them
                     +together:
+                    +
                     +                                              A B C D
                     +                                              ◯—◯—◯—◯ ← feature
                     +                                             ╱
                     +                                    ◯—◯—◯—◯—◯ ← master
+                    +
                     +                                              C' B' A+D
                     +                                              ◯——◯———◯ ← feature
                     +                                             ╱
                     +                                    ◯—◯—◯—◯—◯ ← master
+                    +
                     +Developing is an inherently iterative process; your understanding of a problem evolves
                     +as you work on the solution. This means that the logical separation of ideas may not
                     +become apparent until *after* the fact. Git rebase can help us express the *logical*
                     +set of changes, rather than the (convoluted) set of changes as they actually happened.
+                    +
                     +### So what's the problem?
+                    +
                     +Rebase *rewrites history*. Each git commit contains a pointer to the parent commit(s), so
                     +when we rebase a set of commits they won't hash to the same values as they did before the
                     +rebase, even though the *changeset* may be the same.
+                    +
                     +This rewriting of history makes it problematic to use rebase on branches that are also being
                     +worked on by other people, and it's the generally accepted wisdom not to use rebase with any
                     +branch that you've pushed to a remote repository (i.e. made public).
+                    +
+                    +
                     +## My Git workflow
+                    +
                     +When conducting scientific experiments, one will typically
                     +keep a lab book, which contains notes, observations and key results as they occur. The
                     +goal of keeping a lab book is to make sure that *you don't forget what you were doing*.
                     +The goal of a lab book is, however, *not* to communicate results to a wider community.
                     +A lab book — despite being an accurate record — requires *context* to understand; it
                     +is messy, and does not present information in a way that someone without the relevant
                     +context can easily understand. A *scientific article*
                     +— on the other hand — is designed to disseminate information to a wide audience, and to give
                     +the necessary context to understand any conclusions. When doing science, *both* of these
                     +ways of working are necessary: an *accurate recollection* of what has been done, and then
                     +a *reorganisation* and *reinterpretation* of what was done.
+                    +
                     +In my daily work I use Git as both a *lab book* and a *scientific article*. When I am developing
                     +a new feature or fixing a bug I will create a new branch, and then start experimenting; committing
                     +whenever I make incremental progress towards my goal. This incremental progress will certainly include
                     +many dead-ends and false starts, and that's fine. By committing early and committing often I can ensure
                     +that any work I do won't be lost. However, when it's time to explain to other people bwhat I've done, it's
                     +time to *make sense* of that history. This is when I'll go through my lab book of commits and use the
                     +power of `rebase` to sequence everything into *logical* changes. When my changes are reviewed there will
                     +typically be small fixups (refactoring, naming fixes etc.). During the review I make these changes
                     +as separate commits, which makes it easier for the reviewer to see that I have applied their suggestions.
                     +Once the reviewer is happy I do one final pass with interactive rebase to incorporate the changes
                     +into the commits where they make the most sense. I then rebase on top of the branch into which I'm
                     +merging and perform the merge using the `--no-ff` option (to ensure that an explicit merge commit is made).
+                    +
                     +Enforcing this strategy for merging in changes has a few nice features. Firstly, the history is essentially
                     +linear — any merges could have been "fast-forward" — which makes it easier to visualise in tools like `tig`
                     +or `gitk`. Secondly, preserving the individual commits from each merge means that anyone looking back in
                     +history can see the logical set of changes that went into implementing a particular feature or bugfix.
                     +Finally, cleaning up the commits (i.e. not merging the "lab book" into the master branch) means that
                     +anyone looking back in history will not have to sift through endless trivia to get to the meat of a changeset.

content/posts/coding/google-sheets-auth.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,142 @@
                     +---
                     +title: Google sheets authenticator for Jupyterhub
                     +date: 2018-01-17
                     +tags:
                     +  - coding
                     +  - jupyter
                     +---
+                    +
                     +Back in November I was again involved in running the
                     +programming course for the [Casimir graduate school][casimir]
                     +of the universities of Delft and Leiden. In addition
                     +to the usual tweaks to the material in response to previous year's
                     +feedback, we also wanted to tweak the setup of our programming environment.
+                    +
                     +[casimir]: https://casimir.researchschool.nl/
+                    +
                     +The course is taught in Python and we provide a [Jupyter][jupyter]-based
                     +environment for our learners for the duration of the course by running our own
                     +deployment of [JupyterHub][jupyterhub].
                     +We've found that it's very effective in getting everyone up and running as quickly as
                     +possible, as everyone has the same environment and it's super easy to push updates
                     +to the course materials (though that's more due to the fact that we use Docker).
+                    +
                     +[jupyter]: https://jupyter.org/
                     +[jupyterhub]: https://jupyterhub.readthedocs.io/en/latest/
+                    +
                     +When we ran the course in 2016 we were still relative noobs when it came
                     +to Jupyterhub deployments, but after a year of experience setting up around 10 different
                     +Jupyterhubs (with the help of our ever-evolving Ansible role!)
                     +we were starting to get the hang of things. One thing in particular that we wanted
                     +to streamline was the signup process.
+                    +
                     +When signing up for the course people give their
                     +Github username (which we use in the Git portion of the course). This means that
                     +we can use the [OAuthenticator][oauthenticator] module. However,
                     +we still need to whitelist the usernames of participants, otherwise we'd be letting
                     +anyone with a Github account access to our environment!
+                    +
                     +[oauthenticator]: https://github.com/jupyterhub/oauthenticator
+                    +
                     +We had a few options as to how to do this. Last year we just manually added the names
                     +to a whitelist file, but this is not optimal because the file is only read when the
                     +hub starts, meaning that any people who sign up late need to be added manually (or
                     +we'd have to bounce the hub just to update the whitelist).
                     +In addition we wanted to be able to give people access to the hub as soon as
                     +they signed up, so they could have time to get used to it and work through some of
                     +the preliminary material if they wanted. Manually adding people just wasn't going
                     +to cut it.
                     +Another possibility was to make all the participants request access to a Github
                     +organization (which we would set up specifically for the course) and use the new
                     +"group whitelisting" functionality of OAuthenticator to whitelist everyone in that
                     +organization. This was not ideal either, as we would need to manually accept each
                     +participant's request to join the organization, and the whole point was to avoid
                     +`O(N_participants)` effort!
+                    +
                     +The solution that we came up with with was pretty hacky, but actually ended up
                     +working perfectly for us. Learners would sign up using a google form that we
                     +had prepared and the submitted form data is magically added to a google docs
                     +spreadsheet set up for the purpose.
                     +Our idea was to "share" the google sheet via a web link,
                     +which we could then fetch from within out whitelisting logic. While this might
                     +seem insanely insecure (it seems like we're making private data public by sharing
                     +using the web link), it's actually not that bad. The URLs that google docs
                     +generates contain a random string of 20 or so alphanumeric characters that's
                     +probably got as much entropy as a reasonable passphrase (sounds like a good topic
                     +for a future blog post!). It goes without saying that we only hit this URL using
                     +HTTPS and don't ever share it around in non-secure channels.
+                    +
                     +The following 50(ish) line snippet is the whole thing! (also available
                     +as a [gist][gist]).
+                    +
                     +[gist]: https://gist.github.com/jbweston/389fad330108f12c816b21da162fb123
+                    +
                     +```python
                     +import csv
                     +import subprocess
+                    +
                     +from tornado import gen, AsyncHTTPClient
+                    +
+                    +
                     +@gen.coroutine
                     +def get_whitelist(sheets_url, usernames_field):
                     +    # Get CSV from sheet
                     +    client = AsyncHTTPClient()
                     +    resp = yield client.fetch(sheets_url)
                     +    raw_csv = resp.body.decode('utf-8', 'replace').split('\n')
+                    +
                     +    reader = csv.reader(raw_csv)
+                    +
                     +    # Extract column index of usernames
                     +    headers = next(reader)
                     +    try:
                     +        username_column = headers.index(usernames_field)
                     +    except ValueError:
                     +        raise ValueError('header field "{}" not found in sheet {}'
                     +                         .format(usernames_field, sheets_url))
+                    +
                     +    usernames = [row[username_column] for row in reader]
                     +    return usernames
+                    +
+                    +
                     +class SheetWhitelister:
+                    +
                     +    sheets_url = 'https://docs.google.com/spreadsheets/d/xxxxxx'
                     +    usernames_column = 'Github username'
+                    +
                     +    @gen.coroutine
                     +    def check_whitelist(self, username):
                     +        if super().check_whitelist(username):
                     +            return True
                     +        try:
                     +            whitelist = yield get_whitelist(self.sheets_url,
                     +                                            self.usernames_column)
                     +            self.log.info('Retrieved users from spreadsheet: {}'
                     +                          .format(whitelist))
                     +            self.whitelist.update(whitelist)
                     +        except Exception:
                     +            self.log.error('Failed to fetch usernames from spreadsheet',
                     +                           exc_info=True)
                     +        return (username in self.whitelist)
                     +```
+                    +
                     +The above defines a mixin class, `SheetWhitelister`, that we can use with an
                     +existing Jupyterhub authenticator to "plug in" the custom whitelisting
                     +logic. To actually use it in the Jupyterhub config we'd need to combine
                     +it with an existing authenticator (e.g. Github), as below:
+                    +
                     +```python
                     +from oauthenticator.github import GithubOAuthenticator
+                    +
                     +class GithubWithSheets(SheetWhitelister, GithubOAuthenticator):
                     +    pass
+                    +
                     +c.JupyterHub.authenticator_class = GithubWithSheets
                     +```
+                    +
                     +I'm really not a fan of the mixin class pattern because you always need
                     +to make these boilerplate classes that combine all the required
                     +functionality, and combining these behaviours at runtime
                     +it's more cumbersome. Give me a nice functional strategy pattern any day!
                     +But hey, it works so I can't complain, and hopefully somebody on the internet
                     +will find this useful.

content/posts/coding/haskell-fizzbuzz.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,350 @@
                     +---
                     +title: Fizzbuzz in Haskell
                     +date: 2018-02-20
                     +tags:
                     +  - coding
                     +  - haskell
                     +---
+                    +
                     +Continuing in the vein of cool Haskell examples I find on the internet, this
                     +post is going to be about a particularly epic [fizzbuzz][fb] implementation that
                     +I saw in a [three-year-old Reddit thread][reddit]. Now, the OP in that thread
                     +had a serviceable but run of the mill fizzbuzz implementation, but what caught
                     +my eye was the top-voted comment. The author (who has since, sadly,
                     +deleted their account, or I would have credited them here) had accomplished
                     +fizzbuzz in a mere 2 lines of code! Here is the snippet copied verbatim:
+                    +
                     +```haskell
                     +let (m ~> str) x = str <$ guard (x `mod` m == 0)
                     +in map (fromMaybe . show <*> 3 ~> "fizz" <> 5 ~> "buzz")
                     +```
                     +[reddit]: https://www.reddit.com/r/haskell/comments/2cum9p/i_did_a_haskell_fizzbuzz/
                     +[fb]: http://wiki.c2.com/?FizzBuzzTest
+                    +
                     +Seeing this was one of those moments where you just say "oh man, I *have* to
                     +understand how this works!". Luckily there were a few people in that thread who
                     +had already hashed out explanations, so I could already get the gist of what was
                     +going on. This post is going to be an attempt to explain the above two lines to
                     +myself.
+                    +
                     +#### Let's go
                     +The fizzbuzz two-liner is a single expression with a `let` binding that defines
                     +an operator called `~>`. We shall put the `let` binding to one side for the
                     +moment and concentrate just on the core expression:
+                    +
                     +```haskell
                     +map (fromMaybe . show <*> 3 ~> "fizz" <> 5 ~> "buzz")
                     +```
                     +OK so we're using the function `map`, which has the signature `map :: (a -> b) ->
                     +[a] -> [b]`, and we've applied it to a single argument, meaning that the
                     +bit in parentheses must be a function `a -> b`. Now, the core of fizzbuzz is all
                     +about turning integers into strings (arbitrary integers into their string
                     +representation, multiples of 3 into "fizz" etc.) so we can probably assume that we
                     +will be mapping over a list of integers and producing a list of strings.
+                    +
                     +We can test this hypothesis by loading the two-liner into GHCi (We have to add
                     +the imports -- which I got by [hoogling][hoogle] the function names that GHCi
                     +didn't know about).
+                    +
                     +```haskell
                     +λ> import Control.Monad (guard)
                     +λ> import Data.Monoid ((<>))
                     +λ> import Data.Maybe (fromMaybe)
                     +λ> let (m ~> str) x = str <$ guard (x `mod` m == 0)
                     +λ> let core = (fromMaybe . show <*> 3 ~> "fizz" <> 5 ~> "buzz")
                     +λ> :t core
                     +core :: (Show a, Integral a) => a -> String
                     +```
                     +This seems to check out; the type signature looks a bit weird because Haskell
                     +derives the most general signature it can, but we can interpret it as `core ::
                     +Integer -> String`.
+                    +
                     +[hoogle]: https://www.haskell.org/hoogle/
+                    +
                     +#### From abstract to concrete
                     +Ok, so now we're going to start from the `core` expression (adding clarifying
                     +parentheses):
+                    +
                     +```haskell
                     +(fromMaybe . show) <*> (3 ~> "fizz" <> 5 ~> "buzz")
                     +```
                     +Let's analyse this from the outside in by first looking at the types of the
                     +arguments on either side of the `<*>`:
+                    +
                     +```haskell
                     +λ> :t (fromMaybe . show)
                     +fromMaybe . show :: Show a => a -> Maybe String -> String
                     +λ> :t (3 ~> "fizzbuzz" <> 5 ~> "buzz")
                     +... :: (Alternative f, Integral a) => a -> f String
                     +```
                     +Hmm, the first one is kind of understandable, but the second one is still quite
                     +abstract. In order to make this more concrete we could try to glue these pieces
                     +together with `<*>`. Let's remind ourselves of the signature for `<*>`:
+                    +
                     +```haskell
                     +λ> :t (<*>)
                     +(<*>) :: Applicative f => f (a -> b) -> f a -> f b
                     +```
                     +Now we have all the ingredients; let's try and match the type signatures for
                     +the previous expressions with the (very abstract) one for `<*>`:
+                    +
                     +```haskell
                     +f     (a            -> b)      -> f     a         -> f     b
                     +a' -> (Maybe String -> String) -> a' -> f' String -> a' -> String
                     +```
                     +So the `Applicative` structure `f` matches up with the `a' ->`, and the `f'`
                     +matches up with the `Maybe`. Given that we know that the whole combination needs
                     +to give something of type `Integer -> String`, this fixes the type of `a'` in
                     +the above to be "a function that takes an integer".
+                    +
                     +Just to make things crystal clear let's rewrite the signatures for the two
                     +sub-expressions using the concrete types that we managed to deduce:
+                    +
                     +```haskell
                     +(fromMaybe . show) :: Integer -> Maybe String - String
                     +(3 ~> "fizz" <> 5 ~> "buzz") :: Integer -> Maybe String
                     +```
                     +This is pretty cool; by combining several expressions that individually have
                     +very abstract types we've managed to deduce *concrete* types for these
                     +expressions!
+                    +
                     +We can also see that by using `<*>` we're using the [`Applicative` instance of
                     +functions][app] to elide the `Integer` parameter to the two sub-expressions. We
                     +could rewrite `core` like so:
+                    +
                     +```haskell
                     +core n = fromMaybe (show n) $ (3 ~> "fizz" <> 5 ~> "buzz") n
                     +```
                     +which is, in my opinion, more explicit but much less readable!
+                    +
                     +[app]: https://hackage.haskell.org/package/base-4.10.1.0/docs/src/GHC.Base.html#local-6989586621679017723
+                    +
                     +#### Down the layers
                     +Now that we have these concrete types we can start understanding how everything
                     +fits together.
+                    +
                     +`fromMaybe` has signature `a -> Maybe a -> a`; it takes a default value, a
                     +`Maybe` value and returns the default value if the `Maybe` is `Nothing`. In code:
+                    +
                     +```haskell
                     +fromMaybe a (Just b) = b
                     +fromMaybe a Nothing = a
+                    +
                     +```
                     +In `core` the default value is `show n`, where `n`
                     +is the number we're fizz-buzzing. This makes sense, as if `n` is not divisible
                     +by 3 or 5 then we should show just the number itself.
+                    +
                     +We can therefore see that `3 ~> "fizz" <> 5 ~> "buzz"` takes `n` and should
                     +return `Nothing` if `n` is not divisible by 3 or 5, and `Just "something"`
                     +otherwise.
+                    +
                     +Given this, it kind of makes sense if we can first look at `3 ~> "fizz"` in
                     +isolation. If we look at the type signature for `<>`:
+                    +
                     +```haskell
                     +λ> :t (<>)
                     +(<>) :: Monoid m => m -> m -> m
                     +```
                     +we see that it takes two things of type `m` and produces a third thing of the
                     +same type. We can therefore deduce that the type of `3 ~> "fizz"` is the same as
                     +the whole expression `3 ~> "fizz" <> 5 ~> "buzz"`, and is therefore `Integer ->
                     +Maybe String`.
+                    +
                     +To understand how `3 ~> "fizz"` works we'll first have to look at the definition
                     +of `~>` again:
+                    +
                     +```haskell
                     +(m ~> str) x = str <$ guard (x `mod` m == 0)
                     +```
                     +Ok, the last bit, ``x `mod` m == 0``, is clearly checking whether `x` is
                     +divisible by `m`. Let's look at the signatures of `<$` and `guard`:
+                    +
                     +```haskell
                     +λ> :t (<$)
                     +(<$) :: Functor f => a -> f b -> f a
                     +λ> :t guard
                     +guard :: Alternative f => Bool -> f ()
                     +```
                     +Ok, so `<$` seems to take two arguments, the second one being a functorial one,
                     +and returns the first value in the functorial context of the second value. If I
                     +had to guess I would say that it's implemented like so:
+                    +
                     +```haskell
                     +a <$ fb = fmap (const a) fb
                     +```
                     +or, in point free style:
+                    +
                     +```haskell
                     +(<$) = fmap . const
                     +```
                     +Looking at the definition of `~>` again we can see that the expression evaluates
                     +to `str` put into the functorial context of ``guard (x `mod` m ==
                     +0)``. What the hell does that mean?
+                    +
                     +Once again we're getting hit by the fact that the type signatures of the
                     +individual pieces are too general; we need to put stuff back into context and
                     +"match up the types" to understand what is really going on.
+                    +
                     +We know that ``str <$ guard (x `mod` m == 0)`` must have type `Maybe String`,
                     +and we know that `str` has type `String` and guard returns an `f ()` where `f`
                     +is some functor (`Alternative` being a subclass of `Functor`). We can therefore
                     +see that ``guard (x `mod` m == 0)`` must therefore have type `Maybe ()`. This
                     +means that the only values this expression can have are `Just ()` and `Nothing`.
+                    +
                     +Combined with the `<$` we can therefore see that `(m ~> str) x` evaluates to
                     +`Just str` when `m` divides `x`, and `Nothing` otherwise.
+                    +
                     +##### Down, down, down
+                    +
                     +So now we've understood *that* layer of structure, let's see if we can
                     +understand the combination `3 ~> "fizz <> 5 ~> "buzz`. Because we'll be
                     +referring to this thing a few times, I'm going to give it the name `buzzer`, so
+                    +
                     +```haskell
                     +buzzer :: Int -> Maybe String
                     +buzzer = 3 ~> "fizz" <> 5 ~> "buzz"
                     +```
                     +The expressions on either side of the `<>` are *functions* from `Integer` to
                     +`Maybe String`. `<>` is [defined as follows][func] between functions:
+                    +
                     +```haskell
                     +(f <> g) x = f x <> g x
                     +```
                     +[func]: https://hackage.haskell.org/package/base-4.6.0.1/docs/src/Data-Monoid.html#line-105
+                    +
                     +so clearly for this to work `f` and `g` must have the same signature, *and*
                     +the return value must itself be a monoid. We know that `f` and `g` return
                     +`Maybe String` for our case. `Maybe` is indeed a monoid if the thing that
                     +it contains is also a monoid; we just identify `Nothing` with the monoidal
                     +identity for the contained values and we're done. `String` is, of course,
                     +a monoid with the empty string as its identity element and concatenation
                     +as its `<>`.
+                    +
                     +Putting all this together we can see how `buzzer` actually
                     +works. We can explicitly treat each case: not divisible by 3 or 5, divisible
                     +by either 3 or 5, divisible by both 3 and 5.
+                    +
                     +When we apply `buzzer` to a number that is neither divisible by 3 nor by 5
                     +then both of the subexpressions evaluate to `Nothing` and we get
                     +`Nothing <> Nothing`, which is just `Nothing`. In the second case we get
                     +either `Just "fizz" <> Nothing` or `Nothing <> Just "buzz"`, which evaluate
                     +to `Just "fizz"` and `Just "buzz"` respectively (thanks to the monoid on
                     +`Maybe`). In the final case we get `Just "fizz" <> Just "buzz"`, which
                     +evaluates to `Just ("fizz" <> "buzz")` which is `Just "fizzbuzz"`.
+                    +
                     +#### Putting it all together
                     +Now comes the question of how we would rewrite this fizzbuzz so that it's
                     +easier to understand. On one hand we want to use abstraction to help us reveal
                     +the actual structure of the problem (without getting bogged down in the messy
                     +details) and on the other hand we don't want to abstract into the stratosphere
                     +so that it's no longer clear what our intention is.
+                    +
                     +My compromise would probably look something like this:
+                    +
                     +```haskell
                     +import Control.Monad (guard)
                     +import Data.Monoid ((<>))
                     +import Data.Maybe (fromMaybe)
+                    +
                     +(m ~> str) x = if x `mod` m == 0
                     +    then Just str
                     +    else Nothing
+                    +
                     +fizz_or_buzz :: Integer -> Maybe String
                     +fizz_or_buzz =
                     +        3 ~> "fizz"
                     +    <>  5 ~> "buzz"
+                    +
                     +fizzbuzz :: Integer -> String
                     +fizzbuzz = fromMaybe <$> show <*> fizz_or_buzz
+                    +
                     +main = traverse putStrLn $ map fizzbuzz [1..100]
                     +```
+                    +
                     +Essentially I made the following changes:
+                    +
                     ++ I preferred an explicit 'if-then-else' over the use of `guard` and `<$`,
                     +  but did not apply a type signature to `~>` as I feel it would obscure, rather
                     +  than clarify, meaning.
                     ++ I put an explicit type signature on the piece that handles the fizzing and
                     +  buzzing, but kept the abstract monoidal composition. I think that even if
                     +  someone is not 100% clear on how all the monoid instances interact, the
                     +  signature and definition make it obvious what this piece is doing. In addition
                     +  the formatting makes it easy for someone else to modify the code, say to
                     +  add printing of "baz" if the number is divisible by 7, or to reverse the
                     +  order of "fizz" and "buzz".
                     ++ I prefer using applicative style for both of the arguments for `fromMaybe`;
                     +  In my opinion this clarifies intent drastically.
+                    +
                     +So in the end we have not actually changed too much: the code still works in
                     +essentially the same way; I just clarified intent by adding
                     +explicit names to things, adding type signatures, and using explicit
                     +language features as opposed to what I consider excessive abstract logic.
+                    +
                     +Of course, the changes I made are coming from a place of ignorance; I am a
                     +total Haskell noob, so the things that are not obvious to me could well be
                     +obvious for a Haskell veteran. For example, the fact that I chose to keep the
                     +`fromMaybe <$> show <*> fizz_or_buzz` is due to the fact that I understand and
                     +know how to use the applicative instance of functions; maybe if I had more
                     +experience using `guard` and `<$` I would find the initial two-liner clearer
                     +than my explicit 'if-then-else'. I guess only time will tell.
+                    +
+                    +
                     +### Thoughts
+                    +
                     +#### Spaghetti
                     +People complain about object oriented programming because when you make a method
                     +call you have no idea what code is actually getting called ('cause dynamic
                     +dispatch + having to follow the object's method resolution order). I would posit
                     +that finding the definition of any of the functionality defined in a typeclass
                     +is the same thing. From a function definition it is sometimes impossible to know
                     +what code will actually be run because it can depend on the type of the
                     +arguments; you need to go to the call site to find out what will happen.
+                    +
                     +In addition, I find that the abstract level at which Haskell operates sometimes
                     +confuses more than it helps. Even though a `a -> b` and `Maybe a` both have
                     +monoid instances, the meaning is *totally* different for the two. In my opinion
                     +this is a case where treating things too generally can actually obscure meaning.
+                    +
                     +#### Work from the outside in
                     +I found that the complexity from overgeneralising can be combated by working
                     +top-down. You first need to figure out the type of the top-level/outermost
                     +expression and work inwards. If you start out trying to understand the types for
                     +the constituent expressions, often they will be to general for you to be able to
                     +understand why they are being used in the first place.
+                    +
                     +By starting from the outermost expression you can apply the technique of
                     +"matching up the types" to figure out what is going on one layer down, and then
                     +carry on recursively like this until you have the concrete types for the
                     +innermost expressions.
+                    +
                     +#### Is there such a thing as being *too* general?
                     +Abstraction is in some sense the essence of programming computers. It allows us
                     +to see the forest instead of the trees and often enables thinking about
                     +problems in a more fruitful way, i.e. *closer to the domain in which the problem
                     +was originally defined*. Many languages define abstract (as opposed to concrete)
                     +concepts. Python (my go-to language) has the concept of a `sequence`, an
                     +`iterable`, a `mapping` etc. These are all useful concepts, as they signal
                     +*intent*; we can define an algorithm that works on any `iterable`, and this
                     +gives us the freedom to pass it an array, linked list, or anything else that can
                     +be iterated over. Someone reading the algorithm doesn't need to care about the
                     +actual type that is passed in to understand what is going on.
+                    +
                     +Haskell takes this 1 step further with `Functor`, `Applicative` and `Monad`, and
                     +I am yet to be convinced that this is actually useful for a wide variety of
                     +cases. Even if `Maybe` and `List` can both formally be considered as applicative
                     +functors, the applicative instances for these two types *means totally different
                     +things*. `Maybe` represents computations that can fail, whereas the `List`
                     +`Applicative` instance represents all possible combinations of the provided
                     +computations. If I write some code that does something with a general
                     +`Applicative`, I don't really know what the code *means* before I apply it to
                     +concrete types. This means that *even if* I can formulate an algorithm using
                     +only `Applicative`, *naming* this thing sensibly is going to be a real
                     +challenge.
+                    +
                     +On the other hand, some very smart people clearly think that thinking at this
                     +level of abstraction *does* produce better software, and I am still very new to
                     +Haskell and functional programming in general. I would really like to see a good
                     +set of concrete examples that show how abstracting into the stratosphere like
                     +this is actually beneficial and produces code that is more maintainable.

content/posts/coding/haskell-snake.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,202 @@
                     +---
                     +title: Writing a snake clone in Haskell, part 1
                     +date: 2017-11-16
                     +tags:
                     +  - coding
                     +  - haskell
                     +---
+                    +
                     +After my recent dive into Haskell I was keen to try a small project to
                     +test out what I had learned. After watching a bunch of YouTube videos
                     +from various Haskell conferences I came across one by
                     +[Moss Collum](https://github.com/moss) where he describes how he built
                     +a series of Rogue-like games in Haskell over the course of a week.
+                    +
                     +I took a look at Moss' [code](https://github.com/moss/haskell-roguelike-challenge)
                     +and thought it was a pretty neat idea, however I wanted to try and make
                     +a snake game instead (nostalgia for the Nokia 3310, I guess). If you don't
                     +know what snake is, here's a sweet GIF of a russian guy getting a perfect game:
+                    +
                     +![snake](https://media.giphy.com/media/8D0yR4ylkAC1G/giphy.gif)
+                    +
                     +You move a snake around the screen trying to gobble up pieces of food. The snake
                     +moves forward 1 space every second or so autonomously, and each piece
                     +of food you eat makes the snake grow 1 space longer. You die if the snake hits
                     +the walls or its own tail.
+                    +
                     +The snake games that I saw on Hackage all seemed to be projects for the author
                     +to learn how to use a specific library, and I found that as a consequence the
                     +code logic was somewhat obscured. I wanted something *much* simpler:
                     +a terminal application controlled by sending keypresses to `stdin`, and with
                     +ASCII "graphics". I specifically wanted to avoid using game libraries; after all, my aim
                     +was to exercise my Haskell knowledge, not to make a novel gaming experience!
+                    +
+                    +
                     +### Let's go
+                    +
                     +All of the code described here is available [on Github](https://github.com/jbweston/haskell-snake).
+                    +
                     +#### First iteration
+                    +
                     +I started out by looking at some of Moss' code to see an example of how
                     +I could proceed. I decided that the first thing I would do
                     +would be to have a snake of fixed length moving around the screen
                     +in response to keypresses: no "food" to grow the snake, no boundaries,
                     +no collision detection and (most importantly) the snake does not move
                     +by itself.
+                    +
                     +The best way of proceeding seemed to be to model the game as a sequence
                     +of transformations on an initial state of the game's world. The
                     +transformations to apply are determined by the commands typed by the
                     +player. Moss' code took advantage of Haskell's lazy IO to get an
                     +(infinite) list of keypresses from `stdin` and then used this
                     +as the sequence of transformations. This is captured by the
                     +following code:
+                    +
                     +```haskell
                     +parseInput :: [Char] -> [Direction]
                     +...
                     +advance :: World -> Direction -> World
                     +...
                     +input <- getContents
                     +let states = scanl advance initialWorld (parseInput input)
                     +```
+                    +
                     +The last two lines are from the `main` function, and the preceding
                     +lines are the type signatures necessary to understand them. We
                     +can see that we first take the raw input from the user (via the
                     +`getContents` IO action) and parse the sequence of raw keypresses
                     +(the infinite list of `Char`) into a sequence of `Direction`s in
                     +which to move the snake. We then do a left scan of the `advance`
                     +function over this sequence of directions, starting with the
                     +world in its initial state, to generate a sequence of states
                     +of the world! `parseInput` also handles quitting the game when
                     +the user presses `q`. We model this by terminating the
                     +sequence of directions when we detect that `q` was typed.
+                    +
                     +Once we have this sequence of game worlds we just need to
                     +draw them to the screen. Naively I initially did the
                     +following [^1]:
+                    +
                     +```haskell
                     +drawWorld :: World -> IO ()
                     +...
                     +mapM_ (\s -> clearScreen >> drawWorld s) states
                     +```
+                    +
                     +i.e. I cleared the screen before drawing the new state. Unfortunately
                     +this caused the screen to flicker every time the world state
                     +updated, and I guessed (correctly) that it was because of the
                     +`clearScreen` taking just long enough to be noticeable. My solution
                     +was instead to "update" the screen:
+                    +
                     +```haskell
                     +drawUpdate :: (World, World) -> IO ()
                     +...
                     +mapM_ drawUpdate $ zip states (tail states)
                     +```
+                    +
                     +`drawUpdate` is actually pretty dumb; it just "deletes" the snake
                     +in the previous world by writing a space character to every position
                     +the snake occupied, then draws the snake position in the new world
                     +by writing a `@` at every position it occupies.
+                    +
                     +The result can be seen below
+                    +
                     +<video src="/images/snake/basic.webm" autoplay loop></video>
+                    +
                     +This is smashing, but is clearly not really a snake game yet!
                     +We have to add a few more ingredients to make it more like
                     +the game I remember from the old Nokia phones.
+                    +
                     +[^1]: `mapM_` maps a function that returns an IO action (more generally,
                     +     any monadic value), and then sequences those actions.
+                    +
                     +#### Adding extra ingredients
+                    +
                     +The first thing to do was to actually make it possible to lose the game.
                     +This involved detecting collisions between the snake and itself or with
                     +the boundary. After these additions I had something that looks like this:
+                    +
                     +<video src="/images/snake/with-walls.webm" autoplay loop></video>
+                    +
                     +The final piece of the puzzle (for now) was to add the food that could
                     +be eaten and would reappear in a random location. This led to the
                     +final iteration:
+                    +
                     +<video src="/images/snake/simple-complete.webm" autoplay loop></video>
+                    +
                     +This is already starting to look a lot like what I had initially envisioned!
                     +The next step (which I will detail in a subsequent post) is to make the
                     +snake move in the last direction selected every second or so. This will
                     +probably require a rewrite of much of the code; we'll need to have
                     +another "source" for direction commands, and probably different threads
                     +to to do the waiting.
+                    +
+                    +
                     +### Thoughts
+                    +
                     +Writing this short program was really a lot of fun. In addition, it
                     +taught me a bunch of stuff about writing Haskell programs! Below
                     +are a few points that I came to appreciate during this project.
+                    +
+                    +
                     +#### Type signatures are your documentation
+                    +
                     +I was startled by how much intention could be gleaned just from the
                     +type signatures and sensibly naming the functions. Given that I had
                     +some context about the program as a whole I found that the meaning
                     +of most of the functions became self evident. For example, given that
                     +I know that the  `World` datatype contains the state of the game world,
                     +and `Direction` is an order to move the snake in a particular
                     +direction, the meaning of
+                    +
                     +```haskell
                     +advance :: World -> Direction -> World
                     +```
+                    +
                     +is obviously "advance the state of the game world in response to
                     +an order to move in a particular direction".
+                    +
                     +I realise that my perspective on this is pretty skewed due to the short length of
                     +the program I was writing (you can hold the context of the whole program in
                     +your mind at once), but I get the impression that even with longer programs
                     +this concept that the type declarations *are* (for many functions) your documentation
                     +is quite prevalent. This is very different to Python, for example, where we are
                     +encouraged to document every function and detail its parameters.
+                    +
+                    +
                     +#### If your program compiles, chances are it is correct
+                    +
                     +I had read this claim on several places on the web and was initially sceptical,
                     +but can now anecdotally confirm it to be true! I reckon this surprising property
                     +of Haskell is due to the fact that Haskell programs naturally need to be decomposed
                     +into teeny weeny functions that do literally only one thing. As far as I can tell
                     +this need to decompose your program into much smaller pieces than you otherwise
                     +would is a consequence of Haskell's purity. We can't hold any mutable state in "variables",
                     +and each function returns an output and does nothing else, so there's not really much
                     +"room" to do much else than just quickly compute a value and return it. It thus
                     +becomes abundantly clear when you are writing a function whether it is correct or not,
                     +as you often only have to verify that a few expressions are correct.
+                    +
                     +Let's take the `advance` function (before we added the food) as an example.
                     +We want it to move the snake
                     +in a particular direction, unless that direction is the *opposite* direction
                     +to the direction in which the snake is currently moving (in which case
                     +`advance` should not change the state of the world). In code:
+                    +
                     +```haskell
                     +advance :: World -> Direction -> World
                     +advance w newDir
                     +    | newDir == opposite (direction w) = w
                     +    | otherwise = World { snake = slither (snake w) newDir
                     +                        , direction = newDir
                     +                        }
                     +```
+                    +
                     +The above code is obviously correct; if the new direction is opposite to the
                     +current direction, then just return the current world state, otherwise
                     +return a state of the world where the snake has slithered in the new direction.
                     +Of course we still need to verify that `opposite` and `slither` are implemented
                     +correctly, but because they have similarly restricted scopes it becomes just as
                     +easy to verify their correctness.

content/posts/coding/haskell-snake2.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,143 @@
                     +---
                     +title: Writing a snake clone in Haskell, part 2
                     +date: 2018-01-27
                     +tags:
                     +  - coding
                     +  - haskell
                     +---
+                    +
                     +In a [previous post](haskell-snake) I talked a bit about writing a snake game in
                     +Haskell. At the end of the post we had a working game, but there was 1 ingredient
                     +missing; the snake would not go anywhere by itself! The fundamental problem was that
                     +our game was being driven by Haskell's [lazy IO][lazy-io]. Whenever a new character
                     +appeared on `stdin` the runtime would crank the handle on our Haskell code,
                     +transforming this character into a sequence of IO actions that the runtime evaluates
                     +to print the game world to the screen.
                     +This use of lazy IO meant that basically all of the logic (except
                     +drawing to the screen) could take place outside the IO monad in nice, pure code.
+                    +
                     +[lazy-io]: http://book.realworldhaskell.org/read/io.html#io.lazy
+                    +
                     +The challenge now was to find a way of inserting an extra stream of "fake messages
                     +from the keyboard" that would be delivered at regular intervals (these would make
                     +the snake move forward without me having to type a key). It seemed to
                     +make sense to retain the "pipeline" structure of the code, so I thought about
                     +modifying it as illustrated by the following ascii-art:
+                    +
                     +    directions from  >-+-------------------------+-> update game world
                     +       keyboard        |                         |    and draw update
                     +                       +-> forward most recent >-+
                     +                             every X seconds
+                    +
                     +I came across the [Pipes](https://wiki.haskell.org/Pipes) library pretty
                     +quickly, and was delighted to see that the *first example* in the
                     +`pipes-concurrency` tutorial [is a game][pipes]! Essentially all I
                     +had to do was launch 3 threads that would run the above 3 components,
                     +with each one either feeding messages to, or reading messages from,
                     +a mailbox. The above diagram translates into the following haskell
                     +(inside the IO monad)
+                    +
                     +```haskell
                     +(mO, mI) <- spawn unbounded
                     +(dO, dI) <- spawn $ latest West
+                    +
                     +let inputTask = getDirections >-> to (mO <> dO)
                     +    delayedTask = from dI >-> rateLimit 1 >-> to mO
                     +    drawingTask = for (from mI >-> transitions initialWorld)
                     +                      (lift . drawUpdate)
                     +```
+                    +
                     +We first create some mailboxes: the main one (`mO` and `mI`), which
                     +`drawingTask` will draw directions from, and the one that will handle
                     +the delayed directions (`dO` and `dI`). Then we build up some pipelines
                     +that feed and consume these messages to and from the pipelines.
                     +All we need to do now is to run each of these pipelines in a separate
                     +thread using the `async` function. This is a bit involved
                     +because we first need to "unwrap" the pipeline into an IO action using
                     +`runEffect` (and perform garbage collection ¯\\\_(ツ)\_/¯).
+                    +
                     +```haskell
                     +let run p = async $ runEffect p >> performGC
                     +tasks <- sequence $ map run [inputTask, delayedTask, drawingTask]
                     +waitAny tasks
                     +```
+                    +
                     +[pipes]: (https://hackage.haskell.org/package/pipes-concurrency-2.0.0/docs/Pipes-Concurrent-Tutorial.html)
+                    +
+                    +
                     +The full code is [on Github][snake].
+                    +
                     +[snake]: https://github.com/jbweston/haskell-snake
+                    +
+                    +
                     +### Thoughts
+                    +
                     +#### Lots of stuff happens in monads
                     +I previously had the impression that Haskell code was super readable because
                     +it was composed of teeny tiny functions that only do one thing. However, after
                     +reading a bit of Haskell code (for example the [`Pipes.Concurrent`][concurrent]
                     +library) I realised that a lot of Haskell code is written inside monads which,
                     +in my opinion, harms readability. When I say that the code "happens in monads"
                     +what I really mean is that code is written using Haskell's [do notation][do]
                     +that allows you to write code that looks like it's imperative, but it really
                     +just a bunch of monadic compositions:
+                    +
                     +```haskell
                     +do
                     +    x <- x_monad
                     +    y <- returns_a_monad(x)
                     +    return (x + y)
                     +```
+                    +
                     +the above contrived example is equivalent to the following chain of monadic
                     +bind operations:
+                    +
                     +```haskell
                     +x_monad >>= (\x -> returns_a_monad(x)
                     +             >>=
                     +               (\y -> return (x + y)))
                     +```
+                    +
                     +which is certainly more difficult to read than the do notation!
                     +However, because it is easy to build up a lot of context when using do
                     +notation, I find it goes a bit against the grain of composing tiny
                     +functions that do only one thing. Hopefully as I gain competence in
                     +Haskell I'll be able to overcome these hurdles.
+                    +
                     +[concurrent]: https://github.com/Gabriel439/Haskell-Pipes-Concurrency-Library/
                     +[do]: https://en.wikibooks.org/wiki/Haskell/do_Notation
+                    +
+                    +
                     +#### Haskell's import style is scary
+                    +
                     +The language I have worked in most is recent years is Python. The
                     +[zen of Python][zen] teaches us that *explicit is better than implicit*,
                     +because it makes code easier to reason about. Given this, I find Haskell's
                     +default mode when importing modules somewhat scary. In Haskell, when you
                     +say `import foo`, this is equivalent to saying `from foo import *` in
                     +Python. This means that you get a bunch of arbitrary names injected into
                     +your namespace. This isn't quite as bad as `import *` in Python from
                     +a code-correctness perspective because Haskell is statically typed, and
                     +so any problems will (most probably) be caught at compile time. From a
                     +code readability perspective, however, I find it to be a complete nightmare;
                     +someone reading the code has no idea where an (often cryptically named)
                     +function comes from!  For example, `Pipes.Concurrent` exports a function
                     +called `spawn` that *creates a new mailbox*. Someone reading the code may
                     +naturally assume that `spawn` has something to do with creating new threads,
                     +but without knowing even what module it comes from, it's very difficult to
                     +tell. Now Haskell experts may well respond with "read the code and the
                     +meaning will be obvious" or merely "get gud", but I would posit that *the whole
                     +point* of things like clear variable names and explicit imports is that
                     +you *shouldn't have to* "get gud" to get a sense of what some code
                     +is trying to do. Maintaining mental context is hard, and as
                     +communicators we should try and reduce the burden by not requiring people
                     +to retain excess information, such as which modules export exactly which functions.
+                    +
                     +I am, of course, aware that Haskell has several variants of its import
                     +syntax, such as `import qualified` (which requires you to prepend the namespace,
                     +as you would with a regular `import` in Python) or by specifying explicitly
                     +which names should be imported. However, the overwhelming majority of Haskell
                     +code that I have read so far has made use of the unqualified syntax, making it
                     +more difficult than necessary to decipher people's code.
+                    +
                     +[zen]: https://www.python.org/dev/peps/pep-0020/

content/posts/coding/haskell.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,66 @@
                     +---
                     +title: Diving into Haskell
                     +date: 2017-11-08
                     +tags:
                     +  - coding
                     +  - haskell
                     +---
+                    +
                     +Haskell has been, for a number of years, a language that I have always wanted to
                     +dive into. I've heard it lauded as the language of "true hackers",
                     +and it's somewhat of a sign that you've made it as a developer if
                     +you can make sense of its terse syntax and seemingly arcane concepts.
                     +No mutation? No for-loops? What?! How do you get *anything done* in
                     +the language if it doesn't have these most basic of control flow
                     +mechanisms?
+                    +
                     +Well, the other day I saw the following snippet as a way of generating
                     +the Fibonacci sequence in Haskell:
+                    +
                     +```haskell
                     +fibonacci = 1 : 1 : zipWith (+) fibonacci (tail fibonacci)
                     +```
+                    +
                     +and I immediately knew that I needed Haskell in my life.
                     +I didn't even fully understand it on
                     +first glance; at that point all I knew about Haskell was that *spaces*
                     +were used for function application, rather than the more traditional `()`,
                     +but already I could see the outline of what the solution
                     +meant. The Fibonacci sequence is defined as `1`, `1`, then the sum of
                     +the previous number with the one before that, recursively. *But that's
                     +exactly what the above code says*. Even to the relatively untrained eye (mine)
                     +we can kind of see that the code is telling us to start with two `1`'s, then
                     +mash together the sequence we are currently building *with itself* (dropping
                     +the first element) using `+` as the "mashing operator".
+                    +
                     +Let's contrast this to a least-effort implementation in Python that
                     +generates the same sequence:
+                    +
                     +```python
                     +def fibs():
                     +    x = y = 1
                     +    yield y
                     +    while True:
                     +        yield x
                     +        x, y = x + y, x
                     +```
+                    +
                     +This is, in my opinion, much harder to read than the Haskell version.
                     +I'm not exactly  sure why; maybe it's because the Haskell version is so
                     +terse that you can hold it all in your mind's eye at once, or maybe it's
                     +got something to do with the way our brains process recursion vs. mutating
                     +values. In any case this example was enough to hook me.
+                    +
                     +I devoured the sublime "[Learn you a haskell for great good](http://learnyouahaskell.com/)"
                     +in the space of about a week, although I'm sure it will take a while before I
                     +fully digest the *meaning* of, e.g., functors and applicative functors (even if the mathematical
                     +definition is trivial). I think it's a testament to the quality of the exposition
                     +of this book that I was left with the distinct impression of having "got" monads
                     +after only a few readings (although I'm probably way off the mark). I'm not going
                     +to fall into the "[monads are like burritos](https://byorgey.wordpress.com/2009/01/12/abstraction-intuition-and-the-monad-tutorial-fallacy/)"
                     +trap, though; as far as I can tell, they appear to be just a particularly useful design pattern,
                     +and I am by far not the [first person](https://www.stephanboyer.com/post/9/monads-part-1-a-design-pattern)
                     +to draw this conclusion.
+                    +
                     +My next step in Haskell is going to be to tackle a small project of very limited scope,
                     +to see if I can write anything beyond tutorial code; should be fun!

content/posts/coding/isolating-docker-containers.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,118 @@
                     +---
                     +title: Isolating a Jupyterhub deployment
                     +date: 2017-02-02
                     +tags:
                     +  - docker
                     +  - jupyter
                     +---
+                    +
                     +In the [research group][qt] that I am a part of we use [Jupyter][jupyter] and
                     +associated projects a *lot*.  In addition to the local Jupyter instances that
                     +people may run on their private machines, we also have a [Jupyterhub][jhub]
                     +deployment that spawns Jupyter servers in Docker containers that we use for
                     +research purposes, as well as other deployments that we use for guest
                     +researchers and teaching, among other things.
+                    +
                     +One really useful recent addition to the Jupyter ecosystem is an authenticator
                     +plugin for Jupyterhub by [yuvipanda][yv] that will give a user a temporary
                     +account that will expire when they log out. Along with the [idle notebook
                     +culler][cull], this effectively allows us to set up a [tmpnb][tmpnb]
                     +deployment, but using all the existing infrastructure we have for deploying and
                     +managing Jupyterhub instances. We want to use this to host an interactive
                     +tutorial for our quantum transport simulation tool, [Kwant][kwant], that anyone
                     +can try out from wherever they are!
+                    +
                     +While this would be really awesome, there is currently one problem:
                     +we run everything on our own hardware in the university, so giving random
                     +people on the internet access to a Jupyter notebook servers inside the
                     +university firewall is a recipe for disaster. To get around this problem we will
                     +use the networking capabilites of Docker along with a few iptables rules
                     +to secure our deployment.
+                    +
                     +#### Docker networking
+                    +
                     +When you create a new Docker container it will, by default, be attached to
                     +the default network bridge used by Docker. All containers connected to the same bridge
                     +will be on the same IP subnet. Restricting access between containers
                     +in this configuration is possible but cumbersome (you'd need to write firewalls rules
                     +targeting each container individually). It is much simpler to first create a new
                     +"docker network", to which you attach all the containers you want to have a similar network
                     +configuration.
+                    +
                     +```bash
                     +$ docker network create --driver=bridge my_new_network
                     +48d08d196dc853e58c6115a6fab96ce84028ab68d6fa5d596c91adb406efb3ac
                     +```
+                    +
                     +The above command creates a network called `my_new_network`, which we can attach
                     +newly created containers to when invoking `docker run`:
+                    +
                     +```bash
                     +$ docker run --network=my_new_network debian:lastest
                     +```
+                    +
                     +In the context of Jupyterhub, this last step is actually done with the following
                     +configuration in `jupyterhub_config.py`:
+                    +
                     +```python
                     +c.DockerSpawner.network_name = 'my_new_network'
                     +```
+                    +
                     +when we execute `docker network create` the Docker daemon actually creates a virtual
                     +ethernet bridge in the kernel. We can inspect this with `brctl`.
+                    +
                     +```bash
                     +$ brctl show
                     +bridge name bridge id       STP enabled interfaces
                     +br-48d08d196dc8     8000.024245cf35a7   no
                     +docker0     8000.0242874f9221   no
                     +```
+                    +
                     +We can see that our new docker network actually corresponds to the bridge interface `br-48d08d196dc8`.
                     +When a new Docker container is created its virtual network interface is attached to this
                     +bridge interface; just like if a physical machine was plugged into an ethernet switch.
+                    +
                     +If we want a more manageable name for the virtual bridge, say `my_bridge`, we can pass it as an argument to
                     +`docker network create`:
+                    +
                     +```bash
                     +$ docker network create --driver=bridge -o "com.docker.network.bridge.name"="my_bridge" my_network
                     +```
+                    +
+                    +
                     +#### Applying IPTables rules
                     +We can now use the bridge interface in IPTables rules to control access to docker containers connected
                     +to it.
                     +For example, if we want to prevent all containers on the network from accessing the internet, we
                     +could apply the following IPTables rule:
+                    +
                     +```bash
                     +$ iptables -I DOCKER-ISOLATION -i my_bridge -o !my_bridge -m conntrack --cstate NEW -j REJECT
                     +```
+                    +
                     +The above command says the following: Please reject TCP packets that arrive on `my_bridge` and are destined
                     +for a different interface, and which correspond to a new connection (i.e. they have the `SYN` flag set), and
                     +insert this rule before any others on the `DOCKER-ISOLATION` chain. The `DOCKER-ISOLATION` chain is
                     +installed by the Docker daemon when it is installed, and is jumped to from the `FORWARD` chain.
+                    +
                     +One final thing to be aware of is the kernel configuration setting
                     +`net.bridge.bridge-nf-call-iptables`. The docker containers are connected to the same network
                     +bridge, which operates on the link layer. This means that packets destined for hosts attached
                     +to the same bridge don't need to go up to the IP layer of the network stack for the kernel to
                     +process them, which means that in principle IPTables does not act on packets that are exchanged
                     +between containers on the docker network. This behaviour can, however, be controlled with the above
                     +kernel configuration. This could be useful if, for example, we want to prevent any traffic
                     +between containers on `my_new_network`:
+                    +
                     +```bash
                     +$ sysctl net.bridge.bridge-nf-call-iptables=1
                     +$ iptables -I DOCKER-ISOLATION -i my_bridge -o my_bridge -j DROP
                     +```
+                    +
                     +[qt]: https://quantumtinkerer.tudelft.nl
                     +[jupyter]: https://jupyter.org
                     +[jhub]: https://jupyterhub.readthedocs.io/en/latest/
                     +[yv]: https://github.com/yuvipanda
                     +[cull]: https://github.com/jupyterhub/jupyterhub/tree/master/examples/cull-idle
                     +[tmpnb]: https://github.com/jupyter/tmpnb
                     +[kwant]: https://kwant-project.org

content/posts/coding/kwant-tutorial.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,18 @@
                     +---
                     +title: Kwant Tutorial
                     +date: 2019-02-27
                     +tags:
                     +  - coding
                     +  - python
                     +  - kwant
                     +---
+                    +
                     +I recently gave a talk at the University of Maryland about Kwant and using it for
                     +quantum transport. The tutorial contains an introduction to the main features of Kwant,
                     +and also a relatively in-depth discussion of the internal linear algebra that Kwant uses.
+                    +
                     +I made the slides using a Jupyter notebook, and they are available
                     +[on GitHub](https://github.com/jbweston/maryland-kwant-tutorial/) and is executable
                     +[on Binder](https://mybinder.org/v2/gh/jbweston/maryland-kwant-tutorial/master?filepath=index.ipynb).
+                    +
                     +Happy Kwanting!

content/posts/coding/markov-chain-decrypter.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,38 @@
                     +---
                     +title: Markov Chain Monte Carlo for decryption
                     +date: 2018-11-20
                     +tags:
                     +  - coding
                     +  - haskell
                     +  - markov-chain
                     +draft: true
                     +---
+                    +
                     +Each year I teach part of the Python programming course at the
                     +Casimir research school, and each year I try and think of more
                     +short projects to offer the participants during the latter half
                     +of the course. While fishing for ideas I came across an incredibly
                     +cool idea: using Markov chains to break classic cryptographic ciphers.
+                    +
                     ++ Found this paper
                     ++ Idea is:
                     +  - Analyze a reference text and obtain bigram frequencies
                     +  - Construct a score function for a decryption key by finding
                     +    the frequencies of bigrams in the decrypted text
                     +  - Use this score function with the metropolis-hastings algorithm
                     +    to walk around the key space
                     ++ Coded up a solution in Python in a couple of hours, also wanted
                     +  to give it a try in Haskell, to test out iHaskell and see how good
                     +  Haskell is for "exploratory" work
+                    +
                     ++ TL;DR for exploratory work Haskell seems too restrictive. Mediocre
                     +  library documentation and overly abstracted types make error messages
                     +  impossible to debug
+                    +
                     +---
+                    +
                     ++ Keys are just maps between characters, we make RVars of them
                     ++ Trying to make sense of the required pieces of RVars is intense
                     ++ We need to run the whole markov chain before we can get the results; not cool!
                     +  Somewhere in our monad stack we are inserting some strictness; we need to find
                     +  out where!

content/posts/coding/postscript.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,79 @@
                     +---
                     +title: Python + Postscript = Profit!
                     +date: 2016-03-12
                     +tags:
                     +  - coding
                     +---
+                    +
                     +While setting up the computing environment for the "Introduction to
                     +Computational Quantum Nanoelectronics" [tutorial][mm16] at the APS March
                     +Meeting, I came across the problem that I needed to generate 150 chits of paper
                     +with login information on them. While all the login info was available in plain
                     +text form, this didn't really lend itself well to easy printing.  Given the
                     +number of chits we would have to generate I didnt really feel like manually
                     +copy/pasting/formatting the contents of the text file using word processing
                     +software. A colleague suggested that I take a look at [Postscript][ps], which
                     +is a language for creating vector graphics. It's pretty bare-bones in
                     +terms of the features it offers (it's mainly meant as the output of
                     +sophisticated document processors such as TeX), but getting a few lines of text
                     +layed out on a page it's perfect. The Python snippet below shows how simple it
                     +is to write a simple Postscript generator.
+                    +
                     +```python
                     +import sys
+                    +
                     +postscript_header = """
                     +    %%!PS-Adobe-2.0
+                    +
                     +    /Inconsolata findfont
                     +    50 scalefont
                     +    setfont
+                    +
                     +    %%Pages: {0}
                     +"""
+                    +
                     +postscript_page = """
                     +    %%Page {0} {0}
                     +    %%BeginPageSetup
                     +      90 rotate 0 -595 translate
                     +    %%EndPageSetup
+                    +
                     +    newpath
                     +    50 400 moveto
                     +    (user: {1}) show
                     +    50 300 moveto
                     +    (password: {2}) show
                     +    showpage
                     +"""
+                    +
                     +pages = [postscript_header]
                     +for pagenum, line in enumerate(sys.stdin, 1):
                     +    user, passwd = line.split()
                     +    pages.append(postscript_page.format(pagenum, user, passwd))
+                    +
                     +# now we know the number of pages, format the page header
                     +pages[0] = pages[0].format(pagenum)
                     +print('\n'.join(pages))
                     +```
+                    +
                     +The above snippet takes username/password pairs from `stdin` and
                     +and writes a postscript document to `stdout`. It displays a single
                     +username/password per page in 50pt Inconsolata[^1] and oriented
                     +landscape. This can be read using most standard document viewers,
                     +and when printing the output can be compacted somewhat by printing
                     +several logical pages per physical page. The raw postscript can also
                     +be converted into other formats such as PDF, which is useful as the
                     +fonts are embedded directly into the document and mean that the
                     +document can be easily shared.
+                    +
                     +Now that I've seen just how easy it is to generate proper documents
                     +with Python and postscript I'm sure that I'll be integrating it
                     +into my workflow more often!
+                    +
+                    +
                     +[^1]: This font is advantageous for username/password combinations
                     +      as it distinguishes zeros from O's by putting a slash through
                     +      the former.
+                    +
                     +[mm16]: http://kwant-project.org/mm16
                     +[ps]: https://en.wikipedia.org/wiki/PostScript

content/posts/coding/stop-squashing-your-commits.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,103 @@
                     +---
                     +title: Stop squashing your commits
                     +date: 2018-11-20
                     +tags:
                     +  - coding
                     +  - git
                     +draft: true
                     +---
+                    +
                     +Actually, don't. Or do. Well, actually, it depends.
                     +The question of whether or not you should squash your git commits together is practically a holy war at this point,
                     +but I would like to give my two cents on the issue, if only to clarify my thoughts to myself.
+                    +
                     +### Background
                     ++ when starting out, people are taught to *commit early, commit often*
                     +When people start learning Git, one of the mantras that they first learn is:
+                    +
                     +> Commit early, commit often.
+                    +
                     +The idea is to get people
                     ++ gets used to idea that operations in git are *cheap*, and the advantages gained by telling git about changes
                     +  greatly outweigh the disadvantages.
                     ++ in fact, because all but a few git operations are *local*, you don't need to worry about having everything fleshed
                     +  out before taking advantage of git.
                     ++ implicit in the above is the idea
+                    +
+                    +
                     +### Git workflows
                     ++ other people have discussed git workflow, but not really gone into detail about how to craft commits
                     ++ hack until a feature is fully fleshed out, then commit it
                     +  - goes againt *commit early, commit often*
                     +  - don't do this
                     ++ hack until a feature is fully fleshed out, commiting at random
                     +  - git delivers diminishing returns if commits don't correspond to logical changes
                     +    - hard to skip back to a "working" state
                     +    - harder to see what has actually been changed without contextual and correct commit messages
                     +  - commits can be "at random" to a greater extent, but the idea is that commits are made according
                     +    to some scheme that is divorced from the code (e.g. commiting before you go to lunch, or at the end of
                     +    the day
                     ++ split feature into smaller pieces, then tackle these pieces one at a time, making a single commit for each
                     +  - this is the "ideal", we should strive for this
                     +  - difficult to do in practice because it presupposes that you already understand the problem well enough
                     +    to split it up.
+                    +
                     +###
+                    +
+                    +
                     +The idea is to get people
                     +used to the idea that, when using git, commits are *cheap*. This is in contrast to older version control systems
                     +(VCS) where commiting is often a rather hefty operation, and where the idea of making *several commits a day* is
                     +pretty crazy. The unintended consequence of this mentali
+                    +
                     +The fundamental point is that *git history is a historical record*. This might seem like a tautology, but
                     +my point is that there are several ways of interpreting git history.
+                    +
+                    +
                     +### The git history as an eye-witness account / lab book
                     +History is *what actually happened*.
                     +"from the trenches". You feel like you're there; oh the trailing whitespace; oh the humanity!.
                     +Get accounts from both sides of the battle.
+                    +
                     ++ immutability is a public good
                     ++ don't get problems when collaborating
+                    +
                     +### The git history as a historical account / scientific article
                     +History is written by the victors. When writing a scientific article, you have to keep in mind
                     +that the majority of your readers don't care whether you spent several weeks on a particularly
                     +difficult calculation, trying many routes that ended up as dead ends, all they care about is
                     +the *correct* method for obtaining the results.
+                    +
                     +Of course, it may be that you publish something incorrect, and later have to write another paper
                     +to correct the previous one (rare in the academic world, but very common when writing code). This
                     +is to some extent unavoidable, but it does not diminish the importance of taking the time to distill
                     +down the experience into a digestible portion.
+                    +
                     +----
                     +*Why are you recording all of history*? Why not use dropbox, or dropbox + some history (say 1 week / 1 month).
                     +Could be several reasons:
+                    +
                     ++ auditing (knowing who changed what). Might be important for businesses for knowing who to promote/fire
                     ++ improving understanding
                     ++ increased control / granularity
+                    +
                     +### The git history as a lab book
                     +An immutable record of things "as they went down". Many dead ends and mistakes.
                     +Requires extra work to see what the actual progress was.
+                    +
                     +### The git history as a journal article
                     +Curated to make the content as understandable as possible. Nobody cares that you spent 3
                     +days tracking down a particularly insidious bug.
+                    +
+                    +
                     +### Two distinct modes of operation
                     ++ A personal record
                     +  - a quick backup in case I accidentally `rm` something
                     +  - allows to explore various paths
+                    +
                     ++ An account for others to understand the decisions that went into making the code.
                     +  If there is something that I don't understand, I will usually grep the git log for that
                     +  change.
+                    +
                     +- could use branches as the "article" and individual commits as the "lab book". The message of the merge commit
                     +  should be the same as the message of the single commit in the "article" way of working.

content/posts/coding/stop-teaching-git-pull.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,87 @@
                     +---
                     +title: A case for rebase
                     +date: 2017-10-23
                     +tags:
                     +  - coding
                     +  - git
                     +---
+                    +
                     +There are a lot of Git tutorials on the web that teach people to use `git pull`
                     +when first teaching them about working with remote repositories and collaboration.
                     +I would like to put forward the position that this is a Bad Idea (TM), and that
                     +it is more instructive to teach people to use `git fetch` followed by an explicit
                     +`git merge`.
+                    +
                     +I understand the temptation of teaching people to just `git pull`, because it's
                     +a single command (rather than 2) and often it "just werks". On the other hand
                     +I get the impression that teaching people only `git pull`
                     +reinforces an incorrect mental model that causes a ton of confusion when
                     +there are (as there inevitably are) conflicts with the remote repository.
                     +In addition, I've noticed that often people just want to see what their collaborators have
                     +done, without necessarily incorporating those changes into their own work.
                     +Teaching the two operations separately enables this workflow; without it you have
                     +to introduce `git reset` just so that people can get themselves back to their previous
                     +state!
+                    +
                     +Because working with a remote repository is essentially (pedants, please contain
                     +yourselves) working with multiple branches I personally think that it is really useful to
                     +teach branches *before* remote repositories[^1]. Once people have the concept of
                     +branches down, it's then a pretty small leap to "by the way, you can fetch
                     +the state of *other people's branches* with `git fetch`". You then explain that
                     +the branch shows up on your local machine as `origin/whatever-branch-name`, and
                     +that you shouldn't try and make commits directly on this branch because it's
                     +"owned" by `origin`. At this point it's probably a good idea to show what happens
                     +when the remote repository is updated by somebody else, so that there is a "fork"
                     +in the history:
+                    +
                     +          ◯—◯ ← origin/master
                     +         ╱
                     +    ◯—◯—◯—◯—◯ ← master
+                    +
                     +[^1]: This is, of course, tough if you are teaching a Github-centric workflow.
                     +      One way around this may be to get people to initialize their local repositories
                     +      by cloning, and then forget about the remote entirely until the time is right.
+                    +
                     +You can then say "ok, `origin/master` and `master` now contain *different things*;
                     +we need to incorporate the changes on `origin/master` with our ones".
                     +With that you introduce `git merge`, and can show the updated history after that
                     +operation:
+                    +
                     +          ◯—◯ ← origin/master
                     +         ╱   ╲
                     +    ◯—◯—◯—◯—◯—◯ ← master
+                    +
                     +then you can `git push origin master` and show what that does locally:
+                    +
                     +          ◯—◯
                     +         ╱   ╲
                     +    ◯—◯—◯—◯—◯—◯ ← master, origin/master
+                    +
                     +Teaching this sequence of operations, it is abundantly clear that `git fetch` only
                     +updates `origin/master`; it will *never affect what you are working on right now*.
                     +It's the way that you see what other people are working on, while you also continue
                     +working on your own thing. It's also clear that `git merge` *totally affects what
                     +you're working on right now*, so you'd better get yourself into a place where
                     +you're ready to have your files modified as git magically incorporates all those
                     +sweet sweet changes that your buddy just pushed.
+                    +
                     +This workflow also mitigates the common pitfall of:
+                    +
                     +    $ git push
                     +        To git-example-origin
                     +        ! [rejected]        master -> master (fetch first)
                     +        error: failed to push some refs to 'git-example-origin'
                     +    $ git pull
                     +        Auto-merging
                     +        CONFLICT (content): Merge conflict in hello-world
                     +        Recorded preimage for 'hello-world'
                     +        Automatic merge failed; fix conflicts and then commit the result.
+                    +
                     +So instead of "congratulations, your code is now full of conflict markers,
                     +have fun!" you get to inspect the changes that were introduced by the remote
                     +*before* your try to merge them in. This means you can anticipate if there
                     +will be any problems, and know what to expect when you try to merge.
+                    +
                     +You could even imagine running `git fetch` periodically to keep `origin`
                     +up to date with any changes on the remote. This would be complete
                     +madness if you tried to do the same thing with `git pull`!

content/posts/coding/think-differently.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,125 @@
                     +---
                     +title: On thinking differently
                     +date: 2018-02-07
                     +tags:
                     +  - coding
                     +draft: true
                     +---
+                    +
                     +This post is about an experience I had while solving a kata-style coding
                     +exercise. While the problem itself was very well defined and had a simple
                     +solution, I was very taken aback that I did not see the *most* elegant and
                     +simple solution, despite my proclaimed fluency with programmatic problem
                     +solving. This experience taught me that I still have a lot to learn about
                     +thinking outside the box, and I'm writing it down here mainly to try and
                     +articulate my thoughts to myself.
+                    +
                     +#### Let's begin
+                    +
                     +A colleague of my partner regularly posts small coding exercises to the
                     +company Slack channel. I think that this is a great idea for several reasons:
+                    +
                     ++ it gives people practice at translating problem specifications into code,
                     ++ it makes people think about problems that are different to those on which
                     +  they work day to day,
                     ++ and it provides a central point for discussions about the merits of different
                     +  ways of attacking problems, in addition to coding style.
+                    +
                     +The exercises do not even have to be very complicated (in fact I think that this
                     +is best); the most recent exercise was as follows:
+                    +
                     +> Write a function that returns the most commonly occurring alphabetic character
                     +> in a string, treating uppercase and lowercase letters as equivalent.
                     +>
                     +> If two characters occur equally often, the the one that occurs earlier in
                     +> the alphabet should be returned.
+                    +
                     +#### My solution
+                    +
                     +Seems simple enough, right? I coded up the simplest solution I could think
                     +of in a few minutes
+                    +
                     +    :::python
                     +    from collections import Counter
+                    +
                     +    def most_common(s):
                     +        s = (c for c in s.lower() if c.isalpha())
                     +        most_common, count = max(Counter(s).items(),
                     +                                 key=lambda c: (c[1], -ord(c[0])))
                     +        return most_common
+                    +
                     +I will call the above code "solution 1".
                     +I was convinced that this was the optimal solution:
+                    +
                     ++ We filter out only the characters we care about, so the counting logic
                     +  does not run for characters that we will later throw away,
                     ++ We use a generator expression to avoid making a copy of the (potentially large)
                     +  string in memory
                     ++ We make a single pass over the input string
+                    +
                     +#### The *other* solution
+                    +
                     +This was the solution that was posted to the Slack channel after everyone had
                     +submitted theirs:
+                    +
                     +    :::python
                     +    from string import ascii_lowercase
+                    +
                     +    def most_common(s):
                     +        return max(ascii_lowercase, key=s.lower().count)
+                    +
                     +I will call this code "solution 2". Just looking at it this is *much* cleaner
                     +than solution 1 (although, embarrassingly, it actually took me a minute to
                     +understand how it handles the edge case where two characters have the same
                     +count). It also works in a fundamentally different way to solution 1:
                     +here we iterate over the characters that we are interested in (`ascii_lowercase`)
                     +and compare them based on the number of times that they occur in the input
                     +string, taking the character with the maximum count. If several characters
                     +have the same counts, then `max` will choose the one that occurred first
                     +(it has the [same semantics as a stable sort][max-doc]).
+                    +
                     +[max-doc]: https://docs.python.org/3/library/functions.html#max
+                    +
                     +Despite its readability I was initially skeptical because we make *26 passes
                     +over the input string*, rather than just 1. It is also the case that even if the
                     +input string contains only the character 'a' (for example) we will still iterate
                     +through the damn string 25 more times, counting up the occurrences of 'b', 'c'
                     +and so on! This is even though we *know* that it doesn't contain anything but `a`s after the
                     +first iteration. My partner had actually tried to solve the problem in a
                     +similar manner to this, but I had dismissed it as suboptimal for the
                     +aforementioned reason. I said to myself "*sure, this seems cleaner, but
                     +there's **no way** that it's more efficient*".
+                    +
                     +This is why it was a huge shock to me that *solution 2 actually outperforms
                     +solution 1* in terms of run time.
+                    +
                     +What I failed to account for, is that *in this case we don't care about
                     +asymptotic complexity*. Subconsciously I had been thinking: "*hm, if the problem
                     +requirements change and we now want to find the most commonly occurring unicode
                     +character then we would have to iterate over the input string [a hundred
                     +thousand times][unicode]; not cool!*". However, the problem very clearly states
                     +that *we only care about ascii lowercase characters*. In this regime
                     +solution 2 performs way better because the counting of individual characters is
                     +done by the builtin string method `str.count`, which uses a [tight C
                     +loop][fastcount]. Compare this to solution 1, where we iterate over the input
                     +string in a [python loop][counter], incurring the additional cost of a
                     +dictionary lookup, integer addition, and an `isalpha()` check from Python, phew!
+                    +
                     +[unicode]: https://en.wikipedia.org/wiki/List_of_Unicode_characters
                     +[fastcount]: https://github.com/python/cpython/blob/master/Objects/stringlib/fastsearch.h#L187
                     +[counter]: https://github.com/python/cpython/blob/master/Lib/collections/__init__.py#L486
+                    +
                     +#### Thoughts
                     +This blog post is mainly just me proving to myself that, in this instance, *I was
                     +wrong*. My solution was inferior in every possible metric. This was initially quite hard to swallow, as
                     +before seeing solution 2 I was fully convinced that it was impossible within the
                     +confines of Python to express a solution more cleanly and efficiently. Boy have I
                     +got a lot to learn!
+                    +
                     +It was also a good reminder for me to make sure that I actually optimise
                     +my designs for the *intended use case*. I have a natural tendency to try and
                     +write code that solves a problem more general than the one initially
                     +formulated. Although I'll try and justify this as making the code more
                     +"reusable" or "extensible", the real reason is probably just that I enjoy
                     +extracting the abstract structure of a problem. I really have to work on not
                     +[abstracting into the stratosphere](https://www.joelonsoftware.com/2001/04/21/dont-let-architecture-astronauts-scare-you/).

content/posts/coding/trolling-physicists.md

History View file @ 8bfcbb8

                     new file mode 100644
@@ -0,0 +1,41 @@
                     +---
                     +title: April fools!
                     +date: 2018-04-05
                     +tags:
                     +  - physics
                     +draft: true
                     +---
+                    +
                     +So on April 1st Anton and I posted to the group's [blog](https://quantumtinkerer.tudelft.nl/blog/machine-learning-articles/) about
                     +a fascinating project that we'd been working on in the preceding month. We had been using "advanced machine learning techniques" to
                     +conduct sentiment analysis on scientific articles to see if they contain irrefutable evidence for various breakthroughs such as
                     +a working quantum computer or (from our own field) Majorana zero modes. Try it for yourself below!
+                    +
                     +<iframe class="centered" src="https://ai.weston.cloud"></iframe>
+                    +
                     +To the untrained eye it even looks pretty plausible: there's
                     +a flashy animation when the "predicting" happens, and it even shows you the name of the article that you asked about. Of course
                     +digging even a little bit into the source code reveals that we're using a
                     +[somewhat simplistic model](https://gitlab.kwant-project.org/jbweston/is-it-majoranas/blob/master/backend/Main.hs#L21)
                     +(no chance that we're overfitting here!), nevertheless we manage to get 100% accurate results!
+                    +
                     +What surprised me the most was that Anton was managing to sustain conversations with colleagues about this "project"
                     +and people seemed to be treating it seriously! Of course it's entirely possible that they were just playing along,
                     +and that we were, in fact, the ones who were being trolled.
+                    +
                     +In any case it was a fun little project for me, as my main goal was to test out the
                     +[Elm language](http://elm-lang.org/) for building frontends for webapps.
                     +We'd switched to the React framework for the rewrite of our [Zesje](http://gitlab.kwant-project.org/zesje/zesje) grading software but
                     +I was eager to see what pure functional programming could bring to the game with respect to managing state. Although in principle I
                     +find the idea of modelling a webapp as a well-defined state machine appealing, in practice I found that for such a small project the
                     +hoop-jumping was more hassle than it was worth.
+                    +
                     +I was also interested in seeing how easy it would be to write a small web API in Haskell.
                     +The excellent [Haskell From First Principles](haskellbook.com) uses the Scotty web framework in several examples, so I thought
                     +I'd give that a go. Again, I think that the limited scope of the project really hindered any possible gains that pure functional
                     +programming could provide. Even if pure functional programming gives an asymptotic advantage (in terms of development time and
                     +confidence about a codebase), the relatively large prefactor associated with getting anything done is really significant.
                     +For example, I had to research and import 3 separate network-related libraries to be able to serve the API (`Scotty`), return
                     +HTTP 400 responses (`Network.HTTP.Types`) and send web requests to other APIs (`Network.Wreq`), in addition to another library
                     +(Lens) for accessing attributes from the responses from the `Wreq` library (really). Sometimes it feels like Haskell makes
                     +the simple things much more complicated than they need to be (even if it does make some complicated things easier).