Haskell, Hakyll and Github Actions

This is a short write-up on my current set-up for deploying this site. It covers:

The manual workflow

The site is generated using the Hakyll static site generator. Using it involves writing a small Haskell program site that calls out to the Hakyll libraries to build the website.

This program and the rest of the website sources live in one repository, and are deployed to Github Pages by pushing to a second repository, robx/robx.github.io. The actual source repository is private, but you can checkout a snapshot at robx/site-demo.

Updating the site then takes four steps:

  1. edit some website source files
  2. build the site executable
  3. run site to generate the website content
  4. commit and push the updated website to the destination repository

Doing this manually, we would build the executable using cabal build, then build the site using cabal exec site build, and finally do something along the following lines to deploy:

$ cd /src/robx/robx.github.io
$ rm -r * && cp -r /src/robx/site/_site/* .
$ git add -A && git commit -m "update website" && git push

The automatic workflow

Since doing this by hand is too much work and a bit error prone, let’s automate it using Github Actions. To do this, we define a workflow using a YAML file build-deploy.yml that lives in .github/workflows/. You can view the full workflow file, or check the (messy) run history at the demo repository. Below, we’ll go through that file chunk by chunk.

Our workflow gets a name, is set up to execute on push events, and has one job with id build.

name: Build and deploy to github pages
on: push

jobs:
  build-deploy:

Now comes the body of our build-deploy job. It gets a name as well as a base virtual environment. We also set some variables, since we’ll need to refer to the tool versions twice later on.

    name: Build and deploy
    runs-on: ubuntu-latest
    env:
      GHC_VERSION: '8.6.5'
      CABAL_VERSION: '3.0'

Finally, we give a list of steps that the job should perform. First, we call out to a github-provided action that checks out the working tree corresponding to the event that triggered the workflow.

    steps:
      - uses: actions/checkout@master

Next, we call out to an action to install a Haskell toolchain. We specify versions for GHC and Cabal, referring to the variables using an adhoc expression language.

      - uses: actions/setup-haskell@v1
        with:
          ghc-version: ${{env.GHC_VERSION}}
          cabal-version: ${{env.CABAL_VERSION}}

Building Haskell projects tends to take too much time: A fresh build of Hakyll on the Github Actions infrastructure takes around 30 minutes. So we’ll cache the compiled dependencies. We’ll be using Cabal’s Nix-style builds here, which store artifacts in $HOME/.cabal/store.

The way caching works, we need to provide a cache key that includes full dependency version information. For simplicity, we’ll assume the existence of a Cabal version locking file cabal.project.freeze; see below for another approach.

      - name: 'Run actions/cache@v1: cache cabal store'
        uses: actions/cache@v1
        with:
          path: ~/.cabal/store
          key: cabal-store-${{ runner.OS }}-${{ env.GHC_VERSION }}-${{ hashFiles('cabal.project.freeze') }}
          restore-keys: |
            cabal-store-${{ runner.OS }}-${{ env.GHC_VERSION }}-
            cabal-store-${{ runner.OS }}-
When run, this action will restore any existing archive under the given key, falling back to any of the alternate keys listed under restore-keys. In addition, the action has a “post action”, which will save an archive at the end of a successful run.

Now we’re ready to execute a couple of commands to build the Haskell project and generate the site:

      - run: cabal update
      - run: cabal build --only-dependencies
      - run: cabal build
      - run: cabal exec site build
  • cabal update fetches the package database from hackage. (This package database might also be cached between runs, but at ~30s I didn’t bother so far.)
  • cabal build --only-dependencies builds the dependencies only. It’s useful to split this from the project build itself below:
    1. While getting things to work, we can disable later steps in order to get the dependency cache ready, making iterating on the later steps a lot faster.
    2. We can easily distinguish between problems with the project itself and with the packaging infrastructure.
  • cabal build builds the Hakyll site executable itself.
  • cabal exec site build calls this executable, generating the website.

Finally, we check the commit the updated version of the site to the github pages repository, using one of a multitude of third party actions that deal with this task. This action in particular has the advantage of supporting ssh deploy keys, while most other actions appear to require a (far more powerful) personal access token to interact with a different repository.

      - name: 'Run peaceiris/actions-gh-pages@v2.5.0: deploy to github pages'
        uses: peaceiris/actions-gh-pages@v2.5.0
        env:
          ACTIONS_DEPLOY_KEY: ${{ secrets.ACTIONS_DEPLOY_KEY }}
          PUBLISH_BRANCH: master
          PUBLISH_DIR: _site
          EXTERNAL_REPOSITORY: robx/robx.github.io
        if: github.ref == 'refs/heads/master'

The if: condition ensures that this step is only run on pushes to master.

To configure the deploy key, generate an ssh key-pair, add the public key to the github pages repository as a deploy key, and store the private key in the source repository secrets under the key ACTIONS_DEPLOY_KEY. The README of this action has detailed instructions.

Some extra snippets

Debugging

The following step is useful to have in there to aid in debugging a workflow. It stores a number of contexts to env, which is enough to be able to inspect them in the web interface.

- name: Dump contexts
  env:
    CTX_GITHUB: ${{ toJson(github) }}
    CTX_STEPS: ${{ toJson(steps) }}
    CTX_ENV: ${{ toJson(env) }}
  run: true

Caching with cabal.project.freeze

To get reliable caching with cabal regardless of the existence of a freeze file, you can reorder things as follows:

- run: cabal update
- run: '[ -e cabal.project.freeze ] || cabal freeze'
- name: 'Run actions/cache@v1: cache cabal store'
  uses: actions/cache@v1
  with:
    path: ~/.cabal/store
    key: cabal-store-${{ runner.OS }}-${{ env.GHC_VERSION }}-${{ hashFiles('cabal.project.freeze') }}
    restore-keys: |
      cabal-store-${{ runner.OS }}-${{ env.GHC_VERSION }}-
      cabal-store-${{ runner.OS }}-
- run: cabal build --only-dependencies

This generates an up-to-date freeze file and uses it to compute the cache key.

It’s necessary to get the version information into the cache key instead of just e.g. hashing the cabal file itself: Otherwise, the cache key will be constant across runs whence the cache won’t be updated, even as the cache gets outdated in relation to upstream.

Github Actions pain points, open ends

This works and I’m happy enough with it. Getting to this state was quite painful though, and I’m not thrilled with Github Actions in their current state. Some random thoughts:

There are also some open ends on the Haskell side of things.