MW
Home
Blog
Work
Contact

For your pipeline you basically need to host your Ghost CMS so that Gatsby can pull the content during build time. You also need to host your built static website somewhere.

Several options exist on how and where you want to host your Ghost CMS, you could even have a local install and just push to built static website to your hosting plattform. I decided early on that I want to have everything on Amazon Web Services. Ghost runs on a EC2 instance and my Gatsby project is at Github to utilize the functionality of CodeBuild to automatically deploy changes to my S3 bucket, where the static site is hosted. For HTTPS/SSL I use CloudFront, which also acts as the CDN (not really important for a niche personal blog).

I would recommend to stick to this tutorial for the setup as I have described it above. I just want to add some points where I had to come up with my own solutions.

CloudFront Caching

When using CloudFront as a CDN, you will most likely want your data to be fresh after you push an update or publish a post. An extra step is required to don't serve stale content and I solved this by using Lambda Functions or invalidating the cache after pushing an update, but ultimately decided to go the invalidating route.

Lambda@Edge Functions

When using CloudFront to distribute your content, you can apply Lambda functions to hook into the process of request/response between the user, cloudfront and your origin source. You can read about this more here.

Source: Amazon lambda@edge

Essential is that you can add a maximum of one lambda function at each stage of the request/response pipeline. For the prevention of a stale cache we want to alter the headers of the response to tell cloudfront the data needs to be fetched from the Origin server. This is taken from the tutorial of Ximedes.

  1. Go to Lambda functions in your AWS console and create a new function. Important is that the region MUST be US-EAST-1 (North Virginia). Otherwise you can't access the cloudfront triggers.
  2. Select Node 8.10 as runtime and add the necessary security permissions to execute lambda functions (basic lambda@edge permissions).
  3. From the dropdown at the top select $(Latest) and add following code in the editor. This will add the cache control to the header, with the settings that everything in the static folder has a max age of one year, and everything else is expected to be stale and must be revalidated.
'use strict';
exports.handler = (event, context, callback) => {
  const request = event.Records[0].cf.request;
  const response = event.Records[0].cf.response;
  const headers = response.headers;

  if (request.uri.startsWith('/static/')) {
    headers['cache-control'] = [
      {
        key: 'Cache-Control',
        value: 'public, max-age=31536000, immutable'
      }
    ];
  } else {
    headers['cache-control'] = [
      {
        key: 'Cache-Control',
        value: 'public, max-age=0, must-revalidate'
      }
    ];
  }
  
  headers['vary'] = [
    {
      key: 'Vary',
      value: 'Accept-Encoding'
    }
  ];

  callback(null, response);
};

Next step is to add the function to your CloudFront distribution

  1. At the top select publish new version in the actions dropdown.
  2. Select the newly created version
  3. On the left search for the CloudFront trigger and add it. This should open a prompt for settings. Select your CloudFront distribution at the distribution.
    Leave the * as path, this means all paths in your distribution will trigger the rule.
    Select origin-response as trigger event and tick the checkbox for distribution. This may take some minutes.
  4. Go to your CloudFront distribution and check the default behaviour. At the bottom of the edit page should now be your lambda function with the associated behavior.
  5. !Important! check if the Object Caching is set to "Use Origin Cache Headers" otherwise CloudFront will use the default settings.

Invalidating the cache after pushing an update

The second variant I am going to introduce you is that after CodeBuild copies your built website to S3 to invalidate the CloudFront Cache. If you read the documentation of invalidating caches you will find a statement about the cost, which states to sum it up that every path you invalidate costs money. Although you get 1000 invalidations free per month, I thought that it would cost too much invalidating every file, however there is a statement that all files in a folder invalidated with a wildcard will only count as one invalidation, which means after pushing a update we just invalidate the root folder.

  1. Update the buildspec.yml file with a line to invalidate your CloudFront distribution
  2. You need to give your CodeBuild build-project an additional permission to access CloudFront. For that add the CloudFrontFullAccess role to you CodeBuild project
  3. Go to your CloudFront distribution and edit the default behaviour. Set object caching to customize and set all Time-to-live (TTL) to 31536000 (one year)
version: 0.2
phases:
  install:
    runtime-versions:
      nodejs: 10
    commands:
    - npm install --global yarn
    - npm install --global gatsby-cli
  build:
    commands:
    - yarn
    - gatsby build
  post_build:
    commands:
      - aws s3 sync public "s3://YOURS3BUCKET" --acl=public-read --delete
      - aws cloudfront create-invalidation --distribution-id "YOURDISTRIBUTIONID" --paths "/*"

cache:
    paths:
        - node_modules/**/*
        - public/**/*
        - /usr/local/lib/node_modules/**/*

Which version to use?

I ultimately decided to use the cache invalidation method, because I tested both and using lambda@edge functions to alter the header was still unreliable when the new content was available. By using the cache invalidation, I can be sure that I see the fresh content after about 10 to 20 minutes. Additionally I think that using lambda functions is a little bit inefficient, because I use so little static resources and my content is only as dynamic as I want it to be by pushing updates. If you want to read more about how Caching in Gatsby works and can be optimized check out the article at their documentation.

Other.

Contact.

Markus Wallinger


I am currently finishing my Master Thesis at the Vienna University of Technology. One the side I work as a freelance software developer, based in Vienna and Tyrol.