Building a Blog Part 4 - Deploying to Amazon S3

Table of Contents

Going Live #

After Building a Blog Part 3 - Continuous Integration with Gitlab CI - its time to automate deploying our blog to a host.

Amazon’s Simple Storage Service (S3) is a good choice for serving static assets on the cheap. S3 is a bit different from a traditional VPC like DigitalOcean or Amazon’s Elastic Compute Cloud (EC2).

Pros #

Very cheap - With AWS you pay as you go. Rather then paying $5-10 a month running a server for our static site, we can store our site for fractions of a penny:

First 1 TB / month $0.0300 per GB

AWS S3 Pricing

and only pay for the requests to our bucket:

PUT, COPY, POST, or LIST Requests $0.005 per 1,000 requests

GET and all other Requests $0.004 per 10,000 requests

Delete Requests Free †

Cons #

S3’s API isn’t like DropBox or Google Drive. While uploading individual files to a bucket is easy with the S3 Web interface, uploading nested folders isn’t so straight forward.

Just like a Linux filesystem, S3 filenames include the full path started at the bucket root.

posts/some_post/index.html

creates two folders: posts and a sub-folder some_post. Rather then generating these folder structures manually, AWS has published SDKs in many languages.

Setup #

Create IAM account #

Its best practice to create seperate accounts for applications accessing AWS resources. This way if your root credentials are ever comprimised, you can rotate credentials and keep your AWS account secure.

Login to your AWS account and browse to > Security Credentials > Users and click Create New Users. I named my user gitlab since I’ll be using this user for all Gitlab related integrations with AWS.

I gave my IAM user AmazonS3FullAccess powers - we can trim the permissions down later.

Configure S3 bucket for web hosting #

Create an S3 bucket and give it the name of your domain. In this case my domain is mblum.me. Enable web-hosting for your bucket:

enable static web hosting for the S3 bucket

This doesn’t expose our files to the outside world. Lets give visitors permission to view our site:

Click on Permissions > Edit bucket policy and add the below policy:

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Sid": "AddPerm",
			"Effect": "Allow",
			"Principal": "*",
			"Action": [
				"s3:GetObject"
			],
			"Resource": [
				"arn:aws:s3:::bucketname/*"
			]
		}
	]
}

swapping out bucketname for the name of your bucket. For this site I set it to:

"arn:aws:s3:::mblum.me/*"

Script #

ruby - a programming language and precious gem

Since our Jekyll site is Ruby-based, it would make sense to write our deployment script in the same language for easy chaining of build, test, and deploying our site.

I’ve written a Ruby gem for uploading a Jekyll to an S3 bucket:

deploy_jekyll_s3 gem

This script will copy the contents of the _site directory to the specified S3 bucket.

Integrating with CI #

Gitlab gives us a nice interface for specifying our AWS IAM credentials as well as S3 bucket without hard coding them into our gem.

The deploy_jekyll_s3 gem expects the following ENV variables:

AWS_ACCESS_KEY
AWS_ACCESS_SECRET
AWS_REGION
AWS_BUCKET

Let’s configure them for our Gitlab project. Browse to your project > Settings > Variables.

Now lets modify the .gitlab-ci.yml to run the deploy_jekyll_s3 gem on the deploy step, after building and testing.

.gitlab-ci.yml #

image: ruby:2.3

stages:
  - build
  - test
  - deploy

before_script:
  - apt-get update >/dev/null
  - apt-get install -y locales >/dev/null
  - echo "en_US UTF-8" > /etc/locale.gen
  - locale-gen en_US.UTF-8
  - export LANG=en_US.UTF-8
  - export LANGUAGE=en_US:en
  - export LC_ALL=en_US.UTF-8
  - bundle install --jobs $(nproc) --path=/cache/bundler
  
build:
  stage: build
  script:
    - bundle exec jekyll build
  only:
    - master

test:
  stage: test
  script:
    - bundle exec htmlproofer _site
  only:
    - master

deploy:
  stage: deploy
  script:
    - scripts/cideploy
  only:
    - master

cideploy #

I was seeing issues with the _site directory being available so I build it again during the deploy step:

#!/usr/bin/env bash

set -e # halt script on error

bundle exec jekyll build
bundle exec deploy_jekyll_s3 --verbose deploy

Automagical Deployments #

Run git push gitlab master and we upload our site to S3:

and here we can see our site copied into the S3 bucket we specified in the Gitlab Environment variables:

Troubleshooting #

SVGs not loading #

Looks like most of our images are loading, but not the images we use for logos.

Looks like our SVGs have the wrong mimetype. It looks like AWS needs to be told what mimetype to set. The mimetype for .SVG is image/svg+xml.

Lets update the code uploading our assets to add the correct mimetype. Using the ruby-filemagic we can apply the mimetype as well as a checksum to track which version of the file is in S3.

def upload
	fm = FileMagic.new(FileMagic::MAGIC_MIME)
	fs_to_s3_map = generate_s3_paths
	
	fs_to_s3_map.keys.each do |key|
		fs_path = fs_to_s3_map[key]
		mime_type = fm.file(fs_path)
		checksum = Digest::SHA2.hexdigest( File.read(fs_path))
		metadata = {
		  "checksum" => checksum
		}
		if @options[:verbose]
          puts "uploading file to S3: #{key} MIME: #{mime_type} CHKSUM: #{checksum}"
		end
		unless is_dryrun
		  bucket = _get_bucket
		  object = bucket.object(key)
	  	  object.put(:body => File.read(fs_path), :content_type => mime_type, :metadata => metadata)
  	    end
  end
	puts 'UPLOAD COMPLETE...'
	return fs_to_s3_map
end

and that solves our mimetype problem.

Note: ruby-filemagic seems to stumble on .css files. I’ve added a custom check that looks to the file extension and adds my specified mimetype or falls back to ruby-filemagic otherwise:

# FileMagic gets these wrong
def _mimetype_from_ext(filename)
  case File.extname(filename)
  when '.css'
    return 'text/css'
  end
end

def get_mimetype(path)
  mime_type = _mimetype_from_ext(path)
  if mime_type.to_s.empty?
    mime_type = @fm.file(path)
  end
  return mime_type
end