NodeJS Deploying Files to AWS S3

NodeAndS3

What Problem Am I Solving

Fundamentally the technical challenge comes from weekly content updates I make to a webapp I own. Every Friday afternoon my daughter comes home with a list of ten spelling words she must study for testing the next Friday morning. Of course, being a good nerd daddy as I am, I built a spelling test webapp where I record her saying the words for playback making a practice test.

Check out the webapp. It’s simple, functional, and does wonders as my daughter uses it on the phone while driving to school, or on the tablet at home in the evening. It’s a fantastic tool and I’m glad I made it.

Weekly updates means ten new files and a subdirectory are uploaded along with an updated test definition file. Pointing and clicking through the AWS web-console manually uploading files was fine the first time, the second, the sixth, but after the tenth I thought I better get smart about this. I need a command-line tool to chunk them up there quick and easy. Less time consuming, and less room for human-error.

Ultimately reducing deployment friction through automation is so important for engineers.

Where on GitHub is the Code Example

Now you might be wondering to yourself, “Ken, aren’t you well known for your illustrative source code examples that concretely demonstrate a technical topic?” Why yes, thank-you for mentioning that. In fact I do try being helpful by writing code between my “fluff piece” articles.

Clone my repo now:

https://github.com/KDawg/S3DeployJS

It has a simple website with files typical of best use with AWS S3 for something to play with. This file in particular is the tool whose code I’ll review in detail below:

S3Deploy.js

Come back to reading this article when you’ve had a sufficient look at the codes.

What is Amazon Web Services

AWS is a phenomenal side business that sprung from Amazon’s core competency of being an online retailer used by nearly everyone, everywhere, that’s nearly always up and running. It’s great stuff and I’ve been using it about four years now. Linux servers (EC2), databases (RDS), and file servers (S3) are just a few highlights.

What is S3

Simple Storage Service” (S3) is a content delivery network (CDN) that rocks out at serving static files across the Internet. What do I mean by “static resources?” I’m specifically thinking images, audio, javascript, and style sheets in the context of a website.

After reading that someone could opine, “My web server is written in Symfony/Rails/Django/Express and can’t it deliver those types of files to users?” Sure, absolutely. It’s my position, however, that my company’s project servers are best saved for doing the most important tasks we program into it.

Domain-specific tasks such as logging in users, updating databases, proxying request responses, and delivering dynamically generated presentation pages. That’s what our business servers rule at doing. Serving up static resources is a commodity task worth offloading to AWS S3. Save the precious processing power for figuring out how to deliver user value, and doing that quickly.

Another fantastic feature of S3 that I love using is serving up a static website. No dedicated server needed at all when I’m exclusively showing front-end work that doesn’t need server interaction. It’s one of the most killer pro-tips that I can share with all of you. In fact a while back I wrote an article for setting up S3 as a webapp server.

What is NodeJS

I must admit that I’ve always considered myself a “front-end” guy. I guess I like graphics and user experience. What exactly I do as a product engineer has changed dramatically over my career. Over the past few years and today I’m steeped deeply in web browsers and their open-source tech. JavaScript is a key component in my work. If you’re wondering if I do any back-end service work I do indeed, but it’s always been my weak-hand.

Recently everything has changed. JavaScript lives on the server in the form of NodeJS. It’s fantastic and I love me some NodeJS server action letting me connect the dots to the user’s browser. Look at me! I’m a full-stack unicorn now! Weee!

NodeJS also runs on the command-line letting me build this sort of build-step/production-deployment tool for uploading my webapp resource files to S3 once localhost dev is done. Explaining what’s under the hood of that tool is why this article exists. JavaScript is reasonably easy to learn and many folks will already have the basics down to push forward into this lesson.

For readers wondering if alternatives to using NodeJS exist I say “sure.” Lots of languages run on the command-line producing fine scripting solutions. These include python, ruby, and php for example.

You Must Have AWS Access

Seems obvious to write this, but I’m going to anyway. If you’ve not signed up for AWS then please go check out resources telling more about that story. Come back when you’re ready and let’s crank on some code.

What NodeJS Modules

When you take the associated source code repo you’ll see it has a package.json file declaring “aws-sdk” as a dependency. This definition file makes setting up the project simple enough. I’m assuming you have background with this NodeJS related tech recognizing:


npm install

Config Your S3 Auth Credentials in AwsConfig.json

We, as AWS subscribers, have authorization credentials. When you clone this article’s demo repo you’ll see a file called S3Deploy.js and beside it you must edit a file it depends upon called AwsConfig.json. It looks like this, but of course you will edit it replacing its placeholders with your real numbers.

How do I know how to get my AWS credentials for this file? Check out this blog article seeing if it guides you.


{
  "accessKeyId": "XXXXXXXXX",
  "secretAccessKey": "YYYYYYYYYYY",
  "region": "us-east-1"
}

Why the dummy values? Obviously we ought to keep our clear text values from out of the public since it’s all private critical data. I recommend putting this file in your project’s .gitignore file and keeping it on localhost only. You’ll see I’ve added this file to the demo repo .gitignore as a reminder of my warning, but I did force it up for educational purposes.

How is AwsConfig.json used in S3Deploy.js?


var aws = require('aws-sdk');
aws.config.loadFromPath('./AwsConfig.json');

Using the Tool

Using the tool means dropping into a Terminal window and navigating to the root of the repo you’ve cloned to your laptop. Run it like any NodeJS app:


node S3Deploy.js createBucket

Anyone using the tool on the command-line without an argument sees a hint that it’s parameter driven thanks to this code.


function noParamsGiven() {
  showUsage();
  process.exit(-1);
}


function showUsage() {
  console.log('Use choosing one of these command line parameters:');
  console.log('  audio folderName');
  console.log('  code');
  console.log('  createBucket');
  console.log('  css');
  console.log('  index');
  console.log('  images');
  console.log('  list');
}

Creating a Bucket

The command line tool can create a bucket that holds all of your files. This article won’t tell what S3 buckets are and why they’re important so please read up on them in other resources if you need more background.

You’ll see this function in S3Deploy.js that uses the aws-sdk, its S3 object, and the createBucket function. Be sure to change the global BUCKET_NAME variable with your choice.


var BUCKET_NAME = 's3deploy.example';
var s3 = new aws.S3();

function createBucket(bucketName) {
  s3.createBucket({Bucket: bucketName}, function() {
    console.log('created the bucket[' + bucketName + ']');
    console.log(arguments);
  });
} 
 

Uploading a Single File

Uploading a single file is easily done given the local filename and its remote bucket destination. NodeJS has a function to read in a file and the aws-sdk has a function to put it up.


function uploadFile(remoteFilename, fileName) {
  var fileBuffer = fs.readFileSync(fileName);
  var metaData = getContentTypeByFile(fileName);
  
  s3.putObject({
    ACL: 'public-read',
    Bucket: BUCKET_NAME,
    Key: remoteFilename,
    Body: fileBuffer,
    ContentType: metaData
  }, function(error, response) {
    console.log('uploaded file[' + fileName + '] to [' + remoteFilename + '] as [' + metaData + ']');
    console.log(arguments);
  });
}

Notice how easily files are put up to S3 in hierarchical form without bothering to create subdirectories. I like that. It’s so good.

Uploading Multiple Files

One of my needs is uploading a directory full of files. Given the context of my spelling test webapp you can see I have a directory foreach week containing ten audio files.

Simply enough I use a NodeJS filesystem function asking for all the files in a directory. Users give the target directory as a third command-line parameter with “audio”. Code looks like this returning an array of files given a directory.


function uploadAudio(folderName) {
  var CODE_PATH = 'resources/audio/';
  var fileList = getFileList('./' + CODE_PATH + folderName + '/');
  
  fileList.forEach(function(entry) {
    uploadFile(CODE_PATH + folderName + '/' + entry, 
      './' + CODE_PATH + folderName + '/' + entry);
  });
}

function getFileList(path) {
  var i, fileInfo, filesFound;
  var fileList = [];

  filesFound = fs.readdirSync(path);
  for (i = 0; i < filesFound.length; i++) {
    fileInfo = fs.lstatSync(path + filesFound[i]);
    if (fileInfo.isFile()) fileList.push(filesFound[i]);
  }
  return fileList;
}

At that point uploading multiple files is easy. Foreach item in the array call the upload function blasting it into place.

Content-Type is a Thing

The first time I uploaded a file from my laptop to S3 I held my arms up in a V and cheered. Then I noticed it didn’t work. Such a bummer. S3 didn’t serve up the files correctly to my browser. I missed the “ContentType” attribute sent in to s3.putObject(). There are rules for this sort of thing.

Easily solved once I scanned the AWS docs. Look how I wrote a function returning a content-type matching the filename targeted for upload based on its file extension.


function getContentTypeByFile(fileName) {
  var rc = 'application/octet-stream';
  var fn = fileName.toLowerCase();

  if (fn.indexOf('.html') >= 0) rc = 'text/html';
  else if (fn.indexOf('.css') >= 0) rc = 'text/css';
  else if (fn.indexOf('.json') >= 0) rc = 'application/json';
  else if (fn.indexOf('.js') >= 0) rc = 'application/x-javascript';
  else if (fn.indexOf('.png') >= 0) rc = 'image/png';
  else if (fn.indexOf('.jpg') >= 0) rc = 'image/jpg';

  return rc;
}

The list is limited to my exact needs – please extend it for your project’s file types.

Error Reporting

Reading through S3Deploy.js you’ll see I don’t do much in the way of error handling and reporting. Nothing sophisticated in the least. Instead callback functions dump the arguments the S3 library sends to them on completion. Users must stay alert scanning for trouble and taking the smart action in response.

What does an error look like? Here’s one example when trying to push a file to a nonexistent bucket. Will you find more errors? I believe so.


{ 
  '0': {
    [NoSuchBucket: The specified bucket does not exist]
    message: 'The specified bucket does not exist',
    code: 'NoSuchBucket',
    time: Fri Jan 24 2014 23:13:42 GMT-0600 (CST),
    statusCode: 404,
    retryable: false 
  },
  '1': null 
}

I gotta admit the entire topic of improving error handling is literally “an exercise left for the reader”, and I can guess you will make it so much better!

Where to Go With This

Several avenues exist for this tool:

  • Sense it’s good enough and working it into your occasional production deployment schedule running it manually
  • Have the “createBucket” option take a parameter for a custom bucket name (e.g. dev vs prod)
  • Call it from Jenkins or another continuous integration server putting files overnight into a test environment
  • Add it into a “Grunt” workflow
  • Replace it entirely with Puppet or Chef or another high-end tool once your needs outmatch S3Deploy’s ability

Further Reading Online

Here are the online API docs I scanned while coding this tool. Give them a look for more details to the calls I used and the many more available.

What Have We Learned

I’m particularly proud of all the stuff in this article. I feel like I’ve learned a ton over the past weeks letting me share a lot back to the community. Here are some of the things readers may take away from this post:

  • Why is AWS S3 useful and how it might fit in your server architecture
  • Reinforce NodeJS learnings like npm, command-line use, filesystem access
  • Concrete examples using the S3 JS SDK
  • Various details on JavaScript coding style
  • Considering what automation you need in your tool-chain especially regarding production

I Can’t Wait

If you start using this tech I can’t wait to hear of your success. Reach out to me on Twitter @KenTabor telling me what great stuff you’ve done with NodeJS, AWS, and web tech in general. Have a coffee and consider how all of this helps. Let’s do something awesome today!