Amazon just released a bunch of new services. My favorite is Lambda. Lambda allows me to deploy simple micro-services without having to setup any servers at all. Everything is hosted in the AWS cloud. Another cool thing about Lambda services is that the default runtime is Node.js!
To get access to AWS Lambda, you have to sign in to the [AWS Console] and
select the Lambda
service. You have to fill out a form to request access,
which may take a while to come through. Once you have access you can edit the
functions in a web form.
A lambda service is a Node module which exports an object with one function, the handler. In the AWS examples this is usually called handler and I'm going to follow their example.
Here is a simple function that can be edited and invoked in the online Lambda Edit/Test tool.
// hello-event.js exports.handler = function(event, context) { console.log('Hello', event); context.done(null, 'Success'); }
The event is any JSON object and since a String is a valid object it can be
invoked with "Tapir"
, which results in the following output in Lambda tool.
Logs ---- START RequestId: 3e21d80e-7e31-11e4-912c-2f870de05098 2014-12-07T16:51:47.163Z 3e21d80e-7e31-11e4-912c-2f870de05098 Hello Tapir END RequestId: 3e21d80e-7e31-11e4-912c-2f870de05098 REPORT RequestId: 3e21d80e-7e31-11e4-912c-2f870de05098 Duration: 3.89 ms Billed Duration: 100 ms Memory Size: 128 MB Max Memory Used: 9 MB Message ------- Success
Working in the Lambda online tool is sufficient for simple examples examples but quickly gets annoying and once you need to add extra modules, you have to upload zip-archives and this is both error prone and tedious. Here is a simple script to zip relevant files and upload them to Lambda. Make sure to update the region and the role to your own specific properties.
#!/bin/bash # # # Zip and upload lambda function # program=`basename $0` set -o errexit function usage() { echo "Usage: $program <function.js>" } if [ $# -lt 1 ] then echo 'Missing required parameters' usage exit 1 fi main=${1%.js} file="./${main}.js" zip="./${main}.zip" role='arn:aws:iam::638281126589:role/lambda_exec_role' region='eu-west-1' zip_package() { zip -r $zip $file lib node_modules } upload_package() { aws lambda upload-function \ --region $region \ --role $role\ --function-name $main \ --function-zip $zip \ --mode event \ --handler $main.handler \ --runtime nodejs \ --debug \ --timeout 10 \ --memory-size 128 } # main zip_package upload_package
A Larger Example
Now that I know the Lambda works it is time to try out something more elaborate. I have read that it is not only possible to get access to npm modules but I also have access to the operating system when writing my service.
My bigger example consists of something I often have use for, a way to serve media files so that I don't have to check them into git. The way I want to do this is to upload a tarball to S3 and then have Lambda unpack the archive, checksum the files and upload them into another bucket.
Something like this:
- React to the
event - Download the tarball from S3
- Extract tarball into temp directory
- Checksum the files and rename them with the checksum
- Upload the checksummed file to another S3 bucket
- Upload an index of the files with mapping from old to new filename.
React to ObjectCreated:Put
An AWS S3 ObjectCreated:Put
event looks something like this in a trimmed
down format
{ "Records": [ { "eventVersion": "2.0", "eventSource": "aws:s3", "eventName": "ObjectCreated:Put", "s3": { "bucket": { "name": "anders-source", }, "object": { "key": "tapirs.tgz", "size": 1024, "eTag": "d41d8cd98f00b204e9800998ecf8427e" } } } ] }
To handle this event we need a handler function. All the handler needs to do is
to extract the relevant properties from the file and then call assetify
will do the rest of the work. Breaking up the code like this allows me to use
locally and not only as a Lambda handler.
assetify.handler = function(event, context) { console.log('Received event:'); console.log(JSON.stringify(event, null, ' ')); var bucket = event.Records[0]; var key = event.Records[0].s3.object.key; assetify(bucket, key, function(err, result) { context.done(err, util.inspect(result)); }); };
In order to use assetify
as a normal module on a local machine I export the
function with module.exports
. This code needs to come before the
declaration above. When exported this way, it is possible to
require the function without involving Lambda.
function assetify(sourceBucket, key, callback) { var tgzRegex = new RegExp('\\.tgz'); if (!key.match(tgzRegex)) return callback('no match'); var prefix = path.basename(key, '.tgz'); async.waterfall([ downloadFile.bind(null, sourceBucket, key), extractTarBall, checksumFiles, uploadFiles.bind(null, prefix), uploadIndex.bind(null, prefix) ], function(err, result) { if (err) return callback(err); callback(null, result); }); } module.exports = assetify;
I'm using async.waterfall
in combination with bind
to get a nice flat
structure of the code which clearly resembles the described flow above.
Download file
The downloadFile
function uses a nice feature of s3.getObject
, streaming.
After creating a temporary file with tmp.file
, I create a request and then I
stream the contents from S3 directly into a write stream. Very nice! I also
need to hook up some event handler to allow me to notify the callback once the
streaming is complete.
function downloadFile(sourceBucket, key, callback) { console.log('downloadFile', sourceBucket, key) tmp.file({postfix: '.tgz'}, function tmpCreated(err, tmpfile) { if (err) return callback(err); var awsRequest = s3.getObject({Bucket: sourceBucket, Key:key}); awsRequest.on('success', function() { return callback(null, tmpfile); }); awsRequest.on('error', function(response) { return callback(response.error); }); var stream = fs.createWriteStream(tmpfile); awsRequest.createReadStream().pipe(stream); }); }
Extract tarball
In order to extract the tarball I'm using the ordinary tar
command instead of
relying on a Node module. This works fine as Lambda seems to include a full
standard AWS distribution. Very nice to have access to all the common Unix
utilities. The glob
function makes it easy to traverse the full tree
structure of the archive and I use this to return (or pass on via callback) a
map of filenames to the temporary files.
function extractTarBall(tarfile, callback) { tmp.dir(function(err, dir) { if (err) return callback(err); var cmd = 'tar -xzf ' + tarfile + ' -C ' + dir; exec(cmd, function (err) { if (err) return callback(err); glob(dir + '**/*.*', function(err, files) { if (err) return callback(err); var fs = { return { path: file, originalFile: file.replace(dir, '') }; }); return callback(null, fs); }); }); }); }
to call the singular version checksumFile
This creates a checksum of the file and does some string manipulation in order
to create a name with a checksum in it.
function checksumFiles(files, callback) {, checksumFile, callback); } function checksumFile(file, callback) { checksum.file(file.path, { algorithm: 'md5'}, function(err, sum) { if (err) return callback(err); var filename = file.originalFile; var ext = path.extname(filename); var base = filename.replace(ext, ''); var checksumFile = base + '-' + sum + ext; callback(null, { path: file.path, originalFile: file.originalFile, checksumFile: checksumFile }); }); }
Upload files to S3
When the new filenames have been created the files can now be uploaded to S3
via s3.putObject
. Unfortunately, putObject
does not support pipe
, but I
can use a ReadStream as the value of the body object and this is good enough.
It uses the mime
module to calculate the content-type from the filename.
After the file is uploaded an object with a mapping between the original name
and the URL is returned.
function uploadFiles(prefix, files, callback) { console.log('uploadFiles', prefix, files), uploadFile.bind(null, prefix), callback); } function uploadFile(prefix, file, callback) { var stream = fs.createReadStream(file.path); var s3options = { Bucket: config.bucket, Key: prefix + file.checksumFile, Body: stream, ContentType: mime.lookup(file.path) }; s3.putObject(s3options, function(err, data) { if (err) return callback(err); console.log('Object added', s3options); callback(null, { originalFile: file.originalFile, url: config.url + config.bucket + '/' + prefix + file.checksumFile }); }); }
Upload the index
The last thing to is to upload the index with the filename-to-URL map as a JSON-file. This is done in a similar way as the upload of the images.
function uploadIndex(prefix, files, callback) { var s3options = { Bucket: config.bucket, Key: prefix + '/index.json', Body: JSON.stringify(files), ContentType: 'application/json' }; s3.putObject(s3options, function(err, data) { if (err) return callback(err); console.log('Object added', s3options.Key); callback(null, { files: files, url: config.url + config.bucket + '/' + prefix + '/index.json' }); }); }
The final index.json file loooks something like this.
[{ originalFile: "/Tapir_standing_profile.jpg", url: "" }, { originalFile: "/tapir-sticker.png", url: "" }, { originalFile: "/tapir.jpg", url: "" }]
Lambda is very simple to work with and it allows me to create small services that react to events without the need to setup any servers at all.
Apart from the integration with S3, it also integrates with Kinesis and with DynamoDB allowing for very cool application to built.
Great post! I love lambda.
ReplyDeleteJustin, thanks, I'm glad you liked it.
ReplyDeleteGreat post man, I love lambda as well ! I think that could be a game changing service !
ReplyDeleteI'm working on a bunch of aws lambda project that you might be interested in :
Hit me up it you are interested, would be great to chat !
ReplyDeleteFound some aws examples in creately diagram community. There are 1000s of examples and templates available to be used freely.