Serverless Architecture and Box Platform

There is an emerging trend in software design called serverless architecture. However the term "serverless" is a misnomer. Serverless architecture is just another way to optimize the new offerings from cloud IaaS providers such as AWS and Azure.

So, You're Telling Me There's a Server?
Yes, there's a server. Only the server is split into pieces, utilizing an event-driven, serverless computing platform like AWS Lambda or Azure Functions. Each of these functions runs simple logic to handle one specific job.

In a traditional server-based application, your entire logic layer would run in an application in a server stack. All the logic would go through one of the nodes in a cluster of these replicated servers. Contrasting this with an application built around serverless architecture, your application's compute processing is balanced between the client and various API endpoints that connect to lambda functions.

The lambda functions serve the same purpose as a traditional server running your application's business logic layer and abstract direct contact between your client code and your application's data layer.

And the Limitations?
That's a great question and I'm glad you asked. With any architecture, there are strengths and weaknesses. Martin Fowler does an incredible job of outlining both in this article. Since someone vastly more intelligent than I am already answered this question in great depth, I will focus on the most important strength and the most important weakness.

The Good News
I'll start with the strength most important to me. Serverless architecture lends itself to a functional programming model.

One of the best developers I've ever worked with mentored me early in my career. Part of this mentorship focused on his view of the future of programming. I didn't fully appreciate it at the time, but I'm now starting to see all of his predictions come true. He was a devoted Erlang developer and believed that within the next decade we would see functional programming explode as the de facto programming model. While not as prevalent as he predicted, functional programming is gaining more adoption.

Service-oriented APIs like Box Platform, Stripe, and Twilio are leading developers toward a more functional way of thinking. For example, instead of building your own SMS messaging service, it's much easier to use a functional REST API like Twilio.

AWS Lambda is an even purer example of functional programming. AWS Lambda gives you the ability to transform your monolithic backend server into focused, function-driven pieces.

The Bad News
One of the biggest concerns I have for utilizing serverless architecture is the full dependence on third-party vendors and cloud services. In reality, moving towards third-party services can save you time and let you focus on features specific to your application and alleviate tons of maintenance work.

What's in the Box?
At this point it would be fair to ask what Box Platform has to do with serverless architecture.

First I'll explain what Box Platform is. Box Platform helps you replace infrastructure and services you would usually need to build when working with files. From collaboration tools to security and permissioning, Box Platform is a lot more than just file storage. You can explore our API endpoints here.

Box Platform sits well within many implementations of serverless architecture. Here are some ideas I've compiled of how Box Platform can extend a serverless architecture. Some ideas:

Box Platform’s Webhooks for reporting, analytics or workflows
Box Platform download redirect microservice
Box Platform thumbnail generator microservice
Event-driven content modification microservice
Box Platform authentication microservice

This post focuses on building an authentication microservice to meet the authentication requirements for an offering Box provides called App Users.

What is an App User?
An App User is an authentication scheme that lets you provide a Box account to your application's users transparently. That means you can enable your users with all the features and functionality that Box offers through your application's UI. In other words, when I log in to your application, I'm also "logging in" to Box.

With an App User, you can authenticate your users to Box in order to utilize services provided by the Box API. And since there is no visible interaction between your user and Box, you can't utilize a traditional OAuth login flow. Instead, App Users utilize a server-to-server authentication scheme. Once a user authenticates with your server, your server then authenticates to Box on behalf of the user. An App User can never directly authenticate to Box, so any access to their content stored in Box must happen through your application. Here's a general workflow of how an App User may interact with your server and with Box:

As shown above, the interaction between your users, your server, and Box are all dictated by different tokens. Correctly assigning and granting access based on these tokens is a process that should only be handled server side. So, I set out to architect a serverless way of juggling these tokens.

For more information on App Users, see our guide for building on Box Platform.

Serverless with Box App Users
We've already established the necessity for a server within Box's App User authentication workflow. So, handling server-side authentication with an entire VM running Linux and Apache, Nginx, or etc., seems akin to hammering a nail with a sledgehammer. Instead, I just want to use a hammer. AWS Lambda functions that sit behind an AWS API Gateway gives me my hammer.

So what piece needs to happen server-side and how do I structure these Lambda functions?

Box grants your application an Enterprise token. Think of this access token like an octopus living in a studio apartment — it can touch everything. The Box Enterprise token is scoped in a way that allows you to make API calls that manage secure details about your Box Enterprise. And so, you should never send this Box Enterprise token outside of your server environment.

Some of the elevated things an Enterprise token can do is create new Box App Users and retrieve access tokens for Box App Users. Here's more detail around the Enterprise-level authentication flow outlined above:

Your back end system reads the private_key.pem that is registered for your application with Box.
With the contents of the private_key.pem, your back end system signs a JWT assertion in preparation to request an Enterprise token from Box.
Your back end system sends a POST request to Box with the signed JWT assertion. On receiving and verifying the JWT, Box returns back an Enterprise token that can (amongst other things) create new App Users and generate App User access tokens.
If you're utilizing an official Box SDK, the SDK most likely abstracts this for you and completes this internally.

Once you've received an Enterprise token, generally your back end system is going to do one of two things: either create a new App User, or generate an App User access token for an existing App User.

Before We Go Into the Mystic
So far, we've got an action that pairs well with a Lambda function: retrieving an Enterprise access token. We'll see that creating a new App User and generating App User tokens will map well to Lambda functions as well.

If you've noticed, I haven't mentioned user authentication. That's because Box Platform and App Users never directly authenticate to Box. Instead, your application needs to provide an authentication mechanism. Once your user authenticates to your application, your back end system can authenticate to Box on behalf of your user by requesting a Box App User access token with your Enterprise token.

Your App User will have an ID property as its unique identifier within Box that you will need to map to your existing user model. Then, using that unique ID, your back end system can request a token scoped to that specific App User. This essentially logs your App User into Box, and you can now orchestrate actions against Box's services for this user.

Who Are You?
If you're working with your own authentication system, you should have an easy time mapping the Box App User IDs as you create your App Users. You can utilize your authentication system to lock down the AWS API Gateway endpoints.

However, if you're like me, I don't have an existing authentication system. There are a lot of options with third-party services. Okta, AWS Cognito, and Auth0 are all viable options for this.

I utilized Auth0, which provides a very robust product and an in-depth guide for locking down an AWS API Gateway with their service. You can follow their guide available here. Following this guide will get you in place to start developing your Lambda back end service.

Lambda Calrissian
After following Auth0's guide, you should have an AWS API Gateway setup and locked down by Auth0. Now, we can start coding the actual functions that will create new Box App Users and generate Box App User tokens.

You'll need to create two endpoints similar to those you created with Auth0's guide, except specific to these two processes. I chose /createboxappuser and /generateboxappusertoken. We'll look at both functions in depth.

But before we do, here's an architecture review of the entirety of the authentication microservice we plan to build:

/createboxappuser
Assuming you've followed Auth0's guide, you can lock down this endpoint. You'll want to hook the execution of the API endpoint directly to an AWS Lambda. You can learn more in the AWS docs here.

Let's take a look at the general architecture of what this function needs to accomplish:

Each lambda we create starts by negotiating an Enterprise token from Box as described earlier, so that step isn't included in the diagram. Once the lambda has an Enterprise token, the lambda calls to Box to create a new App User. Box returns JSON data for this App User. The important property here is the ID. We have branching logic here where we call out to Auth0 to update our user model with this Box App User ID. Auth0 returns a user profile JSON object, and we ultimately surface that to our client side application.

Creating a new Box App user is easy. Here's my implementation:

'use strict';
// I pull in the Needle library to abstract making HTTP requests.
const needle = require('needle');
// BoxConstants is a module I created to provide configuration and constant variables.
const BoxConstants = require('./boxValues').BoxConstants;

// I rely on another function to request a new Enterprise token within the file generateBoxEnterpriseToken.js
module.exports = (enterpriseToken, name) => {
// Here, I create the Authorization header utilizing the Enterprise token.
let options = {
headers: {
'Authorization': 'Bearer ' + enterpriseToken
},
json: true
}
// I return a new Promise that resolves around the Box ID that is passed back when the App User is successfully created.
// A production instance of this service would also need to account for an errors that may result during this process.
return new Promise((resolve, reject) => {
// The most important value here is is_platform_access_only. This boolean flag creates the user as an App User.
needle.post(BoxConstants.APP_USERS_URL, { name: name, is_platform_access_only: true }, options, function (err, resp) {
if (err) { reject(err); }
// The Box ID that I will map to my existing user model and will use to retrieve access tokens for this App User later.
resolve(resp.body.id);
});
});
}

Note that I am calling the API endpoint directly. You could also use the Box Node SDK and the features already built into the SDK for creating App Users.

You can view the full Node.js implementation of /createboxappuser here.

/generateboxappusertoken
Just as with /createboxappuser, this endpoint is locked down by Auth0. The key piece of data that is required for this call is the Box App User ID. With that ID, you can generate a fresh App User token specific to the user that will remain valid for about 60 minutes.

Here's the general architecture for the Lambda function triggered by /generateboxappusertoken:

You could potentially send the Box App User ID within the POST body sent to /generateboxappusertoken. However, Auth0 provides a JWT that identifies the authenticated user called the ID token. So, within my specific implementation, I don't trust the client side application to send the correct Box App User ID. Instead, I send Auth0's user ID token to /generateboxappusertoken. I then make an API call to Auth0's service to retrieve the user profile with the JWT ID token. The Auth0 user profile contains the user's Box App User ID because I updated the Auth0 user profile when I created the Box App User.

I then form a JWT assertion with this Box App User ID and request an access token. Box returns a token specifically scoped to this App User, so I can safely send this access token directly to the browser in which the user authenticated to my application.

An important call out here -- you should always utilize HTTPS in production.

Here is a sample of how to generate a new App User token utilizing JWT and Box's API:

'use strict';
const fs = require('fs');
const path = require('path');
const jwt = require('jsonwebtoken');
const uuid = require('uuid');
const needle = require('needle');
// BoxConstants is a module I created to provide configuration and constant variables.
const BoxConstants = require('./boxValues').BoxConstants;
// BoxConfig is a module I created to provide configuration values needed to form my JWT assertion to Box.
const BoxConfig = require('./boxValues').BoxConfig;

module.exports = (boxId) => {
// Using path, I resolve the location of my private_key.pem file.
let certPath = path.resolve(BoxConfig.jwtPrivateKey);
// I read the entire private_key.pem file into memory to sign my JWT assertion.
// This is the same private key that I registered with Box through the Box Developer Console.
let cert = fs.readFileSync(certPath);
BoxConfig.privateKeyFile = cert;
console.log("Read PEM");
// I pass the private_key.pem with the password I used to create the PEM for decryption.
let privateKeyPackage = { key: cert, passphrase: BoxConfig.jwtPrivateKeyPassword };
// I begin building the JWT assertion according to the values outlined at developer.box.com
// https://docs.box.com/docs/app-auth#section-3-constructing-the-jwt-assertion
let jwtPackage = {
"iss": BoxConfig.clientId,
"aud": BoxConstants.BASE_URL,
"jti": uuid.v4(),
"sub": boxId,
"box_sub_type": BoxConstants.USER
};
// The box_sub_type is "user" because I am retrieving a user access token not an enterprise access token.

// Using the JWT library I loaded earlier, I sign a JWT assertion with the values and PEM declared.
// For full coverage on this method signature, see the documentation for this JWT library
// https://www.npmjs.com/package/jsonwebtoken
let token = jwt.sign(
jwtPackage,
privateKeyPackage,
{
header: {
"alg": BoxConstants.DEFAULT_SETTINGS.JWT_ALGORITHM,
"typ": BoxConstants.DEFAULT_SETTINGS.JWT_TYPE,
"kid": BoxConfig.jwtPublicKeyId
},
noTimestamp: true,
expiresIn: BoxConstants.DEFAULT_SETTINGS.JWT_EXPIRATION
}
);

let formData = {
grant_type: BoxConstants.DEFAULT_SETTINGS.JWT_GRANT_TYPE,
client_id: BoxConfig.clientId,
client_secret: BoxConfig.clientSecret,
assertion: token
}

console.log("Constructed JWT");
// Return a new Promise with the response from Box
return new Promise((resolve, reject) => {
needle.post(BoxConstants.BASE_URL, formData, function (err, resp) {
console.log("Inside call to Box");
console.log(resp.body);
if (err) {
reject(err);
}
// Adding an expires_at property for testing for expiration when retrieving tokens from Redis and for using tokens client side.
resp.body.expires_at = Date.now() + (resp.body.expires_in * 1000);
resolve(resp.body);
});
});
}

Note that I am constructing the JWT assertion. You could also use the Box Node SDK and the JWT features already built into the SDK for generating App User tokens.

You can view a full Node.js implementation of this here.

Caching Is Important
For simplicity's sake, I didn't include an outline of the caching layer within the architectural diagrams. For performance, it should be considered mandatory to implement caching for all Enterprise tokens you create and for each individual App User token you generate.

The general outline for caching is as follows: I cache the Enterprise token in an instance of Redis that is accessible to both functions. I save roughly several hundred milliseconds each time this function runs and utilizes a token from cache. The same is true for the App User tokens. I store the Box App User ID as the key and the App User access token as the value within Redis.

I used AWS ElastiCache (specifically with Redis) for my implementation, though you could use any kind of cloud-based in-memory cache. There is a good deal of setup needed if you're going to call out to ElastiCache from your Lambda functions. You can follow this guide from AWS. There are also hosted cloud solutions dedicated to Redis such as Redislabs.

Utilizing Redis or Memcache is highly recommended here for speed. You start to lose benefits of caching if you use long term data storage solutions like Postgres or MongoDB due to longer lookup times than Redis and Memcache. Here's an example of how I cache App User tokens and attempt to retrieve before sending a request to Box for a new token:

'use strict';

exports.handler = function (event, context, callback) {
let redisClient = require('./redisConfig')();
let auth0 = require('./auth0Config')();
let generateBoxUserToken = require('./generateBoxUserToken.js');

let response;
let error;
// I've created a callback for when Redis ends its connection. The callback response object should contain a Box App User token if successful.
redisClient.on('end', () => {
console.log("Ending connection to Redis...");
console.log(response);
callback(error, response);
});

// Verify identify from Auth0 using Auth0 JWT token
auth0.tokens.getInfo(event.token)
.then(function (profile) {
console.log("Auth0 call complete...");
console.log(profile);
// Check to see if this Auth0 profile contains a Box ID
if (profile && profile.app_metadata && profile.app_metadata.boxId) {
console.log("Attempting to retrieve box token from cache...");
// I save each App User token to Redis with the Box ID of the user as the key.
redisClient.get(profile.app_metadata.boxId, function (err, boxToken) {
if (err) {
console.log("There was an error.");
// I signal for Redis to end this connection if these is an error.
redisClient.quit(() => {
error = err;
});
}
console.log(boxToken);
// Here, I coerce the token (if it exists) into a JavaScript Object
boxToken = (boxToken) ? JSON.parse(boxToken) : null;
console.log(boxToken);
console.log(Date.now().toString());
// If the token exists and the token isn't expired, I set the token as the response and signal for Redis to end its connection.
if (boxToken && boxToken.expires_at > Date.now()) {
// I set the token as the response and I tell Redis to end its connection.
redisClient.quit(() => {
response = boxToken;
});
// Otherwise, I need to request a new token.
} else {
generateBoxUserToken(profile.app_metadata.boxId)
.then((boxUserToken) => {
console.log("Setting token in Elasticache");
// I set the token in Redis using the Box ID of the user as the unique key.
redisClient.set(profile.app_metadata.boxId, JSON.stringify(boxUserToken), function (err, reply) {
// I set the token as the response and I tell Redis to end its connection.
redisClient.quit(() => {
response = boxUserToken;
});
});
})
.catch((err) => {
redisClient.quit(() => {
error = err;
});
});
}
});

} else {
var err = {
message: "This user is not properly authenicated."
};
error = err;
}
})
.catch(function (err) {
console.log(err);
error = err;
});
}

If you went and viewed my sample code, you may have also noticed that I'm repeating code to create an Enterprise token in each Lambda function. Martin Fowler describes this code repetition as a con in serverless architectures, and this is a real life example of when that repetition is necessary due to this choice of architecture. Unfortunately, I'm also repeating code for storing and accessing tokens in Redis.

Do Not Underestimate the Powers of the Client Side
Now that I have a secured API endpoint and a Box App User authentication microservice, I can start offloading costly operations to client side applications. I extended out the Angular project that Auth0's guide provided to start offering client side actions that call Box's APIs directly.

The key here is that I can now generate a token that is scoped appropriately and safely to send to an authenticated browser or mobile device using a secure HTTPS connection. Additionally, I have a secured API endpoint for getting another App User token when the one I'm using expires, assuming that my user still has a valid Auth0 token as well.

For demonstration purposes, I've built out a CORS-enabled client side upload using the Box API, the example Angular project, and the App User authentication microservice. Within the Box Developer Console I can register specific domains for CORS, so I add the S3 domain serving my static HTML, JavaScript, and CSS files. Then, using the App User token retrieved from the microservice, I populate the $http service available through Angular with an Authorization header and the App User token.

My application now no longer needs to proxy uploaded files through any kind of back end server. This means I ultimately save money on utilizing infrastructure to process and send files to Box, and instead, offload this expensive process to the client side.

Also, since the upload happens from the client directly to Box, my application is now faster than when I needed to send the file through my server and then to Box. As a result my application will also gain more responsiveness from the end user's perspective.

Where To Go from Here
In addition to building specific microservices to work with Box, you can start to reimagine your entire backend server in this microservice structure.

Since our thumbnail generation endpoint returns bytes, you could easily proxy the calls to Box through one of these microservices. You'd have a URL (/thumbnail/:fileId, for example) tied to a Lambda that calls to Box's API, retrieves the image bytes for the file, and returns the image to the client. You could then just point the image tag's src attribute to your microservice URL.

Another use case is around our webhooks service. When registering a webhook on for an event on an object in Box, you'll need to provide an HTTPS URL to which Box can POST data about the event. Catching that POST and passing that data onto a greater back end workflow process is perfect to be handled by a Lambda microservice as described.

You can view a running implementation of my application here.

Here's a list of the resources mentioned in this post:
Serverless Framework - https://github.com/serverless/serverless
Auth0 Guide - https://auth0.com/docs/integrations/aws-api-gateway
Martin Fowler article - http://martinfowler.com/articles/serverless.html
Github repo - https://github.com/amgrobelny-box/aws-box-project
Lightning Talk - https://www.youtube.com/watch?v=tlhSet0LSI8

Getting Started with Box Platform

Box Platform provides enterprise-grade security, a granular permissions model, and rich preview capabilities for 120 file types. If you want to test out Box Platform in your application, click here to create a free developer account.

If you have any questions about this tutorial, please feel free to ask in the Developer forum within Box Community.