AWS, Websockets

Websockets on AWS

Let’s say that you have an application that uses a Websocket to send and receive data, but it relies on a server that must run on a dedicated host, that is it either needs to live in a container, or run directly on the operating system. This could potentially be a problem if your Websocket server gets a lot of traffic. Services like AWS Lambda can come in handy for this, especially since it can abstract away the need to manually scale a service up and down. What we’ll demonstrate here is a guide for moving that Websocket server into the cloud, in particular, into an AWS Lambda function.

Broad Overview

In this demonstration, we’ll be taking the Publish Subscribe server for a Small Game Engine and move it into a Lambda. Let’s take a look at how the existing NodeJS server works:

https://github.com/mboleary/JsGameEngine/blob/krum_blog_post_2020_08/server/index.js

This version of the server makes use of the [WS library] (https://www.npmjs.com/package/ws)

// ... Some parts removed to make this shorter
wss.on('connection', function connection(ws, req) {
    console.log("Connection:", req.socket.remoteAddress);
    ws.isAlive = true;
    ws.on('pong', heartbeat);
    ws.on('message', (msg) => {
        console.log("Message:", msg);
        if (!msg) return;
        let json = null;
        try {
            json = JSON.parse(msg);
        } catch (e) {
            ws.send("{err: \"Not Valid JSON\"");
            return;
        }
        console.log("Action:", json.action, "Target:", json.target);
        if (json.action === "update") {
            if (json.target === "*") {
                wss.clients.forEach((wsr) => {
                    if (wsr !== ws) {
                        wsr.send(JSON.stringify(json));
                    }
                });
            }
        }
    });
    ws.send('TEST');
});
// Code that handles ending stale connections here...

The Code has been truncated for brevity.

The Client uses this to send the server the updated state of special Game Objects, and then the server will send these updates to the other clients, or can even send a message to a specific client.

Now that we know how the server works, let’s build one in Serverless.

Serverless Project

This section will walk you through how I built the project. If you want to see the final code, go to https://github.com/krumIO/AWSWebsocketDemo.

Initializing the project

The first thing we’ll want to do is to start a Serverless Project.
Make sure that serverless is installed (run npm i -g serverless if it isn't already installed). This can be done by running
serverless <PROJECT NAME>
cd <PROJECT NAME>

At this point, serverless.yml should contain the service name, a provider section, some basic iamRoleStatements, and a default handler, as well as some other stuff.

Editing Serverless.yml

Now, the serverless.yml file needs to be updated. Add the new Lambda functions, and add the iamRoleStatements that will allow the Lambda function to access DynamoDB.

https://github.com/krumIO/AWSWebsocketDemo/blob/master/jsge-pubsub/serverless.yml

provider:
  # Some items removed.
  iamRoleStatements:
      - Effect: Allow
        Action:
          - dynamodb:Query
          - dynamodb:Scan
          - dynamodb:GetItem
          - dynamodb:PutItem
          - dynamodb:UpdateItem
          - dynamodb:DeleteItem
        Resource: "arn:aws:dynamodb:${opt:region, self:provider.region}:*:*"
  environment:
    CONNECTIONS_TABLE: "AWSWebsocketDemo-Connections"
functions:
  connectHandler:
    handler: handler.handler
    events:
      - websocket:
          route: $connect
      - websocket:
          route: $disconnect
      - websocket:
          route: $default

The handler function will be triggered by websocket events. We'll talk more about these events later.

Also notice how we've defined the CONNECTIONS_TABLE Environment Variable. We're going to use that in the Lambda function later.

Adding the DynamoDB Schema

Afterwards, add the DynamoDB Schema.

resources:
  Resources:
    # This table stores the connections to the API Gateway
    ConnectionsTable: 
      Type: AWS::DynamoDB::Table
      # DeletionPolicy: Retain
      Properties:
        AttributeDefinitions:
          - AttributeName: id
            AttributeType: S
        KeySchema:
          - AttributeName: id
            KeyType: HASH
        ProvisionedThroughput:
          ReadCapacityUnits: 1
          WriteCapacityUnits: 1
        TableName: ${self:provider.environment.CONNECTIONS_TABLE}

Author's Note: You may notice that there is a line commented out above (DeletionPolicy: Retain). That particular line, if uncommented will prevent the database from being accidentally deleted, meaning that it would be necessary to comment out the entire ConnectionsTable resource block if you wished to re-deploy your code.

Since we’re just going to be deploying code on the Lambda function, you shouldn’t need to install any additional libraries to get this to work properly, since the AWS SDK is already included in the Lambda.

Writing the Connection Event Handler

Now, write the connection handler.

Here is the full source code for the handler: https://github.com/krumIO/AWSWebsocketDemo/blob/master/jsge-pubsub/handler.js

There are 3 types of events that this handler will handle

  • connect: The event where a client will establish a Websocket Connection
  • disconnect: When a client is disconnected
  • default: This handles messages that come through the Websocket

A handler can handle one, or all of these events, but this example will trigger the same handler on all 3 events.

Create a file called handler.js, and put in the following content.

const AWS = require('aws-sdk');

const dynamoDB = new AWS.DynamoDB.DocumentClient({ apiVersion: '2012-08-10', region: process.env.AWS_REGION});

// Handles Connections
module.exports.handler = async event => {
    const routeKey = event.requestContext.routeKey;

    if (routeKey === "$connect") {
        // Connect Route
        return {
            statusCode: 200,
            body: JSON.stringify(
                {
                    message: 'Connect',
                    input: event,
                }
            ),
        };
    }

    if (routeKey === "$disconnect") {
        // Disconnect Route
        return {
            statusCode: 200,
            body: JSON.stringify(
                {
                    message: 'Disconnect',
                    input: event,
                }
            ),
        };
    }
    
    // This will handle the default Events
    
    return {
        statusCode: 200,
        body: JSON.stringify(
            {
                message: 'Response',
                input: event,
            }
        ),
    };
};

We're going to use the event.requestContext.routeKey value to determine which event has come through the handler.

Before we can start reading events, we'll need to add a little bit of boilerplate first that will set up the API Gateway Management API, which we'll use later on to send messages back to the clients.

// ...
module.exports.handler = async event => {
    const routeKey = event.requestContext.routeKey;
    const connectionId = event.requestContext.connectionId;
    const url = `${event.requestContext.domainName}/${event.requestContext.stage}`;
    
    const apiGateway = new AWS.ApiGatewayManagementApi({
        apiVersion: '2018-11-29',
        endpoint: url,
        logger: console
    });
    if (routeKey === "$connect") {
    // ...

The Connection Handler will handle when the Client establishes a Websocket through the API Gateway. Since Lambdas are stateless, we'll need to store the Connection IDs that come from the API Gateway so that we can send messages later.

if (routeKey === "$connect") {
        const putParams = {
            TableName: process.env.CONNECTIONS_TABLE,
            Item: {
                id: connectionId
            }
        };

        try {
            await dynamoDB.put(putParams).promise();
        } catch (err) {
            console.error("Error adding connection:", err);
            return {
                statusCode: 500,
                body: JSON.stringify(
                    {
                        message: 'Connection Failed',
                        error: err,
                    }
                )
            };
        }

        return {
            statusCode: 200,
            body: JSON.stringify(
                {
                    message: 'Connect',
                    input: event,
                }
            ),
        };
    }

Note that the value returned is used for sending a response back mainly when the Websocket connection is first being established, since the Websocket starts its life out as an HTTP request that is upgraded to be a Websocket.

The Disconnect handler handles when a connection is closed or interrupted. Since the API Gateway manages the connections, all that we need to do is to remove the connection ID from the table so that we don't try to send messages through it later.

if (routeKey === "$disconnect") {
        const deleteParams = {
            TableName: process.env.CONNECTIONS_TABLE,
            Key: {
                id: connectionId
            }
        };
        try {
            await dynamoDB.delete(deleteParams).promise();
        } catch (err) {
            console.error("Error Removing Connection:", err);
            return {
                statusCode: 500,
                body: JSON.stringify(
                    {
                        message: 'Disconnect Failed',
                        error: err,
                    }
                )
            };
        }
        return {
            statusCode: 200,
            body: JSON.stringify(
                {
                    message: 'Disconnect',
                    input: event,
                }
            ),
        };
    }

Forwarding Messages to Clients

Now, we need to handle sending messages to the clients. It's VERY IMPORTANT that the messages are STRINGS, otherwise the message will not send (Giving it un-stringified JSON will make it sad). In our case, since we're making a Publish-Subscribe component, we'll need to send out the Stringified JSON updates to the connected clients. We'll do this in 2 parts.

const message = JSON.parse(event.body);
let connections = null;
if (message.action === "update") {
    if (message.target === "*") {
        try {
            connections = await dynamoDB.scan({
                TableName: process.env.CONNECTIONS_TABLE,
                ProjectionExpression: "id",
            }).promise();
        } catch (err) {
            console.error("Error getting all connections:", err);
            return {
                statusCode: 500,
                body: JSON.stringify({
                    message: "Error Getting all Connections while sending a Message",
                    error: err
                })
            }
        }
    } else {
        connections = {
            Items: [
                {id: message.target}
            ]
        }
    }
}

For the first part, we need to first look at the message that we received. For this application, we need to determine who to send it to, as the server can send to either a single client, or all other connected clients. To get all other clients, we need to look at the entries stored in the Connections Table. Once we have built the list of connection IDs to send the update to, we need to post the data through the API Gateway.

if (connections && connections.Items.length > 0) {
    const dataToPost = {};
    dataToPost.from = connectionId;
    dataToPost.action = message.action;
    dataToPost.number = message.number;
    dataToPost.data = message.data;
    let posts = connections.Items.map(async (connData) => {
        let id = connData.id;
        if (id === connectionId) return;
        try {
            await apiGateway.postToConnection({
                ConnectionId: id,
                Data: JSON.stringify(dataToPost)
            }).promise();
        } catch (err) {
            // HTTP 410 => Gone
            if (err.statusCode === 410) {
                await dynamoDB.delete({
                    TableName: process.env.CONNECTIONS_TABLE,
                    key: id
                }).promise();
            } else {
                throw err;
            }
        }
    });
    try {
        await Promise.all(posts);
    } catch (err) {
        return {
            statusCode: 500,
            body: JSON.stringify({
                message: "Error Sending Messages",
                error: err
            })
        }
    }
}

When we're sending messages back to the clients, we need to keep in mind that sometimes the connections go stale, meaning that the connection is broken before a disconnection event is received. If this happens, an exception will be thrown by the API Gateway Management API. If the error has a statusCode of 410 (https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/410), then we'll need to delete that connectionID from the Connections Table.

Deploying to AWS

If you want to try this for yourself, you'll need to deploy the code to AWS using serverless.

If Serverless is not installed, you'll need to run npm i -g serverless in a command line, which will install it.

To Deploy your code to AWS, run serverless deploy.

Please note that
If you are developing this into something else, you can always do serverless deploy --function (LAMBDA_FUNCTION_NAME)

After you have deployed your changes to AWS, run sls info to get the information about your serverless endpoint. It should look something like this:

Service Information
service: jsge-pubsub                                                                                                                                     
stage: dev                                                                                                                                               
region: us-east-1                                                                                                                                        
stack: jsge-pubsub-dev                                                                                                                                   
resources: 14                                                                                                                                            
api keys:                                                                                                                                                
  None                                                                                                                                                   
endpoints:                                                                                                                                               
  wss://XXXXXXXXXX.execute-api.us-east-1.amazonaws.com/dev                                                                                               
functions:                                                                                                                                               
  connectHandler: jsge-pubsub-dev-connectHandler                                                                                                         
layers:                                                                                                                                                  
  None       

Make sure to save that wss:// url for later, as we'll need to configure the frontend to work with it.

Testing with the client

To test this out with the client, you'll need to clone the krum_blog_post_2020_08 branch of the following repo.
https://github.com/mboleary/JsGameEngine/tree/krum_blog_post_2020_08

You can do this by typing the following commands into your command line

git clone https://github.com/mboleary/JsGameEngine.git
git checkout krum_blog_post_2020_08

Afterwards, you'll need to open frontend/CONFIG.js in a text editor and update the value of the pubsub key with the new Websocket URL

window.CONFIG = {
    "branch": "testing",
    "debug": true,
    // "pubsub": "ws://localhost:8001"
    "pubsub": "wss://XXXXXXXXXX.execute-api.us-east-1.amazonaws.com/dev"
}

After this, navigate to the frontend directory in the repo, and run a local web server. You can do this if you have Python installed by typing

python3 -m http.server 8000 --bind 127.0.0.1

Or if you prefer to use a NodeJS HTTP server:
https://www.npmjs.com/package/http-server

Afterwards, you'll want to open your web browser and navigate to http://localhost:8000 (or whatever port you specified). Then you can open the Developer console and look at the network tab to take a look at the messages going back and forth.

Problems Encountered

{"message": "Forbidden", "connectionId":"Q6sX6fMsoAMCLVg=", "requestId":"Q6sZYH2JIAMFzUQ="}

This can be caused if there is no method to handle the Websocket’s message.
If there is a handler, then make sure that your message body is a string (this means that if it’s JSON, then it needs to be stringified.)

Another problem that you will encounter is that the websocket times out after a while. The connections will be closed after 10 minutes of inactivity, or after 2 hours regardless of activity. This is in an effort to reduce load on the part of AWS, so as a result you should plan on building in some logic into your client to check the status of the websocket connection and re-establish it if necessary.

Lastly, there is a 1MB limit on the amount of data that DynamoDB returns. If you're just using DynamoDB to store the connection IDs, as we have done in this demo, you may not run into this issue, but if you do, you'll need to paginate through the data using the provided tokens to get it all.

Cost

Remember that there is a cost associated with using DynamoDB, as it is not in the Free Tier of AWS. https://aws.amazon.com/dynamodb/pricing/
If you go over 1 million messages, or 750,000 connection minutes, you will be charged https://aws.amazon.com/api-gateway/pricing/

Going Further

There are a few ways you can expand upon this example project. One such way is to add support for rooms. Many other pub-sub components use this system to manage which clients get a message, and which ones do not. This was not implemented here because the client does not yet support different rooms.

How we use Websockets

One of our clients at Krumware uses Websockets to send live updates to all connected web clients. What's different between our client's application and this testing one is that these web clients make HTTP requests to AWS Lambda functions to update the data in various parts of the application. Those Lambda functions then use SNS to send a message to trigger another Lambda function that sends out a message on the Websocket to update the data in all of the connected clients. As a result of this, the clients don't need to constantly refresh or make many HTTP GET requests to update their data, thus causing less load on the Lambda functions and in turn costing them less money to keep their web clients up to date with the latest data.

To combat the connection timeout issue mentioned above, we implemented a ping-pong strategy, where we send a message back and forth to keep the connection open, and we re-establish the connection every 30 minutes to combat the 2-hour timeout limitation.

Conclusion

Hopefully by reading through this, you have a better understanding of what a process of converting an existing NodeJS Websocket application to run on AWS Lambda might look like. If you have a bigger project with a more complex backend software that you're looking to get running on a Lambda function, Krumware might be able to help.

https://en.wikipedia.org/wiki/Publish–subscribe_pattern

https://docs.aws.amazon.com/apigateway/latest/developerguide/apigateway-websocket-api-overview.html (Websocket APIs in API Gateway)

https://www.serverless.com/framework/docs/providers/aws/events/websocket/ (Using Websockets with Serverless)

https://tsh.io/blog/implementing-websocket-with-aws-lambda-and-api-gateway/ (Pretty much what we're doing here)

https://aws.amazon.com/blogs/compute/announcing-websocket-apis-in-amazon-api-gateway/

https://github.com/aws-samples/simple-websockets-chat-app

https://www.serverless.com/framework/docs/providers/aws/guide/resources/

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-dynamodb-table.html

https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/ApiGatewayManagementApi.html#postToConnection-property (postToConnection)

Author image

About Brady O'Leary

I have been a Software Developer at Krumware since 2019.
  • Columbia, SC