Cost-Effective Management of Static Data in Serverless Apps

June 4, 2024

Introduction

Handling static data is nothing new, and servers have needed it since the dawn of the internet. This was a relatively simple engineering feat back then, due to software usually being found in monoliths. Meaning only a handful of physical servers needed the data.

The data would likely be near the application servers, meaning calls to fetch it would be fast and cheap, never leaving the data centre network. And that was if it was even worth storing somewhere centralised. If only a handful of servers need this data, an engineer could simply SSH onto each server and update an environment variable or a file stored on the disk.

SSHing into individual Lambda Function invocation environments is not a feasible approach. This article will cover the options available in serverless (which extends to most cloud-based) architectures.

What Is Static Data

Static data could be a number of things. For the purposes of this article, we are going to assume it’s relatively static, read-only data. Use cases could include:

  • Business-level application configuration: e.g. company address, phone number or an email template. The reason you would avoid using environment variables for this sort of data is because it is shared and you would need to update it every where when (for example) your company address changed.
  • Implementation-level application configuration: e.g. feature flags, circuit breakers
  • UI assets: e.g. images used for the landing page of a website.

Modern Day Problems Of Static Data

It is uncommon for applications today to resemble the above. There are three major differences:

  • Applications are in the cloud: We pay for egress network traffic from our VPCs and idle compute time. We also tend to lean on managed services (e.g. API Gateway, SNS, SQS or EventBridge etc) rather than hosting our own implementations of the same technology (queues, event buses, storage etc).
  • We are all building microservices: Services are isolated within their own environments, meaning a single application can be found in multiple pieces - on multiple servers. Communication between these services now traverses external networking and not the localhost.
  • Some of us are using ephemeral compute: The likes of Lambda, Step Functions and ECS mean our invocation environments do not persist for long. All of a sudden, we need to consider the costs of initializing execution environments because there are no long-standing servers.

These three shifts in how we develop applications have downsides that each compound each other:

  1. Not only do we now pay for network traffic and managed services …
  2. But we now develop many more, smaller microservices which require a heavier use of these managed services …
  3. To go a step further, ephemeral compute environments within these microservices require more initializations. If it requires fetching static data, which is likely stored in a managed service, our cloud costs can spiral out of control.
Distributed architectures distribute their data. This increases latency and price.

The Solution

The answer is that we need to be frugal. We should design systems to only retrieve data when they absolutely need it. This means caching wherever possible, fetching data through the cheapest path, and doing it in a performant manner because in the cloud we pay for idle compute time.

Hardcoding Data

Paying a tribute to how things used to be done, i.e. jumping on servers to apply configuration changes ourselves, some companies approach the problem by hardcoding data into their application. While it does have it’s benefit(s):

  • Data retrieval is fast: A read from disk is, of course, going to be orders of magnitude faster than any network request.
  • No additional costs to pay: Data is stored locally and there is no need to use a managed service.

It does come with its problems:

  • Configuration is not centralized: A change in cross-service configuration will need multiple code changes and deployments. These deployments may need to be coordinated.
  • May bloat Lambda package size: This configuration is now part of the Lambda package. It must be pulled down when every new Lambda execution environment is initialized. This will increase your Lambda cold-start times and nudge your Lambda towards the hard service limit of 250 MB if done repeatedly.

Use A Cloud-Based Data Store

The alternative is to store this static data somewhere centralized - where it is accessible to all your services. The solution must be:

  • Fast: We pay for idle compute in Lambda. This retrieval must be as fast as possible.
  • Scalable: We don’t want retrieving this data to become a bottleneck in an architecture that is designed to be elastic.

On top of this, there is some responsibility for the client too. The client must locally cache this static data whenever possible. Only re-retrieving it when absolutely necessary, perhaps after a fixed time interval.

This leaves us with four candidates that we will compare:

  • DynamoDB
  • S3
  • SSM Parameter Store
  • AWS AppConfig
A summary of the different static data access patterns from this article

DynamoDB Solution

DynamoDB is commonly used as a cache. It is preferred over ElastiCache in many serverless applications due to its tighter integration and ability to be accessed via the AWS API. Meaning there are no VPC requirements. One pitfall of this is that it is not a cache, so it will not be as performant. The reason this is relevant is because we essentially need a tool that can provide low-latency read-heavy operations. The question we must ask ourselves is: how fast will a DynamoDB 'GetItem' operation be for retrieving this static data?

Pricing

Given that storage will be negligible, we can simply focus on the RCU cost of $0.00125 per 10K requests - assuming the application uses eventually consistent reads. This will, of course, double once your configuration is larger than 4kB, but more on this below!

Considerations

DynamoDB has a relatively generous soft service limit of 40k RCUs. If your configuration is smaller than 4kB, that is equivalent to 80K TPS (transactions per second), halving to 40K TPS for data between 4kB and 8kB and so on. There is also a hard 400kB limit on the size of a dynamoDB. If your data is near this size, then DynamoDB is not the right solution for you.

Take note that all calculations for DynamoDB in this article assume the use of eventually-consistent reads. This will be fine for static data and halves your cost.

Another downside of DynamoDB is that there is no way to define your data in infrastructure-as-code. If this is something you would like (i.e. check the static data into source control and promote data changes through your environment CI/CD) then you must either implement this logic yourself, or use an alternate option such as SSM Parameter Store.

S3 Bucket Solution

It is indeed possible to store this static data as a blob in an S3 bucket. Your applications simply need to make a 'GetObject' command whenever necessary. If your static data is large, you can optionally retrieve your objects in parts. This actually allows you to parallelize the fetching operation which results in better performance for larger objects.

Pricing

Since this is a read-heavy blob, we should be using the standard mode of S3. This means we will be paying $0.004 per 10K requests. As with all other solutions, it would be wise to access S3 via a VPC endpoint to avoid additional data transfer costs.

Considerations

Blobs can be of any size, so your static data can be as large as you like. However, S3 can return a mysterious 503 “Slow Down” error if you are making too many GET requests. The threshold for triggering this error is documented as 5.5K TPS.

SSM Parameter Store Solution

Data can sometimes be treated as key-value pairs. Which is exactly what the SSM Parameter Store is designed for. You can either set your entire blob to the value of a specific key or potentially split up the configuration into individual parameters. Take note that doing this allows certain services to request only the static data they need, but will multiply the number of requests required to retrieve the whole data if required.

Pricing

Parameters come in two distinct offerings: standard and advanced.

Standard parameters are free! However, you must adhere to the service quotas of standard SSM parameters (see below). Advanced parameters cost $0.05 per parameter per month (negligible) and $0.05 per 10K requests.

Considerations

There is a clear distinction between standard and advanced parameters. If you want to use standard SSM parameters, the parameter (and therefore your static data) must be smaller than 4kB. On top of this, you are restricted to only 40 TPS (transactions per second).

Advanced parameters can store up to 8kB of data and support up to 10K TPS.

A final point to add is that SSM Parameter Store parameters can be saved as secure strings, meaning they are encrypted by KMS  (KMS charges will apply). The client can decrypt the value as part of the retrieval request if it has IAM access to both the parameter and the KMS key. Just in case this is a requirement.

AppConfig Solution

AppConfig is certainly the wildcard in this comparison. However, it is the only service here that is actually designed for this use case.

AppConfig is part of the SSM (Systems Manager) family. It is an offering that allows you to define dynamic “configuration” and feature flags in a centralized place. This configuration can then be retrieved by clients via the AWS API.  It has a few notable features:

  • Supports rollout strategies: DynamoDB, S3 and SSM Parameter Store will restrict your data rollout to an “All At Once” approach. If there is an issue with the data, it will cause application-wide outages. It is also extremely difficult to implement A/B testing of different data. AppConfig gives you the freedom to rollout how you wish, with built in monitoring and automated rollback when errors spike.
  • Configuration validation: You can provide a JSON schema or a Lambda Function that will validate your configuration change before it is pushed to your application.
  • AWS-provided Lambda extension with built-in caching: With the other two solutions, there is still complexity for the clients. They must retrieve the data value during initialization, and periodically request the data in-case there has been a change. AppConfig provides a Lambda Extension containing the AppConfig Agent that handles all this for you, exposing a local HTTP endpoint that you can call on every invocation inside your Lambda Function.

Pricing

Assuming this is static data, there is just one cost of $0.002 per 10K requests. It is worth noting that every time your data changes, and for the initial fetch of new invocation environments, the request that fetches the updated data will cost $8 per 10K requests. So this can be an expensive solution for dynamic data or services that commonly cold-start - this can prove to be problematic for Lambda Functions.

Considerations

AppConfig configurations have a similar size restriction to standard SSM parameters. The data must be less than 4kB in size (by default, 2kB but this is a soft limit). However, their allowed throughput is an initial 1K TPS which is a soft limit with no documented maximum.

You must also be aware that there is a soft limit on the number of “configuration sessions” that can be started. This is 500 TPS and is a one-to-one mapping of Lambda Function invocation environments (you will configure a session during initialization and only once). This essentially means there is a soft limit of 500 Lambda cold starts per second. A reminder: this is only a soft limit.

What are Lambda Extensions?

Lambda extensions are distributed as Lambda layers. And Lambda Layers have a bad reputation.

This is mostly attributed to the fact that they make local testing of your code very difficult. A lambda layer essentially adds a directory to the execution environment’s file system. And if your code in production expects that directory to be there, then it will fail locally when it is not.

The best solution is to build your applications using AWS SAM. The AWS SAM local invocation mechanism of your function will spin up a local Docker container and invoke your code inside of it. An important detail I left out is that AWS SAM will also pull down your configured lambda layers and apply them to the file system inside the Docker container.

Unfortunately, there is no similar behaviour when developing with the Serverless Framework, CDK, Terraform, Pulumi etc.

What Does This Lambda Extension Do?

The AWS AppConfig Agent Lambda Extension does the following:

  • It runs the AWS AppConfig Agent inside your invocation environment after the environment is initialized but before your function code is initialized.
  • The agent will make a 'StartConfigurationSession' call to the AppConfig service.
  • It will then make a periodic 'GetLatestConfiguration' call as per your configured 'AWS_APPCONFIG_EXTENSION_POLL_INTERVAL_SECONDS' environment variable.
  • This evolving configuration value is exposed on a local port (by default '2772') and can be accessed by a simple HTTP request in your code inside your handler function on every invocation:
import urllib.request

def lambda_handler(event, context):
    url = f'http://localhost:2772/applications/application_name/environments/environment_name/configurations/configuration_name?flag=flag_name'
    config = urllib.request.urlopen(url).read()
    return config

Comparison

Different engineering teams will have different priorities when choosing the right approach. It is important we compare the performance, intended data sizes, intended concurrency and of course the price.

This comparison table includes a benchmark made from my local machine to fetch 4kB of data using each solution.

The table above compares the four options mentioned in this article. From this we can make the following comparison statements:

  • If your data is extremely large (> ~16MB): You would benefit from S3’s ability to parallelize the fetching. This will however increase the number of requests for your data. S3 is also your only valid option.
  • If your static data is very large (> 60kB): S3 will actually support more concurrency (5.5K vs 5.3K TPS) than DynamoDB for better than a quarter of the price ($0.004 vs $0.01875 for 10K data retrievals).
  • If your data is large (> 16kB): S3 becomes cheaper than DynamoDB.
  • If your static data is larger than 4kB: You are limited to DynamoDB, S3 & Advanced SSM Parameters.
    • If you need additional KMS encryption: Choose Advanced SSM Parameters
    • For all other scenarios (4kB < data < 16kB): DynamoDB is the cheapest and fastest option.
  • If your static data is smaller than 4kB:
    • If you know you won’t hit 40 requests for this data per second (40 TPS): Use Standard SSM Parameters free of charge.
    • If you want / need a rollout of data changes: Use AWS AppConfig.
    • Otherwise: DynamoDB will be the fastest and cheapest option.

Conclusion

You should be aware of size limits in DynamoDB (400kB), SSM Parameter Store (8kB), and AWS AppConfig (4kB). If your data were to grow beyond these limits, you will need to re-design your application immediately.

Aside from some of these concrete limitations, there is no silver bullet. Various factors, such as request throughput, data size and deployment strategy can impact the decision.

I hope this article makes that decision a little easier!

Serverless Handbook
Access free book

The dream team

At Serverless Guru, we're a collective of proactive solution finders. We prioritize genuineness, forward-thinking vision, and above all, we commit to diligently serving our members each and every day.

See open positions

Looking for skilled architects & developers?

Join businesses around the globe that trust our services. Let's start your serverless journey. Get in touch today!
Ryan Jones
Founder
Book a meeting
arrow
Founder
Eduardo Marcos
Chief Technology Officer
Chief Technology Officer
Book a meeting
arrow

Join the Community

Gather, share, and learn about AWS and serverless with enthusiasts worldwide in our open and free community.