Diving into Cloudfront

February 7, 2023

Cloudfront can be simply defined as a CDN (Content Delivery Network), caching your static assets in a datacenter nearer to your viewers. But Cloudfront is a lot more complex and versatile than this simple definition.

Cloudfront is a “pull” CDN, which means that you don’t push your content to the CDN. The content is pulled into the CDN Edge from the origin at the first request of any piece of content.

In addition to the traditional pull and cache usage, Cloudfront can also be used as:

  • A Networking Router
  • A Firewall
  • A Web Server
  • An Application Server

Why is using a CDN relevant?

The main reason is to improve the speed of delivery of static content. By caching the content on the CDN edge, you not only reduce the download time from a few seconds to a few milliseconds, but you also reduce the load and amount of requests on your backend (Network, IO, CPU, Memory, …).

Static content can be defined as content not changing between two identical requests done in the same time frame.

Identical can be as simple as the same URI, or as fine grained as down to the authentication header. The time frame can range between 1 second to 1 year.

The most common case is caching resources like Javascript or CSS and serving the same file to all users forever. But caching a JSON response tailored to a user (Authentication header) for a few seconds reduces the backend calls when the user has the well-known “frenetic browser reload syndrome”.

Edges, Mid-Tier Caches, and Origins

Cloudfront isn’t “just” some servers in datacenters around the world. The service is a layered network of Edge Locations and Regional Edge Caches (or Mid-Tier Caches).

Edge Locations are distributed around the globe with more than 400 points of presence in over 90 cities across 48 countries. Each Edge Location is connected to one of the 13 Regional Edge Caches.

Regional Edge Caches are transparent to you and your visitors, you can’t configure them or access them directly. Your visitors will interact with the nearest Edge Location, which will connect to the attached Regional Edge Cache and finally to your origin. Therefore, in this article, we will refer to Cloudfront as the combination of Edge Locations and Region Edge Caches.

Not only will the visitors benefit on download speed by retrieving content cached on the same Edge Location, but visitors in the same region using different Edge Locations will also benefit from the content cached at the Regional Edge Cache level by not having the need to retrieve the content from the origin.

Classic Pull and Cache Setup

Before diving into the deep end, let’s refresh our memories with some of the basics of Cloudfront.

Cloudfront can allow write actions (POST, PUT, DELETE) but they will never be cached. We will mainly focus on the read actions (GET, HEAD, OPTIONS).

Origin types

-HTTP/HTTPS:

         • S3 Website

          • S3 Multi-Region Access Points

          • API Gateway

         • ALB

          • Any HTTP Webserver

-S3:

          • S3 API

-Elemental Media Store Container

-Elemental Media Package Container

S3 Website or S3 API ?

Even if they look very similar, these two ways of accessing S3 are very different.

-S3 Website:

          • The bucket needs to be configured with website capabilities

          • Origin Domain: bucketName.s3-website-region.amazonaws.com

          • Configured as CustomOrigin

            Pros:

               • GET /foo/bar/ will serve /foo/bar/index.html

               • Returns 404 on file not found (if listBucket is public)

            Cons:

               • HTTP origin only

               • Files in the bucket need to be public

               • Files can be accessed by calling directly the S3-Website endpoint

-S3 API:

          • Origin Domain: bucketName.s3.amazonaws.com (or bucketName.s3.region.amazonaws.com)

          • Configured as S3Origin

            Pros:

               • HTTPS origin only

               • Files can be private and Cloudfront is granted access using OAC

            Cons:

               • GET /foo/bar/ returns 403 and not index.html

In both cases, we can overcome some of the cons:

-S3 Website:

          • Spoofing the referer header on the origin call and using an S3 policy to enforce this header minimizes the
possibility to access the bucket directly

{
    "Version": "2012-10-17",
    "Id": "Cloudfront",
    "Statement": [
      {
        "Sid": "Cloudfront",
        "Effect": "Allow",
        "Principal": "*",
        "Action": "s3:GetObject",
        "Resource": "arn:aws:s3:::myBucket/*",
        "Condition": {
          "StringLike": {
            "aws:Referer": "AAAAAA"
          }
        }
      }
    ]
  }

-S3 API:

          • Using Lambda@Edge, we can rewrite the origin URI to serve index.html (more details below).

S3 Multi-Region Access Points

Sadly you won’t see it as a possible origin, you need to treat it as a custom HTTPS endpoint.

To allow access to the origin you need to sign all the requests with AWS Sig4 by yourself using Lambda@Edge.

Custom HTTP(S) origins

For any HTTP/HTTPS origin, you can configure how Cloudfront connects to it:

  • Enforce HTTPS (or HTTP)
  • Custom ports
  • Timeouts
  • Additional Headers
  • Etc.

Multiple Origins and Behaviors

Cloudfront can handle multiple origins. The routing is based on the request path and is configured as behaviors.

Each behavior has its own settings of origin to use and how to handle cache policies.

It is not uncommon to have multiple behaviors sharing the same origin to adjust caching policies for different kinds of content.

Custom Domain

Cloudfront will always attach their xxxxx.cloudfront.net domain to your distribution, but you have the possibility to use your own domain by defining one or more domain aliases.

SSL Termination

Cloudfront handles SSL terminations at the Edge. You can only attach a single SSL certificate. This certificate needs to list all domains used as aliases. Furthermore, your clients need to support SNI (all modern browsers do), or else you need to purchase dedicated IP addresses for your distribution.

You need to provision the certificate in us-east-1, which is an annoyance when working with Cloudformation and deploying to another region.

This allows you to serve HTTPS traffic to your client, but connect using HTTP to the origin (S3 Website) or HTTPS on another domain (ALB, API Gateway).

Cache Policies and Cache Key

Caching is controlled using cache policies. Each behavior has a cache policy.

The cache policy not only dictates what makes content vary but also sets the time an object should remain in the cache and if the response should be compressed or not. Values defined as Cache Key are automatically transferred to the origin.

Cloudfront provides a set of pre-defined cache policies:

-CachingOptimized: Ideal for S3 backends

          • Headers, Cookies, and Query String aren’t taken into consideration

          • Origin’s Max-Age is used, defaults to 1 day

          • Response is compressed

-CachingDisabled:

          • Headers, Cookies, and Query String aren’t taken into consideration

          • Origin’s Max-Age is overwritten to 0 (no cache)

          • Response is compressed

-MediaPackage: Ideal for MediaPackage origin

          • origin Header is used to vary content

          • Query String parameters aws.manifestfilter, start, end, m are used to vary content

          • Cookies aren’t taken into account

          • Origin’s Max-Age is used, defaults to 1 day

          • Response is compressed

You can create your own depending on your needs, some examples:

-Visitor’s Language:

          • Use the Browser’s Header Accept-Language

-Visitor’s Country:

          • Use the Header generated by Cloudfront based on the visitor’s IP: CloudFront-Viewer-Country

-API with pagination:

          • Use the Query String Parameter page

Cache policies aren’t linked to a specific distribution. You can re-use the same policy on different and unrelated distributions.

Max-Age and Cache-Control

Cloudfront will use the origin’s Max-Age (Cache-Control) if it is in the bounds of min TTL and max TTL.

If the Origin’s TTL is outside the defined bounds, Cloudfront will overwrite it with min TTL or max TTL.

If the origin doesn’t set any caching rule, the Default TTL value will be used.

Cloudfront’s behavior on TTL doesn’t affect the client's behavior. The Cache-Control header from the origin is sent unmodified to the client.

Cloudfront might evict your object before the Max-Age is reached, especially if the object isn’t requested often, but it will never keep it longer than that value.

The browser will also cache for the amount of Max-Age, to cache the object differently in Cloudfront, you can use a combination of Max-Age and s-maxage. Cloudfront will cache the object for a duration of s-maxage and the browser for a duration of Max-Age. This is useful when using Cloudfront Functions to validate the access token when you want the browser to re-validate access, but without having to fetch the object from the Origin.

Max-Age isn’t the only way to control cache, Expires. For more in-depth details on the usage of Max-Age, refer to Cloudfront’s expiration documentation.

Extending with edge functions

Cloudfront isn’t limited to only fetching and caching statically defined resources. It gives you the ability to modify requests and responses in-flight using functions deployed and executed at the edge.

You are able to interact with the request/response in four different steps of the execution:

-viewer-request:

          • Triggered before the request reaches Cloudfront

          • Access to the first 40 KB of the body

          • A modified body can have a maximum size of 53.2 KB

          • Executed on each request

          • Generate a response and by-pass the Origin Request

          • Use cases:

               • Normalize headers

               • Normalize query string

               • Normalize cache keys

               • Authorization

-origin-request:

          • Triggered before Cloudfront calls the origin

          • Access to the first 1 MB of the body

          • A modified body can have a maximum size of 1.33 MB

          • Executed on cache miss only

          • Generate a response and by-pass the Origin Request

          • Use cases:

               • Rewriting path or host

               • Generate cached redirects

               • Generate cached static responses

-origin-response:

          • Triggered before Cloudfront received the origin response

          • No access to the body

          • Body can be generated

          • Executed on cache miss only

          • Use cases:

               • Add headers

               • Change HTTP status

               • Replace body with static content

-viewer-response:

          • Triggered before Cloudfront sends the response to the client

          • No access to the body

          • Body can be generated

          • Executed on each response

          • Use cases:

               • Add dynamic CORS headers

You have two solutions to interact with these requests/responses:

-Cloudfront Functions:

          • Executed on the Edge Location

          • Access to viewer-request and viewer-response only

          • Function can be deployed in any region. It’s a Cloudfront Object

          • Logs are stored in Cloudwatch Logs in us-east-1

          • No access to body

          • Geolocation Headers available

          • Runtime: javascript (ES 5.1) with some restrictions:

               • No custom ENV variables

               • No module includes

               • No timers

               • No network access

               • No File system access

               • Crypto and QueryString modules are built-in

          • Cost: $0.10 / 1M executions

          • Use cases:

               • Header normalization, manipulation

               • Query String normalization

               • Cache key generation

               • Path rewrite

               • Static Authorization

-Lambda@Edge:

          • Executed at the Regional Edge Cache

          • Access to all 4 events

          • Function needs to be deployed to us-east-1 as a Lambda Function

          • Logs are stored in Cloudwatch Logs in the region of the Regional Edge Cache

          • Limited access on request body

          • 5s timeout on viewer request/response

          • 30s timeout on origin request/response

          • Geolocation Headers available

          • Runtime: NodeJS or Python with some restrictions:

               • No custom ENV variables

               • No Lambda DLQ

               • No VPC

               • No Lambda Layers

               • No Lambda Container Images

          • Cost: $0.60 / 1M executions and additionally for the execution duration.

          • Use cases:

               • Authorization with oAuth endpoint

               • Database query

               • Body rewrite/validation

When to use which depends on the use case. If you don’t need to access network resources, Cloudfront Functions is the right choice. Even if they are executed on each request (viewer-request), you would need a very good hit ratio to get cheaper executions with Lambda@Edge (when executed only on cache miss on origin-request).

Furthermore, the need to deploy Lambda@Edge to us-east-1 makes it a cumbersome task when you want to define your stack in a single Cloudformation template deployed to any other region than us-east-1.

If you need to access network resources or need more memory and time to execute your code, then Lambda@Edge is the right choice.

If neither of these solutions works for you, you still can deploy Lambda and access it via API Gateway or Lambda function URL deployed in one or multiple regions and cache the result in Cloudfront.

Event structure

Events for Lambda@Edge or Cloudfront Functions differ in some parts, mainly how headers are represented.

Viewer Request

Lambda@Edge

{
  "Records": [
    {
      "cf": {
        "config": {
          "distributionDomainName": "d111111abcdef8.cloudfront.net",
          "distributionId": "EDFDVBD6EXAMPLE",
          "eventType": "viewer-request",
          "requestId": "4TyzHTaYWb1GX1qTfsHhEqV6HUDd_BzoBZnwfnvQc_1oF26ClkoUSEQ=="
        },
        "request": {
          "clientIp": "203.0.113.178",
          "headers": {
            "host": [
              {
                "key": "Host",
                "value": "d111111abcdef8.cloudfront.net"
              }
            ],
            "user-agent": [
              {
                "key": "User-Agent",
                "value": "curl/7.66.0"
              }
            ],
            "accept": [
              {
                "key": "accept",
                "value": "*/*"
              }
            ]
          },
          "method": "GET",
          "querystring": "",
          "uri": "/"
        }
      }
    }
  ]
}

Cloudfront Function

{
    "version": "1.0",
    "context": {
        "distributionDomainName": "d111111abcdef8.cloudfront.net",
        "distributionId": "EDFDVBD6EXAMPLE",
        "eventType": "viewer-request",
        "requestId": "4TyzHTaYWb1GX1qTfsHhEqV6HUDd_BzoBZnwfnvQc_1oF26ClkoUSEQ=="
    },
    "viewer": {
        "ip": "203.0.113.178"
    },
    "request": {
        "method": "GET",
        "uri": "/",
        "querystring": {},
        "headers": {
            "host": {
                "value": "d111111abcdef8.cloudfront.net"
            },
            "user-agent": {
                "value": "curl/7.85.0"
            },
            "accept": {
                "value": "*/*"
            }
        },
        "cookies": {}
    }
}

Origin Request

Lambda@Edge

{
  "Records": [
    {
      "cf": {
        "config": {
          "distributionDomainName": "d111111abcdef8.cloudfront.net",
          "distributionId": "EDFDVBD6EXAMPLE",
          "eventType": "origin-request",
          "requestId": "4TyzHTaYWb1GX1qTfsHhEqV6HUDd_BzoBZnwfnvQc_1oF26ClkoUSEQ=="
        },
        "request": {
          "clientIp": "203.0.113.178",
          "headers": {
            "x-forwarded-for": [
              {
                "key": "X-Forwarded-For",
                "value": "203.0.113.178"
              }
            ],
            "user-agent": [
              {
                "key": "User-Agent",
                "value": "Amazon CloudFront"
              }
            ],
            "via": [
              {
                "key": "Via",
                "value": "2.0 2afae0d44e2540f472c0635ab62c232b.cloudfront.net (CloudFront)"
              }
            ],
            "host": [
              {
                "key": "Host",
                "value": "example.org"
              }
            ],
            "cache-control": [
              {
                "key": "Cache-Control",
                "value": "no-cache, cf-no-cache"
              }
            ]
          },
          "method": "GET",
          "origin": {
            "custom": {
              "customHeaders": {},
              "domainName": "example.org",
              "keepaliveTimeout": 5,
              "path": "",
              "port": 443,
              "protocol": "https",
              "readTimeout": 30,
              "sslProtocols": [
                "TLSv1",
                "TLSv1.1",
                "TLSv1.2"
              ]
            }
          },
          "querystring": "",
          "uri": "/"
        }
      }
    }
  ]
}

Origin Response

Lambda@Edge

{
  "Records": [
    {
      "cf": {
        "config": {
          "distributionDomainName": "d111111abcdef8.cloudfront.net",
          "distributionId": "EDFDVBD6EXAMPLE",
          "eventType": "origin-response",
          "requestId": "4TyzHTaYWb1GX1qTfsHhEqV6HUDd_BzoBZnwfnvQc_1oF26ClkoUSEQ=="
        },
        "request": {
          "clientIp": "203.0.113.178",
          "headers": {
            "x-forwarded-for": [
              {
                "key": "X-Forwarded-For",
                "value": "203.0.113.178"
              }
            ],
            "user-agent": [
              {
                "key": "User-Agent",
                "value": "Amazon CloudFront"
              }
            ],
            "via": [
              {
                "key": "Via",
                "value": "2.0 8f22423015641505b8c857a37450d6c0.cloudfront.net (CloudFront)"
              }
            ],
            "host": [
              {
                "key": "Host",
                "value": "example.org"
              }
            ],
            "cache-control": [
              {
                "key": "Cache-Control",
                "value": "no-cache, cf-no-cache"
              }
            ]
          },
          "method": "GET",
          "origin": {
            "custom": {
              "customHeaders": {},
              "domainName": "example.org",
              "keepaliveTimeout": 5,
              "path": "",
              "port": 443,
              "protocol": "https",
              "readTimeout": 30,
              "sslProtocols": [
                "TLSv1",
                "TLSv1.1",
                "TLSv1.2"
              ]
            }
          },
          "querystring": "",
          "uri": "/"
        },
        "response": {
          "headers": {
            "access-control-allow-credentials": [
              {
                "key": "Access-Control-Allow-Credentials",
                "value": "true"
              }
            ],
            "access-control-allow-origin": [
              {
                "key": "Access-Control-Allow-Origin",
                "value": "*"
              }
            ],
            "date": [
              {
                "key": "Date",
                "value": "Mon, 13 Jan 2020 20:12:38 GMT"
              }
            ],
            "referrer-policy": [
              {
                "key": "Referrer-Policy",
                "value": "no-referrer-when-downgrade"
              }
            ],
            "server": [
              {
                "key": "Server",
                "value": "ExampleCustomOriginServer"
              }
            ],
            "x-content-type-options": [
              {
                "key": "X-Content-Type-Options",
                "value": "nosniff"
              }
            ],
            "x-frame-options": [
              {
                "key": "X-Frame-Options",
                "value": "DENY"
              }
            ],
            "x-xss-protection": [
              {
                "key": "X-XSS-Protection",
                "value": "1; mode=block"
              }
            ],
            "content-type": [
              {
                "key": "Content-Type",
                "value": "text/html; charset=utf-8"
              }
            ],
            "content-length": [
              {
                "key": "Content-Length",
                "value": "9593"
              }
            ]
          },
          "status": "200",
          "statusDescription": "OK"
        }
      }
    }
  ]
}

Viewer Response

Lambda@Edge

{
  "Records": [
    {
      "cf": {
        "config": {
          "distributionDomainName": "d111111abcdef8.cloudfront.net",
          "distributionId": "EDFDVBD6EXAMPLE",
          "eventType": "viewer-response",
          "requestId": "4TyzHTaYWb1GX1qTfsHhEqV6HUDd_BzoBZnwfnvQc_1oF26ClkoUSEQ=="
        },
        "request": {
          "clientIp": "203.0.113.178",
          "headers": {
            "host": [
              {
                "key": "Host",
                "value": "d111111abcdef8.cloudfront.net"
              }
            ],
            "user-agent": [
              {
                "key": "User-Agent",
                "value": "curl/7.66.0"
              }
            ],
            "accept": [
              {
                "key": "accept",
                "value": "*/*"
              }
            ]
          },
          "method": "GET",
          "querystring": "",
          "uri": "/"
        },
        "response": {
          "headers": {
            "access-control-allow-credentials": [
              {
                "key": "Access-Control-Allow-Credentials",
                "value": "true"
              }
            ],
            "access-control-allow-origin": [
              {
                "key": "Access-Control-Allow-Origin",
                "value": "*"
              }
            ],
            "date": [
              {
                "key": "Date",
                "value": "Mon, 13 Jan 2020 20:14:56 GMT"
              }
            ],
            "referrer-policy": [
              {
                "key": "Referrer-Policy",
                "value": "no-referrer-when-downgrade"
              }
            ],
            "server": [
              {
                "key": "Server",
                "value": "ExampleCustomOriginServer"
              }
            ],
            "x-content-type-options": [
              {
                "key": "X-Content-Type-Options",
                "value": "nosniff"
              }
            ],
            "x-frame-options": [
              {
                "key": "X-Frame-Options",
                "value": "DENY"
              }
            ],
            "x-xss-protection": [
              {
                "key": "X-XSS-Protection",
                "value": "1; mode=block"
              }
            ],
            "age": [
              {
                "key": "Age",
                "value": "2402"
              }
            ],
            "content-type": [
              {
                "key": "Content-Type",
                "value": "text/html; charset=utf-8"
              }
            ],
            "content-length": [
              {
                "key": "Content-Length",
                "value": "9593"
              }
            ]
          },
          "status": "200",
          "statusDescription": "OK"
        }
      }
    }
  ]
}

Cloudfront Function

{
    "version": "1.0",
    "context": {
        "distributionDomainName": "d111111abcdef8.cloudfront.net",
        "distributionId": "EDFDVBD6EXAMPLE",
        "eventType": "viewer-response",
        "requestId": "MmxS5hbDhc9VyOIqzmYksKesOj6n_54ycCBX4XCS5-w7OJJ5wloOAA=="
    },
    "viewer": {
        "ip": "203.0.113.178"
    },
    "request": {
        "method": "GET",
        "uri": "/",
        "querystring": {},
        "headers": {
            "host": {
                "value": "d111111abcdef8.cloudfront.net"
            },
            "user-agent": {
                "value": "curl/7.85.0"
            },
            "accept": {
                "value": "*/*"
            }
        },
        "cookies": {}
    },
    "response": {
        "statusCode": 200,
        "statusDescription": "OK",
        "headers": {
            "date": {
                "value": "Fri, 25 Nov 2022 12:33:42 GMT"
            },
            "last-modified": {
                "value": "Fri, 25 Nov 2022 12:31:12 GMT"
            },
            "etag": {
                "value": "\"b9c2e628c3ffe65db36c4d92c9aebbb3\""
            },
            "accept-ranges": {
                "value": "bytes"
            },
            "server": {
                "value": "AmazonS3"
            },
            "via": {
                "value": "1.1 dfeaaa9951aa7df30bdb3dfb8a94470a.cloudfront.net (CloudFront)"
            },
            "age": {
                "value": "82"
            },
            "content-type": {
                "value": "text/html"
            },
            "content-length": {
                "value": "109"
            }
        },
        "cookies": {}
    }
}

Less conventional usages

Networking Router

Even if you don’t need the caching functionalities of Cloudfront (POST requests or disabling cache on GET), you can still use Cloudfront to act as a Networking Router, by sending your visitor traffic to the nearest edge. From the edge to the origin, the performant AWS Backbone will be used instead of your ISP’s peering.

By using multiple behaviors, you can route your traffic to different backends that don’t need to be in the same region or even inside AWS.

As an example, S3 Transfer Acceleration uses Cloudfront as the endpoint, forcing the networking path to the nearest Edge Location and leveraging the AWS backbone to reach the bucket.

Firewall

In addition to using AWS WAF with Cloudfront to protect your Origin application, Cloudfront also provides default DDOS protection. You can also deny access to visitors from specific countries.

By adding Origin Shield, you add an additional layer of caching between Cloudfront and your origin, reducing the need for Cloudfront to retrieve content from your origin. This helps to reduce the load on your origin and improves the CDN hit ratio (and therefore download speed) for your visitors.

Application Server

By using Lambda@Edge functions, you can directly query databases like DynamoDB and apply business logic to it, without needing any origin (you still need to configure a dummy one, Cloudfront doesn’t allow you to be origin-less).

Combining this with DynamoDB Global Tables, you can always query a table near your edge, making your application performant and reliable.

Web Server

As seen through all the examples mentioned in this article, Cloudfront can be seen as “just” an HTTP server in front of your application.

Using functions (or Lambda@Edge) you can return redirections or static content without the need for any backend.

By using multiple behaviors, you can route your traffic to different types of backends.

Real World Use Cases

You can find the template used for the below examples on Github.

The template provides a distribution with different backends:

-S3 Website:

          • /blog/*

          • /private/*

-S3 API:

          • /assets/*

          • /html/*

          • /airport/*

-API Gateway:

          • /*

Cache based on visitor’s country

  • Backend: Api Gateway + Lambda
  • Edge function: None
  • Policy: Whitelist cloudfront-viewer-country
Type: AWS::CloudFront::CachePolicy
Properties:
  CachePolicyConfig:
    DefaultTTL: 10
    MinTTL: 0
    MaxTTL: 3600
    Name: Country
    ParametersInCacheKeyAndForwardedToOrigin:
      CookiesConfig:
        CookieBehavior: none
      EnableAcceptEncodingBrotli: true
      EnableAcceptEncodingGzip: true
      HeadersConfig:
        HeaderBehavior: whitelist
        Headers:
          - cloudfront-viewer-country
      QueryStringsConfig:
        QueryStringBehavior: none


Request:

curl -v 'https://d3h57w0cnyb350.cloudfront.net/country'

> GET /country HTTP/2
> Host: d3h57w0cnyb350.cloudfront.net
> user-agent: curl/7.85.0
> accept: */*

< HTTP/2 200
< content-type: application/json
< content-length: 2
< date: Tue, 06 Dec 2022 17:58:23 GMT
< x-amz-cf-pop: LIS50-C1

Lambda Event:

{
    "version": "2.0",
    "routeKey": "GET /country",
    "rawPath": "/country",
    "rawQueryString": "",
    "headers": {
        "accept-encoding": "br,gzip",
        "cloudfront-viewer-country": "PT",
        "content-length": "0",
        "host": "wvo7t33pz3.execute-api.eu-central-1.amazonaws.com",
        "user-agent": "Amazon CloudFront",
        "via": "2.0 592fdb72142153f4ac204b48e22d9036.cloudfront.net (CloudFront)",
        "x-amz-cf-id": "kz-vrWiS6p5lg9ZjGRCY7Xuwg2gmr5utkrJLaU62Leol8ApRIzL4nw==",
        "x-amzn-trace-id": "Root=1-638f8250-33b4ecae5ea02e382b6d5d8b",
        "x-forwarded-for": "X.X.X.X",
        "x-forwarded-port": "443",
        "x-forwarded-proto": "https"
    },
    "requestContext": {
        "accountId": "688589788262",
        "apiId": "wvo7t33pz3",
        "domainName": "wvo7t33pz3.execute-api.eu-central-1.amazonaws.com",
        "domainPrefix": "wvo7t33pz3",
        "http": {
            "method": "GET",
            "path": "/country",
            "protocol": "HTTP/1.1",
            "sourceIp": "X.X.X.X",
            "userAgent": "Amazon CloudFront"
        },
        "requestId": "cvFMihZBliAEPSw=",
        "routeKey": "GET /country",
        "stage": "$default",
        "time": "06/Dec/2022:17:56:32 +0000",
        "timeEpoch": 1670349392038
    },
    "isBase64Encoded": false
}

Rewrite URI to serve index.html

Nobody wants to type '/index.html' when calling a URL. Apache, Nginx, and S3-Website are loading 'index.html' automatically when no document is passed in the URI. When using an S3-API backend we need to provide this functionality ourselves. Cloudfront is only able to load 'index.html' at the root, which is generally enough for a SPA, but not for a static-generated site.

S3-Website

  • Backend: S3-Website
  • Edge function: None
  • Policy: managed-cacheOptimized

Request:

curl -v 'https://d3h57w0cnyb350.cloudfront.net/blog/articles/'

< HTTP/2 200
< content-type: text/html

'index.html' is returned, the backend is an S3-Website and has the logic to serve index.html when the document name is provided.

S3-API without function

  • Backend: S3-API
  • Edge function: None
  • Policy: managed-cacheOptimized

Request:

  <code>
curl -v 'https://d3h57w0cnyb350.cloudfront.net/assets/articles/'

< HTTP/2 403
< content-type: application/xml

An error is returned. The document '/assets/articles/' doesn’t exist in S3.

S3-API with function

  • Backend: S3-API
  • Edge function: viewer-request Cloudfront Function
  • Policy: managed-cacheOptimized

function:

function handler(event) {
  var request = event.request
  var uri = request.uri
  if (uri.endsWith('/')) {
    request.uri += 'index.html';
  }
  else if (!uri.includes('.')) {
    request.uri += '/index.html';
  }
  return request;
}

Request:

curl -v 'https://d3h57w0cnyb350.cloudfront.net/html/articles/'

< HTTP/2 200
< content-type: text/html

'index.html' is returned, and we rewrite the incoming URI to append 'index.html' to the request so that an existing key can be fetched from S3.

Serve localized content

Using the browser’s 'accept-language' header, we are returning the content in the language requested by the viewer. The URL is the same regardless of the language but needs to be cached according to this language. Since this header can have multiple variations, we normalize it to increase our hit ratio.

  • Backend: S3-API
  • Edge function: viewer-request Cloudfront Function
  • Policy: Whitelist x-locale

function:

function handler(event) {
  /*
  * Rewrite URI to serve index.html when missing
  **/
  var request = event.request

  if (request.uri.endsWith('/')) {
    request.uri += 'index.html';
  }
  else if (!request.uri.includes('.')) {
    request.uri += '/index.html';
  }

  /*
  * Set default locale
  **/
  var locale = 'en'
  var translations = ['fr', 'de']

  /*
  * Parse accept-language and extract the first value for simplicity
  * We should use all languages and use the first for which we have a translation
  **/
  if (request.headers['accept-language'] && request.headers['accept-language'].value) {
    var language = request.headers['accept-language'].value.split(',')[0].split(';')[0].substring(0,2).toLowerCase()
    if (translations.indexOf(language) > -1) {
      locale = language
    }
  }

  /*
  * Create an x-locale header to be used as part of the cache key
  **/
  request.headers['x-locale'] = {
    value: locale
  }
  /**
   * Rewrite the URI
   */
  request.uri = request.uri.replace('locale/', `locale/${locale}/`)
  return request;
}

Default (English):

curl https://d3h57w0cnyb350.cloudfront.net/html/locale/

< x-cache: Miss from cloudfront
&lt;html&gt;
  &lt;head&gt;
    &lt;title&gt;English&lt;/title&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;h1&gt;English content&lt;/h1&gt;
  &lt;/body&gt;
&lt;/html&gt;

French:

curl -H "Accept-Language: fr" https://d3h57w0cnyb350.cloudfront.net/html/locale/

< x-cache: Miss from cloudfront
&lt;html&gt;
  &lt;head&gt;
    &lt;title&gt;French&lt;/title&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;h1&gt;French content&lt;/h1&gt;
  &lt;/body&gt;
&lt;/html&gt;

Swiss-French:

curl -H "Accept-Language: fr-CH" https://d3h57w0cnyb350.cloudfront.net/html/locale/

< x-cache: Hit from cloudfront
&lt;html&gt;
  &lt;head&gt;
    &lt;title&gt;French&lt;/title&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;h1&gt;French content&lt;/h1&gt;
  &lt;/body&gt;
&lt;/html&gt;

Even with a different 'Accept-Language' header, the call for fr-CH is a Hit. Both languages are normalized to fr.

Protect with password

To showcase how password protection works, we will use “Basic Authentication”. In a more realistic setup, we would use a JWT token validation. In both cases, the authorization is stored inside the function’s code. If this is a security concern, you would need to use a Lambda@Edge function and validate the token by calling the Authorization Service.

  • Backend: S3-Website
  • Edge function: viewer-request Cloudfront Function
  • Policy: managed-cacheOptimized

function:

function handler(event) {
  var authHeaders = event.request.headers.authorization
  /**
   * Authorization string is sent by the browser as base64Encode(username:password).
   * base64Encode('private:private')='cHJpdmF0ZTpwcml2YXRl'
   */
  var expected = "Basic cHJpdmF0ZTpwcml2YXRl"
  if (authHeaders && authHeaders.value === expected) {
    return event.request
  }

  var response = {
    statusCode: 401,
    statusDescription: "Unauthorized",
    headers: {
      "www-authenticate": {
        value: "Basic realm='Enter your credentials'"
      }
    }
  }
  return response
}

With a valid authentication, the request is returned, instructing Cloudfront to continue by sending the request to the origin.

Without any valid authentication, a response is returned, instructing Cloudfront to not call the origin and directly return the response to the client.

curl -v https://d3h57w0cnyb350.cloudfront.net/private/

< HTTP/2 401
< www-authenticate: Basic realm='Enter your credentials'
< x-cache: FunctionGeneratedResponse from cloudfront

curl -v https://private:private@d3h57w0cnyb350.cloudfront.net/private/

> authorization: Basic cHJpdmF0ZTpwcml2YXRl

< HTTP/2 200
< content-type: text/html
&lt;html&gt;
  &lt;head&gt;
    &lt;title&gt;Private&lt;/title&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;h1&gt;Private content&lt;/h1&gt;
  &lt;/body&gt;
&lt;/html&gt;

Fetch content from DynamoDB

We use a Lambda@Edge function on the origin-request event. We fetch data from a DynamoDB Table and return the result, bypassing the request to the origin.

By using origin-request, we can cache the response in Cloudfront and trigger the function only when the result isn’t already cached in Cloudfront.

  • Backend: Any (not used but must be provided)
  • Edge function: origin-request Lambda@Edge
  • Policy: managed-cacheOptimized

function:

import type { CloudFrontRequestEvent, CloudFrontResponseResult } from 'aws-lambda'
import aws from 'aws-sdk'
import https from 'https'

const documentClient = new aws.DynamoDB.DocumentClient({
  region: 'eu-central-1',
  httpOptions: {
    agent: new https.Agent({
      keepAlive: true,
    }),
  },
})

export const handler = async (event: CloudFrontRequestEvent): Promise<CloudFrontResponseResult> => {
  const request = event.Records[0].cf.request

  try {
    const code = request.uri.split('/').slice(-1)[0]

    const data = await documentClient
      .get({
        TableName: 'demoContent',
        Key: {
          code,
        },
      })
      .promise()
    if (!(data && data.Item && data.Item.name)) {
      return notFound
    }
    return {
      status: '200',
      statusDescription: 'OK',
      body: JSON.stringify(data.Item),
      headers: {
        'content-type': [
          {
            value: 'application/json',
          },
        ],
        'cache-control': [
          {
            value: 'max-age=120',
          },
        ],
      },
    }
  } catch () {
    return notFound
  }
}

const notFound = {
  status: '404',
  statusDescription: 'Not Found',
  headers: {
    'content-type': [
      {
        value: 'application/json',
      },
    ],
    'cache-control': [
      {
        value: 'max-age=10',
      },
    ],
  },
  body: JSON.stringify({ message: 'Object Not Found' }),
}
curl -v "https://d3h57w0cnyb350.cloudfront.net/airport/gva"

< x-cache: Miss from cloudfront

curl -v "https://d3h57w0cnyb350.cloudfront.net/airport/gva"

< x-cache: Hit from cloudfront
{
  "city":"Geneva",
  "code":"gva",
  "name":"Geneva International Airport",
  "country":"Switzerland"
}

Serve alternate content in case of error

When the origin is unreachable, overloaded, or simply unable to handle the request, instead of showing the error to the client we redirect him to an alternate site.

  • Backend: Any
  • Edge function: origin-response Lambda@Edge
  • Policy: managed-cacheOptimized

function:

import type { CloudFrontResponseEvent, CloudFrontResponseResult } from 'aws-lambda'

export const handler = async (
  event: CloudFrontResponseEvent,
): Promise<CloudFrontResponseResult> => {
  const response = event.Records[0].cf.response
  const request = event.Records[0].cf.request

  try {
    const status = parseInt(response.status)
    if (status >= 400 && status <= 599) {
      return redirect(request.uri)
    } else {
      return response
    }
  } catch (e) {
    // eslint-disable-next-line no-console
    console.error(e as Error)

    return redirect(request.uri)
  }
}

const redirect = (uri: string): CloudFrontResponseResult => {
  return {
    status: '307',
    statusDescription: 'Temporary Redirect',
    headers: {
      location: [
        {
          value: `https://alt.example.com${uri}`,
        },
      ],
    },
  }
}

What Have We Learned?

  • Cloudfront is more than just a simple “pull-cache-serve” service
  • You improve delivery speed to your visitors
  • You can increase resilience by always using a healthy backend
  • You improve overall speed to your backend by leveraging AWS’s backbone
  • You can modify any request to tailor the response to your visitor’s device or region
  • You don’t always need a backend
  • You protect your backend by reducing the number of calls reaching it

About the Author

Daniel Muller is a Staff Serverless Developer at ServerlessGuru and has developed multiple large-scale serverless applications for OTT, BigData, and MarTech.

Pushing Cloudfront to its limits was his main concern while working on OTT applications, an area where low latency for everybody matters.If you have further questions, you can get in touch via LinkedIn.

Serverless Handbook
Access free book

The dream team

At Serverless Guru, we're a collective of proactive solution finders. We prioritize genuineness, forward-thinking vision, and above all, we commit to diligently serving our members each and every day.

See open positions

Looking for skilled architects & developers?

Join businesses around the globe that trust our services. Let's start your serverless journey. Get in touch today!
Ryan Jones
Founder
Book a meeting
arrow
Founder
Eduardo Marcos
Chief Technology Officer
Chief Technology Officer
Book a meeting
arrow

Join the Community

Gather, share, and learn about AWS and serverless with enthusiasts worldwide in our open and free community.