Improve Laravel Vapor response time with prewarming

laravel-tips Jack Ellis · Mar 22, 2020

For those of you who aren't aware, prewarming is used in Vapor to reduce cold starts. When you deploy your application, there won't be any Lambda containers that are "warm" and ready to respond to requests by default. This leads to guaranteed cold starts, which can add up to 2 seconds to response times, which is not ideal. Prewarming means that Vapor will send requests to a specified amount of containers on deployment to warm them up and will continue to send requests every 5 minutes to keep them warm. So prewarming is a great thing and you should always do it.

For those who already prewarm, you may be thinking "Am I prewarming enough?" or "Am I prewarming too much?". In addition to those concerns, you also might have heard the argument of "Why don't you just use a standard server since you won't get any cold starts?" etc. Well, I have news for you, and it's good news.

Let's consider a $5 / month DigitalOcean droplet. If it always online with 1GB of memory ready to go. Let's say that each Laravel request is 25MB. That means, in theory, if all memory is assigned to PHP requests (heh), the Droplet can handle 40 concurrent requests.

In Vapor, if you want to match that concurrency availability you would need to have 40 lambda containers warmed up. And guess what, it's not expensive to do.

Vapor pings your containers every 5 minutes when you use prewarming. There are ~730 hours in a month. 730 x 60 = 43,800 minutes. 43,800 minutes / 5 = 8,760. So there are 8,760 requests per month sent by Vapor per container for prewarming.

Let's use a 512MB Lambda container (far more than the 25MB each DigitalOcean request has) and consider warming 40 containers.

Post-Deployment Warming

Let's be pessimistic and assume that the post-deployment warm is 2 seconds and that we are responsible for the cost of the start-up. AWS doesn't charge us for this but I'm not sure who incurs the cost: Vapor or us. When you don't know the answer to something like this, and can't find it documented, you should always assume the worst case pricing scenario: So let's pretend we will be billed 2 seconds per cold start:

512MB for 2 seconds = $0.00001666
Request = $0.0000002
Data Transfer costs are $0.01 per GB, so they're negligible
Total cost per deployment for 1 container to be warmed: $0.00001686
Total cost per deployment for 40 containers to be warmed: $0.0006744

So each deployment will cost us $0.0006744. Let's say we deploy 500 times in a month (Freek Van der Herten productivity). That'd cost us 33 cents in a month.

Warming every 5 minutes

These warms are keeping your already-warm containers warm, so you won't have that 2 second cold start.

512MB for 200ms (estimate) = $0.000001666
Request = $0.0000002
Data Transfer costs of $0.01 per GB are negligible
Total cost per 5 minute "ping" = $0.000001866
Total cost per 5 minute "ping" for 40 containers = $0.00007464
Total cost for 40 containers to be warmed every 5 minutes for a month = $0.6538464

Isn't that incredible? So what does that mean in real talk? Let's assume an API with 200ms response times.

$0.6538464 / month = ready for 200 req/sec concurrency
$1.3076928 / month = ready for 400 req/sec concurrency
$2.6153856 / month = ready for 800 req/sec concurrency
$5.2307712 / month = ready for 1.6k req/sec concurrency

And sure, you will still have to pay for the actual request processing but the point is that you are fully warmed up and ready to handle incredible capacity for a ridiculously low price. Do you really think that someone who is handling 1.6k requests per second (5.7M req per hour) can't afford $5.30 to warm their containers up? That's the price of a coffee.

I'm confident that my math won't be exact but it's a solid estimate based on AWS' published pricing. I would advise running over the numbers for your own specific use case. All you need to do is use the AWS Lambda Pricing page to get the figures you need. Decide what kind of warming you want and go from there.

My main point here? Do not be afraid of setting your warming setting nice and high... because you're worth it :)

The Vapor team are actively working on improvements to their cold-start handling. It's not perfect, you'll still have cases where a few requests will encounter a delay (if they hit during the exact moment prewarming is occuring), but prewarming helps. They introduced provisioned concurrency (1 week after Lambda announced it), which is great, but it's way too expensive . The vapor team are actively working on new ideas to improve cold starts, and they move very fast.

Return to the Fathom Analytics blog

BIO
Jack Ellis, CTO + teacher

Recent blog posts

Tired of how time consuming and complex Google Analytics can be? Try Fathom Analytics:

Start a free trial