AWS Lambda: Cold Starts

At work, I’ve been working with Lambdas. They’re interesting and quite exciting things to work with. They’re simple and very effective. They’re not designed to compute and process intense volumes of data beyond periods after 15 minutes (Lambdas can be warmed up between a minimum of 5 to 15 minutes), but for a lot of tasks, they’re very well suited. They can scale very quickly through throttling and there is a very nifty 50MB which can be used in the /tmp directory.

The major drawback for Lambdas are cold starts – particularly when working with a JVM-based language. It’s one of the major headaches any software engineer has to deal with; basically ‘how does one reduce the time it takes to execute X, while the Linux-based VM starts-up?’. It’s something you have to think about, to improve throughput.

So when a Lambda hasn’t been warmed up, it’s in a cold state. Suppose a message is sent to a queue (SQS) or an item is being inserted into a DynamoDB table, a signal is sent to the container (a Lambda is virtualised), and the VM (OS) then starts up. AWS Lambda is based on Firecracker, which is a MicroVM, that’s written in Rust (it’s similar to C++, but designed for better memory utilisation and thread safety). A MicroVM is based on a stripped down distribution of the Linux Kernel VM. By design, it has a very low memory overhead: just 5MiB. It potentially allows for hundreds or thousands of containers to be run on a single machine, which is where throttling can come in.

Its optimised design helps the VM to start up quickly. Unfortunately the start time of the MicroVM is just the first stage and if not setup properly, the JVM can add more milliseconds than you’re willing to settle with. Using a custom JVM (Java 11) and one of the JIT (Just in Time) compilation arguments can alleviate this, but it obviously has its performance drawbacks.

So how can we optimise the JVM to reduce cold starts? Well a good place is to go back to basics of how the JVM works. We all know that objects are created and at compile time, they’re compiled into byte code, against the maximum supported byte code version on the JRE (Java Runtime Environment) in the JVM that it’s written against.

When that bytecode is executed, the JVM stores objects into heaps of memory and stores its reference in the stack. Since the JVM can be an expensive beast in terms of CPU/RAM requirements, understanding the relationship between CPU clock cycles and read/write executions to each block of memory is important.

A good strategy is to limit what objects are allocated memory blocks in the JVM’s heap and call object methods statically. This helps the garbage collector and reduces creation and destruction of objects. Obviously if you need an array of objects in your code, then they’ll need to create an instance and create a reference in the stack to it’s place in the heap. However for methods in an object that aren’t needed that often, using static methods can be very useful.

Another strategy is to design your Lambda function with simplicity and versatility in mind. Good Object-Oriented Design can be very helpful and it will allow you to understand the level of interaction between CPU clock cycles and read/write executions to each block of memory. Design it in steps and use the JDK as appropriately as possible when implementing your code! It will thank you; it can be indicated through writing your function’s code using Test Driven Development (TDD) and reducing the time it takes to execute each unit and integration test, through refactoring.

I hope this post has been useful and insightful. I’m still learning myself, so if I learn anything more or new I will update it!

This entry was posted in AWS, Java, Linux/UNIX, Software Engineering. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Post a comment or leave a trackback: Trackback URL.

Post a Comment

Your email is never published nor shared. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>