Reverse engineering AWS Lambda

So I have been spending some time jamming my hands into AWS Lambda's greasy internals, and I'd like to share all the wonderful details I've discovered.

why though?

I've use AWS Lambda quite extensively at work. And I wanted to get a better understanding of its inner working. What prompted this, you might ask?

Unofficial Native Go Runtime for Google Cloud Functions

There was an off handed comment by the author about the "Lambda API being a bit more complex."

Well I aim to find out just how complex it is, with the end goal of writing a custom runtime, similar to the one above.

Probably in Python, just because it's quick to prototype with.

Lets get started, shall we?

TL;DR

For the impatient of you, if you just want to see the results, feel free to look at the code here

Initial Spelunking

The tools

In order to better understand what AWS Lambda is doing. I wrote a tool I call lambda-command (lcmd for short). It executes a shell command in Lambda and then print the results. For example.

$ lcmd ls
lambda_function.py

Simple. Lets start poking around the Lambda environment and see what we can find.

I'll start with the python3.6 environment because its what I know.

Cave entrance

$ lcmd ps x
PID TTY      STAT   TIME COMMAND
  1 ?        Ss     0:00 /var/lang/bin/python3.6 /var/runtime/awslambda/bootstrap.py
 16 ?        R      0:00 ps x

Interesting. So there is only a single process running (other than the lcmd subprocess).

So that is a big flag that we are running in some sort of container. (If you didn't realise that already about Lambda). Since containers hide the other processes visible via a thing called namespaces.

But is it docker?

Well the easiest way to tell is to look for the /.dockerenv in the filesystem root.

$ lcmd ls -la /
dr-xr-xr-x   2 root         root 4096 Feb 13 21:07 bin
dr-xr-xr-x   2 root         root 4096 Feb 13 21:06 boot
drwx------   4 root         root 4096 Feb 13 21:07 builddir
drwxr-xr-x   2 root         root 4096 Feb 21 10:24 dev
drwxr-xr-x  60 root         root 4096 Feb 13 21:10 etc
drwxr-xr-x   2 root         root 4096 Jan  6  2012 home
-rw-rw-r--   1 root         root    0 Feb 13 21:06 .initialized
dr-xr-xr-x   7 root         root 4096 Feb 13 21:06 lib
dr-xr-xr-x   6 root         root 4096 Feb 13 21:06 lib64
drwxr-xr-x   2 root         root 4096 Jan  6  2012 media
drwxr-xr-x   2 root         root 4096 Jan  6  2012 mnt
drwxr-xr-x   2 root         root 4096 Jan  6  2012 opt
dr-xr-xr-x 100 root         root    0 Feb 21 10:25 proc
dr-xr-x---   2 root         root 4096 Jan  6  2012 root
dr-xr-xr-x   2 root         root 4096 Feb 13 21:07 sbin
drwxr-xr-x   2 root         root 4096 Jan  6  2012 selinux
drwxr-xr-x   2 root         root 4096 Jan  6  2012 srv
drwx------   2 sbx_user1059  487 4096 Feb 21 10:25 tmp
drwxr-xr-x  13 root         root 4096 Feb 13 21:05 usr
drwxr-xr-x  22 root         root 4096 Feb 21 10:24 var

Hmmm. No .dockerenv, well it doesn't seem to be docker. And that is the sbx_user1059. Maybe sbx is sandbox?

Lets see if there are any cgroups set up.

$ lcmd cat /proc/1/cgroup
9:perf_event:/
8:memory:/sandbox-b647c1
7:hugetlb:/
6:freezer:/sandbox-8808d5
5:devices:/
4:cpuset:/
3:cpuacct:/sandbox-496bbd
2:cpu:/sandbox-root-5TrpgM/sandbox-6b041b
1:blkio:/

No docker there either. Maybe there are running a heavily modified docker version.

Wait... furious wikipedia searching

Docker version 1.0 release 06/2014 dockerpedia

lambdapedia

Well knowing AWS they would have had Lambda in preview long before public release (6 months? a year?). And they wouldn't have started with a pre 1.0 software release.

Therefore it is fairly safe to assume they wrote their own container runtime. Or maybe I'm wrong and they just worked very quickly. Either way for the moment we'll move forward assuming we are in a custom runtime.

What else can we work out about this container image. Well it's based on Amazon Linux (But we already knew that)

$ lcmd cat /etc/os-release
NAME="Amazon Linux AMI"
VERSION="2017.03"
ID="amzn"
ID_LIKE="rhel fedora"
VERSION_ID="2017.03"
PRETTY_NAME="Amazon Linux AMI 2017.03"
ANSI_COLOR="0;33"
CPE_NAME="cpe:/o:amazon:linux:2017.03:ga"
HOME_URL="http://aws.amazon.com/amazon-linux-ami/"

And unsuprisingly it's running on EC2, at least if it's hostname is to be trusted.

$ lcmd hostname -f
ip-10-12-108-193.ap-southeast-2.compute.internal

Next lets take a look at that bootstrap.py file from our ps command from earlier.

$ lcmd cat /var/runtime/awslambda/bootstrap.py
# -*- coding: utf-8 -*-
"""
aws_lambda.bootstrap.py
Amazon Lambda

Copyright (c) 2013 Amazon. All rights reserved.

Lambda runtime implemention
"""
from __future__ import print_function

import decimal
... snipped for readability
import traceback

import runtime as lambda_runtime

import wsgi


def _get_handlers(handler, mode):
    lambda_runtime.report_user_init_start()
    init_handler = lambda: None

    """
    This is the old way we were loading modules.
    It was causing intermittent build failures for unknown reasons.
    Using the imp module seems to remove these failures.
    """

AND ANOTHER 100 OR S LINES.
... snipped for readability

Wowzer. Ok, well lets put that to the side for the moment.

You can find a copy of the file here. If you want to peruse it right now, but we will be coming back to it later.

So Lambda has to be getting our code into the container somehow. Lets take a look at the mount points.

Mounts

$ lcmd mount

No dice. We saw before that we could look in /proc. Let's look some more. (I am using pid 1, since that is what bootstrap and our code is running under)

$ lcmd cat /proc/1/mounts
none /proc proc rw,nosuid,nodev,noexec,relatime 0 0
/dev/xvda1 / ext4 ro,nosuid,nodev,noatime,data=ordered 0 0
/dev/xvda1 /dev ext4 rw,nosuid,noexec,noatime,data=ordered 0 0
/dev/xvda1 /var/task ext4 ro,nosuid,noatime,data=ordered 0 0
/dev/xvda1 /var/runtime ext4 ro,nosuid,nodev,noatime,data=ordered 0 0
/dev/xvda1 /var/lang ext4 ro,nosuid,nodev,noatime,data=ordered 0 0
/dev/xvda1 /proc/sys/kernel/random/boot_id ext4 ro,nosuid,nodev,noatime,data=ordered 0 0
/dev/loop0 /tmp ext4 rw,relatime,data=ordered 0 0

Now, we're getting somewhere.

So there are the standard mounts of /proc (which we are using at the moment) as well as / and /dev
There is also /var/task, /var/runtime, and /var/lang. You may remember that bootstrap.py was in /var/runtime.
Next we have a boot_id, which after reading up on them here. I guess it is mounted separately so AWS can ensure a unique id for each container.
Finally there is a /tmp that is mounted to a loopback device. That makes sense since AWS guarantee 512MB of disk space in /tmp and that is probably an easy way to manage it.

It should be worth noting as well that every mount is read only except /tmp and most have have the nosuid and nodev flags in place, presumable as a security precaution, which prevent executing userid binaries and special files respectively.

While were looking at mount, let also look at mountinfo.

$ lcmd cat /proc/1/mountinfo
# reformatted for readability

234 132 0:30 / /proc rw,nosuid,nodev,noexec,relatime - proc none rw

132 94 202:1 /opt/amazon/asc/worker/chroot
/ ro,nosuid,nodev,noatime master:39 - ext4 /dev/xvda1 rw,data=ordered

137 132 202:1 /opt/amazon/asc/worker/sandbox-tmp/sbtasks/sandbox-dev
/dev rw,nosuid,noexec,noatime master:22 - ext4 /dev/xvda1 rw,data=ordered

235 234 202:1 /opt/amazon/asc/worker/sandbox-tmp/sbtasks/boot_id-kL11Uq
/proc/sys/kernel/random/boot_id ro,nosuid,nodev,noatime master:22 - ext4 /dev/xvda1 rw,data=ordered

135 132 202:1 /opt/amazon/asc/worker/tasks/[ACCOUNT_ID]/lcmd/2bee02fb-b088-4b55-99df-b9314db32575
/var/task ro,nosuid,noatime shared:40 master:22 - ext4 /dev/xvda1 rw,data=ordered

236 132 202:1 /opt/amazon/asc/worker/runtime/python-3.6
/var/runtime ro,nosuid,nodev,noatime master:22 - ext4 /dev/xvda1 rw,data=ordered

237 132 202:1 /opt/amazon/asc/worker/lang/python-3.6
/var/lang ro,nosuid,nodev,noatime master:22 - ext4 /dev/xvda1 rw,data=ordered

138 132 7:0 / /tmp rw,relatime - ext4 /dev/loop0 rw,data=ordered

That provides some more juicy information. It's nice when people name things correctly. Let's organise it so it's easier to read.

Host path	container path
/	/proc
/	/tmp
/opt/amazon/asc/worker/chroot	/
/opt/amazon/asc/worker/sandbox-tmp/sbtasks/sandbox-dev	/dev
/opt/amazon/asc/worker/sandbox-tmp/sbtasks/boot_id-kL11Uq	/proc/sys/kernel/random/boot_id
/opt/amazon/asc/worker/tasks/[ACCOUNT_ID]/lcmd/2bee02fb-b088-4b55-99df-b9314db32575	/var/task
/opt/amazon/asc/worker/runtime/python-3.6	/var/runtime
/opt/amazon/asc/worker/lang/python-3.6	/var/lang

So /proc and /tmp are name spaced so nothing new there. Similarly, / is a chroot which we expected since we are in a container.

Next it looks like AWS have created a sandbox for the boot_id and /dev file systems. Again probably to ensure that each container behaves as independent machines.

Next we have /var/task, which contains the code for our Lambda function, so it looks like AWS unzip the lambda bundle into this tasks directory on the host and mount it.

(Some might have noticed I had to redact the AWS account id from the path, while technically not private, it makes information gathering of someone else's AWS account much easier.)

$ lcmd pwd
/var/task

Just to confirm. Yes, task is where our code lives.

And then the last 2 mounts, /var/lang and /var/runtime are related to the language and runtime. It makes sense to mount them also since it means updating the language or runtime for all Lambda functions is just pushing out a new code bundle to be mounted.

Next up, lets have a look at /dev.

$ lcmd ls -la /dev
total 8
drwxr-xr-x  2 root root 4096 Feb 21 10:24 .
drwxr-xr-x 21 root root 4096 Feb 21 10:24 ..
crw-rw-rw-  1 root root 1, 7 Feb 21 10:24 full
crw-rw-rw-  1 root root 1, 3 Feb 21 10:24 null
crw-rw-rw-  1 root root 1, 8 Feb 21 10:24 random
lrwxrwxrwx  1 root root   15 Feb 21 10:24 stderr -> /proc/self/fd/2
lrwxrwxrwx  1 root root   15 Feb 21 10:24 stdin -> /proc/self/fd/0
lrwxrwxrwx  1 root root   15 Feb 21 10:24 stdout -> /proc/self/fd/1
crw-rw-rw-  1 root root 1, 9 Feb 21 10:24 urandom
crw-rw-rw-  1 root root 1, 5 Feb 21 10:24 zero

Hmm, a little sparse, but I guess it is the minimum required for most software. I can see why they wouldn't want people being able to write to the snd or console.

etc. etc.

$ lcmd ls /etc

adjtime
aliases
alternatives
asound.conf
bash_completion.d
bashrc
blkid
chkconfig.d
cloud
csh.cshrc
csh.login
dbus-1
default
...
AND lots more.

You can find the full list of /etc here.

Almost exactly what you would expect from a standard Amazon linux distribution. There are a few interesting things I'll call out. Though for the most part it's not that interesting.

passwd: It contains a bunch of numbers users like so:

...
sbx_user1051:x:496:495::/home/sbx_user1051:/sbin/nologin
sbx_user1052:x:495:494::/home/sbx_user1052:/sbin/nologin
sbx_user1053:x:494:493::/home/sbx_user1053:/sbin/nologin
sbx_user1054:x:493:492::/home/sbx_user1054:/sbin/nologin
...

125 of them in total. Which would point to them being the users that the containers namespaces are remapped into. I've tried listing the number of users with different amounts of memory assigned to my lambda. I hypothesised that the number of users would correlate to the number of concurrent containers. But it didn't seem to have any effect on the number of users in the passwd file. It is probably worth more experimentation, like observing if the numbers change after a cold start.

The only other things of note are some some of the configuration files. These seem to be the default of the packages. For example Java and Ghostscript. I suspect there is more information to be gleaned from deeper investigation. But I'll leave that as an exercise for the reader. Definitely not because I'm lazy.

There is more to be learned from exploring the rest of the system and I did do that. For many hours. With a very low success rate. But luckily I can just tell you the important bits. That is to say the bits that I realised were important after the fact and after staring at pages of assembler.

Reversing

👢strap.py

From ps and /proc/1/exe we know that the entrypoint for the container is python3.6 /var/runtime/awslambda/bootstrap.py.

Lets take a tour of the file. There are a lot of interesting functions, as well as a companion wsgi.py file, that we'll look briefly at later. But this is not everything, and I encourage the curious to look for themselves. And while all the runtimes for the various different languages supported by AWS Lambda perform the same function, they all differ slightly in how they achieve this.

Imports

import decimal
import imp
import json
import logging
import os
import site
import socket
import sys
import time
import traceback

import runtime as lambda_runtime

import wsgi

Looks like some standard library packages as well as that wsgi file I mentioned before.

But where does that runtime library come from you may ask? Well that is easy, looking in the same directory we can see a c extension python library. Hurray, I hope you like disassembly. But it does mean that whenever we see lambda_runtime.function_call(...) we know the definition won't be in the bootstrap file.

$ lcmd ls /var/runtime/awslambda/
bootstrap.py
__init__.py
runtime.cpython-36m-x86_64-linux-gnu.so
wsgi.py

I find the easiest way to work out what code is doing is to just follower its execution path from start to finish. So let's jump to the __main__ entrypoint.

if __name__ == '__main__':
    log_info("main started at epoch {0}".format(int(round(time.time() * 1000))))
    main()

It looks fairly standard, but lets take a quick look at that log_info function before continuing.

def log_info(msg):
    lambda_runtime.log_sb("[INFO] ({}) {}".format(__file__, msg))

Our first call to lambda_runtime. We'll take note of it and continue. Though it is worth pointing out that the main started at epoch... message is not in the logs we get from our execution of a lambda. Which means that the log_sb function must be sending logs to an internal AWS system.

def main():
    if sys.version_info[0] < 3:
        reload(sys)
        sys.setdefaultencoding('utf-8')

    sys.stdout = CustomFile(sys.stdout)
    sys.stderr = CustomFile(sys.stderr)

    logging.Formatter.converter = time.gmtime
    logger = logging.getLogger()
    logger_handler = LambdaLoggerHandler()
    logger_handler.setFormatter(logging.Formatter(
        '[%(levelname)s]\t%(asctime)s.%(msecs)dZ\t%(aws_request_id)s\t%(message)s\n',
        '%Y-%m-%dT%H:%M:%S'
    ))
    logger_handler.addFilter(LambdaLoggerFilter())
    logger.addHandler(logger_handler)

So main is quite a big function, so we'll walk through it in stages. First it looks like there is some setup and python2/3 boilerplate (the setting encoding). Then it sets stdout and stderr to be custom files.

class CustomFile(object):
    def __init__(self, fd):
        self._fd = fd

    def __getattr__(self, attr):
        return getattr(self._fd, attr)

    def write(self, msg):
        lambda_runtime.log_bytes(msg, self._fd.fileno())
        self._fd.flush()

    def writelines(self, msgs):
        for msg in msgs:
            lambda_runtime.log_bytes(msg, self._fd.fileno())
            self._fd.flush()

This is a relatively simple wrapper around normal stdout and stderr objects. But instead of just writing to the files, we pass in the message to a new runtime function log_bytes.

What is most interesting about this function is that is takes the file description of the parent file (i. e. stdout/stderr) and python still calls flush on the file descriptor object before continuing. This implies that in addition to what ever this log_bytes function does, it also appears to write to stdout and stderr, because otherwise there would be no need to call flush.

It is also worth noting that messages directly to log_bytes do appear in our lambda logs (in cloudwatch). This can be tested by importing lambda_runtime from bootstrap. py and calling log_bytes directly.

class LambdaLoggerFilter(logging.Filter):
    def filter(self, record):
        record.aws_request_id = _GLOBAL_AWS_REQUEST_ID or ""
        return True

Back to main, the next steps are just standard logging configuration. It sets up a default handler as well as a filter which adds a GLOBAL_AWS_REQUEST_ID from a global variable if it's available.

    # Remove lambda internal environment variables
    for env in [
        "_LAMBDA_CONTROL_SOCKET",
        "_LAMBDA_SHARED_MEM_FD",
        "_LAMBDA_LOG_FD",
        "_LAMBDA_SB_ID",
        "_LAMBDA_CONSOLE_SOCKET",
        "_LAMBDA_RUNTIME_LOAD_TIME"
    ]:
        del os.environ[env]

Now it's deleting some environment variables. Presumably for internal use... unless of course the comments are lies [illuminati.png] But I'd like to see what they are, luckily we can see some before they are deleted using our handy /proc.

$ lcmd cat /proc/1/environ
AWS_DEFAULT_REGION=us-east-1
AWS_EXECUTION_ENV=AWS_Lambda_python3.6
AWS_LAMBDA_FUNCTION_MEMORY_SIZE=3008
AWS_LAMBDA_FUNCTION_NAME=lcmd
AWS_LAMBDA_FUNCTION_VERSION=$LATEST
AWS_LAMBDA_LOG_GROUP_NAME=/aws/lambda/lcmd
AWS_LAMBDA_LOG_STREAM_NAME=2018/01/01/[$LATEST]81d524bb74ad45b4b540047039a84e1e
AWS_REGION=us-east-1
AWS_XRAY_CONTEXT_MISSING=LOG_ERROR
AWS_XRAY_DAEMON_ADDRESS=169.254.79.2:2000
LAMBDA_RUNTIME_DIR=/var/runtime
LAMBDA_TASK_ROOT=/var/task
LANG=en_US.UTF-8
LD_LIBRARY_PATH=/var/lang/lib:/lib64:/usr/lib64:/var/runtime:/var/runtime/lib:/var/task:/var/task/lib
PATH=/var/lang/bin:/usr/local/bin:/usr/bin/:/bin
PYTHONPATH=/tmp/:/var/task/:/var/runtime/
TZ=:UTC
_AWS_XRAY_DAEMON_ADDRESS=169.254.79.2
_AWS_XRAY_DAEMON_PORT=2000
_HANDLER=lcmd_lambda.lambda_handler
_LAMBDA_CONSOLE_SOCKET=28
_LAMBDA_CONTROL_SOCKET=21
_LAMBDA_LOG_FD=38
_LAMBDA_RUNTIME_LOAD_TIME=10746081534797
_LAMBDA_SB_ID=10
_LAMBDA_SHARED_MEM_FD=12
_X_AMZN_TRACE_ID=Parent=19ce256b20b7fec2

Again a lot of noise we're going to ignore for the moment, lets focus on the ones from in the 'env' array in 'bootstrap.py'.

Most seem to be file descriptor numbers, presumably so that the python runtime can communicate with the host environment to recieve invokes and send responses. Lets see if any of them are open still.

$ lcmd ls -l /proc/1/fd
lrwx------ 1 sbx_user1061 485 64 Apr  9 12:58 0 -> socket:[47433]
l-wx------ 1 sbx_user1061 485 64 Apr  9 12:58 1 -> pipe:[48314]
l-wx------ 1 sbx_user1061 485 64 Apr  9 12:58 2 -> pipe:[48315]
lr-x------ 1 sbx_user1061 485 64 Apr  9 12:58 3 -> /dev/urandom
lr-x------ 1 sbx_user1061 485 64 Apr  9 12:58 4 -> pipe:[50082]
lr-x------ 1 sbx_user1061 485 64 Apr  9 12:58 6 -> pipe:[50083]
l-wx------ 1 sbx_user1061 485 64 Apr  9 12:58 10 -> /opt/amazon/asc/worker/sb_log/sb10.log
lrwx------ 1 sbx_user1061 485 64 Apr  9 12:58 21 -> socket:[48311]
lrwx------ 1 sbx_user1061 485 64 Apr  9 12:58 28 -> socket:[48313]
l-wx------ 1 sbx_user1061 485 64 Apr  9 12:58 38 -> pipe:[10969]

So matching them up we get the following table.

Environment	FD	File/Socket/Pipe	Comment
	0	socket:[47433]	STDIN
	1	pipe:[48314]	STDOUT
	2	pipe:[48315]	STDERR
	3	/dev/urandom	urandom
	4	pipe:[50082]	???
	6	pipe:[50083]	???
LAMBDA_SB_ID	10	/opt/amazon/asc/worker/sb_log/sb10.log	log file, strangely there is nothing at that location, it must be already open before the `fork` call or is passed in via a socket
LAMBDA_SHARED_MEM_FD	12		No open fd? maybe something is read at startup and closed before it hands off control to our handler
LAMBDA_CONTROL_SOCKET	21	socket:[48311]	r/w socket, presumable for control signals
LAMBDA_CONSOLE_SOCKET	28	socket:[48313]	r/w socket, possibly for sending signals to the AWS console?
LAMBDA_LOG_FD	38	pipe:[10969]	w only pipe, not sure why we have both the sb_log and this.

So a bit of new information, but we'll move on from here for the moment, but will refer back to this take once we start looking at the disassembled shared libraries.

Next there is a call to wait_for_start.

def wait_for_start():
    (invokeid, mode, handler, suppress_init, credentials) = lambda_runtime.receive_start()
    force_path_importer_cache_update()
    set_environ(credentials)
    lambda_runtime.report_running(invokeid)

    return (invokeid, mode, handler, suppress_init, credentials)

It makes a receive_start call to the shared library, and gets back some interesting values before calling force_path_importer_cache_update, set_environ and then a second shared library call report_running.

def force_path_importer_cache_update():
    for path in os.environ.get("PYTHONPATH", "").split(":"):
        if path == os.environ["LAMBDA_RUNTIME_DIR"]:
            continue
        importer = sys.path_importer_cache.get(path, None)
        if not importer or isinstance(importer, imp.NullImporter):
            sys.path_importer_cache.pop(path, None)

This function ensures that the path importer cache only paths in the PYTHONPATH or the LAMBDA_RUNTIME_DIR (which looking back to our env vars, we can see is /var/runtime). I'm not sure why there would be other values polluting the path hooks, possibly a python2/3 compatibility issue.

Next we have the set environ function,

def set_environ(credentials):
    key, secret, session = credentials.get('key'), credentials.get('secret'), credentials.get('session')
    # TODO delete from environ if params not found
    if credentials.get('key'):
        os.environ['AWS_ACCESS_KEY_ID'] = key
    if credentials.get('secret'):
        os.environ['AWS_SECRET_ACCESS_KEY'] = secret
    if credentials.get('session'):
        os.environ['AWS_SESSION_TOKEN'] = session
        os.environ['AWS_SECURITY_TOKEN'] = session

So this tells us that the credentials object returned from wait_for_start is dict like and contains the AWS credentials for our function to use.

Credentials digression

If you are familiar with how AWS handles authentication and authorization feel free to skip to the next section, but otherwise consider this a crash coarse.

All requests to AWS's control plane require a signature, the current correct method is called SigV4. This signature tells AWS the identity of the request, and links it to an IAM User or IAM Role, the two AWS IAM (Identity and Access Management) primitives. These can optionally have policies of various types which grant the User or Role the permissions to perform actions within the AWS account/environment.

Why this matters to Lambda is that without a set of credentials it can't really do much. It can accept a request and process the data before returning, but it can't talk to any other AWS services (even basic functions such as writing out logs to cloudwatch, require these credentials). Which is why AWS requires that you attach an IAM role to your function.

To perform the actual signing process requires and Access Key Id, a Secret Access Key and optionally a session token. All of the official AWS SDKs handle this process for you, but not only that they also know how to fetch the credentials from as set of known location.

If you are on a dev machine, you are most likely using a set of IAM User access keys. These are a fixed Key Id an Secret key stored in the ~/.aws/credentials directory.
If you are on an EC2 instance, then the best way to get credentials is to assign a role to the instance. A role being very much like a user, except that instead of using a fixed set of credentials you retrieve temporary credentials through various means. With EC2 these are retrieved via a meta-data IP address, 169.254.169.254, and the AWS SDKs are aware of this and will retrieve those credentials automatically.
A third location is via environment variables, which again the SDKs are aware of and will retrieve automatically, and as we can see from the set_environ code this is the method being used by lambda. Instead of having a meta data endpoint, the credentials are passed into the container before being set in the environment, giving your function the privileges of the role attached to it.
There are a few other exotic methods (which can be read seen here), such as running a subprocess to get the credentials or hitting another meta data endpoint specific to AWS container services like ECS or Fargate.

sys.path.insert(0, os.environ['LAMBDA_TASK_ROOT'])

# Set /var/task as site directory so we are able to load all customer pth files
site.addsitedir(os.environ["LAMBDA_TASK_ROOT"])

if suppress_init:
    init_handler, request_handler = lambda: None, None
else:
    init_handler, request_handler = _get_handlers(handler, mode)
    
lambda_runtime.report_done(invokeid, None, None)
log_info("init complete at epoch {0}".format(int(round(time.time() * 1000))))

The next section of bootstrap first inserts the location of our code (look back to the environment variables section, /var/task/ is the location of our code and is set to LAMBDA_TASK_ROOT). This ensures that our code is able to be loaded by python. Next we also add the task directory as a custom site directory which ensures that any custom libraries we've included in our lambda code are also loadable. And finally we optionally call _get_handler depending on the content of suppress_init, which is a flag (bool) given to us from the receive_start function.

I will add that I've never been able to get suppress_init to be True. I've tried different invocation types (sync vs async), calling via straight API or via event source (i.e. Kinesis or API gateway). So if someone is able to work out when/if this is set, please let me know.

The _get_handler function is long and complex to handle a variety of use cases, but I'll quickly go over important parts:

It calls two shared library functions report_user_init_start and report_user_init_end at it's start and directly before returning.
In addition to loading the handler function you've specified for you lambda, it also loads a function called init from you handler module if it exists. It is commented in the code that this is maintained for backwards compatibility (and it not mentioned in AWS docs are far as I can find).
Depending on the mode (either http or event) flag, also passed in via receive_start, your handler is either treated as a normal python function or it is treated as a WSGI handler. Which is what the wsgi.py module we saw along side bootstrap.py is used for. Interestingly again I've never been able to get my lambda to actually have a http mode (used similar tests to the suppress_init flag above), I still haven't been able to work out if this is a yet to be unreleased feature or is legacy from when lambda was purely a service for just web requests.

After doing the initialisation steps the runtime calls report_done to signify that we've finished the setup and are ready to start the actual event handler loop, and well as logging some debug information. And again the log_info command does not output to anywhere us as a user can see, so it is purely for AWS's use.

while True:
    (invokeid, x_amzn_trace_id, sockfd, credentials, event_body,
      context_objs, invoked_function_arn) = wait_for_invoke()
    _GLOBAL_AWS_REQUEST_ID = invokeid

    if x_amzn_trace_id != None:
        os.environ['_X_AMZN_TRACE_ID'] = x_amzn_trace_id
    elif '_X_AMZN_TRACE_ID' in os.environ:
        del os.environ['_X_AMZN_TRACE_ID']

    # If the handler hasn't been loaded yet, due to init suppression, load it now.
    if request_handler is None:
        init_handler, request_handler = _get_handlers(handler, mode)
        run_init_handler(init_handler, invokeid)

    if mode == "http":
        handle_http_request(request_handler, invokeid, sockfd)
    elif mode == "event":
        handle_event_request(request_handler, invokeid, event_body,
                              context_objs, invoked_function_arn)

Now at first glance it looks quite complicated. However most of the lines are just either adding Xray trace information or dealing with either the handlers missing or whether to use the event or http event type (again, I've never been able to get lambda to actually use the http).

So below is what it looks like with the bare minimum.

while True:
    (invokeid, x_amzn_trace_id, sockfd, credentials, event_body, context_objs, invoked_function_arn) = wait_for_invoke()
    
    _GLOBAL_AWS_REQUEST_ID = invokeid
    
     handle_event_request(request_handler, invokeid, event_body, context_objs, invoked_function_arn)

Much for straight forward. We get an event. Set our request id and then handle the event.

def wait_for_invoke():
    (invokeid, data_sock, credentials, event_body, context_objs,
     invoked_function_arn, x_amzn_trace_id) = lambda_runtime.receive_invoke()

    set_environ(credentials)

    return (invokeid, x_amzn_trace_id, data_sock, credentials, event_body,
            context_objs, invoked_function_arn)

Wait for invoke is very simple, it just makes a call into the shared library to get the invocation, which includes the event as well as a set of AWS credentials. It then sets the appropriate environment variables so that our function handler has the correct IAM role assigned to our function via the set_environ call.

def handle_event_request(request_handler, invokeid, event_body, context_objs,
                         invoked_function_arn):
    lambda_runtime.report_user_invoke_start()
    errortype = None
    try:
        client_context = context_objs.get('client_context')
        if client_context:
            client_context = try_or_raise(lambda: json.loads(client_context),
                                          "Unable to parse client context")
        context = LambdaContext(invokeid, context_objs, client_context,
                                invoked_function_arn)
        json_input = try_or_raise(lambda: json.loads(event_body),
                                  "Unable to parse input as json")
        result = request_handler(json_input, context)
        result = try_or_raise(
            lambda: to_json(result),
            "An error occurred during JSON serialization of response")
    except wsgi.FaultException as e:
        lambda_runtime.report_fault(invokeid, e.msg, e.except_value, None)
        report_xray_fault_helper("LambdaValidationError", e.msg, [])
        result = make_error(e.msg, None, None)
        result = to_json(result)
        errortype = "unhandled"
    except JsonError as e:
        result = report_fault_helper(invokeid, e.exc_info, e.msg)
        result = to_json(result)
        errortype = "unhandled"
    except Exception as e:
        result = report_fault_helper(invokeid, sys.exc_info(), None)
        result = to_json(result)
        errortype = "unhandled"
    lambda_runtime.report_user_invoke_end()
    lambda_runtime.report_done(invokeid, errortype, result)

Again another fairly long function. But most of that is dealing with different error scenarios. Lets go step by step.

First tell the runtime that the user handler invocation has started via report_user_invoke_start.
Next we parse the event body from json into native python types as well as assembling the context object.
We then call the user handler, which is the code we provide to our lambda function.
There is then a few different error branches to deal with various issues that could encounter, from the basics of the handler function throwing an exception to more specific issues like event or response json validation errors.
If any of these errors happens we build a nicely formatted stacktrace via report_fault_helper.
Finally we tell the runtime that we've finished the user code as well as reporting we are done with either the error or the result.

So it has taken a while but we've made our way through the AWS Lambda python runtime. But we're not quite done. In face we've only finished the easy bit.

The next stage, I will admit dear reader, took me ~10x the amount of time to work out than everything thus far. Which is mostly my not being familiar with analysing a binary/shared library. I suspect that someone more well versed with such analysis would have found it trivial. But what is the point of doing something, if you don't learn a new skill along the way.

This brings us handily to the analysis of the shared libraries. Now while I think this was the most interesting part of the process, it is very tedious to trace every execution path as well reconstructing what I think the c structs look like. So instead I'll walk through a couple of examples, some interesting observations and finally my reconstruction (in python) of the shared library code.

Let's not get head of ourself and refresh what we already know about the shared libraries and we'll start with the one loaded by boostrap.py.

$ lcmd ls /var/runtime/awslambda
bootstrap.py
__init__.py
runtime.cpython-36m-x86_64-linux-gnu.so
wsgi.py

runtime.cpython-36m-x86_64-linux-gnu.so is loaded by import runtime in bootstrap.py. The cpython-36m-x86_64-linux-gnu suffix tells us that it a python c extension, which is why python know how to load it and why we are given back native python objects from calls to runtime in boostrap.py.

$ lcmd ldd /var/runtime/awslambda/runtime.cpython-36m-x86_64-linux-gnu.so
	linux-vdso.so.1 =>  (0x00007fff66bbb000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007efd7c36c000)
	libpython3.6m.so.1.0 => /var/lang/lib/libpython3.6m.so.1.0 (0x00007efd7c02d000)
	liblambdaio.so => /var/runtime/liblambdaio.so (0x00007efd7c79f000)
	liblambdaruntime.so => /var/runtime/liblambdaruntime.so (0x00007efd7c795000)
	liblambdaipc.so => /var/runtime/liblambdaipc.so (0x00007efd7c78e000)
	liblambdalog.so => /var/runtime/liblambdalog.so (0x00007efd7c78b000)
	libc.so.6 => /lib64/libc.so.6 (0x00007efd7bc69000)
	/lib64/ld-linux-x86-64.so.2 (0x00007efd7c588000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007efd7ba65000)
	libutil.so.1 => /lib64/libutil.so.1 (0x00007efd7b862000)
	librt.so.1 => /lib64/librt.so.1 (0x00007efd7b65a000)
	libm.so.6 => /lib64/libm.so.6 (0x00007efd7b358000)

If we next look at what the runtime extension is linked against we can confirm that it is linked against libpython3. But it is also linked against some more shared libraries one directory up. liblambdaruntime, liblambdaipc, liblambdaio and liblambdalog. If we look at how those are linked we get the following dependency graph.

dependencies

So after copying those shared libraries to our local machine. It is time to break out Binary Ninja. This choice will probably irritate some people, but its price for personal use is hard to argue with. That being said IDA or Hopper would also be fine choices.

Reverse engineering for dummies (me)

So I am going to summarise what the shared libraries do, but how did I work this out you ask? Well the answer is basically trial and error and banging my head against a wall for months learning assembly and generally being incompetent. So I will show you a few examples of how I reverse engineered the control flow and structure of these shared libraries, but I must warn you that this is almost certainly either wrong or slow or more likely both.

binja1

Here is an example of the control flow graph generated by binary ninja. In particular this function called PyInit_runtime, tells us that the function is run when the module is imported. This also solves the mystery of how initialisation is performed if all we can see in bootstrap.py is the code before. If you take a look back at bootstrap.py, we can see that one of the first things it does is import the runtime C extension, therefore this initialisation function is run before almost anything else.

And looking at the control flow it is fairly easy to work out what is going on. It is the same as the following pseudocode:

if global PY_RUNTIME_RUNNING != 0 {
  error = runtime_init()
  if error != 0 {
    exit()
  } else {
    global PY_RUNTIME_RUNNING = 1
  }
  
  global RUNTIME = runtime
  
  PyModule_Create2()
}

Fairly straight forward, and in essence the entire process follows a similar structure, look at the control flow, work out what it is doing and write it out as pseudocode. So next we would follow the runtime_init function, however it is not in the runtime.cpython module and so we have to jump to liblambdaruntime.

Now this one is a bit more complex...

binja2

I would like to say I am an amazing hacker that this comes naturally too, however that is entirely not the case. The AWS engineers were nice enough to compile the code with debug symbols, which is why most of the control flow graphs have reasonable names for things. But more importantly they are using a very useful error reporting framework that does almost all the reversing for me. What do I mean by this? It is easier to just show you. For example let's take a look at this particularly gnarly function.

binja3

I won't cover what it actually does, but the eagle eyed of you will have spotted the error handling branches. And if we follow them we find the following strings.

binja4

Which in full are:

((__runtime->xray_sock = socket(2, SOCK_DGRAM | SOCK_NONBLOCK | SOCK_CLOEXEC, 0))) >= 0

inet_aton(xray_address, &address.sin_addr)

(connect(__runtime->xray_sock, (struct sockaddr *) &address, sizeof(address))) == 0

"src/lambda/runtime.c"

Which gives us a much clearer idea of what is actually going on, we can piece together that it is opening and connecting to a datagram socket at the xray_address and then storing the result in the __runtime struct.

Now the __runtime struct is interesting as it pops up all over the place, and again I'd like to say I worked out it's definition just from reading the assembler, but the answer is far far more tedious. You see the RUNTIME global we saw before is actually set to a pointer to this __runtime struct, which means it is accessible from out python code using the cffi module.

def get_native_runtime_struct():
  rt_pointer = ctypes.POINTER(ctypes.c_char * 0x880)
  dll = ctypes.cdll.LoadLibrary(
      '/var/runtime/awslambda/runtime.cpython-36m-x86_64-linux-gnu.so')
  rt = rt_pointer.in_dll(dll, '__runtime')
  try:
      return rt.contents
  except ValueError:
      return 'NullPointer'

All we are doing is dumping out the 2176 bytes (0x880) at the __runtime location from the shared library. Why 2176? Because of this part of the runtime_init function.

binja5

And if we look at the result of the above code, we get a bunch of nonsense like this.

bytearray(
    b'#\x00\x00\x00*\x00\x00\x00\x00\x00\x00\x00\x00\x00arn:aws:lambda:ap-southeast-2:12345678910:function:pyinject
    \x00\x7f\x00\x00\xd0\xa0/\xd5\xf9\x7f\x00\x00\xa5\xd9Z\xc1k\x98\xe0l\xb0\x96/\xd5\xf9\x7f\x00\x00\xc0\x7f\xe7\xdc\xf9
    \x7f\x00\x00\x88\x9by7\xff\xda(\xa0XX/\xd5\xf9\x7f\x00\x00\xe0\x7f\xe7\xdc\xf9\x7f\x00\x00\xac\x1a9\xf6\x08E\xa7\xd4
    \xa0X/\xd5\xf9\x7f\x00\x00\x00\x80\xe7\xdc\xf9\x7f\x00\x00\x81-\x95\x06\x06\xde\xa9\xbd\xe8X/\xd5\xf9\x7f\x00\x00
    \x80\xe7\xdc\xf9\x7f\x00\x00\xb0~6\xe1\x9d\xee\x8e\x0f\xf8f/\xd5\xf9\x7f\x00\x00\xe0\x7f\xe7\xdc\xf9\x7f\x00\x00\xcc
    ... and so on)

Which has some recognisable strings in it. So with a combination of the patience and frustration I was able to work out the runtime struct looks like this:

class Runtime(PStruct):
    _fields_ = [
        ("ctrl_sock", ctypes.c_int),
        ("console_sock", ctypes.c_int),
        ("xray_sock", ctypes.c_int),
        ("needs_debug_logs", ctypes.c_int),
        ("function_arn", ctypes.c_char * 512),
        ("deadline_ns", ctypes.c_uint64),
        ("shared_mem", ctypes.POINTER(SharedMem)),
        ("pre_load_time_ns", ctypes.c_uint64),
        ("post_load_time_ns", ctypes.c_uint64),
        ("wait_start_time_ns", ctypes.c_uint64),
        ("wait_end_time_ns", ctypes.c_uint64),
        ("max_stall_time_ms", ctypes.c_size_t),
        ("is_initialized", ctypes.c_bool),
        ("init_start_time", timeval),
        ("init_end_time", timeval),
        ("invoke_start_time", timeval),
        ("is_traced", ctypes.c_bool),
        ("reported_xray_exception", ctypes.c_bool),
        ("init_xray_context", XrayContext),
        ("xray_context", XrayContext)
        ]

And how do I know the names of the fields you ask? Well again if you look carefully you will see the xray_sock field from the debug messages we found before. So all of theme were pulled from the debug messages, and the types were identified via trial and error and the help of the code in PStruct

Which has a nice printing function so you can see if our guessed types are actually matching the types in the real struct. For example it would print the out the following for the above defined Runtime struct, with the byte offsets helpfully annotated.

So once the size of the struct matched the offsets, I knew I had the correct definition. (As well as just some common sense things i.e. file descriptors being ints)

{
    0x0 ctrl_sock: 21,
    0x4 console_sock: 28,
    0x8 xray_sock: 0,
    0xc needs_debug_logs: 0,
    0x10 function_arn: b'arn:aws:lambda:ap-southeast-2:12345678910:function:pyinject',
    0x210 deadline_ns: 2660013665248,
    0x218 shared_mem: {
        0x0 event_body_len: 2,
        0x4 debug_log_len: 0,
        0x8 event_body: b'{}',
        0x60006c debug_logs: b'',
        0x6192a4 response_body_len: 0},
    0x220 pre_load_time_ns: 2656844548537,
    0x228 post_load_time_ns: 2656946702375,
    0x230 wait_start_time_ns: 2657008317187,
    0x238 wait_end_time_ns: 2657008350855,
    0x240 max_stall_time_ms: 0,
    0x248 is_initialized: True,
    0x249 init_start_time: {
        0x0 tv_sec: 1527519386,
        0x8 tv_usec: 491078},
    0x259 init_end_time: {
        0x0 tv_sec: 1527519386,
        0x8 tv_usec: 495631},
    0x269 invoke_start_time: {
        0x0 tv_sec: 140114522192048,
        0x8 tv_usec: 140114651686080},
    0x279 is_traced: False,
    0x27a reported_xray_exception: False,
    0x27b init_xray_context: {
        0x0 trace_id: b'',
        0xff is_sampled: False,
        0x100 parent_id: b'446d6e0975296e18',
        0x1ff lambda_id: b''},
    0x579 xray_context: {
        0x0 trace_id: b'1-5b0c1899-3bfaf18e732c0be2465110a4',
        0xff is_sampled: False,
        0x100 parent_id: b'2625bdd20f4e3600',
        0x1ff lambda_id: b''}
}

And that is all I am going to cover in how the actual reverse engineering was done, as it consisted entirely of those steps over and over again.

In terms of what each of the shared library files does, I'll cover it on broad strokes now.

runtime.cpython...

This shared library does very little other than to delegate functionality to the other shared libraries.

This makes sense as it allows for code sharing between the different language runtimes if you keep the shim for your individual language as small as possible. Though one interesting thing is that it has some sort of spinlock on the length of the result, which I think is how they are coordinating writes to the shared memory struct.

binja6

Oh right the shared memory struct, the one from the pointer above in the __runtime struct.

So this one took an embarrassingly long time and many re-reading of linux documentation to work out what the heck was going on. I'd reccomend you just jump straight to the pyruntime.py file to see what it is doing.

But the gist is that a file descriptor is passed into the lambda function, which is them memory mapped into the process's address space, before being closed.

The trick being that is an anonymous memory map, it has no backing file, so once it is closed you have to remember the address returned from the mmap call. This block of mapped memory is how AWS send events to the lambda function and how events are returned. Why they couldn't just use Unix socket? I don't know. But that is the way they chose to do it.

Anyway back to the spinlock. I think they are using the response_body_len as a multiprocess lock to coordinate writes to the shared memory. What is more interesting is that this takes place in runtime.cpython. Which means that all the other runtimes have to manually implement this lock in their shims. Which all of them do as far as I can tell, all except golang that is. There might be better guarantees around memory writes in golang that I am not aware of, but just something interesting.

Oh by the way I did disassembly, decompile, dump etc all the other language runtimes for cross referencing and general interest, however that is a whole other blog post in itself. But I will cover the more interesting pieces in the future efforts section.

liblambdaruntime

This is where the meat of the work happens. It holds the initialisation logic as well as all the logic around send and receiving administration commands via the various Unix sockets we found earlier. It essentially has analogs of all the runtime functions we observed in the bootstrap.py walkthrough.

binja7

liblambdaipc

This shared library contains utility functions for IPC communication. The runtime uses Unix sockets to send and receive commands to the AWS host system. These commands need to be serialized/de-serialized symmetrically on both ends, so I expected this library is also on the host system.

binja8

liblambdalog

This library has some helper function for logging to stdout and the Lambda shared log buffer. Included in the shared memory that is used to send the event and result to and from the function is a region for storing debug logs. These logs are also written to standard out simultaneously by the functions below.

binja9

liblambdaio

A very simple shared library, it just has some helper functions for synchronised writing to file descriptors.

binja10

PyRuntime

Once all of that was worked out, reversed and translated into python we are left with the code in my repo here.

It doesn't match 1-1 with all the shared libraries and how their functionality is separated, however it follows the majority of code paths. (with some important caveats covered in the future efforts section.) But more importantly it works. How do I know this? Let's find out.

Wrestling control

So how do you test your new shiny runtime if bootstrap.py is run before our code is invoked. It's all in the environment. The environment variables that is.

AWS are kind enough to let us set the environment variables for our Lambda functions. More importantly these environment variables are set before the AWS runtime is handed control. I.e. they are set when the container is started, not by the runtime.

The key insight is to realise this gives us a way to control how bootstrap.py is executed. The cpython interpreter has many flags that can be controlled by environment variables, in particular we are interested in PYTHONPATH.

This controls where, and in what order, python looks, when it's loading modules.

So if we for example, have two directories with a python module each called 'foo.py'.

And we add those directories to our PYTHONPATH.

And we then try to import foo.

The directory that is first in the path will have it's module loaded first.

Let's take another look at the bootstrap.py imports.

from __future__ import print_function

import decimal
import imp
import json
import logging
import os
import site
import socket
import sys
import time
import traceback

import runtime as lambda_runtime

import wsgi

So the first thing it loads after the __future__ compatibility line is decimal. If we were to put a decimal.py in the PYTHONPATH it will take precedence of the built in decimal module. And so this is example what we would do is.

set the PYTHONPATH environment variable to /var/task/
place a decimal.py file in the root of our Lambda function code.

And now we have our code executing, before the lambda runtime.cpython is loaded.

The decimal.py for my custom runtime can be found here. https://github.com/Posnet/pylambda

This file then calls execve with our own slightly modified bootstrap.py thus replacing the process with our own code. And so we have native control in our Lambda container.

As is seen by running the lambda function. current_proc is the result of ps 1 which if you remember used to be /var/runtime/awslambda/boostrap.py, yet now it is our custom runtime.

$ ./run.sh
{
  "working": true,
  "current_proc": "/var/lang/bin/python3.6 /var/task/inject/bootstrap.py"
}

Now there is a bit of a problem with me telling this. You see AWS could very easily block this avenue by restricting what environment variables you can set. And so can remove my access at any time. I hope they don't, but I also would understand if they wanted to nip this in the bud before anyone sees it, since it would effectively make an internal API, a public once if people start using it (even if it is unsupported.).

So what?

So if you looked at the above repo, you will have noticed that my pyruntime.py is in fact written in python.

Why go to all this trouble to replace fast c code with slow python. Especially since Lambda already supports python?!

Well this was mainly for my own interest, so that was enough for me. However for the rest of you, it does mean you are now free to use the pyruntime as template implementation.

There is no reason you couldn't rewrite it in C++, or Rust, or even Golang (yes I know it is natively supported but you could write your own runtime to get rid of that pesky RPC, that the AWS version uses.).

So that concludes a multi-month long investigation of how AWS have implemented their Lambda runtime.

Future efforts (Or I am lazy and tired of writing)

Now there are a bunch of open questions I'd like to answer however this blog is already too long so I'll leave them for a future time. (Even though I have answers to a few of them.)

Containers and completeness.

As demonstrated by the copious TODOs in the pyruntime, I am not yet done replicating 100% of the functionality. The basics work, however there are still many code paths yet unexplored. For example there are whole swaths of code for handling AWS X-Ray reporting from the runtime that I haven't touched. Similarly those helpful error messages which I used as cheat sheets in my reversing efforts also provide important debug info to the AWS Lambda team which no longer are being logged.

However my next task, before even finishing the rest of the runtime is to implement the other side of the runtime. And by other side I mean the AWS host side of the lambda function. This is a whole project in of itself, and which I will undoubtedly procrastinate doing.

Luckily I found the amazing rubber-docker project. Which I have worked out is the perfect base for my host implementation. As well as being a fantastic tutorial for anyone interested in containers and how they actually work.

Once that is done, it will be much easier to work out if my runtime is behaving correctly since I will be able to run the AWS lambda shared libraries locally and make sure my version is sending identical command and logging identical logs as the official one. This will let me verify that my implementation is correct.

Please be a good citizen

So since this reference implementation is not complete, it is likely to fail in unexpected and opaque ways, which will be both confusing to you as well as the AWS Lambda team, so I ask you to be careful with your experimentation and please don't try to break things. And for the love of god do not use this in production for any reason, only pain and miserly lies ahead for those that do.

Polyglottal

The other big task is to go over the other runtimes. No I didn't forget about them.The other runtimes use the same shared libraries as the python ones, though how they talk to it varies.

The Node.js uses gyp and a shim like python to proxy function calls to liblambdaruntime.
Java uses native methods to talk to liblambdaruntime directly.
Golang is a weird one. Because golang doesn't support shared libraries, in order to allow cgo to link against the runtime without revealing the source code, AWS have opted to write a daemon that links against the shared libs and then have that talk to your handler over golang's native RPC interface.
C#, I haven't looked at this one, but I assume it just uses the C# runtime's native calling like java

I am sure there is useful information that can be used to make the reversing of the runtime easier and more complete by looking at these other language runtimes. However that will have to be a topic for another time.

Elephant in the env var

So I mentioned before that AWS can remove our entrypoint by restricting the environment variables we can set. If they do so, the shared library runtime will run, before our code is execution in the main Lambda event loop.

We could take control again here if it wasn't for the fact that after memory mapping the shared buffer, the runtime then closes those file descriptors.

This is a problem since memory maps do not survice exec and we cannot re-open the fd since it was passed to us, already open, on fork and we don't have access to any actual file. Which once closed cannot be re-opened. This means we can't then exec ourselves without loosing the the memory map.

We can still wrestle control of the runtime and never return to AWS's event loop. So while we will then almost have full control, we are still in the address space of the original process. Which could reduce the overhead of Lambda invocations (though it won't save you money since it is already very easy to get below the 100ms minimum charge) but it isn't going to help with cold starts for example.

Discussion

Hacker News