Lambda, EFS, and the Serverless Framework

0
1663

If you’ve been developing serverless applications for a while, pretty sure you have found yourself facing a few challenges, apart from the old cold start thing – which has been solved to a great extent with the Provisioned Concurrency feature.

For instance, you need to load large files of rules consumed by a Lambda function that implements a rules engine or keep data files produced dynamically by the function between invocations. Lambda provides some local space – 512MB – that you may use, but it’s small and ephemeral, so it is not helpful for those kinds of scenarios.

Other solutions come to mind: storing in databases – RDS, DynamoDB, S3 … but comes with a high price of development, performance and cost. What would happen if we had peaks of several hundred – or thousands of requests – per second, loading big files in the startup and writing files to a data store concurrently?

Well, at the very least, we could have a significant performance hit, depending on the size of the files, the latency of retrieving the files at startup + the cold start of Lambdas – enter provisioned concurrency – plus the latency of storing the intermediate files to the datastores – it’s not the same storing and retrieving from S3 than from DynamoDB.

So no alternative? Well, we are in luck, as AWS released EFS support for Lambda in June!

Image property of AWS

Amazon EFS is widely known, so I’m not going to delve into the service but to mention that Amazon Elastic File Service provides an NFS file system that escalates on-demand, providing high throughput and low latency. It’s instrumental when shared storage and parallel access from the services it’s needed.

Configuration & Considerations

“With power comes responsibility”, or in our case, with powerful features come some configuration constraints. EFS runs in different subnets within a VPC, which means that our Lambda functions have to run within a VPC. That comes with a price: IP direction, a possible performance hit, and loss of connection to AWS global services; therefore, a NAT Gateway or Private Links / Gateway might need to be used, depending on the use case.

That constraint was vastly improved last year when Hyperplane ENI for Lambda was released, allowing that just a few ENIs – and therefore a few IPs – would be enough to handle a big number of Lambda invocations decoupling function scaling from ENI’s provisioning.

Configuration – Serverless Framework

The configuration of a Lambda function running within a VPC could be pretty simple – if it only needs to access the VPC resources – as shown in the image below – under the VPC label:

Serverless framework YAML – Image MNube.org

A security group is needed for the Lambda function, the IDs of the subnet(s) where the ENI(s) will be placed, and permissións to create, delete, and describe network interfaces.

VPC Lambda – Image MNube.org

The Lambda function is running within our VPC now, an ENI placed in each subnet selected, but to access the EFS instance, a few permissións will need to be provided:

Role permissións EFS, Lambda – Image MNube.org

Now the EFS can be created within the VPC. To do that, the console, Cloudformation, Serverless, AWS CLI, AWS SDK, etc … could be used.

EFS instance – Image MNube.org

After creating the instance, an access point needs to be provided to allow applications access. This is a new resource: “AWS::EFS::AccessPoint”. It can be created from the console or through a Cloudformation file – we will need to supply the EFS ID: ${self.provider}.

Serverless framework YAML – Image MNube.org

Finally, we link the file system to the Lambda Function, providing the arn of the EFS, the arn of the access point, and the locally mounted path – as shown in the image below:

Image MNube.org

The EFS instance is ready to be accessed by the Lambda function 🙂

Solution

I have used the Serverless framework to produce the solution – but AWS SAM with Cloud 9 as the official alternative could have been used instead.

Architecture – MNube.org

Let’s create – or transfer – a rules file that can be accessed from the Lambda function 🙂

Different services could be used to transfer the files, like AWS DataSync, an EC2 instance, or even creating files from code. The files we might transfer from EC2 are accessible from the Lambda functions, so we’ll use this method.

After the EC2 instance has been created – a t2.micro is enough – in one of the subnets of the VPC that has access to the EFS ENIs, a directory we’ll be needed – /efs. That directory doesn’t link to the EFS instance, so we’ll need to mount the directory.

One way to do it is by using the EFS tools:

                     sudo yum install -y amazon-efs-utils

An access point was created previously that we can use to mount the directory. It’s easy to get the command line needed from the web console. Just go to the Amazon EFS > Access Point > id link and press the Attach button:

EFS Mount – Image MNube.org

After mounting the directory – in green – the files can be transferred to the /efs directory:

Mounting and creating files – Image MNube.org

At this point, access to the directory from the Lambda function should be fully possible. I have coded a minimum Lambda function that lists the files contained in the directory:

Lambda function – Image MNube.org

The solution is now ready to be deployed. Remember that I have only shown parts of the serverless.yml, equivalent to the Cloudformation file you might use to provide the infrastructure – I will leave that to you as an exercise.

serverless deploy --stage dev --region eu-west-1
Serverless Stack – Image MNube.org

The framework provides an URL link, as I created an API gateway that invokes the Lambda function:

Cloudwatch Logs – Image from MNUBE.org

I have captured the request trace from the Cloudwatch Logs, where we can see the files in /efs: test.txt and rules.txt, and the low latency of the request.

Other Use Cases

  • Loading extensive libraries that Lambda layers can’t handle.
  • Files that are updated regularly.
  • Files that need locks for concurrent access.
  • Access to big files – zip / unzip.
  • Using different computing architectures – EC2, and ECS – to process the duplicate files.