An AWS Summer: Lambda, EFS, and the Serverless Framework

The autumn equinox has just passed, which is a perfect moment to look back, and review some of the features released in this last summer by AWS – in no particular order, just because I think they are cool – and useful 🙂

Serverless challenges

If you´ve been developing serverless applications for a while, pretty sure you have found yourself facing a few challenges, apart from the old cold start thing – which have been solved to a great extent with the Provisioned Concurrency feature.

For instance, let’s say you need to load large files of rules consumed by a Lambda function, that implements a rules engine, or you need to keep data files produced dynamically by the function between invocations. Lambda provides some local space – 512MB – that you may use, but it’s small and ephemeral, so is not useful for those kinds of scenarios.

Other solutions come to mind: storing in databases – RDS, DynamoDB,S3 … but comes with a high price of development, performance and cost. What would happen if we had peaks of several hundreds – or thousands requests – per second, loading big files in the startup and writing files to a data store concurrently?

Well, at the very least, we could have a big performance hit, depending on the size of the files, the latency of retrieving the files at startup + the cold start of Lambdas – enter provisioned concurrency – plus the latency of storing the intermediate files to the datastores – it’s not the same storing and retrieving from S3 than from DynamoDB.

So no alternative? Well, we are in luck, as AWS released EFS support for Lambda in June!

Image property of AWS

Amazon EFS is widely known, so I’m not going to delve depth into the service, but just to mention that Amazon Elastic File Service provides a NFS file system that escalates on demand, providing high throughput and low latency. It’s very useful when shared storage, and a parallel access from the services it´s needed.

Configuration & Considerations

“With power comes responsibility”, or in our case with powerful features come some configuration constraints. EFS runs in different subnets within a VPC, which means that our Lambda functions have to run within a VPC as well. That comes with a price: IP directioning, possible performance hit, loss of connection to AWS global services, therefore a NAT Gateway or Private Links / Gateway might need to be used, depending on the use case.

That constraint was vastly improved last year when Hyperplane ENI for Lambda was released, allowing that just a few ENI´s – and therefore a few IP´s – would be enough to handle a big number of Lambda invocations, decoupling function scaling from ENI´s provisioning.

Configuration – Serverless Framework

The configuration of a Lambda function running within a VPC could be fairly simple – if only needs to access the VPC resources – as in shown in the image below – under the vpc label:

Serverless framework YAML – Image MNube.org

A security group is needed for the Lambda function, the ID´s of subnet(s) where the ENI(s) will be placed, and permissións to create, delete, and describe network interfaces.

VPC Lambda – Image MNube.org

The Lambda function is running within our VPC now, an ENI placed in each subnet selected, but in order to access the EFS instance a few permissións will need to be provided:

Role permissións EFS, Lambda – Image MNube.org

Now the EFS can be created within the VPC. In order to do that, the console, Cloudformation, Serverless, AWS CLI, AWS SDK, etc … could be used.

EFS instance – Image MNube.org

After creating the instance, an access point needs to be provided to allow applications access. This is a new resource: “AWS::EFS::AccessPoint”. It can be created from the console, or through a cloudformation file – we will need to supply the EFS ID: ${self.provider}.

Serverless framework YAML – Image MNube.org

Finally, we link the file system to the Lambda Function, providing the arn of the EFS, the arn of the access point, and the local mounted path – as shown on the image below:

Image MNube.org

The EFS instance is ready to be accessed by the Lambda function 🙂

Solution

I have used the Serverless framework to produce the solution – but AWS SAM with Cloud 9 as the official alternative could have been used instead. I have quite experience with Serverless, having introduced it to a few companies – including Everis – with big success.

Architecture – MNube.org

Let’s create – or transfer – a rules file that can be accessed from the Lambda function 🙂

Different services could be used to transfer the files, like AWS DataSync, an EC2 instance, or even creating files from code. The files we might transfer from EC2 are accessible from the Lambda functions, so we´ll use this method.

After the EC2 instance has been created – a t2.micro is enough – in one of the subnets of the VPC that has access to the EFS ENI´s, a directory we´ll be needed – /efs. That directory doesn’t have any link to the EFS instance, so we´ll need to mount the directory.

One way to do it is using the EFS tools:

                     sudo yum install -y amazon-efs-utils

An access point was created previously that we can use to mount the directory. It’s easy to get the command line needed from the web console. Just go to to the Amazon EFS > Access Point > id link, and press the Attach button:

EFS Mount – Image MNube.org

After mounting the directory – in green – the files can be transfer to the /efs directory:

Mounting and creating files – Image MNube.org

At this point, the access to the directory from the Lambda function should be fully possible. I have coded a minimum Lambda function that lists the files contained in the directory:

Lambda function – Image MNube.org

The solution is now ready to be deployed. Keep in mind that I have only shown parts of the serverless.yml, equivalent to the cloudformation file you might use to provide the infrastructure – I will leave that to you as an exercise.

                serverless deploy --stage dev --region eu-west-1
Serverless Stack – Image MNube.org

An URL link is provided by the framework, as I created an API gateway that invokes the Lambda function:

Cloudwatch Logs – Image from MNUBE.org

I have captured the request trace from the Cloudwatch Logs, where we can see the files in /efs: test.txt and rules.txt, and the low latency of the request.

Other Use Cases

  • Loading big libraries that Lambda layers can´t handle.
  • Files that are updated regularly.
  • Files that need locks for concurrent access.
  • Access to big files – zip / unzip.
  • Using different computing architectures – EC2, ECS – to process the same files.

AWS Certified Developer Reloaded

I’m going to share my recent experience with the re-certification – June 2020AWS Developer, one of my favorites, without a doubt. An experience that has been very different from the previous one, since, if my memory serves me well, I didn’t find any repeated question.

The structure of the exam is the usual one for the associated level: 2 hours and 65 questions, with an evolved format, even more, towards scenario-type questions. I don’t recall any direct questions, and certainly not extremely easy ones. That said, it seems to me to be a much more balanced exam than the previous version, where some services had much more weight than others – API gateway, I’m looking at you.

Virtually all Core / Serverless services – important ones – are represented in the exam:

  • S3
  • In Memory Databases: Elastic Cache, Memcache, Redis
  • Databases: RDS, DynamoDB …
  • Security: KMS, Policies ..
  • CI / CD, IAC: ElasticBeanstalk, Codepipeline, Cloudformation …
  • Serverless: Lambda functions, API Gateway, Cognito …
  • Microservices: SQS, SNS, Kinesis, Containers, Step Functions …
  • Monitorización: Cloudwatch, Cloudwatch Logs, Cloudtrail, X-Ray …
  • Optimización: Cost control, Autoscaling, Spot Fleets …

Developer is the Serverless certification par excellence, although some services, such as Step Functions or Fargate Containers, are poorly represented – just one or two questions, and of high difficulty.

Serverless is a great option for IoT Sytems

Prerequisites and recommendations

I will not repeat the information that is already available on the AWS website; instead, I will give my recommendations and personal observations.

Professionals with experience in Serverless development – especially in AWS – Microservices, or experience with React-type applications, will be the most comfortable when preparing and facing this certification.

  • AWS Experience. Certification indicated for professionals with little or no experience on AWS. I´d recommend getting the AWS Certified Cloud Practitioner, though.
  • Dev Experience. It’s essential to possess a certain level, since many of the questions are eminently practical, and are the result of experience in the development field. Knowledge of programming languages like Python, Javascript or Java is something very desirable. The exam poses programming problems indirectly, through concepts, debugging and optimization. The lack of this knowledge or experience generates the impression in many professionals that this certification is of a very high level of difficulty, when in my opinion it is not.
  • Architecture experience. The exam is largely focused on the development of Cloud applications, especially Serverless – Microservices. However, some questions may require knowledge at the Cloud / Serverless / Containers architecture pattern level.
  • DevOps Experience. Concepts such as CI / CD, infrastructure or configuration as code are of great importance today, and this is reflected in the exam. Obviously, the questions focus – for the most part – on AWS products, but knowledge of other products like Docker, Jenkins, Spinaker, Git and general principles can go a long way. Let’s not forget that this certification, together with SysOps, are part of the recommended path to obtain the AWS DevOps Pro certification, and obtaining them automatically re-certifies the two previously mentioned.

Neo, knowing the path is not the same as walking it” – Morpheus. The Matrix, 1999


Imagen aws.amazon.com

AWS Technical Essentials: introductory course, low level. Live remote or in person.

Developing on AWS: course focused on developing AWS applications using the SDK. It is intermediate level, and the agenda seems quite relevant to the certification. Live remote or in person. Not free.

Advanced Developing on AWS: interesting course, but focused on AWS Architecture: migrations, re-architecture, microservices .. Live remote or face-to-face. Not free.

Exam Readiness Developer: essential. Free and digital.

AWS Certified Cloud Practitioner: Official certification, especially aimed at professionals with little knowledge of the Cloud in general, and AWS in particular.

Exam

As I have previously commented, the exam format is similar to most certifications, associated or not. That is, “scenario based”, and in this case of medium difficulty, medium-high. You are not going to find “direct” or excessively simple questions. As it is an associated level exam, each question focuses on a single topic, that is, if the question is about DynamoDB, the question will not contain cross cutting concerns, such as Security, for instance.

Let’s examine a question taken from the certification sample questionnaire:

Very representative question of the medium – high level of difficulty of the exam. We are talking about a development-oriented certification, so you will find questions about development, APIs, configuration, optimization and debugging. In this case, we are presented with a real example of configuring and designing indexes for a DynamoDB table.

DynamoDB is an integral part of the AWS Serverless offering and the flagship database – with permission from Aurora Serverless. Low latency NOSQL database ideal for IoT, events, time – series etc … Its purely Serverless nature allows its use without the need to provide and manage servers, or the need to place it within a VPC. This fact provides a great advantage when accessing it directly from Lambda functions, since it is not necessary that they would “live” within a VPC, with the added expense of resource management and possible performance problems – “enter Hyperplane”.

DynamoDB hardly appears in the new AWS Databases certification, so I´d recommend that you study it in depth for this certification, due the number of questions that may appear.

Services to study in detail

The following services are of great importance – not just to pass the certification – so I highly recommend in-depth study.

Imagen aws.amazon.com
  • AWS S3 – Core service. It appears consistently across all certifications. Use cases, security, encryption, API, development and debugging.
  • Seguridad – It appears consistently in all certifications: KMS encryption, Certificate Manager, AWS Cloud HMS, Federation, Active Directory, IAM, Policies, Roles etc….
  • AWS Lambda – Use cases, creation, configuration-sizing, deployment, optimization, debugging and monitoring (X-RAY).
  • AWS DynamoDB – Use cases, table creation, configuration, optimization, indexes, API, DAX, DynamoDB Streams.
  • AWS API Gateway – Use cases, configuration, API, deployment, security and integration with S3, Cognito and Lambda. Optimization and debugging.
  • AWS ElastiCache – Use cases, configuration-sizing, API, deployment, security, optimization and debugging. It weighs heavily on the exam – at least in my question set.
  • AWS Cognito – Use cases, configuration and integration with other Serverless and Federation services. Concepts like SAML, OAUTH, Active Directory etc … are important for the exam.
  • AWS Cloudformation – Use cases, configuration, creation of scripts, knowledge of the nomenclature / CLI commands.
  • AWS SQS – Use cases, architecture, configuration, API, security, optimization and debugging. Questions of different difficulty levels may appear.

Very important services to consider

  • AWS SNS – Knowledge of use cases at architecture level, configuration, endpoints, integration with other Serverless services.
  • AWS CLI – Average knowledge of different commands and nomenclature. In my set of questions not many appeared, but in any case, it is very positive to have some ease at the console level.
  • AWS Kinesis – Some more complex questions appear in this version of the exam than in the previous embodiment. Use cases, configuration, sizing, KPL, KCL, API, debugging and monitoring.
  • AWS CloudWatch, Events, Log – It appears consistently across all certifications. Knowledge of architecture, configuration, metrics, alarms, integration, use cases.
  • AWS X-RAYUse cases, configuration, instrumentation and installation in different environments.
  • AWS Pipeline, CodeBuild, Cloud Deploy, CodeCommit, CodeStar – High-level operation, architecture, integration and use cases. I´d recommend in depth study of CodePipeline and CodeBuild.
  • AWS ELB / Certificates – Use cases, ELB types, integration, debugging, monitoring, security – certificate installation.
  • AWS EC2, Autoscaling – Use cases, integration with ELB.
  • AWS Beanstalk – Architecture, use cases, configuration, debugging and deployment types – very important for the exam: All at Once, Rolling etc …
  • AWS RDS – One of the star services of AWS and the Databases Certification. Here it makes its appearance in a limited way: use cases, configuration, integration – caches – debugging and monitoring.

Other Services

  • AWS Networking – architecture and basic network knowledge: VPC, security groups, Regions, Zones, VPN … They appear in a general and limited way, compared to the rest of the certifications. It is one of the reasons why this certification is ideal for beginners. Network architecture on AWS can be a very complex and arid topic.
  • AWS Step FunctionsA service widely used in the business environment, but which appears circumstantially in certifications. I recommend studying architecture, use cases and nomenclature – the questions are not easy.
  • AWS SAM – Use cases, configuration and deployment. SAM CLI Commands.
  • AWS ECS / Fargate – Its appearance in the certifications is quite disappointing – and more so when compared to Google Cloud´s certifications, where Kubernetes – GKE – has a main role – logical, since it´s Google’s native technology. I´d recommend studying the architecture, use cases – microservices – configuration, integration and monitoring (X-RAY).
  • AWS Cloudfront – General operation and use cases. Integration with S3.
  • AWS Glue – General operation and use cases.
  • AWS EMR General operation and use cases.
  • AWS DataPipeline – General operation and use cases.
  • AWS Cloudtrail – General operation and use cases.
  • AWS GuardDuty – General operation and use cases.
  • AWS SecretsManager – General operation and use cases.

Essential Resources

  • AWS Certification Website.
  • Sample questions
  • Readiness course – recommended, with additional practice questions,
  • AWS Whitepapers – “Storage Services Overview“, “Hosting Static Websites on AWS“, “In Memory Processing in the Cloud with Amazon ElastiCache“, “Serverless Architectures with AWS Lambda“, “Microservices“.
  • FAQS – specially for Lambda, API Gateway, DynamoDB, Cognito, SQS and ElastiCache.
  • AWS Compute Blog
  • Practice Exam – highly recommended, level of difficulty representative of the exam.

Laboratories

I´d like to propose you an incremental practical exercise, cooked by me, that can be useful for preparing for the exam.

Serverless Web App

Imagen aws.amazon.com
  • Create a static website and host it on S3. Use AWS CLI and APIS to create a bucket and copy the contents.
  • Create a repository with CodeCommit and upload the files from the Web to it.
  • Integrate S3 and Cloudfront – creating a Web distribution.
  • Create a Serverless backend with API Gateway, Lambda and DynamoDB, or alternatively Aurora Serverless, using Cloudformation and the AWS SAM model.
  • Code the Lambdas functions with one of the supported runtimes – Python, Javascript, Java … – and use BOTO to insert and read in DynamoDB. Each Lambda will correspond to an API Gateway method, accessible from the Web.
  • Integrate X-Ray to trace Lambdas.
  • Create the Stack from the console.
  • Upload the generated YAML´s files to CodeCommit.
  • Optional: create a pipeline using CodePipeline and CodeCommit.
  • Optional: integrate Cognito with API Gateway to authenticate, manage, and restrict API usage.
  • Optional: replace DynamoDB with RDS and integrate Elasticache.
  • Optional: add a SQS queue, which will be fed from a Lambda. Create another Lambda that consumes the queue periodically.

Is it worth it?

Certifications are a good way, not only to validate knowledge externally, but to collect updated information, validate good practices and consolidate knowledge with real (or almost) practical cases.

Obtaining the AWS Certified Developer seems to me a “no brainer” in most cases, as I explained previously in another post, and in this one.

Good luck to everyone!