Thursday, March 4, 2021
Home Blog

The Reference Architecture Disappointment

0

There is a “phenomenon” that I have experienced through my career that I like to call the “Reference Architecture Disappointment”.

Some people would experiment with a similar effect when they go to the MD´s consultation with several symptoms to find out that they may have a common cold. No frenzy at the Hospital, no crazy consultations, no House MD´s TV scenes. Just paracetamol, water and rest!

So many years of Medicine School just to prescribe that?

Well, yes. The MD recognised a common cold between dozen of illnesses with the same set of symptoms and prescribed the simplest and best treatment. The question is, would you be able to do it?

Same thing when a Solutions Architect deals with a set of requirements. The “Architect” will select the best architecture that solves a business problem, most simply and efficiently possible. Sometimes, that means to use the “Reference Architecture” for that particular problem, with the necessary changes.

Those architectures emerge from practical experience and encompass patterns and best practices. Usually, reinventing the wheel is not a good idea.

Keep it simple and Rock On!

GCP – Introducing GKE Autopilot

0

It’s no secret the huge revolution that Kubernetes has ignited for the Industry since Google introduced it back in 2014, so I’d guess we don’t need to go there.

The GCP offering for Kubernetes is GKE, which provides a fully managed environment for orchestrating and deploying containers in the cloud.

Image property of gcp.com

GKE now is offering two operation modes:

The Standard operation mode is managed, but the infrastructure is configured and handled by the customer: needs configuration for scaling and node provisioning – provides a lot of flexibility.

Image captured from the console by the author

Autopilot mode has been introduced to provide a full and streamlined NoOps experience: GKE fully manages the infrastructure. The nodes provisioning and the scaling are automatically handled for you – no more worries about the master and working nodes. You’d lose some flexibility, though, but that’s the usual compromise.

Autopilot Cluster – Image property gcp.com

Creating a cluster with Autopilot

Image captured from the console by the author

I initially created a cluster in europe-north-1, but I got some problems deploying the pods to the cluster, so I changed it to usa-north, and it worked – guess some region limitations at the moment some transient problems.

Image captured from the console by the author

The cluster has been created, now we need to deploy some pods – Image captured from the console by the author

After creating the cluster, I deployed a basic web container with a web service but no node configuration, which speeds and simplifies the provisioning.

Per the documentation, Autopilot applies the following values for the pod’s resources:

Finally, I created a service to expose the endpoint to the world. I selected the balancer type because Autopilot doesn’t allow ExternalIps; alternatively, you could use an Ingress service.

Working cluster, 3 nodes + balancer – Image captured from the console by the author

And that’s all; our web app is ready. Autopilot provisioned automatically three nodes using e2-medium machines. After invoking the service a few times, the allocated resources were low, as shown in the image above.

I need to do load testing – and cost calculation – with a complex application and see how the autoscaling behaves. But my initial impression it’s excellent: it can’t be easier to provision a Kubernetes cluster. Read more about it in the GCP’S blog.

Building Serverless Applications with Google Cloud Run

0

Cloud Run is the Serverless Container Platform offering by GCP, launched back in 2019. I’m sure you’d get the idea, deploying apps very quickly – packaged in containers -, in a fully managed environment.

After reading this book by Wietse Venema, and going through most of the hand on examples, I can recommend it without any reservations – well, except for the price, but it’s a niche book after all 🙂

Right off the bat, I’ll tell you that it’s not an 800-page bible. It’s a relatively short book, very well written, concise and clear. Around 160 pages, packed with concepts, real-life advice, and hand-on examples, catered for different audiences and proficiency levels. So you’d get short explanations about Docker and more advanced discussions about transaction concurrency and resource contention.

Notebook series by O’Reilly; other titles in my collection include “JBoss at Work” and “Spring Framework.”

I love the experience of moving through the book, creating my own reading path, and coming back to it many times. It’s the sort of experience that you can’t have with another sort of media, which actually anchors the information in your mind. The structure of the book makes that very easy, adjusting it to your experience level.

It reminds me a lot of a series of books released by O’Reilly back in the mid-2000s, the Notebook series. I own four of them, just found two of them around, but I have another two packed in containers – seriously 🙂

Other resources you may find useful

Official Docshttps://cloud.google.com/run/docs/?hl=es-AR

Sampleshttps://github.com/GoogleCloudPlatform/cloud-run-samples

Labhttps://www.qwiklabs.com/focuses/5162?parent=catalog

Questhttps://www.qwiklabs.com/quests/98

Verizon’s Media Data Warehouse migration to GCP

0

You’d probably remember how big of a player was Yahoo in the big tech scene in the late ’90s to the 2000s, and I surely still remember how their CEO rejected the buyout offer by Microsoft for $44.6 billion in 2008 – ouch!

Now Yahoo is part of Verizon Media. They have just finished a massive migration of Hadoop and Enterprise Data Warehouse (EDW) workloads to Google Cloud’s BigQuery and Looker, becoming a big part of their MAW – Media Analytics Warehouse.

Looker – image from google.cloud.com

I don’t need to vouch for the power and flexibility of BigQuery as a tool, is well known: analytics real-time or batch, warehouse or even as an AI tool, without having to move out the data from processing and just using SQL.

I’ve been using it lately in that capacity – BigQuery ML – and it’s really easy, even from Jupyter Notebooks:

%load_ext google.cloud.bigquery
%%bigquery
SELECT    source_year AS year,    
COUNT(is_male) AS birth_count
FROM `bigquery-public-data.samples.natality`
GROUP BY year
ORDER BY year DESC LIMIT 15

Read more in the following article about the Verizon’s migration:

https://cloud.google.com/blog/products/data-analytics/benchmarking-cloud-data-warehouse-bigquery-to-scale-fast

re:Invent 2020 playlist released on Youtube

0

A playlist containing 35 videos from the last re:Invent has been released by AWS on their Youtube channel.

Some favorites – just because 🙂

Machine Learning Keynote

Strong consistency for Amazon S3

NFL on using AWS to transform player safety

GCP Professional Cloud Architect BETA

0

The Professional Cloud Architect beta certification is now open for registration, until some point in March 2021. The format of the exam is the same, but two new case studies have been added, two have been updated and one have been removed:

Good luck!

Google Certified Associate Cloud Engineer All In One Guide Review

0

A new book for preparing Google´s Associate Engineer Certified exam has been released. It belongs to the series All-In-One from Mc Graw Hill, which it´s one of my favourites; usually, the books are excellent.

This one is not an exception; I think it covers all the topics on the official exam guide. Even a section maps the objectives on the official guide with the chapters on the book, which is really useful. At the time of writing this post – December of 2020 -, both are almost identical:

  • Managing users in Cloud Identity (manually and automated) has replaced Linking users to G Suite Identities.
  • Deploying an application that receives Google Cloud events (e.g., Cloud Pub/Sub events, Cloud Storage object change notification events) has replaced Deploying a Cloud Function

Make sure that you check out the official guide for changes because Google updates it from time to time.

About the changes, Cloud Identity it’s a big topic in the Security Certification, and an exciting one I have to say – so don’t miss it out.

Objetives

The book covers the full official guide in eleven chapters. All the topics are quite up-to-dated – including topics like Anthos and Cloud Run. But keep in mind this is just a guide to prepare for the exam. That means that you have to expand each section, depending on your knowledge and experience about the subject, to be successful. You are not going to become an expert, just studying the guide and passing the exam.

It’s a nice introduction, though.

For instance, the chapter on Kubernetes Engine is quite nice and covers a lot of topics. But you are not going to learn Kubernetes just by reading the chapter, so I’d recommend reading the docs, books, getting real experience, or taking a course because if you are a beginner, you will be confused about the subject. And the chances are that you will find advanced questions on the subject – the exam is heavy on Kubernetes, as a matter of fact.

Or have a look at the chapter on App Engine. It’s a topic quite well covered and probably enough to answer many of the questions you may find on the exam. But you need to go deeper, create apps, and get some real experience if you don’t have it already.

Review and practice questions

Every chapter has a handful of review questions, from ten to fifteen, which are on a test format, similar to the ones you would find in the exam – but on the easier side.

Don’t forget to check the sample questions from Google, but again most questions are a bit easier than the ones you may find on the exam – but they are a good guide to check your knowledge and gauge your readiness to take the exam.

Representative sample question, property of cloud.google.com

The book also provides, for free, online content that consists of one hundred practice questions.

Image taken by the author of the post, property of Total Seminars Training Hub

Once you have registered on the Total Seminars Training Hub site, you will have access to the Custom Test screen, where you can customize the testing experience: duration, number of questions, exam objectives and assistance.

Works well enough, simple but functional. The assistance and the explanations given about the topics are short but enough.

The questions are similar to those you’d encounter on the exam, topics and format, so it’s good practice. I’d say the actual questions are lengthier and a bit more difficult.

Conclusion

This is a good guide that can help you with the preparation of the exam. But you need to expand every topic, depending on your experience and knowledge.

My advice is to use the guide – there is another guide from Google, but it’s a bit outdated – as a starting point, and use the documentation, labs, videos and real-life experience, not only to pass the exam but to round and up-to-date your knowledge, so you can validate your professional experience.

After all, this is not a college exam, but a professional one!

The Learning Journey, Part II: The Dopamine Effect

0

Video killed the radio star – The Buggles, 1980.

Do you remember books? Yeah, those objects that you´d use to carry on your bag and that have been pushed aside by the video course frenzy – and the Internet. And I get it; video courses could be a fast and cheap way to gather information, and some of them are really good.

It seems that video killed the book too.

But there is more than meets the eye, though, so I can’t stress enough the value of books as a source of learning; in fact, and I have been sharing online many of the books I use daily.

Let’s go deeper and find out what’s going on 🙂

The Book Way

Learning through books takes a much bigger effort than watching a video, the same thing that is reading a book or watching the movie adaptation. But in exchange, you’d get a richer non-linear interactive experience, powered by your mind in a big way and keeping much more focused on one task. The complex ideas that you can handle while reading a book are, in many ways, astonishing.

When you are watching a video, a very passive activity, the chances to be distracted with other things and multitask grow exponentially, opening other tabs, reading emails, watching other videos, the notifications … the list has no end. After all, we are getting used to multitasking; that´s the Internet way.

Image taken by the author

That’s not a bad thing, but it comes with a hefty price to pay, loss of focus because of the constant distraction, and obviously, that leads to a loss of productivity in any endeavour that you are pursuing.

That could become a serious problem because, over time, our brain gets used to it as the normal way to function, rewarding the multitasking way and penalizing the single task way of working. We´d become naturally distracted.

Ask yourself the last time you watched a movie, from start to end, without checking the email or the notifications? Do you get bored reading a book? You´d rather go back to watch videos or checking social media?

Well, that’s the dopamine talking.

Shots of dopamine

Dopamine is a neurotransmitter that motivates us to do things through instant gratification because it affects our rewards and pleasure centre. When the brain anticipates that we will do something that gives us pleasure, it releases a certain amount, which depends on the task in question. Eating chocolate, watching a video, or playing a video game, releases a huge amount of it, taking in exchange a small amount of energy. For instance, in the case of video games, the brain could generate a constant supply of dopamine, as chances are we will find new and exciting patterns in the game; novelty generates a lot of dopamine.

Dopamine´s good, and we need it for survival. Still, the problem comes with the artificial release of it, through low-value activities, as dozen hours expended on the Internet – nothing wrong, if done with moderation. As you’d have guessed by now, dopamine generates addiction that could lead to loss of focus, concentration, and loss of time and productivity.

Dopamine: https://commons.wikimedia.org/wiki/User_talk:Jynto

That´s the reason that you are not reading that many books anymore. It still generates dopamine, but just a small amount over time, naturally and healthily. Also, it requires you a lot of energy and attention, leading you to deeper thinking. Your brain improves as a result.

A similar thing that eating an apple or eating a doughnut. A quick shot of refined sugar or a release of natural sugar over time. Guess which one liberates more dopamine.

It´s easy to get addicted to checking social media, playing video games or streaming, but not that easy to become a book addict. That can be a problem because it grows over time as our brain adapts to the new way of learning through dopamine. Your brain is demanding more and more, and you´d give it through automatic behaviour – like checking constantly social media, looking for new and exciting interactions.

I’m not surprised to see so many people with such a lower threshold of frustration these days. They want all the answers right away. No delay. No frustration. No struggle.

They want their dopamine shot.

The Learning Journey: dopamine fast

If you feel your brain is hungry for dopamine, and you can’t concentrate as you used to, or you haven’t read a book for ages, you should consider doing dopamine fast. One way to start is spending one day a week without distractions: no social media, no mobile phone, no movies, no video games … you´d get the idea. Instead, read a book, go for a walk, do exercise, write, paint etc … just analogue stuff. You will feel better over time like any addiction takes time to get rid of it.

My suggestion is to use video courses – videos in general – as a complement of your main learning, which should be a mix of books, documentation, posts, exercises, tests and practical projects. Watch only a few videos per session – short ones are preferred – take written notes and put them into practice.

Public Domain Picture

Transform a passive activity into an active one, and keep the healthy dopamine flowing 🙂

The Learning Journey, Part I

0
Temet Nosce in Greek

Temet Nosce,

Visitors would read that in awe when entering Apollo’s temple in Greece to visit the Delphi´s Oracle. Neo would face the same phrase – over the door – when visiting the Oracle in the Matrix – 1999, Warner Brothers movie. They were completely different epochs and visitors, but all of them had something in common, though they were looking for answers.

Funny thing, that quote It’s all you were going to get from the so called “Oracle”.

Let’s face it, going to someone else to find answers about one-self can be pretty deceiving. They could only give you a reflected image of yourself, and very likely, a distorted one – most of the time based upon our social masks. Now it’s truer than ever because, in the age of Instagram, we are just showing our good and happy side in an endless stream of funny pictures – before the CV19 anyway – and mindfulness quotes.

MGM, Public Domain Image

It’s US who should have all the answers, not some Oracle in some temple, a look-alike NY´s kitchen  😉

Is the Oracle a scam?

Ask Dorothy and their friends in “The Wizard of Oz”, Victor Fleming 1939, movie.

After a long journey travelling through the magical Land of Oz and meeting the Wizard – Oracle – they learned he had no answers for them because he was just “the man behind the curtain”, making the magical world go round. He was just a tool, a device, to create the discovery journey – or creating the Matrix, if you will. It is a mirror character to the Architect in the Matrix movie, even though the latter has many more gnostic overtones than the Wizard.

The Oracle is just an archetype, which is there to reflect the self, help them with introspection, not to give out the actual answers. The process of knowing one-self is obviously, personal and unique.

The Cloud Journey

That’s a concept I read about all the time on Linkedin, and probably some of you are familiar with it as well. I’m often being asked about my “journey” to the Cloud, how to get a job in the industry, or, similarly, my preparation to achieve one specific certification. That’s a new phenomenon brought by social media, something that we could call effectively collaboratory learning or mentoring.

It’s accurate to describe the learning process as “a journey”, though. Because learning should be an adventure, exploring unknown territory that could eventually take you out of the comfort zone. It will allow us to discover things about the self or Temet Nosce – the journey of self-discovery.

Maybe after all that time working on Development, you are now discovering that you are enjoying working on Machine Learning or Analytics. And that’s something that usually one wouldn’t learn in the day job – rarely happens to have that kind of opportunities. In fact, you need to make those opportunities happen, and a good way to do this is embarking on the “learning journey”.

Taking the first steps into the unknown

In the “Cloud Journey”, many people are giving the first steps using professional certifications. That’s a new phenomenon as well. Certifications are a good way to validate, up-to-date or enhance the professional experience. But I’ve never seen that neophyte people take them as the first steps to access an Industry at this scale.

Another way related to the former is taking one of the hundreds of pre-packaged video courses available on the different online learning platforms. They have been around for a long time now, but it’s only in the last few years that they have become so ubiquitous as a cheap way to self-learning and, in many cases, as shortcuts to achieve certifications. That defeats, in part, the journey of self-learning or discovery, as you are taking a fast highway instead of a secondary road. Certainly, driving on a highway is faster and safer, but you will learn much more about transiting on a secondary road. It’s tiresome, sure, but your driving skills will skyrocket.

Ideally, those courses should not be used as your main source truth and learning, but as a complement to your own learning path.

Lambda, EFS, and the Serverless Framework

0

If you´ve been developing serverless applications for a while, pretty sure you have found yourself facing a few challenges, apart from the old cold start thing – which has been solved to a great extent with the Provisioned Concurrency feature.

For instance, let’s say you need to load large files of rules consumed by a Lambda function that implements a rules engine or keep data files produced dynamically by the function between invocations. Lambda provides some local space – 512MB – that you may use, but it’s small and ephemeral, so it is not useful for those kinds of scenarios.

Other solutions come to mind: storing in databases – RDS, DynamoDB, S3 … but comes with a high price of development, performance and cost. What would happen if we had peaks of several hundred – or thousands of requests – per second, loading big files in the startup and writing files to a data store concurrently?

Well, at the very least, we could have a big performance hit, depending on the size of the files, the latency of retrieving the files at startup + the cold start of Lambdas – enter provisioned concurrency – plus the latency of storing the intermediate files to the datastores – it’s not the same storing and retrieving from S3 than from DynamoDB.

So no alternative? Well, we are in luck, as AWS released EFS support for Lambda in June!

Image property of AWS

Amazon EFS is widely known, so I’m not going to delve depth into the service, but to mention that Amazon Elastic File Service provides an NFS file system that escalates on-demand, providing high throughput and low latency. It’s instrumental when shared storage and parallel access from the services it´s needed.

Configuration & Considerations

“With power comes responsibility”, or in our case, with powerful features come some configuration constraints. EFS runs in different subnets within a VPC, which means that our Lambda functions have to run within a VPC. That comes with a price: IP direction, a possible performance hit, loss of connection to AWS global services; therefore, a NAT Gateway or Private Links / Gateway might need to be used, depending on the use case.

That constraint was vastly improved last year when Hyperplane ENI for Lambda was released, allowing that just a few ENI´s – and therefore a few IP´s – would be enough to handle a big number of Lambda invocations decoupling function scaling from ENI´s provisioning.

Configuration – Serverless Framework

The configuration of a Lambda function running within a VPC could be fairly simple – if only needs to access the VPC resources – as shown in the image below – under the VPC label:

Serverless framework YAML – Image MNube.org

A security group is needed for the Lambda function, the ID´s of the subnet(s) where the ENI(s) will be placed, and permissións to create, delete, and describe network interfaces.

VPC Lambda – Image MNube.org

The Lambda function is running within our VPC now, an ENI placed in each subnet selected, but to access the EFS instance, a few permissións will need to be provided:

Role permissións EFS, Lambda – Image MNube.org

Now the EFS can be created within the VPC. In order to do that, the console, Cloudformation, Serverless, AWS CLI, AWS SDK, etc … could be used.

EFS instance – Image MNube.org

After creating the instance, an access point needs to be provided to allow applications access. This is a new resource: “AWS::EFS::AccessPoint”. It can be created from the console or through a Cloudformation file – we will need to supply the EFS ID: ${self.provider}.

Serverless framework YAML – Image MNube.org

Finally, we link the file system to the Lambda Function, providing the arn of the EFS, the arn of the access point, and the local mounted path – as shown on the image below:

Image MNube.org

The EFS instance is ready to be accessed by the Lambda function 🙂

Solution

I have used the Serverless framework to produce the solution – but AWS SAM with Cloud 9 as the official alternative could have been used instead. I have quite experience with Serverless, having introduced it to a few companies – including Everis – with big success.

Architecture – MNube.org

Let’s create – or transfer – a rules file that can be accessed from the Lambda function 🙂

Different services could be used to transfer the files, like AWS DataSync, an EC2 instance, or even creating files from code. The files we might transfer from EC2 are accessible from the Lambda functions, so we´ll use this method.

After the EC2 instance has been created – a t2.micro is enough – in one of the subnets of the VPC that has access to the EFS ENI´s, a directory we´ll be needed – /efs. That directory doesn’t link to the EFS instance, so we´ll need to mount the directory.

One way to do it is using the EFS tools:

                     sudo yum install -y amazon-efs-utils

An access point was created previously that we can use to mount the directory. It’s easy to get the command line needed from the web console. Just go to the Amazon EFS > Access Point > id link and press the Attach button:

EFS Mount – Image MNube.org

After mounting the directory – in green – the files can be transfer to the /efs directory:

Mounting and creating files – Image MNube.org

At this point, access to the directory from the Lambda function should be fully possible. I have coded a minimum Lambda function that lists the files contained in the directory:

Lambda function – Image MNube.org

The solution is now ready to be deployed. Keep in mind that I have only shown parts of the serverless.yml, equivalent to the Cloudformation file you might use to provide the infrastructure – I will leave that to you as an exercise.

serverless deploy --stage dev --region eu-west-1
Serverless Stack – Image MNube.org

An URL link is provided by the framework, as I created an API gateway that invokes the Lambda function:

Cloudwatch Logs – Image from MNUBE.org

I have captured the request trace from the Cloudwatch Logs, where we can see the files in /efs: test.txt and rules.txt, and the low latency of the request.

Other Use Cases

  • Loading big libraries that Lambda layers can´t handle.
  • Files that are updated regularly.
  • Files that need locks for concurrent access.
  • Access to big files – zip / unzip.
  • Using different computing architectures – EC2, ECS – to process the same files.

error: