python | FLRNKS

Identity & Access Management

Mon, 03 Feb 2020 11:11:00 +0000

INTRODUCTION

In this post I show how the Identity and Access Management service in the AWS Public Cloud works to secure resources and workloads. It is a very important topic, because it underpins all of the security that is needed for hosting one’s resources in the public cloud.

At the end of the day, the cloud is just a concept that offers a convenient illusion of dedicated resources, but in reality it’s just some process that runs on someone else’s hardware, so one has to be absolutely sure about security before trusting it and running their business-critical workloads on it.

It is enough to do a quick google search for unsecured s3 bucket to see plenty of examples of administrators failing to properly harden and configure their AWS resources, and falling victim to accidental disclosure of often business-critical information.

IAM exists in the realm of AWS Cloud as a standalone service, providing various ways in which access to resources and workloads can be restricted. For example, if someone has an S3 bucket for storing arbitrary data, one can use IAM policies to restrict access to data stored in the bucket based on various criteria such as user identity, connection source IP, VPC environment and so on. S3 is a convenient service to demonstrate IAM capabilities, because it is very easy to grasp the result of restrictions: access to files in an S3 bucket is either granted or denied.

HOW IT WORKS

In order to illustrate how IAM works, I decided to create a Python Lambda function, which is just an AWS service offering server-less functions, and implemented a routine that tries to access some data stored in a particular S3 bucket. By default the Lambda starts running with an IAM role that has only read-only permission to the bucket. This is verified by making an API call with the boto3 package, which returns without any error. Next the Lambda tries to write some new data to the bucket, but this fails because the IAM role is not equipped with Write permission to the S3 bucket.

To mitigate this problem, I use boto3 to make an AWS Secure Token Service ( STS) call and assume a new role which is equipped with the necessary read-write access. Using this new role the program demonstrates that it can write to the bucket as expected. Below is a sample output of the Lambda Function in action:

=== Checking IAM Identity ===
ARN: arn:aws:sts::ACCOUNT_ID:assumed-role/Base-Lambda-Custom-Role/lambda

=== Testing Read access to S3 file in bucket ===
{
 "field1": true,
 "field2": 1.4107917E7
}

=== Testing Write access to S3 bucket ===
Error: AccessDenied!

=== Assumed New IAM Identity ===
ARN: arn:aws:sts::ACCOUNT_ID:assumed-role/S3-RW-Role/lambda

=== Testing Write access to S3 bucket (using new role) ===
... file was written successfully!

To get a better understanding how this all worked in code, feel free to check out the source code repository in Github ( link). Because I am a big fan of Terraform, I defined all resources (S3, IAM, Lambda) in code which makes it very simple and straightforward to deploy and test the code if you feel like!

ADVANCED IAM

Besides providing the basic functionality to restrict access to resources base on user identity, there are some cool and more advanced features of AWS IAM that I wanted to touch upon. For example, to show how simple it is to give read-only permissions to a bucket for an IAM role:

data "aws_iam_policy_document" "s3_ro_access_policy_document" {
statement {
effect = "Allow"
actions = [
"s3:GetObject",
"s3:ListBucket",
]
resources = [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
]
}
}
resource "aws_iam_policy" "s3_ro_access_policy" {
name = "S3-ReadOnly-Access"
policy = data.aws_iam_policy_document.s3_ro_access_policy_document.json
}
resource "aws_iam_role_policy_attachment" "Allow_S3_ReadOnly_Access" {
role = aws_iam_role.aws_custom_role_for_lambda.name
policy_arn = aws_iam_policy.s3_ro_access_policy.arn
}
resource "aws_iam_role" "aws_s3_readwrite_role" {
name = "S3-RW-Role"
description = "Role to allow full RW to bucket"
}

Full source code on GitHub.

With this short Terraform code, I created a role, and assigned an IAM policy to it, which has RO access to my-bucket resource in S3. To spice this up a bit, it is possible to add extra restrictions based on various elements of the request context to restrict access based on Source IP for example:

data "aws_iam_policy_document" "s3_ro_access_policy_document" {
statement {
effect = "Deny"
actions = [
"s3:*"
]
resources = [ "*"]
condition {
test = "IpAddress"
variable = "aws:SourceIp"
values = [ "192.168.2.0/24" ]
}
}
}

All of a sudden, even if the user who makes the request to S3 has correct credentials, but is connecting from a subnet which is outside the one specified above, the request will be denied! This can be very useful for example, when trying restricting access to resources to be possible only from within a corporate network with specific CIDR range.

One small issue with this source IP restriction is that it can cause issues for certain AWS services that run on behalf of a principal/user. When using the AWS Athena service for example, triggering a query on data stored in S3 means Athena will make S3 API requests on behalf of the user who initiated the Athena query, but will have a source IP address from some Amazon AWS CIDR range and the request will fail. For this purpose, there is an extra condition that can be added to remediate this issue:

data "aws_iam_policy_document" "s3_ro_access_policy_document" {
statement {
effect = "Deny"
actions = [
"s3:*"
]
resources = [ "*"]
condition {
test = "IpAddress"
variable = "aws:SourceIp"
values = [ "192.168.2.0/24" ]
}
condition {
test = "Bool"
variable = "aws:ViaAWSService"
values = [ "false" ]
}
}
}

The aws:viaAWSService = false condition will ensure that this Deny will only take effect when the request context does not come from an AWS Service Endpoint. For additional info on what other possibilities exist that can be used to grant or deny access, please consult the AWS documentation.

CONCLUSION

In this post I demonstrated how to use the boto3 python package to make AWS IAM and STS calls to access resources in the AWS cloud protected by IAM policies. I also discussed some advanced features of AWS IAM that can help you implement more granular IAM policies and access rights. The linked repository also contains an example which may be run locally and does not need the Lambda function to be created (it still, however, requires the Terraform resources to be deployed).

Cloud Service Testing

Fri, 17 Jan 2020 11:11:00 +0000

In this blog post I discuss a recent project I worked on to practice my skills related to AWS, Python and Datadog. It includes topics such as integration testing using pytest and localstack; running Continuous Integration via Travis-CI and infrastructure as code using Terraform.

Intro

For the sake of this blog post, let’s assume that a periodic job runs somewhere in the Cloud, outside the context of this application, which generates a file with some meta-data about the job itself. This data includes mostly numerical values, such as the number of images used to train an ML model, or the number of files processed, etc. This part is depicted on the below diagram as a dummy Lambda function that periodically uploads this metadata file to an S3 bucket with random numerical values.

When this file is uploaded, an event notification is sent to the message queue. The goal of the Python application is to periodically drain these messages from the queue. When the application runs, it fetches the S3 file referenced in each SQS message, parses the file’s contents and submits the numerical metrics to DataDog for the purpose of visualisation and alerting.

Testing

Since the application interacts with two different APIs (AWS & Datadog), I figured it was a good idea to create integration tests that can be run easily via some free CI service (e.g.: Travis-CI.org). When writing the integration tests, I opted to create a simple mock class for testing the interaction with the Datadog API, and chose to rely on localstack for testing the interaction with the AWS API.

Thanks to localstack I could skip creating real resources in AWS and instead use free fake resources in a docker container, that mimic the real AWS API close to 100%. The AWS SDK called boto3 is very easy to reconfigure to connect to the fake resources in localstack with the endpoint_url= parameter.

In the following sections I go through different phases of the project:

coding the python app
mocking Datadog statsd client
setting up AWS resources in localstack
creating integration tests
Travis-CI integration
running the datadog-agent locally
setting up real AWS resources
live testing

~ Coding the python app ~

The code is mainly composed of two Python classes with methods to interact with AWS and DataDog. The CloudResourceHandler class has methods to interact with S3 and SQS, which can be replaced in integration-tests with preconfigured boto3 clients for localstack.

The MetricSubmitter class uses the CloudResourceHandler internally and offers some additional methods for sending metrics to DataDog. Internally it uses statsd from the datadog python package, which can be replaced via dependency injection in integration tests with a mock statsd class that I created to test its interaction with the Datadog API.

To connect to the real AWS & Datadog APIs (via a preconfigured local datadog-agent) there needs to be two environment variables specified at run-time:

STATSD_HOST set to localhost
SQS_QUEUE_URL set to the URL of the Queue

os.environ['STATSD_HOST'] = 'localhost'
os.environ['SQS_QUEUE_URL'] = 'https://sqs.eu-central-1.amazonaws.com/????????????/cloud-job-results-queue'
session = boto3.Session(profile_name='profile-name')
MetricSubmitter(statsd=datadog_statsd,
sqs_client=session.client('sqs'),
s3_client=session.client('s3')).run()

In addition, it also requires a preconfigured AWS profile in ~/.aws/credentials which is necessary for boto3 to authenticate to AWS:

[profile-name]
aws_access_key_id = XXXXXXXXXXXXXXX
aws_secret_access_key = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
region = eu-central-1

But before running it, let’s set up some integration tests!

~ Mocking Datadog statsd client ~

In truth, the application does not interact directly with the Datadog API, but rather it uses statsd from the datadog python package, which interacts with the local datadog-agent, which in turn forwards metrics and events to the Datadog API.

To test this flow that relies on statsd, I created a class called DataDogStatsDHelper. This class has 2 functions (gauge/event) with identical signatures to the real functions from the official datadog-statsd package. However, the mock functions do not send anything to the datadog-agent. Instead, they accumulate the values they were passed in local class variables:

class DataDogStatsDHelper:
event_title = None
event_text = None
event_alert_type = None
event_tags = None
event_counter = 0
gauge_metric_name = None
gauge_metric_value = None
gauge_tags = None
gauge_counter = 0
def event(self, title, text, alert_type=None, aggregation_key=None, source_type_name=None,
date_happened=None, priority=None, tags=None, hostname=None):
...
def gauge(self, metric, value, tags=None, sample_rate=None):
...

When the MetricSubmitter class is tested, this mock class is injected instead of the real statsd class, which enables assertions to be made and compare expectations with reality.

~ AWS resources in localstack ~

To test how the python app integrates with S3 and SQS, I decided to use loalstack, running in a Docker container. To make it simple and repeatable, I created a docker-compose.yaml file that allows the configuration parameters to be defined in YAML:

version: '3.2'
services:
 localstack:
 image: localstack/localstack:latest
 container_name: localstack
 ports:
 - '4563-4599:4563-4599'
 - '8080:8080'
 environment:
 - SERVICES=s3,sqs
 - AWS_ACCESS_KEY_ID=foo
 - AWS_SECRET_ACCESS_KEY=bar

The resulting fake AWS resources are accessible via different ports on localhost. In this case, S3 runs on port 4572 and SQS on port 4576. Refer to the docs on GitHub for more details on ports used by other AWS services in localstack.

It is important to note that when localstack starts up, it is completely empty. Thus, before the integration tests can run, it is necessary to provision the S3 bucket and SQS queue in localstack, just as one would normally do it when using real AWS resources.

For this purpose, it’s possible to write a simple bash script that can be called from the localstack container as part of an automatic init script:

aws --endpoint-url=http://localhost:4572 s3api create-bucket --bucket "bucket-name" --region "eu-central-1"
aws --endpoint-url=http://localhost:4576 sqs create-queue --queue-name "queue-name" --region "eu-central-1" --attributes "MaximumMessageSize=4096,MessageRetentionPeriod=345600,VisibilityTimeout=30"

However, for the sake of making the integration-tests self-contained, I opted to integrate this into the tests as part of a class setup phase that runs before any tests and sets up the required S3 bucket and SQS queue:

@classmethod
def setUpClass(cls):
cls.ls = LocalStackHelper()
cls.ls.get_s3_client().create_bucket(Bucket=cls.s3_bucket_name)
cls.ls.get_sqs_client().create_queue(QueueName=cls.sqs_queue_name)

~ Creating integration tests ~

As a next step I created the integration tests which use the fake AWS resources in localstack, as well as the mock statsd class for DataDog. I used two popular python packages to create these:

unittest which is a built-in package
pytest which is a 3rd party package

Actually, the test cases only use unittest, while pytest is used for the simple collection and execution of those tests. To get started with the unittest framework, I created a python class and implemented the test cases within this class:

import unittest
from app.utils.datadog_fake_statsd import DataDogStatsDHelper
from app.utils.localstack_helper import LocalStackHelper
from app.submitter import MetricSubmitter
class ProjectIntegrationTesting(unittest.TestCase):
@classmethod
def setUpClass(cls):
...
def setUp(self):
...
def test_ddg_submitter_valid_payload(self):
...
def test_ddg_submitter_invalid_payload(self):
...
def test_aws_handler_invalid_s3key(self):
...
def test_aws_handler_valid_s3key(self):
...

In the setUpClass method, a few things are taken care of before tests can be executed:

define class variables for the bucket & the queue
create SQS & S3 clients using localstack endpoint url
provision needed resources (Queue/Bucket) in localstack

To test the interaction with DataDog via the statsd client, the submitter app is executed, which stores some values in the mock statsd class’s internal variables, which are then used in assertions to compare values with expectations.

The other tests inspect the behaviour of the CloudResourceHandler class. For example, one of the assertions tests whether the .has_available_messages() function returns false when there are no more messages in the queue.

A nice feature of unittest is that it’s easy to define tasks that need to be executed before each test, to ensure a clean slate for each test. For example, the code in the setUp method ensures two things:

the fake SQS queue is emptied before each test
class variables of the mock DataDog class are reset before each test

Theoretically, it would be possible to run the test by running pytest -s -v in the python project’s root directory, however the tests rely on localstack, so they would fail…

~ Travis-CI integration ~

So now that the integration tests are created, I thought it would be really nice to have them automatically run in a CI service, whenever someone pushes changes to the Git repo. To this end, I created a free account on travis-ci.org and integrated it with my github rep by creating a .travis.yaml file with the below initial content:

os: linux
language: python
python:
 - "3.8"
services:
 - docker
script:
 - {...}

However, I still needed a way to run localstack and then execute the integration tests within the CI environment. Luckily I found docker-compose to be a perfect fit for this purpose. I had already created a yaml file to describe how to run localstack, so now I could just simply add an extra container that would run my tests. Here is how I created a docker image to run the tests via docker-compose:

FROM python:3.8-alpine
WORKDIR /app
COPY ./requirements-test.txt ./
RUN apk add --no-cache --virtual .pynacl_deps build-base gcc make python3 python3-dev libffi-dev \
 && pip3 install --upgrade setuptools pip \
 && pip3 install --no-cache-dir -r requirements-test.txt \
 && rm requirements-test.txt
COPY ./utils/*.py ./utils/
COPY ./*.py ./
ENV LOCALSTACK_HOST localstack
ENTRYPOINT ["pytest", "-s", "-v"]

It installs the necessary dependencies to an alpine based python 3.8 image; adds the necessary source code, and finally executes pytest to collect & run the tests. Here are the updates I had to make to the docker-compose.yaml file:

version: '3.2'
services:
 localstack:
 {...}
 integration-tests:
 container_name: cloud-job-it
 build:
 context: .
 dockerfile: Dockerfile-tests
 depends_on:
 - "localstack"

Docker Compose auto-magically creates a shared network to enable connectivity between the defined services, which can call one-another by name. So when the tests are running in the cloud-job-it container, they can use the hostname localstack to create the boto3 session via the endpoint url to reach the fake AWS resources.

For easier to creation of AWS clients via localstack, I used a package called localstack-python-client, so I don’t have to deal with port numbers and low level details. However, this client by default tries to use localhost as the hostname, which wouldn’t work in my setup using docker-compose. After digging through the source-code of this python package, I found a way to change this by setting an environment variable named LOCALSTACK_HOST.

As a final step, I just had to add two lines to complete to the .travis.yaml file:

script:
 - docker-compose up --build --abort-on-container-exit
 - docker-compose down -v --rmi all --remove-orphans

Thanks to the --abort-on-container-exit flag, docker-compose will return the same exit code which is returned from the container that first exits, which first this use-case perfectly, as the cloud-job-it container only runs until the tests finish. This way the whole setup will gracefully shut down, while preserving the exit code from the container, allowing the CI system to generate an alert if it’s not 0 (meaning some test failed).

~ Running the datadog-agent locally ~

Note: while Datadog is a paid service, it’s possible to create a trial account that’s free for 2 weeks, without the need to enter credit card details. This is pretty amazing!

Now that the integration tests are automated and passing, I wanted to run the datadog-agent locally, so that I can test the python application with some real data that was to he submitted to Datadog via the agent. Here is an article that was particularly useful to me, with instructions on how the agent should be set up.

While the option of running it in docker-compose was initially appealing, I eventually decided to just start it manually as a long-lived detached container. Here is how I went about doing that:

DOCKER_CONTENT_TRUST=1 docker run -d \
 --name dd-agent \
 -v /var/run/docker.sock:/var/run/docker.sock:ro \
 -v /proc/:/host/proc/:ro \
 -v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \
 -e DD_API_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX \
 -e DD_SITE="datadoghq.eu" \
 -e DD_DOGSTATSD_NON_LOCAL_TRAFFIC=true \
 -p 8125:8125/udp \
 datadog/agent:7

Most notable of these lines is the DD_API_KEY environment variable which ensures that whatever data I send to the agent is associated with my own account. In addition, since I am closest to the EU region, I had to specify the endpoint via the DD_SITE variable. Also, because I want the agent to accept metrics from the python app, I need to turn on a feature via the environment variable DD_DOGSTATSD_NON_LOCAL_TRAFFIC, as well as expose port 8125 from the docker container to the host machine:

 ▶ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
477cb2ea74b2 datadog/agent "/init" 3 days ago Up 3 days (healthy) 0.0.0.0:8125->8125/udp, 8126/tcp dd-agent

All seems to be well!

~ Deploying real AWS resources ~

Here I briefly discuss how I deployed some real resources in AWS to see my application running live. In a nutshell, I set the infra up as code in Terraform, which greatly simplified the whole process. All the necessary files are collected in a directory of my repository:

variables.tf defines some variables used in multiple places
init.tf initialisation of the AWS provider and definition of AWS resources
outputs.tf defines some values that are reported when deployment finishes

The first and last files are not very interesting. Most of the interesting stuff happens in the init.tf, which defines the necessary resources and permissions. One extra resource not mentioned before, is an AWS Lambda function, which gets executed every minute and is used to upload a JSON file to the S3 bucket. This acts as a random source of data, so that the python app has some work to do without manual intervention.

~ Live testing ~

Now that all parts seem to be ready, it’s time to run the main python app using the real S3 bucket and SQS queue, as well as the local datadog-agent. The console output provides some hints whether it’s able to pump the metrics from AWS to a DataDog:

▶ python3 submitter.py
Initializing new Cloud Resource Handler with SQS URL - https://.../cloud-job-results-queue
Processing available messages in SQS queue:
- sending data to DataDog via statsd/datadog-agent.
- removing message from SQS (AQEBO37smPPHg6OIqbh3HMu3g...)
- ...
- sending data to DataDog via statsd/datadog-agent.
- removing message from SQS (AQEBV0/JzMVEP6k5kBmx2kvGn...)
No more messages visible in the queue, shutting down ...
Process finished with exit code 0

Next, I checked my DataDog account to see whether the metric data arrived. For this I created a custom Notebook with graphs to display them:

All seems to be well! The deployed AWS Lambda function has already run a few times, providing input data for the python app, which were successfully processed and forwarded to Datadog. As seen on the Notebook above, it is really easy to display metric data over time about any recurring workload, which can provide pretty useful insights into those jobs.

Furthermore, since DataDog also submission of events it becomes possible to design dashboards and create alerts which trigger based on mor complex criteria, such as the presence or lack of events over certain periods of time. One such example can be seen below:

This is a so-called screen-board which I created to display the status of a Monitor that I set up previously. This Monitor tracks incoming events with the tag cloud_job_metric and generates an alert, if there is not at least one such event of type success in the last 30 minutes. The screen-board can be exported via a public URL if needed, or just simply displayed on a big screen somewhere in the office.

Conclusions

In this post I discussed a relatively complex project with lots of exciting technology working together in the realm of Cloud Computing. In the end, I was able to create DashBoards and Monitors in DataDog, which can ingest and display telemetry about AWS workloads, in a way that makes it useful to track and monitor the workloads themselves.

RunCode.ninja Challenges

Sat, 11 Jan 2020 11:11:00 +0000

This post was born on a misty saturday morning, while slowly sipping some good quality coffe in a Prague café. The last several days after work was over I spent solving programming challenges on runcode.ninja and I thought it would be nice to share my experience and spread the word about it.

RunCode.ninja

I can’t really recall how I discovered this website in the first place… All I remember is that I was really into the simplistic idea of it all. The basic idea for most of the challenges goes something like this:

check problem description
inspect any sample input (if any)
write your program locally
test on sample input (if any)
submit source code to the evaluation platform

If all went well, you will get feedback within a few seconds whether the submitted code worked correctly for the given task at hand. If it didn’t, then you can turn to their FAQ for some advice. It definitely has some useful info, however if all else fails, you can also contact the team behind the platform on their slack channel. They are really friendly people so be sure to respond to their effort in kind!

Another nice thing about their platform is that they categorized all their challenges (119 in total as of now) into nice categories such as binary, encoding, encryption, forensics, etc. which allows you to select what you are interested in. When I started out, I was first aiming to complete the challenges in Easy which offers a combination of relatively easy challenges from math, text-parsing, encoding and other categories.

As it currently stands, I rank 155 our of around ~2400 registered users, which seems quite impressive at first, but I suspect there may be quite a few inactive accounts in their database. Also, there are some hardcore people who have already completed all their challenges that seems quite impressive. If only a few rainy and cold weekends I could spend working on these, I would probably catch up soon!

Last but not least, their platform is set up to interpret a several different programming languages, so you can choose to solve them in the language you are most comfortable with. Once you solve a challenge, you can access its write-ups which provide some very useful inspiration on how others have solved the same problem. This can provide some very valuable lessons, like that one time when I wrote a Go program that was 20 lines long to solve a challenge that took only 1 line into solve in Bash…

If you are interested to check out my solutions for some of the challenges, you can find them in my GitHub repository. For some of them I even created two different solutions, one in Python and another Go, just to compare and practice working with both languages.

Oh and I almost forgot to mention, they have some really cool stickers that they are not shy to send half-way across the world by post, so that’s another big plus for sticker fans :)

That’s all for now, thank you for tuning in! :)

Infrastructure as Code

Tue, 12 Nov 2019 11:11:00 +0000

Introduction

In this post I will briefly introduce different AWS services and show how to use Terraform to orchestrate and manage them. While the concept of the whole service is rather simple, its main use is enabling me to learn about this new emerging technology called Infrastructure-as-Code or IaC for short.

Project overview

The main goal of this task is to deploy a server-less function and periodically query the Github API to get a list of public repositories for a given organisation (e.g.: Google). The retrieved information should then be stored in a compressed CSV file in a specific S3 bucket, while notifications should be created for new files saved to the bucket.

The main AWS components of the solution are:

Lambda function written in Python
CW Event Rule to schedule the Lambda periodically
S3 for storing data in a bucket
SQS for queueing notifications from S3

Possibilities

Various methods exist for the creation and configuration of these necessary resources. The most simple one is by logging in to the AWS Management Console and setting up each components one by one via the GUI. This method, however, is slow, cumbersome and quite prone to errors.

A better option can be to use the AWS SDK for your favourite programming language. Several options exist, such as Java, Python, GO, Node.js, etc… This option is less error-prone, but still quite cumbersome and slow.

Perhaps one of the best options is to use Terraform, which is a popular Infrastructure as Code or IaC tool these days. It lets you define your infrastructure in a configuration language and has its own internal engine that talks to the AWS SDK to create the necessary infrastructure you defined.

Setup procedure

Before we can make use of Terraform to deploy our project on AWS, we need to set up credentials. This can be done by logging in to the AWS management console and going to Identity and Access Management section, which can provide the necessarz Access Key and Secret value that you need to put into a file on disk. These credentials should be saved to ~/.aws/credentials as follows:

[default]
aws_access_key_id = XXXXXXXXXXXX
aws_secret_access_key = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

This enables Terraform to make changes to your AWS infrastructure through API calls made to AWS to provision resources according to your definition in the .tf file. Once you create the desired configuration a complete infrastructure can be deployed as simply as below:

$ ▶ ls -la
-rw-r--r-- 1 user group 4.9K Nov 21 22:58 main.tf
$ ▶ terraform init
...
Terraform has been successfully initialized!
$ ▶ terraform apply
...
Plan: 13 to add, 0 to change, 2 to destroy.
Do you want to perform these actions?
Enter a value: YES

Project building blocks

In this section I will go over each major component and explain what it is, what it does and how it is set up. First up is the main component: the core logic implemented in Python.

AWS Simple Storage Service

This is a basic building block which we use to store data generated by the Lambda function. Since Lambdas are by nature server-less, they do not have persistent storage attached which can be used to save data between two invocations of the function. If we need persistent storage we need to use S3. The necessary Terraform code is below:

resource "aws_s3_bucket" "tf_aws_bucket" {
bucket = "tf-aws-bucket"
tags = {
Name = "Bucket for Terraform project"
Environment = "Dev"
}
force_destroy = "true"
}

This will create a bucket named tf-aws-bucket which we can then use to store the results of our Lambda function. As an extra feature, we also configured notifications for this bucket, which will be created when a compressed file with .gz file type is created in the bucket. When this happens a notification will be generated and sent to the SQS queue that is also defined in the same Terraform file.

AWS Lambda

AWS Lambda is a server-less technology which lets you create a bare function in the cloud and call it from various other services, without having to worry about setting up an environment where it will run. Different programming language are supported, such as Python, Java, Go and NodeJS. Once you deploy your code, you can receive input to your function just as normally when you write a function, and give it permission to access and modify other resources in AWS, such as working with files stored in S3.

This is exactly the use-case that was implemented in this project. A lambda function that makes an API call to Github to download information, then store this in a compressed CSV file to an S3 bucket. To define the target organisation and the bucket where information is saved, the Lambda function expects two arguments in the function call:

{
"org_name" : "twitter",
"target_bucket" : "repos_folder"
}

This JSON input passed to the function is converted to a map in Python, which can be tested for the presence of necessary keys for the correct functioning of the code:

def handler(event, context):
# verify that URL is passed correctly and create file_name variable based on it
if 'org_name' not in event.keys() or 'target_bucket' not in event.keys():
print("Missing 'org_name' from request body (JSON)!")

The rest of the function’s code downloads the list of public repositories of the passed organisation from Github API and store this in a temporary file that can be uploaded to S3, provided that the necessary permissions have been granted to this Lambda function:

import boto3
s3 = boto3.client("s3")
s3.upload_file(path_to_local_file, target_bucket_name, key_name)

In order to enable access to S3 from Lambda, we have to define some IAM policies and roles. First we have to define a policy which says that the role, which obtains this policy can access the S3 bucket:

data "aws_iam_policy_document" "s3_lambda_access" {
statement {
effect = "Allow"
resources = ["arn:aws:s3:::tf-aws-bucket/*"]
actions = [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket",
]
}
}

resource "aws_iam_policy" "s3_lambda_access" {
name = "s3_lambda_access"
policy = data.aws_iam_policy_document.s3_lambda_access.json
}

This policy is then attached to an IAM role which is allowed to be assumed by AWS Lambda:

resource "aws_iam_role_policy_attachment" "s3_lambda_access" {
role = aws_iam_role.tf_aws_exercise_role.name
policy_arn = aws_iam_policy.s3_lambda_access.id
}

resource "aws_iam_role" "tf_aws_exercise_role" {
name = "tfExerciseRole"
description = "Role that allowed to be assumed by AWS Lambda, which will be taking all actions."
tags = {
owner = "tfExerciseBoss"
}
assume_role_policy = <<EOF
{
 "Version": "2012-10-17",
 "Statement": [
 {
 "Action": "sts:AssumeRole",
 "Principal": {
 "Service": "lambda.amazonaws.com"
 },
 "Effect": "Allow"
 }
 ]
}
EOF
}

AWS CloudWatch Events

This component is responsible for periodically making a call to our Lambda function, with the required arguments passed in JSON format. This component was also configured via Terraform, but for the sake of simplicity, below is a screenshot taken from the AWS Management Console where the created CW event shows up as configured:

The screen-shot shows that it is configured to periodically execute a Target Lambda function every 2 minutes.

Results

In summary, it took me a while to get the hang of Infrastructure as Code concept and apply it while working with Terraform on AWS, but I can definitely see how it can benefit a bigger organisation which want their Cloud infrastructure to be stable and maintainable. IaC tools such as Terraform let developers define their infrastructure as code and check it in to version control for repeatable and more predictable deployment procedures. Now that I have this working project, I can do a simple terraform deploy to bring alive my service with all required components and permissions correctly set up in seconds, while also being able to quickly destroy it if I chose to do so. This gives flexibility and greater ease of development that can speed up projects in the cloud.