<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>python | FLRNKS</title><link>https://flrnks.netlify.app/tag/python/</link><atom:link href="https://flrnks.netlify.app/tag/python/index.xml" rel="self" type="application/rss+xml"/><description>python</description><generator>Source Themes Academic (https://sourcethemes.com/academic/)</generator><language>en-us</language><copyright>© 2024</copyright><lastBuildDate>Mon, 03 Feb 2020 11:11:00 +0000</lastBuildDate><image><url>https://flrnks.netlify.app/images/icon_hu0b7a4cb9992c9ac0e91bd28ffd38dd00_9727_512x512_fill_lanczos_center_2.png</url><title>python</title><link>https://flrnks.netlify.app/tag/python/</link></image><item><title>Identity &amp; Access Management</title><link>https://flrnks.netlify.app/post/aws-iam/</link><pubDate>Mon, 03 Feb 2020 11:11:00 +0000</pubDate><guid>https://flrnks.netlify.app/post/aws-iam/</guid><description>&lt;h2 id="introduction">INTRODUCTION&lt;/h2>
&lt;p>In this post I show how the Identity and Access Management service in the AWS Public Cloud works to secure resources and workloads. It is a very important topic, because it underpins all of the security that is needed for hosting one&amp;rsquo;s resources in the public cloud.&lt;/p>
&lt;p>At the end of the day, the cloud is just a concept that offers a convenient illusion of dedicated resources, but in reality it&amp;rsquo;s just some process that runs on someone else&amp;rsquo;s hardware, so one has to be absolutely sure about security before trusting it and running their business-critical workloads on it.&lt;/p>
&lt;p>It is enough to do a quick google search for
&lt;a href="https://www.google.com/search?q=unsecured%20s3%20bucket" target="_blank" rel="noopener">unsecured s3 bucket&lt;/a> to see plenty of examples of administrators failing to properly harden and configure their AWS resources, and falling victim to accidental disclosure of often business-critical information.&lt;/p>
&lt;p>
&lt;a href="https://docs.aws.amazon.com/iam/?id=docs_gateway" target="_blank" rel="noopener">IAM&lt;/a> exists in the realm of AWS Cloud as a standalone service, providing various ways in which access to resources and workloads can be restricted. For example, if someone has an S3 bucket for storing arbitrary data, one can use IAM policies to restrict access to data stored in the bucket based on various criteria such as user identity, connection source IP, VPC environment and so on. S3 is a convenient service to demonstrate IAM capabilities, because it is very easy to grasp the result of restrictions: access to files in an S3 bucket is either granted or denied.&lt;/p>
&lt;h2 id="how-it-works">HOW IT WORKS&lt;/h2>
&lt;p>In order to illustrate how IAM works, I decided to create a Python Lambda function, which is just an AWS service offering server-less functions, and implemented a routine that tries to access some data stored in a particular S3 bucket. By default the Lambda starts running with an
&lt;a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html" target="_blank" rel="noopener">IAM role&lt;/a> that has only read-only permission to the bucket. This is verified by making an API call with the
&lt;a href="https://boto3.amazonaws.com/v1/documentation/api/latest/index.html" target="_blank" rel="noopener">boto3&lt;/a> package, which returns without any error. Next the Lambda tries to write some new data to the bucket, but this fails because the IAM role is not equipped with Write permission to the S3 bucket.&lt;/p>
&lt;p>To mitigate this problem, I use boto3 to make an AWS Secure Token Service (
&lt;a href="https://docs.aws.amazon.com/STS/latest/APIReference/Welcome.html" target="_blank" rel="noopener">STS&lt;/a>) call and assume a new role which is equipped with the necessary read-write access. Using this new role the program demonstrates that it can write to the bucket as expected. Below is a sample output of the Lambda Function in action:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-yml" data-lang="yml">===&lt;span class="w"> &lt;/span>Checking&lt;span class="w"> &lt;/span>IAM&lt;span class="w"> &lt;/span>Identity&lt;span class="w"> &lt;/span>===&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>&lt;span class="k">ARN&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>arn&lt;span class="p">:&lt;/span>aws&lt;span class="p">:&lt;/span>sts&lt;span class="p">::&lt;/span>ACCOUNT_ID&lt;span class="p">:&lt;/span>assumed-role/Base-Lambda-Custom-Role/lambda&lt;span class="w">
&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>===&lt;span class="w"> &lt;/span>Testing&lt;span class="w"> &lt;/span>Read&lt;span class="w"> &lt;/span>access&lt;span class="w"> &lt;/span>to&lt;span class="w"> &lt;/span>S3&lt;span class="w"> &lt;/span>file&lt;span class="w"> &lt;/span>in&lt;span class="w"> &lt;/span>bucket&lt;span class="w"> &lt;/span>===&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>{&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">&amp;#34;field1&amp;#34;: &lt;/span>&lt;span class="kc">true&lt;/span>&lt;span class="p">,&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">&amp;#34;field2&amp;#34;: &lt;/span>&lt;span class="m">1.&lt;/span>4107917E7&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>}&lt;span class="w">
&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>===&lt;span class="w"> &lt;/span>Testing&lt;span class="w"> &lt;/span>Write&lt;span class="w"> &lt;/span>access&lt;span class="w"> &lt;/span>to&lt;span class="w"> &lt;/span>S3&lt;span class="w"> &lt;/span>bucket&lt;span class="w"> &lt;/span>===&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>&lt;span class="k">Error&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>AccessDenied!&lt;span class="w">
&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>===&lt;span class="w"> &lt;/span>Assumed&lt;span class="w"> &lt;/span>New&lt;span class="w"> &lt;/span>IAM&lt;span class="w"> &lt;/span>Identity&lt;span class="w"> &lt;/span>===&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>&lt;span class="k">ARN&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>arn&lt;span class="p">:&lt;/span>aws&lt;span class="p">:&lt;/span>sts&lt;span class="p">::&lt;/span>ACCOUNT_ID&lt;span class="p">:&lt;/span>assumed-role/S3-RW-Role/lambda&lt;span class="w">
&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>===&lt;span class="w"> &lt;/span>Testing&lt;span class="w"> &lt;/span>Write&lt;span class="w"> &lt;/span>access&lt;span class="w"> &lt;/span>to&lt;span class="w"> &lt;/span>S3&lt;span class="w"> &lt;/span>bucket&lt;span class="w"> &lt;/span>(using&lt;span class="w"> &lt;/span>new&lt;span class="w"> &lt;/span>role)&lt;span class="w"> &lt;/span>===&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>...&lt;span class="w"> &lt;/span>file&lt;span class="w"> &lt;/span>was&lt;span class="w"> &lt;/span>written&lt;span class="w"> &lt;/span>successfully!&lt;span class="w">
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>To get a better understanding how this all worked in code, feel free to check out the source code repository in Github (
&lt;a href="https://github.com/florianakos/aws-iam-exercise" target="_blank" rel="noopener">link&lt;/a>). Because I am a big fan of Terraform, I defined all resources (S3, IAM, Lambda) in code which makes it very simple and straightforward to deploy and test the code if you feel like!&lt;/p>
&lt;h2 id="advanced-iam">ADVANCED IAM&lt;/h2>
&lt;p>Besides providing the basic functionality to restrict access to resources base on user identity, there are some cool and more advanced features of AWS IAM that I wanted to touch upon. For example, to show how simple it is to give read-only permissions to a bucket for an IAM role:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-shell" data-lang="shell">data &lt;span class="s2">&amp;#34;aws_iam_policy_document&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;s3_ro_access_policy_document&amp;#34;&lt;/span> &lt;span class="o">{&lt;/span>
statement &lt;span class="o">{&lt;/span>
&lt;span class="nv">effect&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;Allow&amp;#34;&lt;/span>
&lt;span class="nv">actions&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="o">[&lt;/span>
&lt;span class="s2">&amp;#34;s3:GetObject&amp;#34;&lt;/span>,
&lt;span class="s2">&amp;#34;s3:ListBucket&amp;#34;&lt;/span>,
&lt;span class="o">]&lt;/span>
&lt;span class="nv">resources&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="o">[&lt;/span>
&lt;span class="s2">&amp;#34;arn:aws:s3:::my-bucket&amp;#34;&lt;/span>,
&lt;span class="s2">&amp;#34;arn:aws:s3:::my-bucket/*&amp;#34;&lt;/span>
&lt;span class="o">]&lt;/span>
&lt;span class="o">}&lt;/span>
&lt;span class="o">}&lt;/span>
resource &lt;span class="s2">&amp;#34;aws_iam_policy&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;s3_ro_access_policy&amp;#34;&lt;/span> &lt;span class="o">{&lt;/span>
&lt;span class="nv">name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;S3-ReadOnly-Access&amp;#34;&lt;/span>
&lt;span class="nv">policy&lt;/span> &lt;span class="o">=&lt;/span> data.aws_iam_policy_document.s3_ro_access_policy_document.json
&lt;span class="o">}&lt;/span>
resource &lt;span class="s2">&amp;#34;aws_iam_role_policy_attachment&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;Allow_S3_ReadOnly_Access&amp;#34;&lt;/span> &lt;span class="o">{&lt;/span>
&lt;span class="nv">role&lt;/span> &lt;span class="o">=&lt;/span> aws_iam_role.aws_custom_role_for_lambda.name
&lt;span class="nv">policy_arn&lt;/span> &lt;span class="o">=&lt;/span> aws_iam_policy.s3_ro_access_policy.arn
&lt;span class="o">}&lt;/span>
resource &lt;span class="s2">&amp;#34;aws_iam_role&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;aws_s3_readwrite_role&amp;#34;&lt;/span> &lt;span class="o">{&lt;/span>
&lt;span class="nv">name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;S3-RW-Role&amp;#34;&lt;/span>
&lt;span class="nv">description&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;Role to allow full RW to bucket&amp;#34;&lt;/span>
&lt;span class="o">}&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Full source code on
&lt;a href="https://github.com/florianakos/aws-iam-exercise/blob/master/terraform/s3.tf" target="_blank" rel="noopener">GitHub&lt;/a>.&lt;/p>
&lt;p>With this short Terraform code, I created a role, and assigned an IAM policy to it, which has RO access to &lt;code>my-bucket&lt;/code> resource in S3. To spice this up a bit, it is possible to add extra restrictions based on various elements of the request context to restrict access based on Source IP for example:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-shell" data-lang="shell">data &lt;span class="s2">&amp;#34;aws_iam_policy_document&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;s3_ro_access_policy_document&amp;#34;&lt;/span> &lt;span class="o">{&lt;/span>
statement &lt;span class="o">{&lt;/span>
&lt;span class="nv">effect&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;Deny&amp;#34;&lt;/span>
&lt;span class="nv">actions&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="o">[&lt;/span>
&lt;span class="s2">&amp;#34;s3:*&amp;#34;&lt;/span>
&lt;span class="o">]&lt;/span>
&lt;span class="nv">resources&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="o">[&lt;/span> &lt;span class="s2">&amp;#34;*&amp;#34;&lt;/span>&lt;span class="o">]&lt;/span>
condition &lt;span class="o">{&lt;/span>
&lt;span class="nb">test&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;IpAddress&amp;#34;&lt;/span>
&lt;span class="nv">variable&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;aws:SourceIp&amp;#34;&lt;/span>
&lt;span class="nv">values&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="o">[&lt;/span> &lt;span class="s2">&amp;#34;192.168.2.0/24&amp;#34;&lt;/span> &lt;span class="o">]&lt;/span>
&lt;span class="o">}&lt;/span>
&lt;span class="o">}&lt;/span>
&lt;span class="o">}&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>All of a sudden, even if the user who makes the request to S3 has correct credentials, but is connecting from a subnet which is outside the one specified above, the request will be &lt;strong>denied&lt;/strong>! This can be very useful for example, when trying restricting access to resources to be possible only from within a corporate network with specific CIDR range.&lt;/p>
&lt;p>One small issue with this source IP restriction is that it can cause issues for certain AWS services that run on behalf of a principal/user. When using the AWS Athena service for example, triggering a query on data stored in S3 means Athena will make S3 API requests on behalf of the user who initiated the Athena query, but will have a source IP address from some Amazon AWS CIDR range and the request will fail. For this purpose, there is an extra condition that can be added to remediate this issue:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-shell" data-lang="shell">data &lt;span class="s2">&amp;#34;aws_iam_policy_document&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;s3_ro_access_policy_document&amp;#34;&lt;/span> &lt;span class="o">{&lt;/span>
statement &lt;span class="o">{&lt;/span>
&lt;span class="nv">effect&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;Deny&amp;#34;&lt;/span>
&lt;span class="nv">actions&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="o">[&lt;/span>
&lt;span class="s2">&amp;#34;s3:*&amp;#34;&lt;/span>
&lt;span class="o">]&lt;/span>
&lt;span class="nv">resources&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="o">[&lt;/span> &lt;span class="s2">&amp;#34;*&amp;#34;&lt;/span>&lt;span class="o">]&lt;/span>
condition &lt;span class="o">{&lt;/span>
&lt;span class="nb">test&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;IpAddress&amp;#34;&lt;/span>
&lt;span class="nv">variable&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;aws:SourceIp&amp;#34;&lt;/span>
&lt;span class="nv">values&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="o">[&lt;/span> &lt;span class="s2">&amp;#34;192.168.2.0/24&amp;#34;&lt;/span> &lt;span class="o">]&lt;/span>
&lt;span class="o">}&lt;/span>
condition &lt;span class="o">{&lt;/span>
&lt;span class="nb">test&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;Bool&amp;#34;&lt;/span>
&lt;span class="nv">variable&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;aws:ViaAWSService&amp;#34;&lt;/span>
&lt;span class="nv">values&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="o">[&lt;/span> &lt;span class="s2">&amp;#34;false&amp;#34;&lt;/span> &lt;span class="o">]&lt;/span>
&lt;span class="o">}&lt;/span>
&lt;span class="o">}&lt;/span>
&lt;span class="o">}&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>The &lt;code>aws:viaAWSService = false&lt;/code> condition will ensure that this Deny will only take effect when the request context does not come from an AWS Service Endpoint. For additional info on what other possibilities exist that can be used to grant or deny access, please consult the AWS
&lt;a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_condition-keys.html" target="_blank" rel="noopener">documentation&lt;/a>.&lt;/p>
&lt;h2 id="conclusion">CONCLUSION&lt;/h2>
&lt;p>In this post I demonstrated how to use the boto3 python package to make AWS IAM and STS calls to access resources in the AWS cloud protected by IAM policies. I also discussed some advanced features of AWS IAM that can help you implement more granular IAM policies and access rights. The linked repository also contains an example which may be run locally and does not need the Lambda function to be created (it still, however, requires the Terraform resources to be deployed).&lt;/p></description></item><item><title>Cloud Service Testing</title><link>https://flrnks.netlify.app/post/python-aws-datadog-testing/</link><pubDate>Fri, 17 Jan 2020 11:11:00 +0000</pubDate><guid>https://flrnks.netlify.app/post/python-aws-datadog-testing/</guid><description>&lt;p>In this blog post I discuss a recent project I worked on to practice my skills related to AWS, Python and Datadog. It includes topics such as integration testing using &lt;strong>pytest&lt;/strong> and &lt;strong>localstack&lt;/strong>; running Continuous Integration via &lt;strong>Travis-CI&lt;/strong> and infrastructure as code using &lt;strong>Terraform&lt;/strong>.&lt;/p>
&lt;h2 id="intro">Intro&lt;/h2>
&lt;p>For the sake of this blog post, let&amp;rsquo;s assume that a periodic job runs somewhere in the Cloud, outside the context of this application, which generates a file with some meta-data about the job itself. This data includes mostly numerical values, such as the number of images used to train an ML model, or the number of files processed, etc. This part is depicted on the below diagram as a dummy Lambda function that periodically uploads this metadata file to an S3 bucket with random numerical values.&lt;/p>
&lt;p>&lt;img src="img/arch.png" alt="Architecture">&lt;/p>
&lt;p>When this file is uploaded, an event notification is sent to the message queue. The goal of the Python application is to periodically drain these messages from the queue. When the application runs, it fetches the S3 file referenced in each SQS message, parses the file&amp;rsquo;s contents and submits the numerical metrics to DataDog for the purpose of visualisation and alerting.&lt;/p>
&lt;h2 id="testing">Testing&lt;/h2>
&lt;p>Since the application interacts with two different APIs (AWS &amp;amp; Datadog), I figured it was a good idea to create integration tests that can be run easily via some free CI service (e.g.: Travis-CI.org). When writing the integration tests, I opted to create a simple mock class for testing the interaction with the Datadog API, and chose to rely on &lt;strong>localstack&lt;/strong> for testing the interaction with the AWS API.&lt;/p>
&lt;p>Thanks to &lt;strong>localstack&lt;/strong> I could skip creating real resources in AWS and instead use free fake resources in a docker container, that mimic the real AWS API close to 100%. The AWS SDK called &lt;code>boto3&lt;/code> is very easy to reconfigure to connect to the fake resources in &lt;strong>localstack&lt;/strong> with the &lt;code>endpoint_url=&lt;/code> parameter.&lt;/p>
&lt;p>In the following sections I go through different phases of the project:&lt;/p>
&lt;ol>
&lt;li>coding the python app&lt;/li>
&lt;li>mocking Datadog statsd client&lt;/li>
&lt;li>setting up AWS resources in localstack&lt;/li>
&lt;li>creating integration tests&lt;/li>
&lt;li>Travis-CI integration&lt;/li>
&lt;li>running the datadog-agent locally&lt;/li>
&lt;li>setting up real AWS resources&lt;/li>
&lt;li>live testing&lt;/li>
&lt;/ol>
&lt;h3 id="-coding-the-python-app-">~ Coding the python app ~&lt;/h3>
&lt;p>The
&lt;a href="https://github.com/florianakos/python-testing/blob/master/app/submitter.py" target="_blank" rel="noopener">code&lt;/a> is mainly composed of two Python classes with methods to interact with AWS and DataDog. The &lt;strong>CloudResourceHandler&lt;/strong> class has methods to interact with S3 and SQS, which can be replaced in integration-tests with preconfigured &lt;code>boto3&lt;/code> clients for &lt;strong>localstack&lt;/strong>.&lt;/p>
&lt;p>The &lt;strong>MetricSubmitter&lt;/strong> class uses the &lt;strong>CloudResourceHandler&lt;/strong> internally and offers some additional methods for sending metrics to DataDog. Internally it uses statsd from the &lt;code>datadog&lt;/code> python
&lt;a href="https://pypi.org/project/datadog/" target="_blank" rel="noopener">package&lt;/a>, which can be replaced via dependency injection in integration tests with a mock statsd class that I created to test its interaction with the Datadog API.&lt;/p>
&lt;p>To connect to the real AWS &amp;amp; Datadog APIs (via a preconfigured local datadog-agent) there needs to be two environment variables specified at run-time:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>STATSD_HOST&lt;/strong> set to &lt;code>localhost&lt;/code>&lt;/li>
&lt;li>&lt;strong>SQS_QUEUE_URL&lt;/strong> set to the URL of the Queue&lt;/li>
&lt;/ul>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="n">os&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">environ&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;STATSD_HOST&amp;#39;&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s1">&amp;#39;localhost&amp;#39;&lt;/span>
&lt;span class="n">os&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">environ&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="s1">&amp;#39;SQS_QUEUE_URL&amp;#39;&lt;/span>&lt;span class="p">]&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s1">&amp;#39;https://sqs.eu-central-1.amazonaws.com/????????????/cloud-job-results-queue&amp;#39;&lt;/span>
&lt;span class="n">session&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">boto3&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">Session&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">profile_name&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="s1">&amp;#39;profile-name&amp;#39;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="n">MetricSubmitter&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">statsd&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="n">datadog_statsd&lt;/span>&lt;span class="p">,&lt;/span>
&lt;span class="n">sqs_client&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="n">session&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">client&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s1">&amp;#39;sqs&amp;#39;&lt;/span>&lt;span class="p">),&lt;/span>
&lt;span class="n">s3_client&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="n">session&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">client&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s1">&amp;#39;s3&amp;#39;&lt;/span>&lt;span class="p">))&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">run&lt;/span>&lt;span class="p">()&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>In addition, it also requires a preconfigured AWS profile in &lt;code>~/.aws/credentials&lt;/code> which is necessary for &lt;strong>boto3&lt;/strong> to authenticate to AWS:&lt;/p>
&lt;pre>&lt;code class="language-console" data-lang="console">[profile-name]
aws_access_key_id = XXXXXXXXXXXXXXX
aws_secret_access_key = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
region = eu-central-1
&lt;/code>&lt;/pre>&lt;p>But before running it, let&amp;rsquo;s set up some integration tests!&lt;/p>
&lt;h3 id="-mocking-datadog-statsd-client-">~ Mocking Datadog statsd client ~&lt;/h3>
&lt;p>In truth, the application does not interact directly with the Datadog API, but rather it uses &lt;strong>statsd&lt;/strong> from the &lt;code>datadog&lt;/code> python package, which interacts with the local &lt;code>datadog-agent&lt;/code>, which in turn forwards metrics and events to the Datadog API.&lt;/p>
&lt;p>To test this flow that relies on &lt;code>statsd&lt;/code>, I created a class called &lt;strong>DataDogStatsDHelper&lt;/strong>. This class has 2 functions (&lt;strong>gauge/event&lt;/strong>) with identical signatures to the real functions from the official &lt;code>datadog-statsd&lt;/code> package. However, the mock functions do not send anything to the &lt;code>datadog-agent&lt;/code>. Instead, they accumulate the values they were passed in local class variables:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="k">class&lt;/span> &lt;span class="nc">DataDogStatsDHelper&lt;/span>&lt;span class="p">:&lt;/span>
&lt;span class="n">event_title&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="bp">None&lt;/span>
&lt;span class="n">event_text&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="bp">None&lt;/span>
&lt;span class="n">event_alert_type&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="bp">None&lt;/span>
&lt;span class="n">event_tags&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="bp">None&lt;/span>
&lt;span class="n">event_counter&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mi">0&lt;/span>
&lt;span class="n">gauge_metric_name&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="bp">None&lt;/span>
&lt;span class="n">gauge_metric_value&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="bp">None&lt;/span>
&lt;span class="n">gauge_tags&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="bp">None&lt;/span>
&lt;span class="n">gauge_counter&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="mi">0&lt;/span>
&lt;span class="k">def&lt;/span> &lt;span class="nf">event&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="bp">self&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">title&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">text&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">alert_type&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="bp">None&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">aggregation_key&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="bp">None&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">source_type_name&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="bp">None&lt;/span>&lt;span class="p">,&lt;/span>
&lt;span class="n">date_happened&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="bp">None&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">priority&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="bp">None&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">tags&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="bp">None&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">hostname&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="bp">None&lt;/span>&lt;span class="p">):&lt;/span>
&lt;span class="o">...&lt;/span>
&lt;span class="k">def&lt;/span> &lt;span class="nf">gauge&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="bp">self&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">metric&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">value&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">tags&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="bp">None&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">sample_rate&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="bp">None&lt;/span>&lt;span class="p">):&lt;/span>
&lt;span class="o">...&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>When the MetricSubmitter class is tested, this mock class is injected instead of the real &lt;strong>statsd&lt;/strong> class, which enables assertions to be made and compare expectations with reality.&lt;/p>
&lt;h3 id="-aws-resources-in-localstack-">~ AWS resources in localstack ~&lt;/h3>
&lt;p>To test how the python app integrates with S3 and SQS, I decided to use &lt;strong>loalstack&lt;/strong>, running in a Docker container. To make it simple and repeatable, I created a &lt;code>docker-compose.yaml&lt;/code> file that allows the configuration parameters to be defined in YAML:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-yml" data-lang="yml">&lt;span class="k">version&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s1">&amp;#39;3.2&amp;#39;&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>&lt;span class="k">services&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">localstack&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">image&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>localstack/localstack&lt;span class="p">:&lt;/span>latest&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">container_name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>localstack&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">ports&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>- &lt;span class="s1">&amp;#39;4563-4599:4563-4599&amp;#39;&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>- &lt;span class="s1">&amp;#39;8080:8080&amp;#39;&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">environment&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>- SERVICES=s3&lt;span class="p">,&lt;/span>sqs&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>- AWS_ACCESS_KEY_ID=foo&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>- AWS_SECRET_ACCESS_KEY=bar&lt;span class="w">
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The resulting fake AWS resources are accessible via different ports on localhost. In this case, S3 runs on port &lt;strong>4572&lt;/strong> and SQS on port &lt;strong>4576&lt;/strong>. Refer to the
&lt;a href="https://github.com/localstack/localstack#overview" target="_blank" rel="noopener">docs&lt;/a> on GitHub for more details on ports used by other AWS services in localstack.&lt;/p>
&lt;p>It is important to note that when localstack starts up, it is completely empty. Thus, before the integration tests can run, it is necessary to provision the S3 bucket and SQS queue in localstack, just as one would normally do it when using real AWS resources.&lt;/p>
&lt;p>For this purpose, it&amp;rsquo;s possible to write a simple bash script that can be called from the localstack container as part of an automatic init script:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-shell" data-lang="shell">aws --endpoint-url&lt;span class="o">=&lt;/span>http://localhost:4572 s3api create-bucket --bucket &lt;span class="s2">&amp;#34;bucket-name&amp;#34;&lt;/span> --region &lt;span class="s2">&amp;#34;eu-central-1&amp;#34;&lt;/span>
aws --endpoint-url&lt;span class="o">=&lt;/span>http://localhost:4576 sqs create-queue --queue-name &lt;span class="s2">&amp;#34;queue-name&amp;#34;&lt;/span> --region &lt;span class="s2">&amp;#34;eu-central-1&amp;#34;&lt;/span> --attributes &lt;span class="s2">&amp;#34;MaximumMessageSize=4096,MessageRetentionPeriod=345600,VisibilityTimeout=30&amp;#34;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>However, for the sake of making the integration-tests self-contained, I opted to integrate this into the tests as part of a class setup phase that runs before any tests and sets up the required S3 bucket and SQS queue:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="nd">@classmethod&lt;/span>
&lt;span class="k">def&lt;/span> &lt;span class="nf">setUpClass&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="bp">cls&lt;/span>&lt;span class="p">):&lt;/span>
&lt;span class="bp">cls&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">ls&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">LocalStackHelper&lt;/span>&lt;span class="p">()&lt;/span>
&lt;span class="bp">cls&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">ls&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">get_s3_client&lt;/span>&lt;span class="p">()&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">create_bucket&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">Bucket&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="bp">cls&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">s3_bucket_name&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="bp">cls&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">ls&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">get_sqs_client&lt;/span>&lt;span class="p">()&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">create_queue&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">QueueName&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="bp">cls&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">sqs_queue_name&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="-creating-integration-tests-">~ Creating integration tests ~&lt;/h3>
&lt;p>As a next step I created the integration
&lt;a href="https://github.com/florianakos/python-testing/blob/master/app/test_submitter.py" target="_blank" rel="noopener">tests&lt;/a> which use the fake AWS resources in localstack, as well as the mock &lt;strong>statsd&lt;/strong> class for DataDog. I used two popular python packages to create these:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>unittest&lt;/strong> which is a built-in package&lt;/li>
&lt;li>&lt;strong>pytest&lt;/strong> which is a 3rd party package&lt;/li>
&lt;/ul>
&lt;p>Actually, the test cases only use &lt;strong>unittest&lt;/strong>, while &lt;strong>pytest&lt;/strong> is used for the simple collection and execution of those tests. To get started with the &lt;strong>unittest&lt;/strong> framework, I created a python class and implemented the test cases within this class:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="kn">import&lt;/span> &lt;span class="nn">unittest&lt;/span>
&lt;span class="kn">from&lt;/span> &lt;span class="nn">app.utils.datadog_fake_statsd&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">DataDogStatsDHelper&lt;/span>
&lt;span class="kn">from&lt;/span> &lt;span class="nn">app.utils.localstack_helper&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">LocalStackHelper&lt;/span>
&lt;span class="kn">from&lt;/span> &lt;span class="nn">app.submitter&lt;/span> &lt;span class="kn">import&lt;/span> &lt;span class="n">MetricSubmitter&lt;/span>
&lt;span class="k">class&lt;/span> &lt;span class="nc">ProjectIntegrationTesting&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">unittest&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">TestCase&lt;/span>&lt;span class="p">):&lt;/span>
&lt;span class="nd">@classmethod&lt;/span>
&lt;span class="k">def&lt;/span> &lt;span class="nf">setUpClass&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="bp">cls&lt;/span>&lt;span class="p">):&lt;/span>
&lt;span class="o">...&lt;/span>
&lt;span class="k">def&lt;/span> &lt;span class="nf">setUp&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="bp">self&lt;/span>&lt;span class="p">):&lt;/span>
&lt;span class="o">...&lt;/span>
&lt;span class="k">def&lt;/span> &lt;span class="nf">test_ddg_submitter_valid_payload&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="bp">self&lt;/span>&lt;span class="p">):&lt;/span>
&lt;span class="o">...&lt;/span>
&lt;span class="k">def&lt;/span> &lt;span class="nf">test_ddg_submitter_invalid_payload&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="bp">self&lt;/span>&lt;span class="p">):&lt;/span>
&lt;span class="o">...&lt;/span>
&lt;span class="k">def&lt;/span> &lt;span class="nf">test_aws_handler_invalid_s3key&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="bp">self&lt;/span>&lt;span class="p">):&lt;/span>
&lt;span class="o">...&lt;/span>
&lt;span class="k">def&lt;/span> &lt;span class="nf">test_aws_handler_valid_s3key&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="bp">self&lt;/span>&lt;span class="p">):&lt;/span>
&lt;span class="o">...&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>In the &lt;strong>setUpClass&lt;/strong> method, a few things are taken care of before tests can be executed:&lt;/p>
&lt;ul>
&lt;li>define class variables for the bucket &amp;amp; the queue&lt;/li>
&lt;li>create SQS &amp;amp; S3 clients using localstack endpoint url&lt;/li>
&lt;li>provision needed resources (Queue/Bucket) in localstack&lt;/li>
&lt;/ul>
&lt;p>To test the interaction with DataDog via the statsd client, the submitter app is executed, which stores some values in the mock &lt;strong>statsd&lt;/strong> class&amp;rsquo;s internal variables, which are then used in assertions to compare values with expectations.&lt;/p>
&lt;p>The other tests inspect the behaviour of the &lt;strong>CloudResourceHandler&lt;/strong> class. For example, one of the assertions tests whether the &lt;code>.has_available_messages()&lt;/code> function returns false when there are no more messages in the queue.&lt;/p>
&lt;p>A nice feature of &lt;strong>unittest&lt;/strong> is that it&amp;rsquo;s easy to define tasks that need to be executed before each test, to ensure a clean slate for each test. For example, the code in the &lt;strong>setUp&lt;/strong> method ensures two things:&lt;/p>
&lt;ul>
&lt;li>the fake SQS queue is emptied before each test&lt;/li>
&lt;li>class variables of the mock DataDog class are reset before each test&lt;/li>
&lt;/ul>
&lt;p>Theoretically, it would be possible to run the test by running &lt;code>pytest -s -v&lt;/code> in the python project&amp;rsquo;s root directory, however the tests rely on localstack, so they would fail&amp;hellip;&lt;/p>
&lt;h3 id="-travis-ci-integration-">~ Travis-CI integration ~&lt;/h3>
&lt;p>So now that the integration tests are created, I thought it would be really nice to have them automatically run in a CI service, whenever someone pushes changes to the Git repo. To this end, I created a free account on &lt;code>travis-ci.org&lt;/code> and integrated it with my github rep by creating a &lt;strong>.travis.yaml&lt;/strong> file with the below initial content:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-yaml" data-lang="yaml">&lt;span class="k">os&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>linux&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>&lt;span class="k">language&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>python&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>&lt;span class="k">python&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>- &lt;span class="s2">&amp;#34;3.8&amp;#34;&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>&lt;span class="k">services&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>- docker&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>&lt;span class="k">script&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>- {...}&lt;span class="w">
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>However, I still needed a way to run &lt;code>localstack&lt;/code> and then execute the integration tests within the CI environment. Luckily I found &lt;strong>docker-compose&lt;/strong> to be a perfect fit for this purpose. I had already created a yaml file to describe how to run &lt;code>localstack&lt;/code>, so now I could just simply add an extra container that would run my tests. Here is how I created a docker image to run the tests via docker-compose:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-dockerfile" data-lang="dockerfile">&lt;span class="k">FROM&lt;/span>&lt;span class="s"> python:3.8-alpine&lt;/span>&lt;span class="err">
&lt;/span>&lt;span class="err">&lt;/span>&lt;span class="k">WORKDIR&lt;/span>&lt;span class="s"> /app&lt;/span>&lt;span class="err">
&lt;/span>&lt;span class="err">&lt;/span>&lt;span class="k">COPY&lt;/span> ./requirements-test.txt ./&lt;span class="err">
&lt;/span>&lt;span class="err">&lt;/span>&lt;span class="k">RUN&lt;/span> apk add --no-cache --virtual .pynacl_deps build-base gcc make python3 python3-dev libffi-dev &lt;span class="se">\
&lt;/span>&lt;span class="se">&lt;/span> &lt;span class="o">&amp;amp;&amp;amp;&lt;/span> pip3 install --upgrade setuptools pip &lt;span class="se">\
&lt;/span>&lt;span class="se">&lt;/span> &lt;span class="o">&amp;amp;&amp;amp;&lt;/span> pip3 install --no-cache-dir -r requirements-test.txt &lt;span class="se">\
&lt;/span>&lt;span class="se">&lt;/span> &lt;span class="o">&amp;amp;&amp;amp;&lt;/span> rm requirements-test.txt&lt;span class="err">
&lt;/span>&lt;span class="err">&lt;/span>&lt;span class="k">COPY&lt;/span> ./utils/*.py ./utils/&lt;span class="err">
&lt;/span>&lt;span class="err">&lt;/span>&lt;span class="k">COPY&lt;/span> ./*.py ./&lt;span class="err">
&lt;/span>&lt;span class="err">&lt;/span>&lt;span class="k">ENV&lt;/span> LOCALSTACK_HOST localstack&lt;span class="err">
&lt;/span>&lt;span class="err">&lt;/span>&lt;span class="k">ENTRYPOINT&lt;/span> &lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;pytest&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;-s&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s2">&amp;#34;-v&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="err">
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>It installs the necessary dependencies to an alpine based python 3.8 image; adds the necessary source code, and finally executes &lt;strong>pytest&lt;/strong> to collect &amp;amp; run the tests. Here are the updates I had to make to the &lt;strong>docker-compose.yaml&lt;/strong> file:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-yaml" data-lang="yaml">&lt;span class="k">version&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="s1">&amp;#39;3.2&amp;#39;&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w">&lt;/span>&lt;span class="k">services&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">localstack&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>{...}&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">integration-tests&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">container_name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>cloud-job-it&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">build&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">context&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>.&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">dockerfile&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>Dockerfile-tests&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">depends_on&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>- &lt;span class="s2">&amp;#34;localstack&amp;#34;&lt;/span>&lt;span class="w">
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Docker Compose auto-magically creates a shared network to enable connectivity between the defined services, which can call one-another by name. So when the tests are running in the &lt;strong>cloud-job-it&lt;/strong> container, they can use the hostname &lt;strong>localstack&lt;/strong> to create the &lt;strong>boto3&lt;/strong> session via the endpoint url to reach the fake AWS resources.&lt;/p>
&lt;p>For easier to creation of AWS clients via localstack, I used a package called
&lt;a href="https://github.com/localstack/localstack-python-client" target="_blank" rel="noopener">localstack-python-client&lt;/a>, so I don&amp;rsquo;t have to deal with port numbers and low level details. However, this client by default tries to use &lt;strong>localhost&lt;/strong> as the hostname, which wouldn&amp;rsquo;t work in my setup using docker-compose. After digging through the source-code of this python package, I found a way to change this by setting an environment variable named &lt;strong>LOCALSTACK_HOST&lt;/strong>.&lt;/p>
&lt;p>As a final step, I just had to add two lines to complete to the &lt;strong>.travis.yaml&lt;/strong> file:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-yaml" data-lang="yaml">&lt;span class="k">script&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>- docker-compose&lt;span class="w"> &lt;/span>up&lt;span class="w"> &lt;/span>--build&lt;span class="w"> &lt;/span>--abort-on-container-exit&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>- docker-compose&lt;span class="w"> &lt;/span>down&lt;span class="w"> &lt;/span>-v&lt;span class="w"> &lt;/span>--rmi&lt;span class="w"> &lt;/span>all&lt;span class="w"> &lt;/span>--remove-orphans&lt;span class="w">
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Thanks to the &lt;code>--abort-on-container-exit&lt;/code> flag, docker-compose will return the same exit code which is returned from the container that first exits, which first this use-case perfectly, as the &lt;strong>cloud-job-it&lt;/strong> container only runs until the tests finish. This way the whole setup will gracefully shut down, while preserving the exit code from the container, allowing the CI system to generate an alert if it&amp;rsquo;s not 0 (meaning some test failed).&lt;/p>
&lt;h3 id="-running-the-datadog-agent-locally-">~ Running the datadog-agent locally ~&lt;/h3>
&lt;p>&lt;strong>Note&lt;/strong>: while Datadog is a paid service, it&amp;rsquo;s possible to create a trial account that&amp;rsquo;s free for 2 weeks, without the need to enter credit card details. This is pretty amazing!&lt;/p>
&lt;p>Now that the integration tests are automated and passing, I wanted to run the &lt;code>datadog-agent&lt;/code> locally, so that I can test the python application with some real data that was to he submitted to Datadog via the agent. Here is an
&lt;a href="https://docs.datadoghq.com/getting_started/agent/?tab=datadogeusite" target="_blank" rel="noopener">article&lt;/a> that was particularly useful to me, with instructions on how the agent should be set up.&lt;/p>
&lt;p>While the option of running it in docker-compose was initially appealing, I eventually decided to just start it manually as a long-lived detached container. Here is how I went about doing that:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-shell" data-lang="shell">&lt;span class="nv">DOCKER_CONTENT_TRUST&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="m">1&lt;/span> docker run -d &lt;span class="se">\
&lt;/span>&lt;span class="se">&lt;/span> --name dd-agent &lt;span class="se">\
&lt;/span>&lt;span class="se">&lt;/span> -v /var/run/docker.sock:/var/run/docker.sock:ro &lt;span class="se">\
&lt;/span>&lt;span class="se">&lt;/span> -v /proc/:/host/proc/:ro &lt;span class="se">\
&lt;/span>&lt;span class="se">&lt;/span> -v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro &lt;span class="se">\
&lt;/span>&lt;span class="se">&lt;/span> -e &lt;span class="nv">DD_API_KEY&lt;/span>&lt;span class="o">=&lt;/span>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX &lt;span class="se">\
&lt;/span>&lt;span class="se">&lt;/span> -e &lt;span class="nv">DD_SITE&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="s2">&amp;#34;datadoghq.eu&amp;#34;&lt;/span> &lt;span class="se">\
&lt;/span>&lt;span class="se">&lt;/span> -e &lt;span class="nv">DD_DOGSTATSD_NON_LOCAL_TRAFFIC&lt;/span>&lt;span class="o">=&lt;/span>&lt;span class="nb">true&lt;/span> &lt;span class="se">\
&lt;/span>&lt;span class="se">&lt;/span> -p 8125:8125/udp &lt;span class="se">\
&lt;/span>&lt;span class="se">&lt;/span> datadog/agent:7
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Most notable of these lines is the &lt;strong>DD_API_KEY&lt;/strong> environment variable which ensures that whatever data I send to the agent is associated with my own account. In addition, since I am closest to the EU region, I had to specify the endpoint via the &lt;strong>DD_SITE&lt;/strong> variable. Also, because I want the agent to accept metrics from the python app, I need to turn on a feature via the environment variable &lt;strong>DD_DOGSTATSD_NON_LOCAL_TRAFFIC&lt;/strong>, as well as expose port 8125 from the docker container to the host machine:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-bash" data-lang="bash"> ▶ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
477cb2ea74b2 datadog/agent &lt;span class="s2">&amp;#34;/init&amp;#34;&lt;/span> &lt;span class="m">3&lt;/span> days ago Up &lt;span class="m">3&lt;/span> days &lt;span class="o">(&lt;/span>healthy&lt;span class="o">)&lt;/span> 0.0.0.0:8125-&amp;gt;8125/udp, 8126/tcp dd-agent
&lt;/code>&lt;/pre>&lt;/div>&lt;p>All seems to be well!&lt;/p>
&lt;h3 id="-deploying-real-aws-resources-">~ Deploying real AWS resources ~&lt;/h3>
&lt;p>Here I briefly discuss how I deployed some real resources in AWS to see my application running live. In a nutshell, I set the infra up as code in Terraform, which greatly simplified the whole process. All the necessary files are collected in a
&lt;a href="https://github.com/florianakos/python-testing/tree/master/terraform" target="_blank" rel="noopener">directory&lt;/a> of my repository:&lt;/p>
&lt;ul>
&lt;li>&lt;code>variables.tf&lt;/code> defines some variables used in multiple places&lt;/li>
&lt;li>&lt;code>init.tf&lt;/code> initialisation of the AWS provider and definition of AWS resources&lt;/li>
&lt;li>&lt;code>outputs.tf&lt;/code> defines some values that are reported when deployment finishes&lt;/li>
&lt;/ul>
&lt;p>The first and last files are not very interesting. Most of the interesting stuff happens in the &lt;strong>init.tf&lt;/strong>, which defines the necessary resources and permissions. One extra resource not mentioned before, is an AWS Lambda function, which gets executed every minute and is used to upload a JSON file to the S3 bucket. This acts as a random source of data, so that the python app has some work to do without manual intervention.&lt;/p>
&lt;h3 id="-live-testing-">~ Live testing ~&lt;/h3>
&lt;p>Now that all parts seem to be ready, it&amp;rsquo;s time to run the main python app using the real S3 bucket and SQS queue, as well as the local datadog-agent. The console output provides some hints whether it&amp;rsquo;s able to pump the metrics from AWS to a DataDog:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-bash" data-lang="bash">▶ python3 submitter.py
Initializing new Cloud Resource Handler with SQS URL - https://.../cloud-job-results-queue
Processing available messages in SQS queue:
- sending data to DataDog via statsd/datadog-agent.
- removing message from SQS &lt;span class="o">(&lt;/span>AQEBO37smPPHg6OIqbh3HMu3g...&lt;span class="o">)&lt;/span>
- ...
- sending data to DataDog via statsd/datadog-agent.
- removing message from SQS &lt;span class="o">(&lt;/span>AQEBV0/JzMVEP6k5kBmx2kvGn...&lt;span class="o">)&lt;/span>
No more messages visible in the queue, shutting down ...
Process finished with &lt;span class="nb">exit&lt;/span> code &lt;span class="m">0&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Next, I checked my DataDog account to see whether the metric data arrived. For this I created a custom
&lt;a href="https://app.datadoghq.eu/notebook/list" target="_blank" rel="noopener">Notebook&lt;/a> with graphs to display them:&lt;/p>
&lt;p>&lt;img src="img/datadog-metrics.png" alt="DataDog Metrics">&lt;/p>
&lt;p>All seems to be well! The deployed AWS Lambda function has already run a few times, providing input data for the python app, which were successfully processed and forwarded to Datadog. As seen on the &lt;code>Notebook&lt;/code> above, it is really easy to display metric data over time about any recurring workload, which can provide pretty useful insights into those jobs.&lt;/p>
&lt;p>Furthermore, since DataDog also submission of
&lt;a href="https://docs.datadoghq.com/events/" target="_blank" rel="noopener">events&lt;/a> it becomes possible to design dashboards and create alerts which trigger based on mor complex criteria, such as the presence or lack of events over certain periods of time. One such example can be seen below:&lt;/p>
&lt;p>&lt;img src="img/ok-vs-fail.png" alt="DataDog Dashboard OK">&lt;/p>
&lt;p>This is a so-called
&lt;a href="https://docs.datadoghq.com/dashboards/screenboards/" target="_blank" rel="noopener">screen-board&lt;/a> which I created to display the status of a Monitor that I set up previously. This Monitor tracks incoming events with the tag &lt;strong>cloud_job_metric&lt;/strong> and generates an alert, if there is not at least one such event of type &lt;strong>success&lt;/strong> in the last 30 minutes. The screen-board can be exported via a public URL if needed, or just simply displayed on a big screen somewhere in the office.&lt;/p>
&lt;h2 id="conclusions">Conclusions&lt;/h2>
&lt;p>In this post I discussed a relatively complex project with lots of exciting technology working together in the realm of Cloud Computing. In the end, I was able to create DashBoards and Monitors in DataDog, which can ingest and display telemetry about AWS workloads, in a way that makes it useful to track and monitor the workloads themselves.&lt;/p></description></item><item><title>RunCode.ninja Challenges</title><link>https://flrnks.netlify.app/post/runcode/</link><pubDate>Sat, 11 Jan 2020 11:11:00 +0000</pubDate><guid>https://flrnks.netlify.app/post/runcode/</guid><description>&lt;p>This post was born on a misty saturday morning, while slowly sipping some good quality coffe in a Prague café. The last several days after work was over I spent solving programming challenges on
&lt;a href="https://runcode.ninja/" target="_blank" rel="noopener">runcode.ninja&lt;/a> and I thought it would be nice to share my experience and spread the word about it.&lt;/p>
&lt;h3 id="runcodeninja">RunCode.ninja&lt;/h3>
&lt;p>I can&amp;rsquo;t really recall how I discovered this website in the first place&amp;hellip; All I remember is that I was really into the simplistic idea of it all. The basic idea for most of the challenges goes something like this:&lt;/p>
&lt;ul>
&lt;li>check problem description&lt;/li>
&lt;li>inspect any sample input (if any)&lt;/li>
&lt;li>write your program locally&lt;/li>
&lt;li>test on sample input (if any)&lt;/li>
&lt;li>submit source code to the evaluation platform&lt;/li>
&lt;/ul>
&lt;p>If all went well, you will get feedback within a few seconds whether the submitted code worked correctly for the given task at hand. If it didn&amp;rsquo;t, then you can turn to their
&lt;a href="https://runcode.ninja/faq" target="_blank" rel="noopener">FAQ&lt;/a> for some advice. It definitely has some useful info, however if all else fails, you can also contact the team behind the platform on their slack
&lt;a href="runcodeslack.slack.com">channel&lt;/a>. They are really friendly people so be sure to respond to their effort in kind!&lt;/p>
&lt;p>&lt;img src="runcode.png" alt="easy-category">&lt;/p>
&lt;p>Another nice thing about their platform is that they categorized all their challenges (119 in total as of now) into nice categories such as &lt;code>binary, encoding, encryption, forensics, etc.&lt;/code> which allows you to select what you are interested in. When I started out, I was first aiming to complete the challenges in &lt;code>Easy&lt;/code> which offers a combination of relatively easy challenges from &lt;code>math, text-parsing, encoding&lt;/code> and other categories.&lt;/p>
&lt;p>As it currently stands, I rank 155 our of around ~2400 registered users, which seems quite impressive at first, but I suspect there may be quite a few inactive accounts in their database. Also, there are some hardcore people who have already completed all their challenges that seems quite impressive. If only a few rainy and cold weekends I could spend working on these, I would probably catch up soon!&lt;/p>
&lt;p>Last but not least, their platform is set up to interpret a several different programming languages, so you can choose to solve them in the language you are most comfortable with. Once you solve a challenge, you can access its &lt;code>write-ups&lt;/code> which provide some very useful inspiration on how others have solved the same problem. This can provide some very valuable lessons, like that one time when I wrote a Go program that was 20 lines long to solve a challenge that took only 1 line into solve in Bash&amp;hellip;&lt;/p>
&lt;p>If you are interested to check out my solutions for some of the challenges, you can find them in my GitHub
&lt;a href="https://github.com/florianakos/codewarz" target="_blank" rel="noopener">repository&lt;/a>. For some of them I even created two different solutions, one in Python and another Go, just to compare and practice working with both languages.&lt;/p>
&lt;p>Oh and I almost forgot to mention, they have some really cool stickers that they are not shy to send half-way across the world by post, so that&amp;rsquo;s another big plus for sticker fans :)&lt;/p>
&lt;p>&lt;img src="sticker.png" alt="sticket">&lt;/p>
&lt;p>That&amp;rsquo;s all for now, thank you for tuning in! :)&lt;/p></description></item><item><title>Infrastructure as Code</title><link>https://flrnks.netlify.app/post/infra-as-code/</link><pubDate>Tue, 12 Nov 2019 11:11:00 +0000</pubDate><guid>https://flrnks.netlify.app/post/infra-as-code/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>In this post I will briefly introduce different AWS services and show how to use Terraform to orchestrate and manage them. While the concept of the whole service is rather simple, its main use is enabling me to learn about this new emerging technology called Infrastructure-as-Code or IaC for short.&lt;/p>
&lt;h2 id="project-overview">Project overview&lt;/h2>
&lt;p>The main goal of this task is to deploy a server-less function and periodically query the Github API to get a list of public repositories for a given organisation (e.g.: Google). The retrieved information should then be stored in a compressed CSV file in a specific S3 bucket, while notifications should be created for new files saved to the bucket.&lt;/p>
&lt;p>&lt;img src="arch.png" alt="Go concurrency implemented">&lt;/p>
&lt;p>The main AWS components of the solution are:&lt;/p>
&lt;ul>
&lt;li>Lambda function written in Python&lt;/li>
&lt;li>CW Event Rule to schedule the Lambda periodically&lt;/li>
&lt;li>S3 for storing data in a bucket&lt;/li>
&lt;li>SQS for queueing notifications from S3&lt;/li>
&lt;/ul>
&lt;h2 id="possibilities">Possibilities&lt;/h2>
&lt;p>Various methods exist for the creation and configuration of these necessary resources. The most simple one is by logging in to the AWS Management Console and setting up each components one by one via the GUI. This method, however, is slow, cumbersome and quite prone to errors.&lt;/p>
&lt;p>A better option can be to use the
&lt;a href="https://aws.amazon.com/tools/" target="_blank" rel="noopener">AWS SDK&lt;/a> for your favourite programming language. Several options exist, such as Java, Python, GO, Node.js, etc&amp;hellip; This option is less error-prone, but still quite cumbersome and slow.&lt;/p>
&lt;p>Perhaps one of the best options is to use Terraform, which is a popular Infrastructure as Code or IaC tool these days. It lets you define your infrastructure in a configuration language and has its own internal engine that talks to the AWS SDK to create the necessary infrastructure you defined.&lt;/p>
&lt;h2 id="setup-procedure">Setup procedure&lt;/h2>
&lt;p>Before we can make use of Terraform to deploy our project on AWS, we need to set up credentials. This can be done by logging in to the AWS management console and going to Identity and Access Management section, which can provide the necessarz Access Key and Secret value that you need to put into a file on disk. These credentials should be saved to &lt;code>~/.aws/credentials&lt;/code> as follows:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-bash" data-lang="bash">&lt;span class="o">[&lt;/span>default&lt;span class="o">]&lt;/span>
&lt;span class="nv">aws_access_key_id&lt;/span> &lt;span class="o">=&lt;/span> XXXXXXXXXXXX
&lt;span class="nv">aws_secret_access_key&lt;/span> &lt;span class="o">=&lt;/span> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
&lt;/code>&lt;/pre>&lt;/div>&lt;p>This enables Terraform to make changes to your AWS infrastructure through API calls made to AWS to provision resources according to your definition in the .tf file. Once you create the desired configuration a complete infrastructure can be deployed as simply as below:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-bash" data-lang="bash">$ ▶ ls -la
-rw-r--r-- &lt;span class="m">1&lt;/span> user group 4.9K Nov &lt;span class="m">21&lt;/span> 22:58 main.tf
$ ▶ terraform init
...
Terraform has been successfully initialized!
$ ▶ terraform apply
...
Plan: &lt;span class="m">13&lt;/span> to add, &lt;span class="m">0&lt;/span> to change, &lt;span class="m">2&lt;/span> to destroy.
Do you want to perform these actions?
Enter a value: YES
&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="project-building-blocks">Project building blocks&lt;/h2>
&lt;p>In this section I will go over each major component and explain what it is, what it does and how it is set up. First up is the main component: the core logic implemented in Python.&lt;/p>
&lt;h3 id="aws-simple-storage-service">AWS Simple Storage Service&lt;/h3>
&lt;p>This is a basic building block which we use to store data generated by the Lambda function. Since Lambdas are by nature server-less, they do not have persistent storage attached which can be used to save data between two invocations of the function. If we need persistent storage we need to use S3. The necessary Terraform code is below:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-tf" data-lang="tf">&lt;span class="kr">resource&lt;/span> &lt;span class="s2">&amp;#34;aws_s3_bucket&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;tf_aws_bucket&amp;#34;&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="na">bucket&lt;/span> = &lt;span class="s2">&amp;#34;tf-aws-bucket&amp;#34;&lt;/span>
&lt;span class="na">tags&lt;/span> = &lt;span class="p">{&lt;/span>
&lt;span class="na">Name&lt;/span> = &lt;span class="s2">&amp;#34;Bucket for Terraform project&amp;#34;&lt;/span>
&lt;span class="na">Environment&lt;/span> = &lt;span class="s2">&amp;#34;Dev&amp;#34;&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;span class="na">force_destroy&lt;/span> = &lt;span class="s2">&amp;#34;true&amp;#34;&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>This will create a bucket named &lt;code>tf-aws-bucket&lt;/code> which we can then use to store the results of our Lambda function. As an extra feature, we also configured notifications for this bucket, which will be created when a compressed file with &lt;code>.gz&lt;/code> file type is created in the bucket. When this happens a notification will be generated and sent to the SQS queue that is also defined in the same Terraform file.&lt;/p>
&lt;h3 id="aws-lambda">AWS Lambda&lt;/h3>
&lt;p>AWS Lambda is a server-less technology which lets you create a bare function in the cloud and call it from various other services, without having to worry about setting up an environment where it will run. Different programming language are supported, such as Python, Java, Go and NodeJS. Once you deploy your code, you can receive input to your function just as normally when you write a function, and give it permission to access and modify other resources in AWS, such as working with files stored in S3.&lt;/p>
&lt;p>This is exactly the use-case that was implemented in this project. A lambda function that makes an API call to Github to download information, then store this in a compressed CSV file to an S3 bucket. To define the target organisation and the bucket where information is saved, the Lambda function expects two arguments in the function call:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-json" data-lang="json">&lt;span class="p">{&lt;/span>
&lt;span class="nt">&amp;#34;org_name&amp;#34;&lt;/span> &lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;twitter&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;span class="nt">&amp;#34;target_bucket&amp;#34;&lt;/span> &lt;span class="p">:&lt;/span> &lt;span class="s2">&amp;#34;repos_folder&amp;#34;&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>This JSON input passed to the function is converted to a map in Python, which can be tested for the presence of necessary keys for the correct functioning of the code:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="k">def&lt;/span> &lt;span class="nf">handler&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">event&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">context&lt;/span>&lt;span class="p">):&lt;/span>
&lt;span class="c1"># verify that URL is passed correctly and create file_name variable based on it&lt;/span>
&lt;span class="k">if&lt;/span> &lt;span class="s1">&amp;#39;org_name&amp;#39;&lt;/span> &lt;span class="ow">not&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">event&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">keys&lt;/span>&lt;span class="p">()&lt;/span> &lt;span class="ow">or&lt;/span> &lt;span class="s1">&amp;#39;target_bucket&amp;#39;&lt;/span> &lt;span class="ow">not&lt;/span> &lt;span class="ow">in&lt;/span> &lt;span class="n">event&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">keys&lt;/span>&lt;span class="p">():&lt;/span>
&lt;span class="k">print&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;Missing &amp;#39;org_name&amp;#39; from request body (JSON)!&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>The rest of the function&amp;rsquo;s code downloads the list of public repositories of the passed organisation from Github API and store this in a temporary file that can be uploaded to S3, provided that the necessary permissions have been granted to this Lambda function:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-python" data-lang="python">&lt;span class="kn">import&lt;/span> &lt;span class="nn">boto3&lt;/span>
&lt;span class="n">s3&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="n">boto3&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">client&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s2">&amp;#34;s3&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="n">s3&lt;/span>&lt;span class="o">.&lt;/span>&lt;span class="n">upload_file&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="n">path_to_local_file&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">target_bucket_name&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="n">key_name&lt;/span>&lt;span class="p">)&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>In order to enable access to S3 from Lambda, we have to define some IAM policies and roles. First we have to define a policy which says that the role, which obtains this policy can access the S3 bucket:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-tf" data-lang="tf">&lt;span class="kr">data&lt;/span> &lt;span class="s2">&amp;#34;aws_iam_policy_document&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;s3_lambda_access&amp;#34;&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="nx">statement&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="na">effect&lt;/span> = &lt;span class="s2">&amp;#34;Allow&amp;#34;&lt;/span>
&lt;span class="na">resources&lt;/span> = &lt;span class="p">[&lt;/span>&lt;span class="s2">&amp;#34;arn:aws:s3:::tf-aws-bucket/*&amp;#34;&lt;/span>&lt;span class="p">]&lt;/span>
&lt;span class="na">actions&lt;/span> = &lt;span class="p">[&lt;/span>
&lt;span class="s2">&amp;#34;s3:GetObject&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;span class="s2">&amp;#34;s3:PutObject&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;span class="s2">&amp;#34;s3:ListBucket&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span>
&lt;span class="p">]&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;span class="kr">
&lt;/span>&lt;span class="kr">resource&lt;/span> &lt;span class="s2">&amp;#34;aws_iam_policy&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;s3_lambda_access&amp;#34;&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="na">name&lt;/span> = &lt;span class="s2">&amp;#34;s3_lambda_access&amp;#34;&lt;/span>
&lt;span class="na">policy&lt;/span> = &lt;span class="nb">data&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">aws_iam_policy_document&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">s3_lambda_access&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">json&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>This policy is then attached to an IAM role which is allowed to be assumed by AWS Lambda:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-tf" data-lang="tf">&lt;span class="kr">resource&lt;/span> &lt;span class="s2">&amp;#34;aws_iam_role_policy_attachment&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;s3_lambda_access&amp;#34;&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="na">role&lt;/span> = &lt;span class="nx">aws_iam_role&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">tf_aws_exercise_role&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">name&lt;/span>
&lt;span class="na">policy_arn&lt;/span> = &lt;span class="nx">aws_iam_policy&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">s3_lambda_access&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">id&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;span class="kr">
&lt;/span>&lt;span class="kr">resource&lt;/span> &lt;span class="s2">&amp;#34;aws_iam_role&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;tf_aws_exercise_role&amp;#34;&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="na">name&lt;/span> = &lt;span class="s2">&amp;#34;tfExerciseRole&amp;#34;&lt;/span>
&lt;span class="na">description&lt;/span> = &lt;span class="s2">&amp;#34;Role that allowed to be assumed by AWS Lambda, which will be taking all actions.&amp;#34;&lt;/span>
&lt;span class="na">tags&lt;/span> = &lt;span class="p">{&lt;/span>
&lt;span class="na">owner&lt;/span> = &lt;span class="s2">&amp;#34;tfExerciseBoss&amp;#34;&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;span class="na">assume_role_policy&lt;/span> = &lt;span class="o">&amp;lt;&amp;lt;EOF&lt;/span>&lt;span class="s">
&lt;/span>&lt;span class="s">{
&lt;/span>&lt;span class="s"> &amp;#34;Version&amp;#34;: &amp;#34;2012-10-17&amp;#34;,
&lt;/span>&lt;span class="s"> &amp;#34;Statement&amp;#34;: [
&lt;/span>&lt;span class="s"> {
&lt;/span>&lt;span class="s"> &amp;#34;Action&amp;#34;: &amp;#34;sts:AssumeRole&amp;#34;,
&lt;/span>&lt;span class="s"> &amp;#34;Principal&amp;#34;: {
&lt;/span>&lt;span class="s"> &amp;#34;Service&amp;#34;: &amp;#34;lambda.amazonaws.com&amp;#34;
&lt;/span>&lt;span class="s"> },
&lt;/span>&lt;span class="s"> &amp;#34;Effect&amp;#34;: &amp;#34;Allow&amp;#34;
&lt;/span>&lt;span class="s"> }
&lt;/span>&lt;span class="s"> ]
&lt;/span>&lt;span class="s">}
&lt;/span>&lt;span class="s">&lt;/span>&lt;span class="o">EOF&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="aws-cloudwatch-events">AWS CloudWatch Events&lt;/h3>
&lt;p>This component is responsible for periodically making a call to our Lambda function, with the required arguments passed in JSON format. This component was also configured via Terraform, but for the sake of simplicity, below is a screenshot taken from the AWS Management Console where the created CW event shows up as configured:&lt;/p>
&lt;p>&lt;img src="cwe.png" alt="Cloudwatch Events Rule">&lt;/p>
&lt;p>The screen-shot shows that it is configured to periodically execute a Target Lambda function every 2 minutes.&lt;/p>
&lt;h3 id="results">Results&lt;/h3>
&lt;p>In summary, it took me a while to get the hang of Infrastructure as Code concept and apply it while working with Terraform on AWS, but I can definitely see how it can benefit a bigger organisation which want their Cloud infrastructure to be stable and maintainable. IaC tools such as Terraform let developers define their infrastructure as code and check it in to version control for repeatable and more predictable deployment procedures. Now that I have this working project, I can do a simple &lt;code>terraform deploy&lt;/code> to bring alive my service with all required components and permissions correctly set up in seconds, while also being able to quickly destroy it if I chose to do so. This gives flexibility and greater ease of development that can speed up projects in the cloud.&lt;/p></description></item></channel></rss>