<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>go | FLRNKS</title><link>https://flrnks.netlify.app/tag/go/</link><atom:link href="https://flrnks.netlify.app/tag/go/index.xml" rel="self" type="application/rss+xml"/><description>go</description><generator>Source Themes Academic (https://sourcethemes.com/academic/)</generator><language>en-us</language><copyright>© 2024</copyright><lastBuildDate>Sun, 12 Jul 2020 11:11:00 +0000</lastBuildDate><image><url>https://flrnks.netlify.app/images/icon_hu0b7a4cb9992c9ac0e91bd28ffd38dd00_9727_512x512_fill_lanczos_center_2.png</url><title>go</title><link>https://flrnks.netlify.app/tag/go/</link></image><item><title>Testing Terraform Modules</title><link>https://flrnks.netlify.app/post/terraform-testing/</link><pubDate>Sun, 12 Jul 2020 11:11:00 +0000</pubDate><guid>https://flrnks.netlify.app/post/terraform-testing/</guid><description>&lt;h2 id="intro">Intro&lt;/h2>
&lt;p>I first head of Terraform about 1 year ago while working on an assignment for a job interview. The learning curve was steep, and I still remember how confused I was about the syntax of HCL that resembled JSON but was not exactly the same. I also remember hearing about the concept of Terraform Modules, but for the assignment it was not needed, so I skipped it for the time being.&lt;/p>
&lt;p>Fast forward to present day, I&amp;rsquo;ve had a good amount of exposure to Terraform Modules at work, where we use them to provision resources on AWS in a standardized and rapid fashion. In order to broaden my knowledge on Terraform Modules, I decided to create an exercise in which I created two TF Modules with using version 0.12 of Terraform. In this post I wanted to describe these two Terraform Modules and how I went about testing them to ensure they did what they were meant to.&lt;/p>
&lt;h2 id="what-is-a-terraform-module">What is a Terraform Module&lt;/h2>
&lt;p>According to official
&lt;a href="https://www.terraform.io/docs/configuration/modules.html" target="_blank" rel="noopener">documentation&lt;/a> a Terraform module is simply a container for multiple resources that are defined and used together. Terraform Modules can be embedded in each other to create a hierarchical structure of dependent resources. To define a Terraform Module one needs to create one or more Terraform files that define some input variables, some resources and some outputs. The input variabls are used to control properties of the resources, while the outputs are used to reveal information about the created resources. These are often organized into such structure as follows:&lt;/p>
&lt;ul>
&lt;li>&lt;code>variables.tf&lt;/code> defining the Terraform variables&lt;/li>
&lt;li>&lt;code>main.tf&lt;/code> creating the Terraform resources&lt;/li>
&lt;li>&lt;code>output.tf&lt;/code> listing the Terraform outputs&lt;/li>
&lt;/ul>
&lt;p>Note that the above is just an un-enforced convention, it simply makes it easier to get a quick understanding about a Terraform Module. As an example, if an organization needs to have their AWS S3 buckets secured with the same policies to protect their data, they can embed these security policies in a TF Module and then prescribe its use within the organization to enable those security policies automatically. Next up is an example of just that.&lt;/p>
&lt;h2 id="the-secure-bucket-tf-module">The Secure-Bucket TF Module&lt;/h2>
&lt;p>The first of the 2 Terraform Modules is &lt;code>tf-module-s3-bucket&lt;/code> which can be used to create an S3 bucket in AWS that is secured to a higher degree, so that it may be suitable for storing highly sensitive data. The security features of the bucket consists of:&lt;/p>
&lt;ul>
&lt;li>filtering on Source IPs that can access its contents&lt;/li>
&lt;li>enforcing encryption at rest (KMS) and in transit&lt;/li>
&lt;li>object-level and server access logging enabled&lt;/li>
&lt;li>filtering on IAM principals based on official
&lt;a href="https://aws.amazon.com/blogs/security/how-to-restrict-amazon-s3-bucket-access-to-a-specific-iam-role/" target="_blank" rel="noopener">docs&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>When using this module, one can define a list of IPs, and a list of IAM Principals to control who and from which networks can access the contents of the bucket. These restrictions are written into the Bucket Policy, which is considered a &lt;code>resource-based policy&lt;/code> that always takes precendence over Identity based policies, so it does not matter if an IAM Role has specific permission granted to access the bucket, if the bucket&amp;rsquo;s own Bucket Policy denies the same access. Below is a good overview of the whole evaluation logic of AWS IAM:&lt;/p>
&lt;p>&lt;img src="static/aws-iam.png" alt="AWS IAM Evaluation Logic">&lt;/p>
&lt;p>In addition, server-access and object-level logging can be enabled as well to improve the bucket&amp;rsquo;s level of auditability. Altogether, these settings can greatly elevate the security of data in the S3 bucket that was created by this module.&lt;/p>
&lt;h2 id="the-s3-authz-tf-module">The S3-AuthZ TF Module&lt;/h2>
&lt;p>This 2nd Terraform Module is called &lt;code>tf-module-s3-auth&lt;/code> and it was written to in part to complement the other one used to create an S3 bucket. The aim of this module is to help with the creation of a single IAM policy that can cover the S3 and KMS permissions needed for a given IAM Principal. The motivation behind this module comes from some difficulties I&amp;rsquo;ve faced at work which meant that some IAM Roles we used had too many policies attached. For further reference see the AWS
&lt;a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_iam-quotas.html" target="_blank" rel="noopener">docs&lt;/a> on this.&lt;/p>
&lt;p>The Bucket Policy that is crafted by the first TF Module allows the definition of list of IAM Principals that are allowed to interact with the bucket. With this TF module one can actually define the particular S3 actions that those IAM Principals CAN carry out on the data in the bucket. Additionally, this TF module can also be used allow KMS actions on the KMS keys that are protecting the data at rest in the bucket.&lt;/p>
&lt;h2 id="untested-code-is-broken-code">Untested code is broken code&lt;/h2>
&lt;p>With infrastructure-as-code, just as with normal code, testing is often an afterthought. However, it seems to be catching on more and more nowadays. Nothing shows this better than the amount of search results in Google for &lt;code>Infrastructure as Code testing&lt;/code>: &lt;strong>235.000.000&lt;/strong> as of today (15.8.2020). While Infrastructure as Code is a much broader topic with many other interesting projects, this post will have a sole focus on Terraform. With Terraform, a good step in the right direction is as simple as running &lt;code>terraform validate&lt;/code> that can catch silly mistakes and syntax errors and provide feedback such as below:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-shell" data-lang="shell">Error: Missing required argument
on main.tf line 107, in output &lt;span class="s2">&amp;#34;s3_bucket_name&amp;#34;&lt;/span>:
107: output &lt;span class="s2">&amp;#34;s3_bucket_name&amp;#34;&lt;/span> &lt;span class="o">{&lt;/span>
The argument &lt;span class="s2">&amp;#34;value&amp;#34;&lt;/span> is required, but no definition was found.
&lt;/code>&lt;/pre>&lt;/div>&lt;p>In addition to the &lt;code>terraform validate&lt;/code> option, many IDEs such as IntelliJ, already have plugins that can alert to such issues, so I find myself not using it so often. However, it&amp;rsquo;s still nice to have this feature built into the &lt;code>terraform&lt;/code> executable!&lt;/p>
&lt;p>Once all syntax errors are fixed, the next stage of testing can continue with the &lt;code>terraform plan&lt;/code> command. This command uses &lt;strong>terraform state&lt;/strong> information (local or remote) to figure out what changes are needed if the configuration is applied. This is truly very useful in showing in advance what will be created or destroyed. However, a successful &lt;code>terraform plan&lt;/code> can still result in a failed deployment because some constraints cannot be verified without making the actual API calls to the Cloud Service Provider. The &lt;code>terraform plan&lt;/code> command does not make any actual API calls, it only computes the difference that exist between the Terraform Code vs. the Terraform State (local or remote). The failures are usually very provider specific.&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-shell" data-lang="shell">data &lt;span class="s2">&amp;#34;aws_iam_policy_document&amp;#34;&lt;/span> &lt;span class="s2">&amp;#34;Deny-Non-CiscoCidr-S3-Access&amp;#34;&lt;/span> &lt;span class="o">{&lt;/span>
statement &lt;span class="o">{&lt;/span>
&lt;span class="nv">sid&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;Deny-All-S3-Actions-If-Not-In-IP-PrefixList&amp;#34;&lt;/span>
&lt;span class="nv">effect&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;Deny&amp;#34;&lt;/span>
&lt;span class="nv">actions&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="o">[&lt;/span> &lt;span class="s2">&amp;#34;s3:*&amp;#34;&lt;/span> &lt;span class="o">]&lt;/span>
&lt;span class="nv">resources&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="o">[&lt;/span> &lt;span class="s2">&amp;#34;*&amp;#34;&lt;/span> &lt;span class="o">]&lt;/span>
condition &lt;span class="o">{&lt;/span>
&lt;span class="nb">test&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;NotIpAddress&amp;#34;&lt;/span>
&lt;span class="nv">variable&lt;/span> &lt;span class="o">=&lt;/span> &lt;span class="s2">&amp;#34;aws:SourceIp&amp;#34;&lt;/span>
&lt;span class="nv">values&lt;/span> &lt;span class="o">=&lt;/span> local.ip_prefix_list
&lt;span class="o">}&lt;/span>
&lt;span class="o">}&lt;/span>
&lt;span class="o">}&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>This Terraform Code is syntactically correct nd passes the &lt;code>terraform validate&lt;/code>, and &lt;code>terraform plan&lt;/code> produces a valid plan. However, it still fails at the &lt;code>terraform apply&lt;/code> stage because AWS has a restriction on the &lt;code>sid&lt;/code>: &lt;strong>For IAM policies, basic alphanumeric characters (A-Z,a-z,0-9) are the only allowed characters in the Sid value&lt;/strong>. This constraint is never checked before &lt;code>terraform apply&lt;/code> is called, at which point it is going to fail the whole action with the below error:&lt;/p>
&lt;pre>&lt;code>An error occurred: Statement IDs (SID) must be alpha-numeric. Check that your input satisfies the regular expression [0-9A-Za-z]*
&lt;/code>&lt;/pre>&lt;p>Such types of errors can only be caught when making real API calls to the Cloud Service Provider (or to a truly identical mock of the real API) which will validate the calls and return errors if any are found. Next I will go into some details on how I went about testing the 2 Terraform Modules I wrote.&lt;/p>
&lt;h3 id="manual-testing-via-aws">Manual Testing via AWS&lt;/h3>
&lt;p>This most rudimentary form of testing can be done by setting up a real project that imports and uses the two Terraform modules. This test can be found in my repository&amp;rsquo;s &lt;code>test/terraform/aws/&lt;/code> directory. For this to work properly the AWS provider has to be set up with real credentials, which is beyond the scope of this post. I also opted to use S3 as TF state backend storage but this is optional, it can just ass well store the state locally in a &lt;code>.tfstate&lt;/code> file.&lt;/p>
&lt;p>First, terraform has to be initialized which will trigger the download of the AWS Terraform Provider via &lt;code>terraform init&lt;/code>. Next, the changes can be planned and applied via &lt;code>terraform plan &amp;amp; apply&lt;/code> respectively. It&amp;rsquo;s interesting to note that a complete &lt;code>terraform apply&lt;/code> takes close to 1 minute to complete:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-shell" data-lang="shell">Apply complete! Resources: &lt;span class="m">7&lt;/span> added, &lt;span class="m">0&lt;/span> changed, &lt;span class="m">0&lt;/span> destroyed.
Outputs: &lt;span class="o">[&lt;/span>...&lt;span class="o">]&lt;/span>
real 0m49.090s
user 0m3.532s
sys 0m1.929s
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Once the &lt;code>terraform apply&lt;/code> is complete one can make manual assertions whether it went as expected based on the outputs (if any) and by manually inspecting the resources that were created. While this can be good enough for new setups, it may be not so good when an already deployed project has to be modified and one needs to make sure the changes will not have any undesired side effects.&lt;/p>
&lt;h3 id="manual-testing-via-localstack">Manual Testing via localstack&lt;/h3>
&lt;p>In order to save time (and some costs), one may also consider using &lt;strong>localstack&lt;/strong> which replicates most of the AWS API and its features to enable faster and easier development and testing. It&amp;rsquo;s important to note that it only works if one is an AWS customer. In an earlier
&lt;a href="https://flrnks.netlify.app/post/python-aws-datadog-testing/" target="_blank" rel="noopener">post&lt;/a> I&amp;rsquo;ve already written on how to set it up, so I will not repeat it here. The most important thing is to enable S3, IAM and KMS services in the
&lt;a href="https://github.com/florianakos/terraform-testing/blob/master/test/terraform/localstack/docker-compose.yml" target="_blank" rel="noopener">docker-compose.yaml&lt;/a> by setting this environment variable: &lt;code>SERVICES=s3,kms,iam&lt;/code> so the corresponding API endpoints are turned on.&lt;/p>
&lt;p>The Terraform files I wrote for testing with on real AWS can be re-used for testing with localstack with some tweaks, for more detail look to &lt;code>test/terraform/localstack/&lt;/code> folder in my repository. Then it&amp;rsquo;s just a matter of running &lt;code>terraform init&lt;/code> followed by a &lt;code>terraform plan &amp;amp; apply&lt;/code> to create the fake resources in Localstack.&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-shell" data-lang="shell">Apply complete! Resources: &lt;span class="m">7&lt;/span> added, &lt;span class="m">0&lt;/span> changed, &lt;span class="m">0&lt;/span> destroyed.
Outputs: &lt;span class="o">[&lt;/span> ... &lt;span class="o">]&lt;/span>
real 0m11.649s
user 0m3.589s
sys 0m1.580s
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Notice that this time the &lt;code>terraform apply&lt;/code> took only about 10 seconds, which is considerably faster than using the real AWS API.&lt;/p>
&lt;h3 id="automating-tests-via-terratest">Automating tests via Terratest&lt;/h3>
&lt;p>As I&amp;rsquo;ve shown, running tests via Localstack can be much faster on average, but sometimes a project may require the use of some AWS services that are not supported by Localstack. In this case it becomes necessary to run tests against the real AWS API. For such situations I recommend &lt;code>terratest&lt;/code> from
&lt;a href="https://terratest.gruntwork.io/" target="_blank" rel="noopener">Gruntwork.io&lt;/a>, which is a Go library that provides capabilities to automate tests.&lt;/p>
&lt;p>It still requires a terraform project to be set up, as described in &lt;code>Manual Testing via AWS&lt;/code>, however having the ability to formally define and verify tests can greatly increase the confidence that the code being tested will function the way it&amp;rsquo;s supposed to. In the test I implemented some assertions on the output values of the &lt;code>terraform apply&lt;/code> as well as about the existence of the S3 bucket just created. In addition, the Go library also provides ways to verify the AWS infrastructure setup, by making HTTP calls or SSH connections. This can be a pretty powerful tool.&lt;/p>
&lt;p>This &lt;code>terratest&lt;/code> setup can be found in my repo under
&lt;a href="https://github.com/florianakos/terraform-testing/blob/master/test/go/terraform_test.go" target="_blank" rel="noopener">test/go/terraform_test.go&lt;/a>.&lt;/p>
&lt;p>Running this test takes considerably longer than either of the two previous ones, but the advantage is that this can be easily automated and integrated into a CI/CD build where it can verify on-demand that the TF code still works as intended, even if there were some changes.&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-shell" data-lang="shell">▶ go &lt;span class="nb">test&lt;/span>
TestTerraform 2020-08-09T21:46:22+02:00 logger.go:66: Terraform has been successfully initialized!
...
TestTerraform 2020-08-09T21:47:30+02:00 logger.go:66: Apply complete! Resources: &lt;span class="m">7&lt;/span> added, &lt;span class="m">0&lt;/span> changed, &lt;span class="m">0&lt;/span> destroyed.
...
TestTerraform 2020-08-09T21:48:08+02:00 logger.go:66: Destroy complete! Resources: &lt;span class="m">7&lt;/span> destroyed.
...
PASS
ok github.com/florianakos/terraform-testing/tests 116.670s
&lt;/code>&lt;/pre>&lt;/div>&lt;p>The basic idea of &lt;code>terratest&lt;/code> is to automate the process or creation and cleanup of resources for the purposes of tests. To avoid name clashes with existing AWS resources, it&amp;rsquo;s a good practice to append some random strings to resource names as part of the test, so they are not going to fail due to unique name constraints.&lt;/p>
&lt;h2 id="conclusion">Conclusion&lt;/h2>
&lt;p>In this post I have shown what options are available for testing a Terraform Module in local or remote settings. If one only works with AWS services then Localstack can be a great tool for quick local tests during development, while &lt;strong>terratest&lt;/strong> from Gruntwork can be a great help with codifying and automating such tests that run against the real AWS Cloud from your favourite CI/CD setup.&lt;/p></description></item><item><title>RunCode.ninja Challenges</title><link>https://flrnks.netlify.app/post/runcode/</link><pubDate>Sat, 11 Jan 2020 11:11:00 +0000</pubDate><guid>https://flrnks.netlify.app/post/runcode/</guid><description>&lt;p>This post was born on a misty saturday morning, while slowly sipping some good quality coffe in a Prague café. The last several days after work was over I spent solving programming challenges on
&lt;a href="https://runcode.ninja/" target="_blank" rel="noopener">runcode.ninja&lt;/a> and I thought it would be nice to share my experience and spread the word about it.&lt;/p>
&lt;h3 id="runcodeninja">RunCode.ninja&lt;/h3>
&lt;p>I can&amp;rsquo;t really recall how I discovered this website in the first place&amp;hellip; All I remember is that I was really into the simplistic idea of it all. The basic idea for most of the challenges goes something like this:&lt;/p>
&lt;ul>
&lt;li>check problem description&lt;/li>
&lt;li>inspect any sample input (if any)&lt;/li>
&lt;li>write your program locally&lt;/li>
&lt;li>test on sample input (if any)&lt;/li>
&lt;li>submit source code to the evaluation platform&lt;/li>
&lt;/ul>
&lt;p>If all went well, you will get feedback within a few seconds whether the submitted code worked correctly for the given task at hand. If it didn&amp;rsquo;t, then you can turn to their
&lt;a href="https://runcode.ninja/faq" target="_blank" rel="noopener">FAQ&lt;/a> for some advice. It definitely has some useful info, however if all else fails, you can also contact the team behind the platform on their slack
&lt;a href="runcodeslack.slack.com">channel&lt;/a>. They are really friendly people so be sure to respond to their effort in kind!&lt;/p>
&lt;p>&lt;img src="runcode.png" alt="easy-category">&lt;/p>
&lt;p>Another nice thing about their platform is that they categorized all their challenges (119 in total as of now) into nice categories such as &lt;code>binary, encoding, encryption, forensics, etc.&lt;/code> which allows you to select what you are interested in. When I started out, I was first aiming to complete the challenges in &lt;code>Easy&lt;/code> which offers a combination of relatively easy challenges from &lt;code>math, text-parsing, encoding&lt;/code> and other categories.&lt;/p>
&lt;p>As it currently stands, I rank 155 our of around ~2400 registered users, which seems quite impressive at first, but I suspect there may be quite a few inactive accounts in their database. Also, there are some hardcore people who have already completed all their challenges that seems quite impressive. If only a few rainy and cold weekends I could spend working on these, I would probably catch up soon!&lt;/p>
&lt;p>Last but not least, their platform is set up to interpret a several different programming languages, so you can choose to solve them in the language you are most comfortable with. Once you solve a challenge, you can access its &lt;code>write-ups&lt;/code> which provide some very useful inspiration on how others have solved the same problem. This can provide some very valuable lessons, like that one time when I wrote a Go program that was 20 lines long to solve a challenge that took only 1 line into solve in Bash&amp;hellip;&lt;/p>
&lt;p>If you are interested to check out my solutions for some of the challenges, you can find them in my GitHub
&lt;a href="https://github.com/florianakos/codewarz" target="_blank" rel="noopener">repository&lt;/a>. For some of them I even created two different solutions, one in Python and another Go, just to compare and practice working with both languages.&lt;/p>
&lt;p>Oh and I almost forgot to mention, they have some really cool stickers that they are not shy to send half-way across the world by post, so that&amp;rsquo;s another big plus for sticker fans :)&lt;/p>
&lt;p>&lt;img src="sticker.png" alt="sticket">&lt;/p>
&lt;p>That&amp;rsquo;s all for now, thank you for tuning in! :)&lt;/p></description></item><item><title>Docker with Ansible</title><link>https://flrnks.netlify.app/post/ansible-docker/</link><pubDate>Fri, 13 Dec 2019 11:11:00 +0000</pubDate><guid>https://flrnks.netlify.app/post/ansible-docker/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>This post was written as a kind of learning diary for my most recent venture into the world of automation through &lt;code>Ansible&lt;/code>. The project I implemented uses Docker to package 2 services into a micro-services architecture and Ansible to build and deploy those services on remote hosts (with the help of Dockr Compose).&lt;/p>
&lt;h3 id="the-idea">The Idea&lt;/h3>
&lt;p>The service implements a file processing utility which monitors the file-system (a particular folder) and grabs any newly created files and stores them in another folder while compressing it. Interacting with the service is possible through a web interface which offers a way to upload files, simple statistics and the possibility to request email summaries.&lt;/p>
&lt;h3 id="the-approach">The Approach&lt;/h3>
&lt;p>The first idea was to write it all in Go, because I am quite comfortable with the language. However, after a few searches on the interweb, I discovered that a handy UNIX utility already exists for my exat use-case: &lt;code>inotify&lt;/code>. While Go has some packages that offer wrappers around this utility, I eventually decided to just write a bash script for using the &lt;code>inotify&lt;/code> tool, instead of relying on Go for implementing all parts of this service. This also allowed me a convenient excuse to make the service into a 2 piece set, both of which can be deployed and scaled independently, in the spirit of micro-service architecture. Next, I set out to learn enough of Ansible that can be used to deploy a packaged in Docker containers.&lt;/p>
&lt;h2 id="ansible-101">ANSIBLE 101&lt;/h2>
&lt;p>Before this project, I never had the chance to use Ansible, but I wanted to learn about it for quite a while, so here I would describe it briefly for those who are also on the start of their journey with Ansible.&lt;/p>
&lt;p>At the basic level, it is a tool for provisioning and configuring applications on remote systems in an automated fashion. To achieve the automation it uses so-called &lt;code>playbooks&lt;/code>, which define what steps are necessary to reach a desired state for remote systems. It runs mainly on UNIX systems, but is able to provision and configure both UNIX and Windows based systems.&lt;/p>
&lt;p>It is an &lt;code>agentless&lt;/code> tool, which means it does not require any special software to be included in the remote hosts. Instead it relies on an SSH connection to remote hosts, through which bash or PowerShell utilities are used to carry out the necessary steps.&lt;/p>
&lt;p>Ansible uses an &lt;code>inventory&lt;/code> that describes the remote systems that can be provisioned through the playbooks. Inventories can be defined statically in local filesystem on the Ansible master node, or pulled dynamically from remote systems as well.&lt;/p>
&lt;h2 id="ansible-meets-docker">ANSIBLE MEETS DOCKER&lt;/h2>
&lt;p>For the purpose of this project, them main use of Ansible lies in its ability to build and run Docker containers. While Docker is not strictly needed to deploy this service on multiple remote hosts, it becomes much easier when all the necessary dependencies and the source code are packaged neatly in a container that can be easily shipped. Within the Docker container, all dependencies are set up and the service is configured in a reliable and consistent manner, while Ansible takes care of deploying and running the service.&lt;/p>
&lt;p>It is worth mentioning that other tools exist, such as Kubernetes, Docker Swarm and others, which focus more on shipping containerised applications. This blog post, however will not deal with those, but focus entirely on Ansible and Docker instead. Future posts may discuss those alternatives in more detail.&lt;/p>
&lt;p>Below is a brief summary of the proposed architecture that depicts how Ansible and Docker are used together to achieve the desired state of deploying the containerised service on each Ansible host.&lt;/p>
&lt;p>&lt;img src="ansib-meets-dock.png" alt="Ansible meets Docker">&lt;/p>
&lt;p>Detailed instructions are out of scope for this post as well, but briefly: the above shows a snapshot of my local environment using virtual machines in VirtualBox. First, I created a master VM with Ubuntu Desktop and then two slave VMs with Ubuntu server (no GUI necessary). Ansible was installed on the Master node and proper SSH access was configured for both slave VMs from the master VM. In the Ansible playbook used to deploy the service on remote systems, the first few tasks were about installing necessary dependencies and setting up a local docker environment, which can later build and run containerised applications.&lt;/p>
&lt;h2 id="monolithic-vs-microservice">MONOLITHIC VS MICROSERVICE&lt;/h2>
&lt;p>Before discussing how Ansible was used to deploy the service on remote machines using Docker, it is worth going through the building blocks of the service itself. The set of features needed for the service:&lt;/p>
&lt;ul>
&lt;li>file monitoring service that grabs and compresses files&lt;/li>
&lt;li>web interface for file uploads, email sending and service stats&lt;/li>
&lt;/ul>
&lt;p>These features could be implemented in one application that runs all the necessary functions in parallel. In fact, on my first iteration, I opted to solve it this way, packaging all features into a single container. The below figure shows how it worked.&lt;/p>
&lt;p>&lt;img src="monolithic.png" alt="Monolithic Docker">&lt;/p>
&lt;p>However, for the sake of learning, it is worth to consider using a &lt;code>microservice&lt;/code> approach. This essentially means breaking up big &lt;code>monolithic&lt;/code> applications to smaller sub-components. Docker is a perfect tool for this. For our purposes, such an architecture could mean deploying 2 separate containers: one for the Web UI backend (for uploads, statistics and email) and another that implements the monitoring and compression service. Below is an updated figure showing the breakup of our previously monolithic approach.&lt;/p>
&lt;p>&lt;img src="microservice.png" alt="Microservice Docker">&lt;/p>
&lt;p>Breaking up the one container from the first iteration into two separate containers enables us to reap some benefits of microservice architecture. Our application components can fail independently, for example, a bug in the email sending service will not bring down the monitoring service. Also, such an architecture means in the future we can scale better with demand, in case there would be a huge surge in requests to the web frontend, we could just deploy more instances of the container and use a load-balancer to distribute requests among those instances.&lt;/p>
&lt;h2 id="implementation">IMPLEMENTATION&lt;/h2>
&lt;p>To implement the web component, I used simmple static HTML being served from a &lt;code>GO&lt;/code> backend, that also handled file-uploads, sending email notifications and extracting statistical data from a shared SQLite3 database. In order to implement the file monitoring service, I used the &lt;code>inofity-tools&lt;/code> available on UNIX systems, and wrapped it in a bash script that took care of the zipping, and generating of logs and statistics into the SQLite3 database.&lt;/p>
&lt;h3 id="docker-compose">Docker-Compose&lt;/h3>
&lt;p>Docker Compose was used to enable easier testing and deployment. The definitions in the &lt;code>docker-compose.yml&lt;/code> describe what docker containers should be started with what parameters. The two services defined in the docker-compose correspond to the two containers defined above using the micro-service architecture.&lt;/p>
&lt;p>The &lt;code>webserver&lt;/code> running the GO backend uses a few mounted folders plus an exposed port to let inbound communication reach the server. The &lt;code>monitor&lt;/code> uses 4 folders mounted from the host FS, which enable its core functionality (listening for files and zipping them to a different folder).&lt;/p>
&lt;h3 id="ansible">Ansible&lt;/h3>
&lt;p>Thanks to Docker Compose, it was relatively simple to deploy and run the service with Ansible, once the necessary packages and dependencies are installed on Ansible hosts. All it took was a simple Ansible Task using the docker_compose module:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-yml" data-lang="yml">- &lt;span class="k">name&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>Docker-Compose&lt;span class="w"> &lt;/span>UP&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">docker_compose&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">project_src&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>path_to_docker_compose_yml&lt;span class="w">
&lt;/span>&lt;span class="w"> &lt;/span>&lt;span class="k">build&lt;/span>&lt;span class="p">:&lt;/span>&lt;span class="w"> &lt;/span>yes&lt;span class="w">
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>While testing the sevice a few issues were discovered that could be considered as bugs, but instead let&amp;rsquo;s call them features!&lt;/p>
&lt;h3 id="feature-1">Feature #1&lt;/h3>
&lt;p>Since the service lets users upload files, sometimes, if the file is large enough, the processing may kick in faster than the upload can be completed. In this case, the file may be corrupted and would not be possible to recover after unzipping. To mitigae this to a certain extent, a 5 second processing delay has been added to the &lt;code>monitor_service.sh&lt;/code> script, which will, as a result, delay the processing and hope that during those 5 seconds, the upload has finished.&lt;/p>
&lt;h3 id="feature-2">Feature #2&lt;/h3>
&lt;p>While creating the two Docker files describing each component of the service, I wanted to take an extra step and created a non-root user, so that the main process of the service starts as some user which does not have full root access to everything. This worked well while developing and testing on a local system using manual execution via &lt;code>docker-compose up/down&lt;/code> commands. However, once Ansible has been updated to use DC via the &lt;code>docker_compose&lt;/code> module, certain functionalities would be broken due to file/folder permission issues. Basically the mounted folders would belong to root, whereas the running process was non-root, so it could not save uploaded files for example. Further investigations will be done to solve this, until then, the Dockerfiles have been reverted to use root when starting the main processes.&lt;/p>
&lt;h2 id="conclusion">CONCLUSION&lt;/h2>
&lt;p>All in all, working on this project has been a great opportunity to practice such tools as Docker, Docker Compose and Ansible. While I have used Docker briefly before, I have never once used Ansible, and I learnt a great deal about it during this project. I can definitely see how it enables large organisations to streamline their processes when it comes to deploying and configuring various systems and services in their infrastructure. While this project is rather rudimentary, it gave me a good entry to this realm of IT.&lt;/p></description></item><item><title>Performance tuning GO</title><link>https://flrnks.netlify.app/post/go-performance/</link><pubDate>Mon, 11 Nov 2019 11:11:00 +0000</pubDate><guid>https://flrnks.netlify.app/post/go-performance/</guid><description>&lt;h3 id="introduction">Introduction&lt;/h3>
&lt;p>This post is going to contain a short story on how I managed to optimize the execution of a simple program, written for a coding challenge on the site &lt;code>runcode.ninja&lt;/code>.&lt;/p>
&lt;p>Short description of the task:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-bash" data-lang="bash">There is a text file which is given as argument to your program.This text
file contains lines, each of which is an encoded englishword. Recover them
and print them out to the standard output lineby line. Hint: the UNIX
built-in dictionary may come in handy at &lt;span class="s2">&amp;#34;/usr/share/dict/american-english&amp;#34;&lt;/span>.
&lt;/code>&lt;/pre>&lt;/div>&lt;p>To attack problem, I used the GO language to write a program which used the built-in &lt;code>encoding&lt;/code> and &lt;code>os/exec&lt;/code> packages to decode the lines and to call grep to search in the file-based dictionary. It was not very difficult to figure out that the encoding in use was base64.&lt;/p>
&lt;p>However, to make each line valid either a single &lt;code>=&lt;/code> or double equation &lt;code>==&lt;/code> characters had to be added to each line. The below code takes care of this addition of extra characters at the end of each line.&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-go" data-lang="go">&lt;span class="kd">func&lt;/span> &lt;span class="nf">decode&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">encodedStr&lt;/span> &lt;span class="kt">string&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="kt">string&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="nx">decoded&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">err&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="nx">base64&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">StdEncoding&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">DecodeString&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">encodedStr&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="k">for&lt;/span> &lt;span class="nx">err&lt;/span> &lt;span class="o">!=&lt;/span> &lt;span class="kc">nil&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="nx">encodedStr&lt;/span> &lt;span class="o">+=&lt;/span> &lt;span class="s">&amp;#34;=&amp;#34;&lt;/span>
&lt;span class="nx">decoded&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">err&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="nx">base64&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">StdEncoding&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">DecodeString&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">encodedStr&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;span class="k">return&lt;/span> &lt;span class="nb">string&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">decoded&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>In order to test if the result of a decode operation is a valid word, a helper function was written, which is passed a string as an argument and performed the call to grep via &lt;code>os/exec&lt;/code>.&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-go" data-lang="go">&lt;span class="kd">func&lt;/span> &lt;span class="nf">dictLookup&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">word&lt;/span> &lt;span class="kt">string&lt;/span>&lt;span class="p">)&lt;/span> &lt;span class="kt">bool&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="nx">dictLocation&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="s">&amp;#34;/usr/share/dict/american-english&amp;#34;&lt;/span>
&lt;span class="nx">_&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">err&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="nx">exec&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Command&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;grep&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="s">&amp;#34;-w&amp;#34;&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">word&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">dictLocation&lt;/span>&lt;span class="p">).&lt;/span>&lt;span class="nf">Output&lt;/span>&lt;span class="p">()&lt;/span>
&lt;span class="k">if&lt;/span> &lt;span class="nx">err&lt;/span> &lt;span class="o">!=&lt;/span> &lt;span class="kc">nil&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="k">return&lt;/span> &lt;span class="kc">false&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;span class="k">return&lt;/span> &lt;span class="kc">true&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Finally, putting these pieces together, there is a function which reads in the txt file, iterates over the lines and calls decode and dict lookup until a valid word comes out, then prints it to standard output. Below is the sample code.&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-go" data-lang="go">&lt;span class="nx">scanner&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="nx">bufio&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">NewScanner&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">file&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="kd">var&lt;/span> &lt;span class="nx">line&lt;/span> &lt;span class="kt">string&lt;/span>
&lt;span class="k">for&lt;/span> &lt;span class="nx">scanner&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Scan&lt;/span>&lt;span class="p">()&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="nx">line&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="nf">decode&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">scanner&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Text&lt;/span>&lt;span class="p">())&lt;/span>
&lt;span class="k">for&lt;/span> &lt;span class="p">!(&lt;/span>&lt;span class="nf">dictLookup&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">line&lt;/span>&lt;span class="p">))&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="nx">line&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="nf">decode&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">line&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;span class="nx">fmt&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Println&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">line&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="initial-results">Initial results&lt;/h3>
&lt;p>The sample code worked well enough and running it on the test / sample data provided yielded correct output, so all seemed to be fine!&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-bash" data-lang="bash">flrnks@t460:~/drop_the_bass &lt;span class="o">(&lt;/span>master&lt;span class="o">)&lt;/span> ▶ go run main.go input.txt
interpretation
sanctioned
lawn
electives
unifying
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Then came the idea to try to test this code on both of my laptops because it did not seem to run very quickly, even though it only had to decode 5 lines. So one of the machines I have is a ThinkPad T460 with an i5 and 16GB of RAM, while the other is a 15&amp;rdquo; MacBook Pro with i9 CPU and 32GB of RAM. I initially developed the code on the ThinkPad, and was quite surprised how much slower it was to execute on the MacBook. I would have expected that it would be the opposite, since the ThinkPad is around 3-4 years old already with a less powerful CPU. Initial test results from both machine:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-bash" data-lang="bash"> &lt;span class="o">[&lt;/span>MacBook&lt;span class="o">]&lt;/span> &lt;span class="o">[&lt;/span>ThinkPad&lt;span class="o">]&lt;/span>
interpretation 285.76ms 32.61ms
lawn 425.63ms 59.31ms
unifying 1.10s 93.60ms
electives 1.20s 91.10ms
sanctioned 6.18s 141.28ms
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Overall the MacBook took on average 9 seconds to finish, while the ThinkPad took around 0.5 to 1 second to finish. This was not normal, so I had to investigate! 👀 😄&lt;/p>
&lt;h3 id="performance-tuning-10">Performance Tuning 1.0&lt;/h3>
&lt;p>Seeing the results and the difference in performance, I was quite interested what could be the cause for such a performance drop on the MacBook. My first idea was to implement concurrency into the processing, so that instead of reading lines sequentially, they get processed in parallel by getting assigned to a worker using channels, which will return it to the main routine waiting for the results.&lt;/p>
&lt;p>&lt;img src="concurrent-go.png" alt="Go concurrency implemented">&lt;/p>
&lt;p>The above figure contains the basic idea for this concurrent processing model and the below code snippet shows some parts of the code that are most important:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-go" data-lang="go">&lt;span class="c1">// define the channels for distributing work and collecting the results
&lt;/span>&lt;span class="c1">&lt;/span>&lt;span class="nx">jobs&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="nb">make&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kd">chan&lt;/span> &lt;span class="kt">string&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="nx">results&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="nb">make&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kd">chan&lt;/span> &lt;span class="kt">string&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="c1">// use the waitgroup for syncing up between the workers
&lt;/span>&lt;span class="c1">&lt;/span>&lt;span class="nx">wg&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="nb">new&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">sync&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nx">WaitGroup&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="c1">// start up some workers that will block and wait
&lt;/span>&lt;span class="c1">&lt;/span>&lt;span class="k">for&lt;/span> &lt;span class="nx">w&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="mi">1&lt;/span>&lt;span class="p">;&lt;/span> &lt;span class="nx">w&lt;/span> &lt;span class="o">&amp;lt;=&lt;/span> &lt;span class="mi">5&lt;/span>&lt;span class="p">;&lt;/span> &lt;span class="nx">w&lt;/span>&lt;span class="o">++&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="nx">wg&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Add&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="mi">1&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="k">go&lt;/span> &lt;span class="nf">workerFunc&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">jobs&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">results&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">wg&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;span class="c1">// interate over the file line by line and queue them up in the jobs channel
&lt;/span>&lt;span class="c1">&lt;/span>&lt;span class="k">go&lt;/span> &lt;span class="kd">func&lt;/span>&lt;span class="p">()&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="nx">scanner&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="nx">bufio&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">NewScanner&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">file&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="k">for&lt;/span> &lt;span class="nx">scanner&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Scan&lt;/span>&lt;span class="p">()&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="nx">jobs&lt;/span> &lt;span class="o">&amp;lt;-&lt;/span> &lt;span class="nx">scanner&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Text&lt;/span>&lt;span class="p">()&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;span class="nb">close&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">jobs&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="p">}()&lt;/span>
&lt;span class="c1">// In parallel routine wait for WG to finish and close channel for results
&lt;/span>&lt;span class="c1">&lt;/span>&lt;span class="k">go&lt;/span> &lt;span class="kd">func&lt;/span>&lt;span class="p">()&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="nx">wg&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Wait&lt;/span>&lt;span class="p">()&lt;/span>
&lt;span class="nb">close&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">results&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="p">}()&lt;/span>
&lt;span class="c1">// Print out the results from the results channel.
&lt;/span>&lt;span class="c1">&lt;/span>&lt;span class="k">for&lt;/span> &lt;span class="nx">v&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="k">range&lt;/span> &lt;span class="nx">results&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="nx">fmt&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Println&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">v&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>This parallel processing has noticeable improved the performance, but still did not eliminate the substantial difference between the two platforms.&lt;/p>
&lt;p>&lt;em>Note&lt;/em>: implementing the concurrent model means the words on the standard output will appear in a random order, and so the submission to the grading system might fail.&lt;/p>
&lt;h3 id="performance-tuning-20">Performance Tuning 2.0&lt;/h3>
&lt;p>Next, I was looking around on the internet (StackOverFlow.com in particular) where I got the idea to stop calling grep via the &lt;code>os/exec&lt;/code> package, and instead read the contents of the dictionary into memory and perform lookups that way. Essentially this was trading memory footprint for speed. So then I create a global dictionary {&amp;lsquo;map[string]bool&amp;rsquo;} which was loaded once at the start of the program and used as often as needed by the various go-routines. And this was perfectly fine because the worker routines called read-only operations on this map so there was no issue with concurrent access to the global map variable.&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-go" data-lang="go">&lt;span class="kd">var&lt;/span> &lt;span class="nx">wordDict&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="nb">make&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="kd">map&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="kt">string&lt;/span>&lt;span class="p">]&lt;/span>&lt;span class="kt">bool&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="kd">func&lt;/span> &lt;span class="nf">loadDictionary&lt;/span>&lt;span class="p">()&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="nx">dict&lt;/span>&lt;span class="p">,&lt;/span> &lt;span class="nx">_&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="nx">os&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Open&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="s">&amp;#34;/usr/share/dict/american-english&amp;#34;&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="k">defer&lt;/span> &lt;span class="nx">dict&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Close&lt;/span>&lt;span class="p">()&lt;/span>
&lt;span class="nx">ds&lt;/span> &lt;span class="o">:=&lt;/span> &lt;span class="nx">bufio&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">NewScanner&lt;/span>&lt;span class="p">(&lt;/span>&lt;span class="nx">dict&lt;/span>&lt;span class="p">)&lt;/span>
&lt;span class="k">for&lt;/span> &lt;span class="nx">ds&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Scan&lt;/span>&lt;span class="p">()&lt;/span> &lt;span class="p">{&lt;/span>
&lt;span class="nx">wordDict&lt;/span>&lt;span class="p">[&lt;/span>&lt;span class="nx">ds&lt;/span>&lt;span class="p">.&lt;/span>&lt;span class="nf">Text&lt;/span>&lt;span class="p">()]&lt;/span> &lt;span class="p">=&lt;/span> &lt;span class="kc">true&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;span class="p">}&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;p>This way the lookups in the dictionary cannot be a bottleneck of the I/O system of the particular OS the program is running on. Executing the same timing test this time yielded much improved results. It became clear that the issue on the MacBook was slow execution of the external &lt;code>grep&lt;/code> call from the GO program. Why this is the reason I am not sure, but the results speak for themselves:&lt;/p>
&lt;div class="highlight">&lt;pre class="chroma">&lt;code class="language-bash" data-lang="bash"> &lt;span class="o">[&lt;/span>MacBook&lt;span class="o">]&lt;/span> &lt;span class="o">[&lt;/span>ThinkPad&lt;span class="o">]&lt;/span>
interpretation 54.691µs 24.17µs
lawn 65.922µs 9.176µs
unifying 155.726µs 71.785µs
electives 113.074µs 47.478µs
sanctioned 286.94µs 464.20µs
&lt;/code>&lt;/pre>&lt;/div>&lt;p>Somehow the older and less powerful ThinkPad still seems considerably faster, but at least the difference is not so substantial anymore&amp;hellip; 😌&lt;/p>
&lt;h3 id="results">Results&lt;/h3>
&lt;p>The below picture briefly summarizes the observed results when it comes to performance, which was measured by execution time. In order to mitigate transient effects on execution time, there were 10 measurements taken for each variant.&lt;/p>
&lt;p>&lt;img src="perf.png" alt="Performance measurements">&lt;/p>
&lt;p>Explanation for the different variants (Seq vs. Con and Grep vs Map):&lt;/p>
&lt;ul>
&lt;li>&lt;code>Seq&lt;/code>: each line is decoded one after the other in sequence.&lt;/li>
&lt;li>&lt;code>Con&lt;/code>: each line is processed concurrently on a pool of workers.&lt;/li>
&lt;li>&lt;code>Grep&lt;/code>: dictionary lookup done via exec call to GREP.&lt;/li>
&lt;li>&lt;code>Map&lt;/code>: dictionary is loaded into a string map in memory.&lt;/li>
&lt;/ul>
&lt;p>Quite frankly, the results speak for themselves. The most notable thing is that, compared to the most basic version (Seq-Grep), the biggest improvement is achieved not by using concurrency, but by eliminating the repeated calls to Grep.&lt;/p>
&lt;p>This is not to say that enabling concurrency did not have an impact on the execution time, on average it decreased from 9 to 6 seconds, which is quite good already!&lt;/p>
&lt;p>However, I/O latency seems to have a higher cost on the performance than lack of parallel processing. At least at the scale of input for this example this is the case. This difference is less pronounced when tests were run using a file which had 500 lines of encoded words (instead of just 5).&lt;/p>
&lt;h3 id="conclusion">Conclusion&lt;/h3>
&lt;p>Never underestimate the power of I/O delay and the effect it can have on your program. Even if you have a very powerful machine, this can bog your performance down considerably! Also, it may help your program&amp;rsquo;s performance further, if you implement proper concurrent processing whenever possible.&lt;/p></description></item></channel></rss>