Navigating AWS for DevOps: A Beginner's Journey
Written on
Chapter 1: Introduction to AWS DevOps
In the initial segment of this series, I successfully Dockerized my project, enabling it to be hosted on any platform that supports Docker containers. Looking ahead to this part, I anticipated tackling the myriad configurations required on AWS for my scripts to function seamlessly.
While I may have been a tad dramatic about the number of steps involved, it genuinely feels like there’s a lack of comprehensive resources detailing how to identify the necessary AWS components and their purposes. The most insightful materials I've encountered for transforming a React project into a DevOps-ready application are YouTube tutorials. However, these videos often prioritize a step-by-step approach, neglecting to explain the reasoning behind each action and potential alternatives.
I don’t claim to be an expert; rather, I’m someone eager to learn through experimentation. If any seasoned professionals notice inaccuracies in my approach, I welcome corrections—it's crucial to steer clear of misleading guidance.
To provide some structure, I’ll draw from a particularly useful tutorial, while infusing my own insights and likely deviating from the script for the sake of exploration.
Overview of the Process
Essentially, we need to establish one or more users with sufficient permissions to execute our scripts. Subsequently, we'll utilize one of these users to store our container image in an accessible location. After that, we’ll create scripts to retrieve the container and deploy it to a place where it can be accessed on the internet. The ambiguity arises from the multitude of methods available for achieving this task.
As my goal is to familiarize myself with AWS, I’ll select free options that genuinely require AWS while steering clear of those that promise effortless solutions. However, I might explore those magical methods later, depending on what captures my interest.
Users and Permissions
The AWS users we create must be accessible via GitHub Actions for pushing containers post-commit, as well as for managing resources within AWS and setting up our final environment.
Storing the Container Image
You have several options for storing Docker images for deployment on AWS—DockerHub and GitHub Container Registry are popular choices. However, within AWS, you can utilize S3 for static assets or Elastic Container Registry (ECR). I’ve opted for ECR, as I suspect it will provide a more enriching learning experience.
Exposing the Site
Once our container is up and running, there are various methods to expose it, including Amplify, EC2, and Fargate on EC2. Amplify functions automatically, reportedly utilizing Fargate behind the scenes. EC2 requires manual configuration, while Fargate simplifies many of the complexities. Given my current skill level, I believe Fargate will be the most suitable option.
For additional context on EC2 versus Fargate, I found a helpful article that outlines the differences.
To expose our site on EC2, we can either SSH into the virtual machine for configuration or create an ECS task. Since SSH isn’t very DevOps-friendly, I plan to pursue the task route while incorporating a load balancer. Although it's not strictly necessary for a simple React site, the tutorial I referenced includes it, so I’ll likely follow suit—unless I choose otherwise.
Now that we have a vague roadmap and some understanding of our goals, let’s dive in.
GitHub Authentication
While the tutorial I found emphasizes using Docker push, I opted for GitHub Actions due to their broader applicability and the added security of not exposing my AWS credentials in a file. Let’s explore how that looks 🤔. (Note: this process veered off course several times—I didn't even end up using the IAM user for GitHub. For a quicker overview, skip ahead to the video.)
According to Amazon's best practices, access keys for the root user should be avoided. Thus, our first task is to create an IAM user. The process is somewhat convoluted, and more knowledgeable folks have outlined it here:
Unfortunately, I encountered issues early on. After completing the initial steps, I was prompted to configure an IAM user, which I didn’t have yet. So, I decided to diverge and perform the following:
- In the IAM Identity Center, look for the link labeled "IAM" at the bottom under "Related consoles." Click on that instead of the "Users" link, which is more visible.
- Click "Users," then "Create User," and select "I want to create an IAM user."
- I checked "AmazonEc2ContainerRegistryFullAccess" and "AmazonEC2FullAccess." I wasn’t sure if these permissions would suffice, but they seemed like a decent starting point.
- Sign in as the IAM user to prompt the required password change.
- While logged in, create your secret keys using these instructions. Oops! I encountered several errors about permissions I should have set up for using the AWS management console. Perhaps I should have stuck with the admin permissions after all. Now, I'm tasked with adding the specific permissions causing errors, but they don't appear in the permissions list, so I guess I’ll grant my IAM user admin access eventually. It’s just a test account—what could possibly go wrong? (At least the errors cleared, so that’s a plus!)
We’ll now follow the steps for configuring AWS credentials in GitHub Actions. This should be straightforward, right? Hm, it suggests that I should have GitHub assume a role instead of utilizing secret keys. Frustratingly, each source I consult leads me to another with multiple methods, complicating matters further. This is just as daunting as my previous experience deploying a static site.
Next, we’ll address Configuring OpenID Connect in GitHub. Before we can finalize our GitHub Actions setup, we must tackle this first. The instructions are mostly clear until we reach the trust policy setup, where Amazon tries to promote Incognito. Even if you purchase that, there are no explicit instructions on its usage—just example JSON snippets without context. Time to turn to YouTube for clearer guidance (starting at the timestamp where we create the role).
Creating an ECS Cluster
While a nap seems inviting, we must persevere. I’ll attempt the steps from the video linked at the top of this post, starting around the 9:30 mark. Let’s see how this unfolds.
First, I didn’t find the "Networking Only" option, so I clicked "Create." I also skipped creating a VPC, as the default option may set one up for me. Surely, nothing bad will happen.
Creating a Registry in ECR
I followed the video instructions closely, and miraculously, nothing went awry. I’m unsure if I’ll utilize the push commands shown since I’m relying on GitHub Actions.
Creating the ECS Task
The only confusing part was when the presenter mentioned pointing to the container at the end—this is actually the first screen of the wizard. I copied the URL from my ECR Registry, but there’s no container present yet because I haven’t configured my GitHub Actions to push it.
Creating a Load Balancer
Again, I relied on the video instructions. I faced some challenges locating the VPC ID created by ECS, but I eventually found "Your VPCs." It was the sole option in the dropdown while creating the load balancer, so I opted to use the default security group instead of creating a new one.
I struggled with the IP configuration, so I removed the default IP, planning to add one later after establishing a service as indicated in the video. The creation appeared successful.
Creating a Service
I attempted to create the service by following the provided steps, but it failed during task execution, resulting in automatic deletion. This was likely due to the absence of a container in ECR, so I’ll circle back after setting up my GitHub Actions in Part 3.
Update: The reason my service kept vanishing was due to selecting "Task" instead of "Service" in "Application Type." By opting for "Service," I could name the service, allowing it to appear in the "Services" list, even if it failed. I also discovered that my Task Definition contained an error because I needed to specify the Container name matching the repository name for the GitHub action to locate the container upon pushing. Once corrected, I copied the URI from the Task Definition to the Service and updated it to use the latest Task Definition revision.
Unfortunately, this wasn’t the end of my troubles. My service, which had been operational, disappeared again 🤦♀️. This was likely due to inconsistent references to the region across the workflow and task definition files, which didn’t align with my actual account region.
Update 2: Troubleshooting AWS Setup
Ultimately, I faced numerous minor issues stemming from following a tutorial that utilized an outdated AWS Console version, skipping critical steps. I discovered a new tutorial that allowed me to access the container directly without routing through the load balancer, confirming that my container was functioning as expected.
At this point, I deleted the service I created primarily by adhering to the previous tutorial. Many essential service settings, such as their endpoints, are non-editable, necessitating complete deletion and recreation. I then followed a new tutorial, incorporating its ideas to get my service operational with a load balancer directing traffic appropriately.
However, I encountered a series of errors: a 502 Bad Gateway, followed by a 503 Service Temporarily Unavailable, and finally a 504 Gateway Timeout. I suspect the 502 and 503 errors are standard during deployments, yet the cause of the 504 error eluded me. Ultimately, I resolved it by creating a security group that allowed unrestricted traffic between the load balancer and the service—though this is likely poor practice. If this were a production scenario, I would need to determine the correct settings.
In the meantime, I discovered several valuable resources that guided me toward a solution:
- When creating a new service, a notification appears with a "View on CloudFormation" button. I clicked this every time to track service creation.
- The Services page provides insights into the current status.
- I found it beneficial to create the load balancer and target group separately before linking them to the service. This streamlined the process and reduced potential errors during service spin-up, especially since I recreated that service numerous times.
Additionally, you can check the status of your targets through the load balancer, as alluded to in AWS troubleshooting resources, though it assumes prior knowledge of how to locate this information.
Next, we’ll work on automating deployments through GitHub Actions.
In this video, Amy Smith from LinkedIn shares her insights on building successful enterprise relationships, which may provide valuable context for collaborative efforts in tech projects.
Janet Gregory discusses the various testing activities in DevOps, emphasizing that it’s much more than just a tool-based approach. This perspective will be beneficial as we integrate testing into our AWS projects.