[DevOps] Designing our Initial CICD Process — 2023's Path to ECR,ECS and More!
Hello! Welcome once again to another write up of a challenge I took great pride in resolving at our current organization this previous year. Consider reading the rest of the story to find out how I reduced our developer’s average build times from 1 hour to less than 25 minutes!
This last year was a big step forward for us on the development team at rennie in terms of modernizing our CICD processes, improving our cloud infrastructure and implemented the foundation for us to work towards a fully containerized application setup but this article is to highlight something I took great pride in designing, setting up and now, fully implemented as our day-to-day CICD solution and that is
Our brand new scaleable Jenkins Distributed CICD Pipelines!
When I had first been brought on at rennie just over a year ago, one of the big challenges I was asked to resolve was the improvement of our Developer’s SDLC for many of our internal applications. Coming into this was no easy walk, and I knew I was a bit overwhelmed with all the things we had to create in order to achieve something we can run internally.
The Challenge
A core issue with the old previous CICD process is that it wasn’t scaleable and we did not control the resources assigned to it — we had opted into a paid CICD Solutions provider which limited our concurrent builds behind a subscription based tier pay wall, wanting more builds meant an increase of our monthly payments to the provider, something that is feasible for us but shouldn’t be the default answer to everything — you simply should not just throw $$$ at a problem.
With us not having any granular control over the computing resources the builds would take up to an average of 1 hour per build, which means that there can be a significant wait time for our developers to have their code changes integrated and deployed.
This can also result in a lack of visibility into the current build and deployment status, leading to confusion and frustration among team members.
So this lays out one of my first big obstacles. Implementing our own CICD pipelines.
The Setup
The tools we used:
- Jenkins CICD Pipelines with a distributed worker setup
- AWS EC2 Fleet for Templating and Auto-scaling our Jenkins Worker Nodes
- AWS Lambda Functions to trigger on-demand Jenkins Worker Nodes
- AWS Cloudwatch for Monitoring idle Jenkins Worker Nodes shutting off idling nodes that are no longer being actively used for a build or deploy.
- Docker Containers loaded with pipeline toolsto mitigate the use of plugin dependancies for Jenkins
Jenkins CICD Pipelines
There are several reasons why we decided to use Jenkins instead of other modern CICD tools such as GitHub or GitLab.
- Jenkins has a large and active community of developers who contribute to the platform and provide support for users. This means that there is a vast collection of plugins, tutorials, and documentation available to help users with their CICD pipeline.
- Jenkins has been around for a long time, it is considered a mature and reliable platform. It has been used in many organizations and has proven to be a robust and stable tool that can handle a wide variety of CICD pipeline use cases.
- Jenkins has a wide range of built-in features that make it easy to create and manage CICD pipelines. It has a web-based interface that allows users to easily create and configure their pipelines, and it also has a wide range of tools that can be used to automate various parts of the pipeline.
- Jenkins is highly customizable, users can create custom plugins and scripts to extend the functionality of the pipeline. This allows organizations to tailor their CICD pipeline to their specific needs, and it also makes it easier to integrate with other tools and systems.
- Jenkins is an open-source tool, it is free to use and it can be easily integrated with a wide range of other open-source tools, such as Git, Maven, and Ansible. This makes it a cost-effective solution for organizations looking to implement a CICD pipeline.
Distributed Worker Setup
To split the Jenkins build process into separate worker nodes, we first identified the different stages of the build and deploy process. These stages included tasks such as compiling the code, running tests, creating the build artifact, and deploying the code to various environments.
Once we identified the different stages, we then created separate worker nodes for each stage. This allowed us to distribute the workload across multiple machines, which increased the overall efficiency of the pipeline.
For example, we created a worker node specifically for compiling the code and running tests. This node was equipped with the necessary tools and dependencies to perform these tasks, such as a NPM , Ruby on Rails and or testing frameworks.
Another worker node was set up specifically for creating the build artifact and packaging it for deployment. This node had the necessary tools such as build automation tools like bundle, NPM build, and it also had the necessary dependencies to create the artifact.
Finally, we also set up separate worker nodes for handling the deployment process. These nodes were responsible for deploying the build artifact to various environments such as development, staging, and/or production.
They were also equipped with the necessary tools and scripts to automate the deployment process, such as Ansible, Chef or Terraform.
AWS Lambda Functions
To trigger the worker nodes on demand, we used AWS Lambda functions written in Python. The Lambda functions were used to invoke the worker nodes when a specific event occurred, such as a code commit or a manual trigger.
import boto3
region = 'YOUR REGION'
instances = ['INSTANCE_ID']
ec2 = boto3.client('ec2', region_name=region)
def lambda_handler(event, context):
print('Starting Worker Node 1')
ec2.start_instances(InstanceIds=instances)
We used Python as the programming language for the Lambda functions, as it is a popular and well-suited language for AWS development. It allowed us to take advantage of the AWS SDK for Python (Boto3) which made it easy to interact with other AWS services such as EC2, SNS, and SQS.
By using AWS Lambda functions to trigger the worker nodes on demand, we were able to automate the CICD pipeline and ensure that the pipeline was only triggered when necessary. This helped to reduce the overall cost of the pipeline and improve its efficiency, as the pipeline was only running when there was actual work to be done. Additionally, it also helped to improve the pipeline’s reliability by ensuring that the pipeline only runs when it is needed and reducing the potential of failures that can be caused by unnecessary runs.
AWS Cloudwatch
Once we had our Jenkins worker nodes set up and running in Docker containers on EC2 instances, we used AWS CloudWatch to monitor the worker nodes and shut off any that were idling. CloudWatch is a monitoring service provided by AWS that allows us to collect and track metrics, collect and monitor log files, and set alarms.
First, we configured CloudWatch to collect metrics from the worker nodes, such as CPU utilization, memory usage, and network traffic. This allowed us to see how the worker nodes were performing and identify any that were idling.
Next, we set up CloudWatch alarms to trigger when the metrics for a worker node indicated that it was idling. For example, if the CPU utilization for a worker node was below a certain threshold for a certain period of time, the alarm would trigger.
Finally, we used CloudWatch events to trigger an AWS Lambda function that would shut down the idling worker nodes. The Lambda function would then use the EC2 API to stop the idling instances, which would shut down the worker node.
By using CloudWatch, we were able to automatically shut off worker nodes that were idling, which helped to reduce the overall cost of the pipeline. Additionally, CloudWatch also helped to improve the overall reliability of the pipeline by ensuring that only the necessary worker nodes were running, which reduced the potential of failures that can be caused by unnecessary runs
2023 Plans
If you got this far, I want to thank you for taking a moment to read up on something, that honestly, I still take such a great pride in seeing being used in our day-to-day work environment and I want to use this next part to answer some lingering questions some of you may have — for those that are more in-depth in the DevOps space, and asked why we went with EC2 instead of going straight into ECS as we have everything in Docker Containers.
We initially decided to use EC2 instances to deploy our Jenkins worker nodes instead of Amazon Elastic Container Service (ECS) for the following reasons:
- At the time of the initial foundation, we had more experience with EC2 and it was easier for us to set up and configure the worker nodes on EC2 instances. We were familiar with the EC2 interface, and it was straightforward for us to set up the necessary dependencies, tools, and configurations on the instances.
- EC2 instances provided more flexibility and control over the underlying infrastructure. We were able to customize the worker nodes to fit our specific needs, and we had more control over the resources, such as CPU and memory, allocated to each worker node.
- ECS was a relatively new service and it was not yet as mature as EC2, and it may not have had all the necessary features and integrations that we needed for our pipeline.
- EC2 instances allowed us to start quickly without having to invest a lot of time in learning a new service. This allowed us to quickly set up and start using our Jenkins worker nodes, and it also allowed us to gradually learn and evaluate ECS and other services before making a decision to move to them.
But I wouldn’t be a good DevOps Developer if I didn’t say that we plan on moving away from this setup within this next year as we do plan on migrating fully off of EC2 over to ECS.
If you’re interested on that, please consider giving this article a few claps and/or consider following me! I will be sure to report back on that technical challenge as the year progresses.
Heres to another great year of growth!