/ Infrastructure / Cloud Cost Optimization

Infrastructure

How Cloud Cost Optimization Saved Us 7% on Our Cloud Bill

March 27, 2025

Seven years ago, adjoe started as a small monetization SDK, working with just 20 publishers in 2018. Today, we’ve grown to over 500 publishers, managing a daily active user base that has expanded exponentially.

Now, we handle half a billion requests daily while offering various in-app monetization solutions and our marketplace, adjoe Exchange. This scale demands a robust, cost-efficient infrastructure, making cloud cost optimization a top priority. Initially, our focus was on rapid deployment, leading us to build everything on AWS without prioritizing cost efficiency. However, as our operations scaled, so did our AWS costs, prompting us to take a closer look at optimization strategies.

Taking Stock of Our AWS Cloud Costs

Our first step toward cloud optimization was conducting a thorough analysis of our infrastructure. Understanding where our budget was going allowed us to identify inefficiencies and optimization opportunities. We held multiple discussions with tech teams and leveraged managed cost-analysis solutions to uncover areas where we were over-provisioning resources.

We discovered that our optimization strategy focused on four main areas:

Observability – Implementing and utilizing monitoring tools like Kubecost and AWS Cost Explorer to track spending and usage patterns.
Cleanup – Auditing and removing underutilized or unused resources.
Shifting to Cost-Effective Solutions – Moving from managed services to open-source alternatives.
Cost Awareness – Educating teams on the financial impact of their choices.

Our Key Cloud Cost Optimization Strategies

It might sound obvious, but you’d be surprised by how quickly unnecessary expenses can pile up! During our investigation, we uncovered several hidden costs—unused resources and legacy features that were silently inflating our budget. Together with our backend developers, we pin pointed the outdated resources and unused data and decided to get rid of them.

Here’s how we approached it:

Storage & Data Management

Optimizing S3 Bucket Usage: We noticed some S3 buckets were keeping unnecessary revision histories, leading to inflated costs. By implementing lifecycle policies to delete files older than 30 days (based on retention needs), we streamlined our data management and reduced storage costs significantly. We also tried to move everything on our IaC tool (Terraform) to make sure manual unwanted buckets won’t be created and be forgotten in future
Implementing Retention Policies for CloudWatch Logs:
Previously, CloudWatch logs also had no retention policy, which resulted in large storage volumes. We added retention policies to remove older logs, optimizing storage costs while ensuring key logs were still retained in Openobserve and Elasticsearch.

Moved RDS Logs from CloudWatch to S3:
First, we got a breakdown to see what API calls are contributing the most for the costs. We found out that in our case it was PutLogEvents which basically means putting logs in Cloudwatch log groups. It’s interesting to have a breakdown per log group to see the IncomingBytes over a long period.
We jump into Cloudwatch metrics and we add a math expression:

SLICE(SORT(REMOVE_EMPTY(SEARCH(‘{AWS/Logs,LogGroupName} MetricName=”IncomingBytes”‘, ‘Sum’, 2592000)),SUM, DESC),0,10)

We set the period to 30 days and the duration is from the 1st of July till the 30th of September, and we got the following results:

The following log groups had an increase in the previous 3 months:

1- RDS audit logs

2- EKS cluster events

3- DynamoDB to firehose

We could do some optimizations here for example disabling the auto-uploading of RDS logs to CloudWatch and using a cron job to transfer logs to S3.
Athena Query Optimization: Reduced query runtime by partitioning data, limiting data scans, and optimizing queries.
DynamoDB TTL: To add TTL to the tables, we had two options. Either we update the existing documents to have a TTL field, or we could entirely create a new table and then migrate to it. We went with the latter option because in our case it was more cost-effective. If we had gone for the first option we would have incurred much higher costs for writing to the tables as some of our tables had Terabytes of data. We created a new table, the backend wrote to the new and read from both. Once we reached the TTL period, we dropped the old table. The graph below shows the reduction in storage cost. All of the bars show the decrease in storage cost when we deleted the tables. Table 1 was removed in November and table 2 in August. Table 3 was dropped towards the mid of May hence it shows half the cost compared to the previous month. The interesting point here is how big the table can get if we don’t have a TTL.
This saved us approximately 70% of the cost on the tables which initially did not have TTL set on them.

Infrastructure Consolidation

Streamlined Test Environments: Our environments had multiple load balancers that weren’t necessary. We consolidated them into a single load balancer resulting in a cleaner, more cost efficient infrastructure.
Switched to Karpenter for Spot Instances: Replaced a managed Kubernetes spot instance provisioning solution with Karpenter and saved around 40% of the cost that we paid to the provider.
Migrated Metrics from CloudWatch to Prometheus: Lowered monitoring expenses by leveraging Prometheus for metrics tracking.
Migrated from Elasticsearch to Openobserve: Reduced Elasticsearch usage by pushing logs to Openobserve as well which stores logs as Parquet files in S3 and allowing them to be queried via Athena.

Reducing Unnecessary Expenses

In cloud infrastructures, inefficiencies often hide in plain sight, whether it’s underutilized resources or outdated configurations. These small issues can snowball, impacting performance and driving up costs.

Eliminated Costly Services with Open Source Solutions: Replaced AWS EMR with self-hosted JupyterHub, and Spark on Kubernetes.
Contributing to Openobserve for Database Optimization:
Through our contributions to Openobserve, we optimized database connection usage, allowing us to scale down to smaller, more cost-effective database instances without compromising performance.
Optimized CloudFront Usage: Files in the S3 bucket were optimized so that we have less download traffic on Cloudfront distribution which eventually saves cost..
Switched to Kafka for Messaging: Migrated from AWS SQS, SNS, kinesis and firehose to Kafka, lowering messaging infrastructure expenses.
Automated Test Environment Shutdowns: Implemented scheduled shutdowns for ECS/EKS clusters and CI/CD runners outside working hours.

Creating a Culture of Cost Awareness

Cost optimization isn’t just about cutting expenses—it requires an ongoing commitment to efficiency. We worked to ensure that all stakeholders understood the financial implications of their decisions:

Collaboration with Teams: We collaborated with our developers and data scientists to raise awareness about cost optimization strategies. Our discussions focused on identifying factors that drive up costs and guiding them on how to optimize workloads efficiently—ensuring cost-effectiveness without compromising performance.
Enhanced Cost Observability: Using Kubecost and AWS Cost Explorer to improve tracking.
Proactive Cost Alerts: Set up alerts to notify teams of unexpected cost spikes.

The Impact on Cloud Spend and Beyond

After few months of optimization, we saw tangible results:

Reduction in AWS expenses by 7% while maintaining performance and scalability.
Stronger cross-team collaboration, improving overall infrastructure efficiency.
A sustainable cost-conscious culture, ensuring long-term savings.

Final Thoughts

When it came down to cloud spend optimization, these improvements didn’t happen overnight.
This was all done bit by bit – over time – and we are still in the process as cloud resources and cost optimization is an ongoing journey that ensures resources are used efficiently and costs are kept in check. By identifying inefficiencies in the system and implementing right strategies like lifecycle policies, infrastructure consolidation, and implementation of workload-specific tools, we successfully reduced cloud costs without compromising performance.

Our cloud infrastructure remains just as reliable and efficient as before—only now, it operates with a leaner, more cost-effective architecture. Optimization isn’t just about cutting costs. It’s about achieving the perfect balance between efficiency and performance. While cost savings are important, they should never come at the expense of system reliability or performance.

We are currently migrating from ECS to EKS which will further reduce the cloud expenditures along with further optimizations as this journey is never ending.

And if you want to learn more about how we manage backend costs, read our previous article detailing our migration to a self-hosted Kafka setup — a new pub/sub mechanism that helped us address increasing AWS costs.

Engineering1

Role

Team

Location

(Senior) DevOps Engineer (f/m/d)

Cloud Engineering,

Hamburg

Your Skills Have a Place at adjoe

Find a Position

Contents