DevOps Engineer and SRE with 8+ years in cloud infrastructure and
distributed systems. Reduced system downtime by 75%, cut cloud costs by
58%, and managed mission-critical platforms serving 70M+ users. Hands-on
with Kubernetes (EKS), Terraform, Ansible, and Python across hybrid
cloud/on-prem environments. Use AI-assisted engineering workflows across
debugging, release support, RCA/postmortems, and technical documentation
while retaining final technical ownership.
Core Competencies
AI-Assisted Engineering: OpenAI Codex, custom
skills, and runbook-driven workflows for debugging, PR review, release
support, RCA/postmortems, and technical documentation
DevOps Engineer, Core Infra Team
Jan 2025 - Present
Support 100 developers by managing hybrid infrastructure (150
servers, 19 MySQL clusters, OpenSearch, Elasticsearch) with Ansible,
Puppet, and Terraform
Resolved 2 P1 incidents within 30-35 min (beating 1-hour SLA) by
owning 24/7 on-call rotation and following established post-mortem, RCA,
and runbook processes
Resolved a critical race condition in 1 day using OpenAI Codex after
it had remained unresolved for 6 months, accelerating codebase
exploration, hypothesis testing, and fix validation while retaining
final technical ownership
Reduced debugging time for known alerts by 60% by creating 23
internal Codex skills from operational runbooks, standardizing
investigation paths for recurring incidents
Site Reliability Engineer, Error Detection
Team
Sep 2022 - Dec 2024
Reduced compute costs by 58% by building custom Golang Kubernetes
operator using CloudWatch metrics and HPA, later migrating to KEDA, and
right-sizing instances (t3.2xlarge to t3a.medium); documented via
RFC/ADR
Reduced memory usage by 97% (32GB to 1GB) by adding file system
caching and Bloom filters to Python report job
Reduced job latency by 99% (1s to 5ms) by eliminating N+1 queries
(1,000 to 1) and tuning log verbosity
Increased pipeline throughput by 100x (50 to 5,000 events/min) by
implementing Python-based message group bucketing for SQS FIFO
Reduced provisioning time from 2 days to 5 seconds by automating
cross-team account onboarding using Python, AWS SDK, and AWS CLI
Saved 55 engineering hours/week by revamping CI/CD with tag-based
deployments for parallel test environments
Achieved ISO 27001 compliance by introducing Bottlerocket nodes,
configuring AWS WAF, and contributing to security documentation and
audit evidence
Delivered real-time threat monitoring by building VPC Flow Logs
pipeline (Kinesis to S3) and deploying Wazuh for machine-level security
logs
Reduced new engineer ramp-up time from 5 to 3 months by authoring
technical documentation (RFCs, ADRs, C4 diagrams)
X-Team International
Pty Ltd
Australia (Remote)
DevOps Engineer, Internal Platform Team
Jul 2021 - Jul 2022
Achieved 99.9% uptime by optimizing AWS infrastructure (EKS, RDS,
CodePipeline) for customer-facing applications
Reduced deployment risks by developing automated infrastructure
testing framework using Golang and Terragrunt
Prevented service disruptions by establishing proactive alerting in
Prometheus and Grafana with custom thresholds
Reduced mean time to resolution by streamlining incident response
using Kanban methodology
bKash Limited
Dhaka, Bangladesh
Senior Cloud Engineer
Aug 2020 - May 2021
Served 70M+ customers by managing 2 mission-critical EKS clusters
for Bangladesh's largest mobile financial system
Reduced system downtime by 75% by implementing CloudWatch monitoring
with 56+ custom metric alarms
Reduced provisioning time from days to minutes by building Terraform
module library with semantic versioning for 5 teams
Achieved zero-downtime updates and 30% cost savings by introducing
blue-green deployments and auto-scaling policies
Achieved RPO under 5 minutes by implementing automated disaster
recovery protocols for financial data systems
Brain Station 23
Limited
Dhaka, Bangladesh
Senior DevOps Engineer, Contracted to
Grameenphone
Feb 2020 - Aug 2020
Modernized telecom applications by deploying Kubernetes with Istio
service mesh in Grameenphone's private data center
Achieved 99% uptime by deploying custom load balancer with HAProxy
and KeepAlived for automatic failover
Enabled rapid incident diagnosis across on-premises Kubernetes
cluster by centralizing logs using Fluentd, ELK Stack, and Zabbix
ShareTrip
Dhaka, Bangladesh
Senior Software Engineer
Sep 2018 - Feb 2020
Integrated 4 payment providers (EBL, CBL, Nagad, bKash) in a
fast-paced Series A startup by leading 5 engineers to build distributed
gateway using Django, Golang, and RabbitMQ
Achieved reliable message delivery at scale by building event-driven
email microservice using Python, Golang, Redis, and RabbitMQ
Reduced deployment time by 67% (15 min to 5 min) by implementing
Jenkins CI/CD pipelines for 5 microservices on AWS EKS
Enabled third-party integrations by developing public APIs for Hotel
search/booking and Package/Tour services
Infosapex Ltd. (Expo
Group)
Dhaka, Bangladesh
Senior Software Engineer
Mar 2016 - Aug 2018
Delivered B2B banking solutions for 4 enterprise clients by leading
12 engineers in an early-stage fintech startup and conducting
client-facing requirement workshops
Protected financial systems from fraud by developing blacklist
manager for banks
Delivered B2B workflow automation platform using BPMN.js, Python,
and MySQL, enabling enterprise clients to digitize approval
processes
Delivered high-availability on-premises infrastructure for 4
enterprise clients (NCC Bank, Eastern Bank, Unilever, Expo Group) using
Nginx load balancing and disaster recovery
Open Source Contributions
Contribute to Zed Editor using Cursor to accelerate codebase
exploration, issue investigation, pull request preparation, and
documentation in a public open-source codebase
Education
Bachelor of Science in Computer Science and Engineering (BSc
CSE)
Expected 2026