Accepting Applications
Full-time
On-site
LinkedIn
Posted 3 days, 23 hours ago
0 views
0 applications
Job Description
⚙️
HIRING:
🚀 We’re Hiring | Senior SRE / DevOps Lead | Avrioc | UAE 🇦🇪
We’re looking for a seasoned DevOps \& Site Reliability Engineering (SRE) Lead to design, scale, and elevate our cloud infrastructure and observability ecosystem.
If you’re passionate about automation, system resilience, and building highly reliable platforms — this role is for you.
🔧 Key Responsibilities:
- Architect and deploy scalable, highly available cloud infrastructure
- Lead SRE best practices to ensure reliability, performance, and scalability
- Optimize CI/CD pipelines (Jenkins, Argo CD or similar) for seamless deployments
- Define and track SLOs \& SLIs to maintain uptime and service health
- Build robust observability frameworks (Elastic Stack, Prometheus, Grafana, Dynatrace, New Relic)
- Manage Kubernetes clusters and Helm charts for efficient orchestration
- Implement auto-healing systems and proactive monitoring
- Drive chaos engineering and resilience testing (Chaos Mesh, Litmus, AWS FIS)
- Collaborate with engineering and product teams to embed reliability into development
- Maintain clear infrastructure and incident documentation
🧩 What We’re Looking For:
- 8+ years of experience in DevOps/SRE, including leadership in enterprise environments
- Hands-on experience with AWS, GCP, or Azure
- Strong expertise in Infrastructure as Code (Terraform, CloudFormation, Ansible)
- Proven experience in CI/CD, monitoring, and incident response
- Deep knowledge of observability tools and practices
- Strong Kubernetes and Helm experience at scale
- Experience with databases like MySQL, Cassandra, etc.
- Proficiency in Python, Bash, or Go
- Experience in BCP/DR planning and capacity management
- Strong communication, troubleshooting, and documentation skills
🛡️ Information Security:
- Adhere to information security and service management policies
- Ensure confidentiality and integrity of data
- Participate in security trainings and report incidents as required