Site Reliability Engineer

Avrioc Technologies

United Arab Emirates

Accepting Applications Full-time On-site
Posted 1 week, 3 days ago 3 views 0 applications
Job Description
⚙️ **HIRING: Senior SRE / DevOps Lead \| Avrioc \| UAE** 🇦🇪 We’re looking for a **Seasoned DevOps \& Site Reliability Engineer (SRE) Lead** to design, scale, and enhance our cloud infrastructure and observability ecosystem. If you’re passionate about automation, resilience, and reliability — this role is for you! 🔧 **Key Responsibilities** * Architect and deploy **scalable, highly available cloud infrastructure** for production workloads. * Lead and implement **SRE best practices** , ensuring system reliability, performance, and scalability. * Oversee and optimize **CI/CD pipelines** (Jenkins, Argo CD or similar) for seamless deployments. * Define and monitor **SLOs \& SLIs** to ensure service reliability and uptime. * Design and manage **observability frameworks** — monitoring, logging, and alerting (Elastic Stack, Prometheus, Grafana, Dynatrace, New Relic). * Manage and optimize **Kubernetes clusters** and **Helm charts** for efficient orchestration and streamlined releases. * Implement **auto\-healing and proactive monitoring** systems to prevent outages. * Drive **fault injection testing \& chaos engineering** (Chaos Mesh, Litmus, AWS FIS) for resilience validation. * Collaborate with engineering and product teams to embed reliability into every phase of development. * Maintain clear documentation on infrastructure, incidents, and operational processes. 🧩 **Requirements** * 8\+ years of experience as a **DevOps/SRE professional** , leading enterprise SRE implementations. * Hands\-on with **AWS, GCP, or Azure** (EC2, S3, RDS, Lambda, etc.). * Strong with **IaC tools** (Terraform, CloudFormation, Ansible). * Proven experience in **CI/CD automation** , **monitoring** , and **incident response** . * Skilled in **observability tools** — Elastic Stack, Grafana, Prometheus, Dynatrace, New Relic. * Strong **Kubernetes \& Helm** expertise for large\-scale deployments. * Experience with **AWS managed \& self\-managed databases** (MySQL, Cassandra, etc.). * Skilled in **Python, Bash, or Go** scripting. * Experience designing and testing **BCP/DR strategies** . * Proactive in **capacity planning** , ensuring scalability and resilience across cloud environments. * Excellent communication, documentation, and troubleshooting skills. 🛡️ **Information Security Responsibilities** * Comply with **Avrioc’s Information Security \& Service Management** policies. * Maintain the **confidentiality and integrity** of all information assets. * Attend mandatory information security trainings. * Report any security incidents through official channels.
Login to Apply

Don't have an account? Register

About Company
Avrioc Technologies
View All Jobs
Share this job