AWS SRE

Marks Sattin

United Kingdom

Accepting Applications Full-time On-site
Posted 9 hours, 27 minutes ago 0 views 0 applications
Job Description
Overview We’re hiring an experienced AWS SRE Engineer to lead observability for a cloud platform. The role focuses on building and maintaining actionable Grafana dashboards, defining and measuring reliability (SLIs/SLOs/SLAs), owning alerting strategy, and driving improvements to platform resilience. This is an opportunity to shape operational excellence and influence engineering decisions across the stack. What you’ll do (key responsibilities) * Design, build and maintain Grafana dashboards that deliver actionable insights into performance, availability and capacity. * Implement and improve observability for AWS\-hosted applications and infrastructure (metrics, logs, traces). * Define and track SLIs, SLOs and SLAs; manage error budgets and translate reliability targets into engineering priorities. * Monitor using golden signals and operate an effective, noise\-aware alerting strategy. * Support incident response, run RCA processes and drive continuous reliability improvements. * Embed observability into CI/CD and cloud operations; collaborate with platform, engineering and ops teams to improve operational efficiency. Must\-have skills and experience * 6\+ years in SRE, Cloud Reliability or Cloud Operations roles. * Strong, hands\-on AWS experience. * Proven expertise building Grafana dashboards and working in observability/monitoring stacks. * Solid understanding of SRE fundamentals (SLA, SLO, SLI, error budgets, golden signals). * Track record troubleshooting production systems and improving platform reliability. * Strong communicator and team collaborator. Nice\-to\-have * Experience with Snowflake or Databricks. * Familiarity with IaC, automation and cloud\-native operational tooling.
Login to Apply

Don't have an account? Register

About Company
Marks Sattin
View All Jobs
Share this job