Accepting Applications
Full-time
On-site
Posted 2 hours, 26 minutes ago
0 views
0 applications
Job Description
**Senior Site Reliability Engineer (SRE) – Healthcare Infrastructure**
**Location:**
SMCHS, Block B, Karachi
**Job Type:**
Full\-Time (Onsite)
**Timings:**
8:00 PM – 5:00 AM (US Shift)
**Company:**
SMB Services
*(Hiring for a US\-based Healthcare Technology Client)*
**Role Overview**
We are looking for an experienced Senior Site Reliability Engineer (SRE) to take full ownership of cloud infrastructure for a US\-based healthcare platform.
The system processes real\-time pharmacy claims for patients requiring critical and life\-saving medications. As a result, system reliability, performance, and security are mission\-critical.
This is a high\-impact, ownership\-driven role focused on building scalable, secure, and highly reliable infrastructure while improving deployment speed and operational efficiency.
**The Environment**
**You will be working on a production system that includes:**
* Rails 8 backend with React 18 frontends (deployed on AWS \& Vercel)
* Real\-time claims processing with zero\-downtime expectations
* HIPAA\-compliant systems requiring strict security, auditing, and access control
* Increasing transaction volumes requiring scalable infrastructure
* CI/CD pipelines (GitHub Actions) with room for optimization
* Monitoring stack including New Relic, Sentry, and Datadog
**Key Responsibilities**
**Infrastructure Ownership**
* Design, manage, and scale AWS infrastructure (EC2, RDS, S3, VPC, IAM, networking)
* Own system reliability, availability, and performance
**Infrastructure as Code (IaC)**
* Build and maintain infrastructure using Terraform, CloudFormation, or similar tools
* Ensure infrastructure is version\-controlled, reproducible, and review\-driven
**CI/CD Optimization**
* Improve and redesign CI/CD pipelines (GitHub Actions)
* Reduce deployment time while ensuring safe and reliable releases
**Observability \& Monitoring**
* Implement robust logging, monitoring, and alerting systems
* Improve instrumentation to proactively detect and resolve issues
**Production Support \& Debugging**
* Troubleshoot production issues across infrastructure and application layers
* Optimize database and system performance where required
**Security \& Compliance**
* Ensure infrastructure meets HIPAA compliance standards
* Implement encryption, access controls, audit logging, and disaster recovery
**Success Metrics**
**Within 6 Months**
* Deployment time significantly reduced
* Issues identified proactively through monitoring
* Infrastructure fully managed via Infrastructure as Code
* Improved staging validation to prevent production issues
* Runbooks created for key operational processes
**Within 12 Months**
* Auto\-scaling infrastructure handling traffic spikes efficiently
* Disaster recovery processes tested and validated
* Faster and more reliable CI/CD pipelines
* Optimized infrastructure costs without compromising performance
* Scalable foundation built to support future growth
**Required Experience \& Skills**
**Core Experience**
* 5\+ years in SRE, DevOps, or Infrastructure Engineering
* Strong expertise in AWS (EC2, RDS, S3, VPC, IAM, CloudWatch)
* Experience with Infrastructure as Code (Terraform / CloudFormation)
* Proven experience designing and optimizing CI/CD pipelines
* Strong understanding of observability (metrics, logs, traces)
**Technical Skills**
* Linux system administration and debugging
* Docker and containerization (ECS/EKS preferred)
* MySQL or PostgreSQL performance tuning
* Ability to write/read code (Ruby, Python, or similar)
* Experience with monitoring tools (Datadog, New Relic, Prometheus, Grafana)
**Security \& Compliance**
* Strong understanding of IAM, encryption, and network security
* Experience with HIPAA, SOC 2, or similar compliance frameworks is a plus
**Preferred (Nice to Have)**
* Experience in healthcare, fintech, or regulated environments
* High\-throughput or real\-time systems experience
* Event\-driven or streaming architectures
* Elasticsearch operations
* Background job systems (e.g., Sidekiq)
* Incident management and post\-mortem analysis
* Infrastructure migration or modernization projects
**Why Join This Role**
* Full ownership of infrastructure and technical decisions
* Opportunity to redesign and improve critical systems
* Direct impact on healthcare technology and patient experience
* Exposure to a broad and modern tech stack
* Focus on reliability, security, and engineering excellence
**Tech Stack**
**Application**
* Rails 8, React 18
* MySQL 8\.0, Elasticsearch
* Solid Queue, EventMachine
**Infrastructure**
* AWS (EC2, RDS, S3, VPC)
* Docker
* Vercel
* GitHub Actions
* New Relic, Sentry, Datadog
More jobs from SMB Services Pakistan
Login to Apply
Don't have an account? Register