DevOps Engineer

ThoughtStorm

Canada

Accepting Applications Full-time On-site
Posted 5 hours, 34 minutes ago 0 views 0 applications
Job Description
**Must Haves:** **3\+ years experience:** * Design, build, and maintain CI/CD pipelines to enable fast, reliable, and repeatable software delivery across development, staging, and production environments. * Develop and maintain cloud infrastructure using IaC tools (e.g., Terraform, Bicep, or ARM templates) to ensure consistent, repeatable provisioning across environments. * Provision, configure, and manage Databricks , Azure AI and ML infrastructure including Azure Machine Learning workspaces, Azure OpenAI Service, Cognitive Services, and AI Foundry resources. * Implement and maintain monitoring, alerting, and logging solutions to ensure full observability across applications and infrastructure (e.g., Elastic, ). Province/State Ontario City Toronto Country Canada Office Location 20 Bay St Assignment Type Onsite Job Title DevOPS/Cloud Engineer \- Intermediate Description SECTION 1: ACCOUNTABILITY STATEMENT We are looking for an Intermediate DevOps Engineer to play a key role in designing, building, and maintaining our cloud infrastructure and delivery pipelines on Azure. This role supports architecture and data engineering teams in automating deployments, managing cloud resources, deploying Data platforms , AI/ML workloads, and ensuring reliability, security, and scalability across all environments. The Intermediate DevOps Engineer works closely with Architecture , Data Engineering , AI, Security, and Technology \& Infrastructure teams to deliver efficient, repeatable, and well\-governed infrastructure practices. SECTION 2: KEY CONTRIBUTIONS CI/CD Pipeline Development \& Automation * Design, build, and maintain CI/CD pipelines to enable fast, reliable, and repeatable software delivery across development, staging, and production environments. * Collaborate with development teams to integrate automated testing, code quality gates, and deployment approvals into pipeline workflows. * Identify bottlenecks in the delivery process and implement automation solutions to reduce manual effort and increase deployment frequency. * Maintain pipeline\-as\-code standards and ensure version\-controlled, auditable pipeline configurations across the organization. Infrastructure as Code (IaC) * Develop and maintain cloud infrastructure using IaC tools (e.g., Terraform, Bicep, or ARM templates) to ensure consistent, repeatable provisioning across environments. * Enforce infrastructure standards and best practices through code reviews, modular templating, and reusable component libraries. * Collaborate with architecture and data engineering teams to translate infrastructure requirements into well\-structured, scalable IaC implementations. * Lead remediation of infrastructure drift and maintain alignment between declared and actual cloud state. AI Infrastructure \& Azure AI Resource Deployment * Provision, configure, and manage Databricks , Azure AI and ML infrastructure including Azure Machine Learning workspaces, Azure OpenAI Service, Cognitive Services, and AI Foundry resources. * Collaborate with data science and engineering teams to build and maintain CI/CD pipelines for model training, evaluation, and deployment workflows. * Implement governance and access controls for AI resource consumption, including quota management, endpoint security, and cost tagging specific to AI workloads. * Ensure AI infrastructure is deployed using IaC principles and integrated with broader platform standards for observability, networking, and compliance. * Stay current with Azure AI platform updates and evaluate new services to support evolving business requirements around AI/ML delivery. Monitoring \& Observability * Implement and maintain monitoring, alerting, and logging solutions to ensure full observability across applications and infrastructure (e.g., Elastic, ). * Define SLIs, SLOs, and alerting thresholds in collaboration with development and operations teams. * Lead incident response efforts by leveraging observability tooling to triage, diagnose, and resolve issues efficiently. * Continuously improve dashboards, runbooks, and on\-call processes to reduce Mean Time to Resolution (MTTR). SECTION 3: REQUIREMENTS Education : Completion of a diploma in Computer Science, Information Technology, or a related discipline — or a combination of education, training, and experience deemed equivalent. Experience Minimum 3 years' experience in a DevOps, Infrastructure, or Platform Engineering role. Hands\-on experience with Azure cloud services, CI/CD tooling, containerization, IaC practices, and deploying AI/ML resources on Azure. Demonstrated ability to work in Agile delivery environments and collaborate across cross\-functional teams. Certifications or Designations Microsoft Certified: Azure Administrator Associate, Azure DevOps Engineer Expert, or equivalent certification preferred. Familiarity with FinOps or cloud cost practices is an asset. SECTION 4: COMPETENCIES \& SPECIALIZED SKILLS Customer Service · Teamwork · Initiative · Quality \& Safety · Engaged in the Business Technical Competencies * Strong working knowledge of Azure services including Databricks , Azure DevOps, Azure Monitor, Azure Machine Learning, Azure OpenAI Service, Key Vault, and networking fundamentals. * Hands\-on experience with CI/CD platforms such as Azure DevOps Pipelines or GitHub Actions, with a focus on automation, gating, and deployment integrity. * Proficiency in IaC tools such as Terraform, Bicep, or ARM templates; strong version control discipline using Git\-based workflows. * Demonstrated ability to provision and operate Azure AI/ML resources including workspaces, compute clusters, endpoints, and Cognitive Services APIs. * Experience with monitoring and observability platforms such as Azure Monitor, Log Analytics, Prometheus, or Grafana; ability to define meaningful SLOs and alerts.
Login to Apply

Don't have an account? Register

About Company
ThoughtStorm
View All Jobs
Share this job