Machine Learning Engineer

Rivago Infotech Inc

Canada

Accepting Applications Full-time Hybrid
Posted 1 hour, 50 minutes ago 0 views 0 applications
Job Description
**MLOps Engineer** **Location: Toronto (Hybrid)** **Duration: Long term Project** **Role Overview:** We are seeking a Machine Learning Developer to design, build, and deploy ML solutions that turn data into measurable business impact. This is a hands\-on engineering role focused on developing end\-to\-end ML pipelines—data preparation, feature engineering, model training, evaluation, and production deployment—using Python and an open\-source AI/ML stack. You will collaborate with data engineering and platform teams and work in environments that may include Databricks and Spark for scalable data processing and model operations. **Key Objectives** Deliver production\-grade ML models and data products from discovery through deployment. Build repeatable, maintainable ML engineering patterns for training, evaluation, and inference. Improve model quality, reliability, and performance through robust testing, monitoring, and iteration. Partner with data and platform teams to leverage scalable compute and data platforms (including Databricks/Spark) while meeting security and governance requirements. **Primary Responsibilities:** * Design, develop, and iterate on machine learning models for classification, regression, clustering, recommendation, forecasting, and/or NLP use cases as needed. * Build end\-to\-end ML pipelines in Python: data ingestion and preparation, feature engineering, training, evaluation, and batch/real\-time inference. * Apply sound experimentation practices: baselines, ablation studies, cross\-validation (as applicable), and clear success metrics aligned to business outcomes. * Develop and maintain reusable ML code (packages, utilities, pipelines) with strong software engineering practices (tests, code review, documentation, CI/CD). * Implement model evaluation and testing: offline benchmarks, data/label quality checks, reproducible training runs, and regression tests to prevent performance degradation. * Operationalize MLOps: model versioning, experiment tracking, model registry, automated deployments, and monitoring for drift, bias, latency, and cost. * Integrate ML services with product systems via APIs and event\-driven patterns; collaborate on feature stores, data contracts, and production SLAs. * Leverage open\-source AI/ML components (e.g., scikit\-learn, PyTorch/TensorFlow, XGBoost/LightGBM, Hugging Face ecosystem) and choose the right tool for accuracy, latency, and maintainability. * Collaborate with data engineering and platform teams to use Databricks/Spark for large\-scale ETL, feature computation, distributed training (where relevant), and scheduled jobs. * Ensure solutions follow security, privacy, and responsible AI practices, including safe handling of sensitive data and auditability of model decisions. **Required Skills \& Experience:** * Strong software engineering experience in Python (clean architecture, API design, testing, packaging, performance tuning). * Hands\-on experience building and deploying machine learning models in production environments. * Proficiency with common ML libraries and frameworks (e.g., scikit\-learn, PyTorch or TensorFlow; XGBoost/LightGBM as applicable). * Experience with data processing in Python (e.g., pandas, NumPy) and strong SQL fundamentals. * Understanding of ML concepts (bias/variance, regularization, feature leakage, evaluation metrics, calibration) and ability to select appropriate metrics for the use case. * Experience with MLOps practices and tooling (e.g., MLflow or equivalent), including experiment tracking, model versioning, and reproducible training. * Experience deploying services (Docker, CI/CD) and operating them with monitoring/observability practices. * Ability to communicate tradeoffs clearly—balancing accuracy, latency, cost, reliability, and risk. **Preferred / Nice to Have:** * Awareness of Databricks concepts (workspaces, notebooks, jobs, clusters) and practical experience with Spark for large\-scale data processing. * Experience with Databricks MLflow Model Registry and/or Unity Catalog (or similar governance) for managing models, features, and controlled data access. * Experience with feature stores, data versioning, and data quality frameworks. * Experience with model serving and optimization (e.g., FastAPI, TorchServe, ONNX, quantization, batching, caching). * Familiarity with modern open\-source LLM and embeddings ecosystem (e.g., Hugging Face Transformers, sentence\-transformers) and applying them to NLP tasks when relevant. * Experience with cloud ML services and distributed training patterns (Ray, Spark ML, Horovod, or similar). * Experience implementing responsible AI practices (privacy, explainability, robustness, and security controls).
Login to Apply

Don't have an account? Register

About Company
Rivago Infotech Inc
View All Jobs
Share this job