Accepting Applications
Full-time
On-site
Posted 1 hour, 49 minutes ago
0 views
0 applications
Job Description
**About Us**
Gramian Consultancy is a boutique consultancy specializing in IT professional services and engineering talent solutions. With a strong background in software engineering and leadership, we help companies build high\-performing teams by matching them with professionals who truly fit their needs.
**Role Overview**
We are looking for an
**AI Evaluation Engineer specialized in data analysis**
to design benchmark tasks that simulate real\-world analytical workflows.
You will create scenarios where AI systems must analyze
**large, messy, multi\-source datasets**
, decompose tasks across multiple agents, and produce clear, verifiable conclusions.
**Commitments Required: 8 hours per day with an overlap of 4 hours with PST.**
**Employment type: Contractor assignment (no medical/paid leave)**
**Duration of contract: 4 weeks\+**
**Location:**
**Bangladesh, Brazil, Colombia, Egypt, Ghana, India, Indonesia, Kenya, Nigeria,Turkey, Vietnam**
**Interview: take home assessment (60min)**
**Responsibilities**
* Design and develop multi\-agent benchmark tasks focused on complex data analysis workflows
* Create or curate realistic datasets (CSV, JSON, logs, reports, financial or operational data)
* Build tasks requiring:
+ Cross\-referencing across multiple data sources
+ Anomaly detection and contradiction identification
+ Statistical analysis and interpretation
* Define task decomposition strategies across specialized sub\-agents (e.g., financial, technical, operational analysis)
* Develop verification logic to validate precise analytical outputs (not generic summaries)
* Implement evaluation pipelines using Python and SQL
* Create reproducible environments using Docker
* Analyze task performance and refine for clarity, difficulty, and scoring accuracy
**Requirements**
* 5\+ years of experience in data analysis or analytics\-heavy roles
* Strong proficiency in Python (pandas, NumPy) and SQL
* Experience working with real\-world, messy datasets (CSV, JSON, logs, reports)
* Ability to design analytical problems with clear, verifiable answers
* Solid understanding of statistics (distributions, correlations, outliers)
* Familiarity with AI benchmarks or evaluation environments (e.g., SWE\-bench or similar)
* Hands\-on experience with Docker (Dockerfiles, image builds, debugging)
****Nice to Have****
* Experience in financial analysis, operations analytics, or risk analysis
* Exposure to data pipelines or ETL workflows
* Experience with data quality validation or anomaly detection systems
* Familiarity with AI/ML data workflows or evaluation frameworks
More jobs from Gramian Consulting
Login to Apply
Don't have an account? Register