Back

Agent Evaluation Senior Analyst / Project Manager

Hybrid
Location:
Woldwide

Responsibilities: 

- Fully own the QA pipeline for agent evaluation tasks;

- Review and validate tasks and golden paths created by scenario writers and experts;

- Spot logical inconsistencies, vague requirements, hidden risks, and unrealistic assumptions;

- Provide structured feedback and ensure quality alignment across contributors;

- Train, onboard, and mentor new QA team members;

- Collaborate with domain experts, delivery managers, and engineers to improve test clarity and coverage;

- Maintain and improve QA checklists, SOPs, and review guidelines;

- Contribute to test planning, prioritization, and quality benchmarks;

- Take initiative to suggest new approaches, tools, and processes that help scale validation and analysis.

Requirements:

- TOP3 / 2-tier consulting experience is a plus;

- Strong analytical and critical thinking skills;

- Attention to detail and reliability - your work can be trusted without double-checking;

- Experience in manual QA, scenario validation, or similar analytical work;

- Comfortable working with structured formats (JSON/YAML);

- Clear written communication and documentation skills;

- Capable of working with a wide range of stakeholders: from engineers to directors/VPs.

Nice to have:

- Background in scenario-based testing, test design, or annotation workflows;

- Experience with AI/LLM evaluation, prompt validation, or agent behavior testing;

- Some technical independence (e.g., Python skills);

- Familiarity with MCP / tool-based task execution.

Function
IT
Product Management
Industries
FI, Banking & FinTech
We are eager to learn more about your business needs to solve them