Agent Evaluation Senior Analyst / Project Manager

●

Hybrid

Location:

Woldwide

Responsibilities:

- Fully own the QA pipeline for agent evaluation tasks;

- Review and validate tasks and golden paths created by scenario writers and experts;

- Spot logical inconsistencies, vague requirements, hidden risks, and unrealistic assumptions;

- Provide structured feedback and ensure quality alignment across contributors;

- Train, onboard, and mentor new QA team members;

- Collaborate with domain experts, delivery managers, and engineers to improve test clarity and coverage;

- Maintain and improve QA checklists, SOPs, and review guidelines;

- Contribute to test planning, prioritization, and quality benchmarks;

- Take initiative to suggest new approaches, tools, and processes that help scale validation and analysis.

- TOP3 / 2-tier consulting experience is a plus;

- Strong analytical and critical thinking skills;

- Attention to detail and reliability - your work can be trusted without double-checking;

- Experience in manual QA, scenario validation, or similar analytical work;

- Comfortable working with structured formats (JSON/YAML);

- Clear written communication and documentation skills;

- Capable of working with a wide range of stakeholders: from engineers to directors/VPs.

Nice to have:

- Background in scenario-based testing, test design, or annotation workflows;

- Experience with AI/LLM evaluation, prompt validation, or agent behavior testing;

- Some technical independence (e.g., Python skills);

- Familiarity with MCP / tool-based task execution.

Candidate Referral - 5% of the invoice amount after the candidate has completed the warranty period.

Contact:

Telegram:

Function

IT

Product Management

Industries

FI, Banking & FinTech

We are eager to learn more about your business needs to solve them