Quality Engineer | £750/day Inside IR35 | 6-month initial contract | London
Industry: Technology
Location: London
Job Type: Contract – 6-month initial
This programme delivers a production-grade enterprise agentic AI platform, with MCP acting as the extensibility layer.
Responsibilities
- Design and implement evaluation frameworks for MCP tools and Skills, covering correctness, safety, reliability, and regression performance.
- Build automated testing pipelines for agentic behaviours, including multi-step workflows and tool-calling interactions.
- Identify and mitigate AI failure modes such as hallucination, incorrect tool usage, invalid inputs, and latency amplification.
- Produce and maintain formal testing evidence required for internal release approvals (build and operate gates).
- Collaborate with ML Engineers to integrate test hooks and observability directly into MCP servers and Skills.
- Contribute to defining long-term quality assurance standards, testing practices, and governance models for a federated MCP ecosystem.
Skills & Experience Required
- Strong Python experience for building test frameworks, automation tooling, and validation pipelines.
- Experience with LLM evaluation approaches, RAG evaluation frameworks, or custom model assessment systems.
- Hands-on experience in test automation across unit, integration, and regression testing.
- Understanding of agentic AI systems and their failure modes, particularly in non-deterministic environments.
- Ability to assess systems against governance, security, and audit/compliance requirements.
- Experience working within regulated or highly controlled engineering environments.

Apply now
* Required
