Scaled Cognition logo

QA Manager

Scaled Cognition
Full-time
On-site
Test Manager
Scaled Cognition is the world’s only model lab dedicated exclusively to customer experience and pioneering agentic models purpose-built for reliable action-taking enterprise applications. Backed by Khosla Ventures, the company’s flagship Agentic Pretrained Transformer (APT) eliminates hallucinations, enforces enterprise policies and increases reliability in real-world CX workflows. Founded by serial AI entrepreneurs, former Microsoft Corporate Vice President of Conversational AIβ€―Dan Roth, and UC Berkeley AI Professor Dan Klein, and built by a team of world-class PhD researchers and engineers, Scaled Cognition advances the science of agentic AI to deliver safe, policy-aligned automation that enterprises can trust.
Β 

As an QA Manager at Scaled Cognition you will:

  • Develop and implement scalable QA plans for evaluating AI agents, defining key performance metrics to measure progress over time.
  • Collaborate with product and engineering teams to document findings, test fixes, and recommend improvements to the underlying models and conversational flows.
  • Lead and mentor a team of QA engineers, establishing best practices and processes for testing conversational AI agents.

Example projects could include:

  • Building test sets to track regressions, agent robustness, and end-to-end testing.
  • Reviewing and analyzing voice and chat transcripts, and quickly identify conversational gaps and provide data for faster iteration on customer deployments.
  • Designing and automating testing pipelines to scale QA capacity across a diverse portfolio of customers and to continuously evaluate the performance of our AI agents.

Preferred Qualifications:Β 

  • Intermediate-level proficiency in Python and experience building and testing conversational AI/LLM systems.
  • Background in implementing evaluation benchmarks, and production monitoring metrics.
  • Experience working with libraries and tooling common in the AI/LLM ecosystem.
  • Demonstrated precision in documenting test plans, test cases, and bug reports, ensuring data is accurate and easily understandable by cross-functional teams.
  • Experience with leveraging AI-powered assistants/tooling to enable rapid iteration, prototyping, and accelerated delivery.
Apply now
Share this job