[Remote] Senior AI QA Engineer with Python (Automation & Manual)
Note: The job is a remote job and is open to candidates in USA. EPAM Systems is a leading company in the tech industry, and they are seeking a skilled Senior AI QA Engineer with strong experience in both manual and automated testing. The ideal candidate will test a variety of AI-based applications, ensuring reliability and accuracy while contributing to the development of automation capabilities.
Responsibilities
- Research and evolve automation frameworks in line with Gen AI tooling and best practices
- Design and automate the evaluation of Gen AI features — grounding, answer accuracy, determinism/reproducibility, precision, recall, and criteria recall
- Build automated LLM test harnesses that scale evaluation beyond human-in-the-loop
- Select and apply Gen AI evaluation frameworks, measuring answer quality and pipeline efficiency
- Perform manual testing as needed to validate new features, integrations, and user stories
- Build and maintain test cases from requirements and user stories
- Test applications that may include AI agents, APIs, databases, and other integrations
- Collaborate with product, engineering, and operations teams to understand requirements and deployment environments
- Track and report test results, defects, and quality metrics
- Assist with troubleshooting production issues; escalate risks as needed
- Guide and support team members, including onshore and offshore consultants
Skills
- 3+ years of experience in software QA, with at least 1 year focused on testing AI agents, agentic solutions, or LLM-based systems
- Hands-on experience with both manual and automated testing of AI agents, including prompt/instruction testing and evaluation of agentic workflows
- Strong programming skills in Python for test automation — pytest or equivalent, scripting, and AI/ML library integration
- Experience with AI agent frameworks, prompt engineering, and evaluation metrics for LLM-based systems
- Demonstrated experience in testing and evaluating Gen AI / LLM applications — grounding, answer accuracy, and hallucination/determinism checks
- Applied knowledge of Gen AI / LLM evaluation frameworks and metrics — precision, recall, criteria recall, and efficiency
- Experience with issue and test management tools (e.g., Jira, QMetry, TestRail)
- Experience with version control systems and integrating tests into CI/CD pipelines
- Experience using AI-powered tools for QA (e.g., GitHub Copilot, LLM-based test generation)
- Understanding of cloud environments, particularly AWS
- Excellent communication, collaboration, and leadership skills
- Strong English communication skills (B2 level or higher)
- Experience with agentic AI platforms (e.g., LangChain, OpenAI Function Calling, or similar)
- Skills in AI safety, bias, and reliability testing
- Background in test data generation for AI/ML systems
Benefits
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn
Company Overview
Company H1B Sponsorship