Senior Software QA Engineer (AI Model Testing)
Autonomize
Software Engineering, Data Science, Quality Assurance
Bengaluru, Karnataka, India
Posted on Apr 28, 2026
Autonomize AI is building the intelligence layer for healthcare. Our Genesis platform replaces brittle, manual knowledge workflows with AI agents that reason, retrieve, and act - reducing administrative burden so clinicians can focus on patients.
We're looking for engineers who don't just integrate AI into software - they think in agents, design for inference, and treat LLMs as first-class infrastructure.
What This Role Is
You'll architect and ship across the full Genesis stack: agentic pipelines, backend APIs, data infrastructure, and clinical-facing UI. You'll work directly with founders and customers. You'll own things end-to-end.
This is not a role where you bolt AI onto existing CRUD. You'll be making foundational decisions about how intelligent systems are designed, evaluated, and operated at scale in a regulated industry.
You'll Thrive Here If
AI-native engineering is your default mode
We're looking for engineers who don't just integrate AI into software - they think in agents, design for inference, and treat LLMs as first-class infrastructure.
What This Role Is
You'll architect and ship across the full Genesis stack: agentic pipelines, backend APIs, data infrastructure, and clinical-facing UI. You'll work directly with founders and customers. You'll own things end-to-end.
This is not a role where you bolt AI onto existing CRUD. You'll be making foundational decisions about how intelligent systems are designed, evaluated, and operated at scale in a regulated industry.
You'll Thrive Here If
AI-native engineering is your default mode
- You've built production systems where LLMs are doing real work - not demos, not PoCs
- You've designed and shipped RAG pipelines, multi-agent workflows, or tool-using agents in production
- You understand prompt engineering as an engineering discipline: versioning, evaluation, regression testing
- You've instrumented AI systems for observability - latency, token usage, hallucination rate, drift
- You can reason about model tradeoffs (context length, cost, latency, accuracy) and make architectural calls accordingly
- You've worked with LLM SDKs (OpenAI, Anthropic, Bedrock, etc.) and agentic orchestration frameworks (LangChain, LlamaIndex, CrewAI, or similar)
- 4+ years building production web applications from scratch
- Deep Python proficiency; comfortable with FastAPI, Django, or Flask in production
- Experience designing APIs that serve both humans and AI agents (tool schemas, structured outputs, streaming)
- Async-first thinking: asyncio, task queues, event-driven architectures
- Kafka, Redis, or ActiveMQ for real-time data movement
- Postgres, Elasticsearch, MongoDB, or graph databases (Neo4j, TigerGraph) in production
- Docker and Kubernetes in production - this is a hard requirement
- At least one public cloud (AWS, Azure, GCP) with real operational experience
- Microservices and cloud-native design patterns
- You've been on-call. You know what a bad deploy feels like at 2am.
- React, TypeScript, or modern JS frameworks
- Enough frontend fluency to build clinical interfaces without a dedicated frontend handoff
- Led a small engineering team - mentored, reviewed, unblocked
- CKAD or equivalent Kubernetes certification
- ML/DL model deployment experience (PyTorch, scikit-learn)
- Built evaluation harnesses or used MLflow, LangSmith, or similar for AI observability
- Healthcare domain experience (FHIR, HL7, clinical workflows)
- Bias toward action - you ship, then iterate. You learn by doing, not by planning.
- Owner mentality - you don't wait for permission. You identify what needs to exist and build it.
- Intellectual honesty - you'd rather say "I don't know, let me find out". When unsure, seek the right information from your peers or leaders.
- Async-first communication - you write clearly, document decisions, and work well across time zones.
- VC-backed healthcare AI company growing fast
- Full-stack ownership - no handoffs, no silos, no "that's not my team"
- Direct access to founders and customers - your technical decisions will be seen and felt
- Professional development budget - conferences, courses, certifications, books
- Flexible -friendly culture built around output, not hours