BMW TechWorks India logo

Python/Prompt Engineer – LLM‑Based AI Agents

BMW TechWorks India
11 days ago
Full-time
On-site
Pune, Maharashtra, India
Generative AI & LLM

Role

Python Engineer – LLM‑Based AI Agents

 

Visit our website bmwtechworks.in to know more. 

Follow us on LinkedIn I Instagram I Facebook X for the exciting updates.

 

About the UNIT/ Unit Overview 

This role supports DECO in designing, developing, and operationalizing LLM‑based intelligent agents and multi‑agent systems.
 You will build autonomous, tool‑using agents that combine reasoning, planning, memory, and retrieval to automate enterprise processes end‑to‑end across quality, logistics, production, and business domains.

Location


Experience: 


Number of openings 

 

 

What awaits you/ Job Profile 

 

AI Agent Engineering & LLM Development

 

JD Description

Key Responsibilities

You will build next‑generation AI agents using advanced agentic concepts:

  • Develop LLM‑based agents with capabilities such as reasoning, planning, reflection, tool use, and memory.
  • Implement ReAct (Reasoning + Acting) patterns and Chain‑of‑Thought (CoT) prompting for reliable multi‑step reasoning.
  • Build agent architectures including Planner/Controller, Executor/Worker, Scratchpads, and Short‑/Long‑Term Memory.
  • Develop Tool‑Wrappers that allow agents to interact with APIs, databases, ERP systems, file systems, and search tools.
  • Build and optimize dynamic prompting pipelines, contextual injection, prompt chaining, and dynamic few‑shot selection.
  • Create self‑healing agents with reflection loops, retry logic, and error‑correction strategies.
  • Implement Guardrails to ensure safety, policy compliance, and output validation.

Python Engineering

  • Design and implement scalable backend services (FastAPI/Flask) powering agent pipelines.
  • Build modular, reusable agent components in Python for planning, memory, tool execution, and reasoning control.
  • Integrate agent services into enterprise IT systems and microservice ecosystems.

RAG, Data & Integration

  • Build Retrieval‑Augmented Generation (RAG) systems using embeddings and semantic search.
  • Work with vectorstores such as FAISS, Chroma, Milvus, or Pinecone.
  • Implement semantic memory, context compression, and document indexing.
  • Integrate structured and unstructured data (SQL, JSON, XML, logs, domain models, KafKa).

Multi‑Agent System Development

  • Build multi‑agent systems (MAS) where planners delegate tasks to worker agents.
  • Define clear agent roles (Supervisor, Planner, Executor…).
  • Implement agent communication protocols (A2A, MCP).
  • Enable coordination, negotiation, and joint problem‑solving between agents.

MLOps & Deployment

  • Deploy agent services in Kubernetes and manage containerized workloads.
  • Build CI/CD pipelines using GitHub/Jenkins for agent services, prompts, and tool‑chains.
  • Monitor performance metrics like latency, token efficiency, retry rate, error rate, and hallucination rate.

Agent Evaluation & Quality Assurance

Apply modern evaluation frameworks and metrics:

  • Use DeepEval, LangChain Testing Utilities, TruLens, Ragas, Promptfoo, and Guardrails for automated testing.
  • Evaluate correctness, relevance, groundedness, consistency, SQL validity (for SQL agents), and semantic accuracy.
  • Build automated regression test suites for prompts, models, and agent orchestration.
  • Measure SQL metrics (Query Validity, Semantic Correctness, Filter Accuracy, Overquerying, Aggregation Quality).

Delivery & Support model:

  • Standard Business Hours 5x9 (Mo - Fr , 08:00 – 17:00 (CET/CEST, DST included).
  • Ops activities in standard business hours during India public holidays at least one colleague must be on call.

What should you bring along

 

Expected skill Sets and experience that the candidates should bring along. 

Key Qualifications and Skills:

  • Expert-level Python skills (async, OOP, microservices).
  • Strong experience with LLMs and agent frameworks:
  • LangChain, LlamaIndex, Haystack, CrewAI, AutoGen, DSPy.
  • Deep knowledge of agent concepts:
  • Planner, Executor, Memory, Reflection, ReAct, Scratchpads, Prompt Chaining.
  • Experience with RAG, vector databases, embeddings, semantic search.
  • Experience with FastAPI, Docker, Kubernetes, GitHub CI/CD.
  • Understanding of prompting techniques:
  • Zero‑Shot / Few‑Shot / One‑Shot, CoT, Dynamic Prompting, Self‑Ask.
  • Knowledge of guardrails, safety, evaluation, and reliability engineering.

Additional Skills:

  • Experience in automotive processes or enterprise-grade data systems.
  • Knowledge of monitoring tools (Prometheus, Grafana, ELK).
  • Familiarity with model registries (MLflow) or agent evaluation stacks.
  • Understanding of world models, semantic memory, and context compression.
  • Experience tuning or optimizing open source models (LLaMA, Qwen, Mistral).

Must have technical skill 

  • Python (advanced)
  • LLMs & Prompting
  • ReAct, CoT, Scratchpads, Planner/Executor patterns
  • LangChain / LlamaIndex / Haystack
  • RAG, Embeddings, Vectorstores
  • FastAPI
  • Docker
  • Kubernetes
  • GitHub / CI/CD

Good to have technical skills

  • CrewAI, AutoGen, DSPy
  • Azure AI services
  • Logging & Observability
  • Testing frameworks: DeepEval, Ragas, Promptfoo, Guardrails
  • Multi agent orchestration patterns