Back to All Jobs

Senior Member of Technical Staff: ML Systems and Infrastructure

Devrev

Location

Bangalore, India

Job Type

Full-time

Posted

April 16, 2026

Job Description

About DevRev

At DevRev, we're building the future of work withComputer– your AI teammate. Unlike traditional tools,Computerunifies all your data sources, tools, and workflows into a single AI-ready platform, giving employees real-time insights, proactive suggestions, and powerful agentic actions. It extends your existing software with AI-native apps and agents that work alongside your teams and customers – updating workflows, coordinating across teams, and eliminating repetitive work. We call this Team Intelligence: human-AI collaboration that breaks down silos, brings people back together, and frees you to solve bigger problems. Backed by Khosla Ventures and Mayfield with $150M+ raised, DevRev is trusted by global companies across industries.

What You’ll Do:

Architect the Future of AI Infrastructure:You will design, build, and own the end-to-end platform that supports the entire lifecycle of our ML models—from massive-scale distributed training to ultra-low-latency, highly-available inference.
Optimize and Serve Cutting-Edge Models:You'll implement and scale sophisticated inference stacks for LLMs using frameworks likevLLM, TensorRT-LLM, or SGLang. You’ll solve complex challenges in throughput, latency, token streaming, and automated scaling to deliver a seamless user experience.
Empower AI Innovation:You will act as a strategic partner to our AI Research and Data Science teams. You’ll create a seamless developer experience that accelerates their ability to experiment, fine-tune, and deploy groundbreaking models with velocity and confidence.
Automate Everything:You'll develop robust CI/CD/CT (Continuous Training) pipelines using tools likeArgo Workflows, ArgoCD, and GitHub Actionsto automate model validation, deployment, and lifecycle management, ensuring our systems are both agile and rock-solid.

What are we looking for

Experience:5+ years in infrastructure or software engineering, with at least 2+ years laser-focused on MLOps or ML infrastructure for large-scale distributed systems.
Education:A Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
Kubernetes & Cloud Native Expertise:Deep, hands-on expertise withKubernetesin production. You are fluent in the cloud-native ecosystem, includingHelm, ArgoCD, and Argo Workflows.
GPU & Cloud Mastery:Optimize the platform’s performance and scalability, considering factors such as GPU resource utilization, data ingestion, model training, and deployment.
Modern LLM Serving Experience:Hands-on experience with modern LLM inference serving frameworks (e.g.,vLLM, SGLang, Triton Inference Server, Ray Serve). You understand the unique challenges of serving generative models.
Strong Coder:Strong programming proficiency inPythonorGo, with experience using ML frameworks likePyTorch,Jax,TensorFlow.
Observability Mindset:A passion for building observable and resilient systems using modern monitoring tools (e.g., Prometheus, Grafana, OpenTelemetry).

We would love to see:

Deep performance optimization skills, including writingcustom inference kernels in CUDA or Tritonto accelerate model performance beyond what off-the-shelf frameworks provide.
Experience with model optimization techniques likequantization, distillation, and speculative decoding.
Exposure to training and servingmulti-modal models(e.g., text-to-image, vision-language).
Knowledge ofAI safety and evaluation frameworksfor monitoring model performance for things like bias, toxicity, and hallucinations.

As part of our hiring process, shortlisted candidates will undergo a Background Verification (BGV). By applying, you consent to sharing personal information required for this process. Any offer made will be subject to successful completion of the BGV.

About DevRev

What You’ll Do:

Architect the Future of AI Infrastructure:You will design, build, and own the end-to-end platform that supports the entire lifecycle of our ML models—from massive-scale distributed training to ultra-low-latency, highly-available inference.
Optimize and Serve Cutting-Edge Models:You'll implement and scale sophisticated inference stacks for LLMs using frameworks likevLLM, TensorRT-LLM, or SGLang. You’ll solve complex challenges in throughput, latency, token streaming, and automated scaling to deliver a seamless user experience.
Empower AI Innovation:You will act as a strategic partner to our AI Research and Data Science teams. You’ll create a seamless developer experience that accelerates their ability to experiment, fine-tune, and deploy groundbreaking models with velocity and confidence.
Automate Everything:You'll develop robust CI/CD/CT (Continuous Training) pipelines using tools likeArgo Workflows, ArgoCD, and GitHub Actionsto automate model validation, deployment, and lifecycle management, ensuring our systems are both agile and rock-solid.

What are we looking for

Experience:5+ years in infrastructure or software engineering, with at least 2+ years laser-focused on MLOps or ML infrastructure for large-scale distributed systems.
Education:A Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
Kubernetes & Cloud Native Expertise:Deep, hands-on expertise withKubernetesin production. You are fluent in the cloud-native ecosystem, includingHelm, ArgoCD, and Argo Workflows.
GPU & Cloud Mastery:Optimize the platform’s performance and scalability, considering factors such as GPU resource utilization, data ingestion, model training, and deployment.
Modern LLM Serving Experience:Hands-on experience with modern LLM inference serving frameworks (e.g.,vLLM, SGLang, Triton Inference Server, Ray Serve). You understand the unique challenges of serving generative models.
Strong Coder:Strong programming proficiency inPythonorGo, with experience using ML frameworks likePyTorch,Jax,TensorFlow.
Observability Mindset:A passion for building observable and resilient systems using modern monitoring tools (e.g., Prometheus, Grafana, OpenTelemetry).

We would love to see:

Deep performance optimization skills, including writingcustom inference kernels in CUDA or Tritonto accelerate model performance beyond what off-the-shelf frameworks provide.
Experience with model optimization techniques likequantization, distillation, and speculative decoding.
Exposure to training and servingmulti-modal models(e.g., text-to-image, vision-language).
Knowledge ofAI safety and evaluation frameworksfor monitoring model performance for things like bias, toxicity, and hallucinations.

DevRev is an equal opportunity employer and does not discriminate on the basis of race, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition, or any other basis protected by law.

Ready to Apply?

Apply for this Position

You'll be redirected to the company's application page

Share this job:

Twitter LinkedIn

Job Information

Source: greenhouse

Remote Type: onsite

Allowed Locations: Bangalore, India

Skills & Tags:

Engineering

Get Jobs Like This

New Devrev jobs and similar roles, straight to your inbox.

Weekly digest. Unsubscribe anytime.

🏙️

Considering Relocating for This Job?

Before you apply, see how far your salary will go in Bangalore, India. Compare take-home pay, rent, food & transport costs vs other tech cities.

Check Cost of Living →

Senior Member of Technical Staff: ML Systems and Infrastructure

Job Description

What are we looking for

What are we looking for

Ready to Apply?

Job Information

Get Jobs Like This

Considering Relocating for This Job?

Wait! Don't miss out