All roles

AI Inference Engineer

Remote · USA Full-time New today

Be part of the team creating the software foundation for next-generation AI compute platforms. In this role, you’ll work across the full stack — from low-level kernels and hardware-optimized operators to large-scale ML deployment frameworks — in close collaboration with compiler developers, ML scientists, and hardware specialists. This position offers the chance to contribute to state-of-the-art AI infrastructure, fine-tune software for custom hardware, and deepen your expertise in system software and machine learning.

Responsibilities (some of the following)

  • Design, develop, and maintain components of the deployment stack and software kernels for AI compute platforms
  • Optimize and implement core ML operators (e.g., GEMMs, convolutions, BLAS routines, SIMD kernels)
  • Translate computational graphs from ML frameworks onto the underlying hardware
  • Contribute to compiler infrastructure together with compiler and hardware teams
  • Investigate and resolve issues through system-level debugging and performance analysis
  • Deliver scalable software solutions under ambitious development schedules
  • Define and apply practices for testing, deployment, and scaling AI systems

Minimum qualifications

  • Bachelor’s degree in Computer Science, Engineering, Mathematics, or related discipline, with 3+ years of professional software development experience
  • Solid knowledge of computer architecture, system software, data structures
  • Strong programming skills in C/C++ or Python in Linux environments using common development tools
  • Hands-on experience implementing algorithms in high-level languages (C/C++/Python)
  • Exposure to specialized hardware (GPUs, FPGAs, DSPs, AI accelerators) and frameworks such as OpenCL or CUDA
  • Experience designing or working with high-performance software systems
  • Solid knowledge of ML fundamentals
  • Motivated team player with a strong sense of responsibility

You are a great fit if you have experience in at least one of the following areas:

  • Model serving frameworks (e.g., Triton Inference Server, DeepSpeed Inference, vLLM)
  • Deep learning frameworks (e.g., PyTorch, TensorFlow)
  • ML runtimes (e.g., ONNX Runtime, TVM, IREE, XLA)
  • Distributed collectives (e.g., Gloo, MPI)
  • Software testing and validation methodologies
  • Deploying ML workloads (LLMs, VLMs, NLP, etc.) across distributed systems
  • Implementation of ML operators and kernels (e.g., SIMD routines, Activation functions, Pooling layers, Quantization layers)
  • Hardware-aware optimizations and performance tuning
  • 2+ years of experience developing software targeting AI hardware

Contribution to open-source projects (e.g., LLVM, PyTorch, TensorFlow, ONNX Runtime, xDSL, IREE) is a big plus.

Apply To This Job

Related roles

Telugu Reviewer/Editor/Certifier, DOJ LSS

Remote · USA Full-time

Visayan  Translator, DOJ LSS

Remote · USA Full-time

Translator (Spanish), DOJ LSS

Remote · USA Full-time

Energy Solutions Advisor

Remote · USA Full-time

Energy Solutions Advisor

Remote · USA Full-time

Spanish Interpreter, DOJ LSS

Remote · USA Full-time

Analyst

Remote · USA Full-time

Senior Specialist

Remote · USA Full-time

Specialist

Remote · USA Full-time

Manager - Digital Products

Remote · USA Full-time

LIVE CHAT SUPPORT (REMOTE)

Remote · USA Full-time

Experienced Data Entry Specialist – Remote Opportunity with arenaflex

Remote · USA Full-time

Job Title: Customer Experience Representative – Delivering Exceptional Service with a Human Touch at blithequark

Remote · USA Full-time

Experienced Data Scientist – Amazon's Ability Procurement (TA) Team

Remote · USA Full-time

Sr. Manager: ERP Business Analyst – Opportunity to Cash (OTC) (REMOTE)

Remote · USA Full-time

Experienced Junior Tech Support Specialist (24/7 Live-Chat Team) – Mobile and Online Banking Solutions

Remote · USA Full-time

Medical Technologist/Pathologist Assistant Virtual Interview Day- Cleveland Clinic Weston in Weston, FL in Cleveland Clinic

Remote · USA Full-time

[Entry Level/No Experience] Amazon Work from Home Jobs

Remote · USA Full-time

Experienced Data Entry Specialist – Flexible Work-From-Home Opportunity with blithequark

Remote · USA Full-time

Teacher - Music: Band/General, [0.7 FTE PART-TIME] (SY25-26)

Remote · USA Full-time