SPi Labs
← All Roles
ENGINEERINGRemote·Full-time

ML Engineer, On-device Inference

Shrink frontier models to fit on edge hardware without sacrificing the accuracy our customers depend on.

What you'll do

  • Quantize, prune, and distill large perception models for edge deployment
  • Benchmark inference across TensorRT, ONNX Runtime, and custom runtimes
  • Build automated model optimization and profiling pipelines
  • Collaborate with research to design architecture-aware efficient models

What we look for

  • 3+ years in ML model optimization or deployment
  • Hands-on experience with TensorRT, ONNX, or TFLite
  • Strong Python and C++ skills
  • Understanding of neural network quantization and hardware accelerators

Interested in this role?

Leave your details and we'll get back to you.