SPi Labs ← All Roles

ENGINEERINGRemote·Full-time

ML Engineer, On-device Inference

Shrink frontier models to fit on edge hardware without sacrificing the accuracy our customers depend on.

What you'll do

Quantize, prune, and distill large perception models for edge deployment
Benchmark inference across TensorRT, ONNX Runtime, and custom runtimes
Build automated model optimization and profiling pipelines
Collaborate with research to design architecture-aware efficient models

What we look for

3+ years in ML model optimization or deployment
Hands-on experience with TensorRT, ONNX, or TFLite
Strong Python and C++ skills
Understanding of neural network quantization and hardware accelerators

Interested in this role?

Leave your details and we'll get back to you.