Senior Software Engineer, Machine Learning Infrastructure
Agtonomy
Other Engineering, Software Engineering
South San Francisco, CA, USA
Posted on Sep 11, 2025
About Us
At Agtonomy, we’re not just building tech—we’re transforming how vital industries get work done. Our Physical AI and fleet services turn heavy machinery into intelligent, autonomous systems that tackle the toughest challenges in agriculture, turf, and beyond. Partnering with industry-leading equipment manufacturers, we’re creating a future where labor shortages, environmental strain, and inefficiencies are relics of the past. Our team is a tight-knit group of bold thinkers—engineers, innovators, and industry experts—who thrive on turning audacious ideas into reality. If you want to shape the future of industries that matter, this is your shot.
About the Role
We are seeking an experienced software engineer to join our team and help scale our ML platform. In this role, you will be responsible for designing, implementing, and maintaining large-scale ML infrastructure to accelerate model iteration and improve training performance across an expanding GPU ecosystem. You’ll work hand-in-hand with our ML and cloud infrastructure teams to solve novel problems, from curating sensor datasets to deploying models that make tractors think like seasoned farmers.
What you'll be doing
- Architect and build distributed training pipelines that scale to handle petabytes of real-world data from farms, fields, and other rugged environments.
- Own the ML lifecycle: curate, label, and visualize massive datasets from cameras, LiDAR, and radar to train world-class models.
- Implement metrics and tags to provide a holistic understanding of model performance and enable the discovery of interesting scenarios for training and evaluation.
- Create tools to visualize predictions and identify failure cases.
- Partner with autonomy, platform and cloud engineers to shape models that run flawlessly on real machines in harsh environments.
What you'll bring
- A Bachelor’s or Master’s degree in Computer Science, Machine Learning, or a related field, plus at least 3 years of experience building systems that matter.
- Experience with Python, Docker, Kubernetes, and Infrastructure as code (e.g. terraform).
- Hands-on experience with data pipelines, ETL processes, and distributed computing in cloud environments (AWS, GCP, or similar).
- A knack for thriving in a fast-paced, collaborative startup where you’ll own big problems and deliver bigger solutions.
What makes you a strong fit
- You’ve wrangled massive datasets and built systems to organize, label, and evaluate them at scale; come with examples!
- Experience working with data from multiple sensors like cameras, LiDAR, and radar.
- You’ve benchmarked complex systems or large-scale ML models, finding failure modes and turning them into wins.
- Familiarity with Nvidia TensorRT or similar tools for optimizing ML inference.
Benefits
• 100% covered medical, dental, and vision for the employee (partner, children, or family is additional)
• Commuter Benefits
• Flexible Spending Account (FSA)
• Life Insurance
• Short- and Long-Term Disability
• 401k Plan
• Stock Options
• Collaborative work environment working alongside passionate mission-driven team!
Our interview process is generally conducted in five (5) phases:
1. Phone Screen with Hiring Manager (30 minutes)
2. Technical Evaluation in Domain (1 hour)
3. Software Engineering Evaluation (1 hour)
4. Panel Interview (Video interviews scheduled with key stakeholders, each interview will be 30 to 60 minutes)