About Sesame
Sesame believes in a future where computers are lifelike – with the ability to see, hear, and collaborate with us in ways that feel natural and human. With this vision, we're designing a new kind of computer, focused on making voice companions part of our daily lives. Our team brings together founders from Oculus and Ubiquity6, alongside proven leaders from Meta, Google, and Apple, with deep expertise spanning hardware and software. Join us in shaping a future where computers truly come alive.
About the Role
As a Software Engineer, ML Infrastructure, you will build and scale the foundational infrastructure that powers Sesame’s AI-driven computing experiences. You will work closely with ML researchers, software engineers, and hardware teams to design robust, high-performance infrastructure that enables cutting-edge AI models to run efficiently in real-time environments.
Responsibilities
- Design, build, and optimize scalable ML infrastructure to support training, evaluation, and deployment of AI models.
- Develop and maintain data pipelines for large-scale machine learning workflows.
- Implement efficient model-serving architectures for real-time inference.
- Collaborate with ML engineers and researchers to improve model performance and deployment efficiency.
- Build monitoring and observability tools to ensure system reliability and performance.
Required Qualifications
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in software engineering, focusing on ML infrastructure, distributed systems, or backend engineering.
- Proficiency in Python, C++, or another systems programming language.
- Experience with ML frameworks such as TensorFlow, PyTorch, or JAX.
- Hands-on experience with cloud platforms (AWS, GCP, or Azure) and containerization (Docker, Kubernetes).
- Knowledge of hardware acceleration (TPUs, GPUs) and efficient model deployment strategies.
- Strong understanding of distributed computing, data pipelines, and model-serving architectures.
Preferred Qualifications
- Experience optimizing ML model inference for real-time applications and edge devices.
- Experience writing and optimizing ML kernels in CUDA or Triton.
- Familiarity with large-scale training pipelines and model orchestration tools.
- Experience with profiling tools such as Nvidia Nsight or PyTorch Profiler.
- Prior work in AI-driven consumer applications or human-computer interaction.
Benefits
- 401k matching
- 100% employer-paid health, vision, and dental benefits
- Unlimited PTO and sick time
- Flexible spending account matching (medical FSA)