Together AI

About Together AI

Empowering corporate mentorship for effective learning

👥 21-100 employees📍 CityPlace, Toronto, ON💰 $1.7m
B2BHRLearningSaaSCommunity

Key Highlights

  • Founded in 2018, headquartered in Toronto, ON
  • Raised $1.7 million in seed funding
  • Partnerships with Heineken, Reddit, and 7-Eleven
  • 4 weeks paid vacation and competitive equity packages

Together is a corporate mentorship management platform founded in 2018, headquartered in CityPlace, Toronto, ON. The platform streamlines the mentorship lifecycle, facilitating connections among employees at companies like Heineken, Reddit, and 7-Eleven. With $1.7 million in seed funding, Together a...

🎁 Benefits

Together offers competitive salaries and equity packages, 4 weeks of paid vacation, and a comprehensive health, dental, and vision plan through Honeyb...

🌟 Culture

Together fosters a culture of autonomy and impact, allowing employees to take on significant responsibilities without bureaucratic constraints. The fo...

Together AI

Machine Learning Engineer Senior

Together AISan Francisco - On-Site

Apply Now →

Skills & Technologies

Overview

Together AI is hiring a Senior Machine Learning Platform Engineer to build and optimize a container platform for custom models and inference. You'll work with technologies like CUDA, PyTorch, and Kubernetes in San Francisco.

Job Description

Who you are

You have over 5 years of experience in building large-scale, fault-tolerant distributed systems — you've tackled challenges in optimizing performance and ensuring robustness in complex environments. Your expertise includes working with serverless inference platforms and you are familiar with the intricacies of model bring-up and cloud operations.

You possess a strong understanding of container orchestration, particularly with Kubernetes — you know how to manage multi-cluster scheduling and can identify and resolve machine learning bottlenecks effectively. Your background in profiling and optimization allows you to enhance system performance and developer experience.

You are skilled in writing clear, maintainable software and infrastructure as code (IaC) — you understand the importance of documentation and testing strategies to ensure robustness and fault tolerance in your solutions. You thrive in collaborative environments, partnering with product teams to translate functional requirements into technical solutions.

Desirable

Experience with video or audio generation technologies is a plus — you have a keen interest in the latest advancements in machine learning and are eager to apply them in practical scenarios. Familiarity with queueing theory and inference engines will further enhance your contributions to the team.

What you'll do

In this role, you will focus on enabling custom models and dedicated inference on Together's platform — your responsibilities will include building a container platform that optimizes autoscaling and minimizes cold starts. You will analyze and improve the end-to-end model performance, ensuring a best-in-class developer experience with great tooling.

You will work on multi-cluster orchestration and predictive autoscaling — your insights will help in the development of control panes and model optimization strategies. You will also be involved in writing APIs for managing deployments and developing inference worker SDKs and CLI tools.

Your role will require you to conduct design and code reviews — you will create developer documentation and develop testing strategies that enhance the robustness and scalability of existing distributed systems, APIs, databases, and infrastructure. You will collaborate closely with product teams to understand their needs and deliver solutions that meet business objectives.

What we offer

Together AI provides a dynamic work environment where innovation thrives — you will be part of a team that is dedicated to pushing the boundaries of machine learning technology. We encourage you to apply even if your experience doesn't match every requirement, as we value diverse perspectives and backgrounds.

You will have opportunities for professional growth and development — we believe in fostering talent and providing the resources needed to succeed in your career. Join us in shaping the future of AI and making a significant impact in the industry.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Together AI.

Similar Jobs You Might Like

Based on your interests and this role

Whatnot

Machine Learning Engineer

Whatnot📍 San Francisco - Hybrid

Whatnot is hiring a Machine Learning Platform Engineer to design and scale core infrastructure for machine learning applications. You'll work with technologies like Python, TensorFlow, and Kubernetes in San Francisco. This role requires experience in building dependable ML systems at scale.

🏢 Hybrid
2w ago
Strava

Machine Learning Engineer

Strava📍 San Francisco - Hybrid

Strava is seeking a Machine Learning Platform Engineer to develop sophisticated machine learning models and systems. You'll work with technologies like Python and TensorFlow in a hybrid role based in San Francisco.

🏢 HybridMid-Level
2 months ago
Whatnot

Machine Learning Engineer

Whatnot📍 San Francisco - Hybrid

Whatnot is seeking a Machine Learning Platform Engineer to design and scale core infrastructure for machine learning applications. You'll work with technologies like Python, TensorFlow, and AWS to bring cutting-edge models into production. This role requires a strong background in machine learning and cloud infrastructure.

🏢 HybridMid-Level
1 month ago
Together AI

Software Engineering

Together AI📍 Amsterdam

Together AI is seeking a Senior Software Engineer for their ML Platform team to optimize model performance and developer experience. You'll work with distributed systems and APIs, requiring 5+ years of experience in building scalable solutions.

Senior
2w ago
Together AI

Machine Learning Engineer

Together AI📍 San Francisco

Together AI is seeking a Senior Machine Learning Engineer to develop systems and APIs for LLM inference and fine-tuning. You'll work with Python, Go, and Rust to build scalable, high-performance solutions. This role requires 5+ years of experience in production-quality code.

Senior
2w ago