Together AI

About Together AI

Empowering corporate mentorship for effective learning

👥 21-100 employees📍 CityPlace, Toronto, ON💰 $1.7m
B2BHRLearningSaaSCommunity

Key Highlights

  • Founded in 2018, headquartered in Toronto, ON
  • Raised $1.7 million in seed funding
  • Partnerships with Heineken, Reddit, and 7-Eleven
  • 4 weeks paid vacation and competitive equity packages

Together is a corporate mentorship management platform founded in 2018, headquartered in CityPlace, Toronto, ON. The platform streamlines the mentorship lifecycle, facilitating connections among employees at companies like Heineken, Reddit, and 7-Eleven. With $1.7 million in seed funding, Together a...

🎁 Benefits

Together offers competitive salaries and equity packages, 4 weeks of paid vacation, and a comprehensive health, dental, and vision plan through Honeyb...

🌟 Culture

Together fosters a culture of autonomy and impact, allowing employees to take on significant responsibilities without bureaucratic constraints. The fo...

Together AI

Ai Engineer Mid-Level

Together AISan Francisco

Posted 2 months agoMid-LevelAi Engineer📍 San Francisco📍 Singapore📍 Amsterdam💰 $160,000 - $230,000 / yearly
Apply Now →

Overview

Together AI is seeking an LLM Inference Frameworks and Optimization Engineer to design and optimize distributed inference engines for large language models. You'll work with technologies like CUDA, TensorRT, and PyTorch to enhance performance and scalability. This role requires expertise in distributed systems and machine learning.

Job Description

Who you are

You have a strong background in AI engineering with a focus on inference frameworks and optimization. Your experience includes designing and developing distributed systems that support high-performance AI applications. You are proficient in CUDA and have worked with TensorRT and PyTorch to optimize model performance. You understand the intricacies of GPU and accelerator optimizations, and you are familiar with algorithms that enhance inference efficiency. You thrive in collaborative environments, working closely with hardware and software teams to ensure seamless integration and performance. You are passionate about pushing the boundaries of AI inference and are eager to contribute to innovative projects.

Desirable

Experience with multimodal models and techniques such as Mixture of Experts (MoE) parallelism is a plus. Familiarity with software-hardware co-design principles will set you apart. You have a keen interest in the latest advancements in AI and are always looking to learn and apply new technologies.

What you'll do

In this role, you will design and develop fault-tolerant, high-concurrency distributed inference engines for text, image, and multimodal generation models. You will implement and optimize distributed inference strategies, including tensor parallelism and pipeline parallelism, to ensure high-performance serving. Your work will involve applying CUDA graph optimizations and TensorRT/TRT-LLM graph optimizations to enhance the efficiency and scalability of large language models. You will collaborate with hardware teams to ensure that the software and hardware components work seamlessly together, contributing to the overall success of the AI infrastructure. You will also engage in research and development to explore new algorithms and techniques that can further improve inference performance.

What we offer

Together AI provides a dynamic work environment where innovation is encouraged. You will have the opportunity to work on cutting-edge AI technologies and contribute to projects that have a significant impact on the industry. We offer competitive compensation and benefits, along with opportunities for professional growth and development. Join us in shaping the future of AI inference infrastructure and be part of a team that values creativity and collaboration.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Together AI.

Similar Jobs You Might Like

Based on your interests and this role

Coin Market Cap Ltd

Ai Engineer

Coin Market Cap Ltd📍 Global - Remote

Coin Market Cap Ltd is hiring an LLM Algorithm Engineer to develop and optimize large language models. You'll work with advanced techniques like SFT and RLHF, utilizing frameworks such as PyTorch and TensorFlow. This position requires 3+ years of experience in the field.

🏠 RemoteMid-Level
9 months ago
MongoDB

Ai Engineer

MongoDB📍 Austin - Remote

MongoDB is seeking an LLM Optimization Lead to drive growth via Large Language Models and AI platforms. You'll work on optimizing brand visibility and customer acquisition strategies. This role requires expertise in AI and SEO.

🏠 RemoteLead
9h ago
Together AI

Backend Engineer

Together AI📍 San Francisco - On-Site

Together AI is seeking a Senior Backend Engineer to build and optimize their Inference Platform for advanced generative AI models. You'll work with technologies like Python, Docker, and AWS to enhance performance and scalability. This role requires strong experience in backend engineering and machine learning.

🏛️ On-SiteSenior
1 month ago
SonarSource

Ai Engineer

SonarSource📍 Singapore

SonarSource is hiring an LLM Engineer to work on pioneering AI and ML projects within software engineering. You'll develop novel algorithms and enhance system performance. This role requires expertise in machine learning and Python.

11 months ago
OpenAI

Machine Learning Engineer

OpenAI📍 San Francisco - On-Site

OpenAI is hiring a Machine Learning Engineer to improve the training throughput of their internal training framework. You'll work with Python, TensorFlow, and PyTorch to enable researchers to experiment with new ideas. This position requires strong engineering skills and knowledge of supercomputer performance.

🏛️ On-SiteMid-Level
3 months ago