About Together AI

Empowering corporate mentorship for effective learning

👥 21-100 employees📍 CityPlace, Toronto, ON💰 $1.7m

B2BHRLearningSaaSCommunity

Key Highlights

Founded in 2018, headquartered in Toronto, ON
Raised $1.7 million in seed funding
Partnerships with Heineken, Reddit, and 7-Eleven
4 weeks paid vacation and competitive equity packages

Together is a corporate mentorship management platform founded in 2018, headquartered in CityPlace, Toronto, ON. The platform streamlines the mentorship lifecycle, facilitating connections among employees at companies like Heineken, Reddit, and 7-Eleven. With $1.7 million in seed funding, Together a...

🎁 Benefits

Together offers competitive salaries and equity packages, 4 weeks of paid vacation, and a comprehensive health, dental, and vision plan through Honeyb...

🌟 Culture

Together fosters a culture of autonomy and impact, allowing employees to take on significant responsibilities without bureaucratic constraints. The fo...

🌐 Website All 37 jobs →

Ai Engineer • Mid-Level

Together AI • San Francisco

Posted 2 months agoMid-Level Ai Engineer 📍 San Francisco 📍 Singapore 📍 Amsterdam💰 $160,000 - $230,000 / yearly

Apply Now →

Skills & Technologies

cuda tensorrt pytorch gpu distributed systems machine learning algorithms high-performance computing

Overview

Together AI is seeking an LLM Inference Frameworks and Optimization Engineer to design and optimize distributed inference engines for large language models. You'll work with technologies like CUDA, TensorRT, and PyTorch to enhance performance and scalability. This role requires expertise in distributed systems and machine learning.

Job Description

Who you are

You have a strong background in AI engineering with a focus on inference frameworks and optimization. Your experience includes designing and developing distributed systems that support high-performance AI applications. You are proficient in CUDA and have worked with TensorRT and PyTorch to optimize model performance. You understand the intricacies of GPU and accelerator optimizations, and you are familiar with algorithms that enhance inference efficiency. You thrive in collaborative environments, working closely with hardware and software teams to ensure seamless integration and performance. You are passionate about pushing the boundaries of AI inference and are eager to contribute to innovative projects.

Desirable

Experience with multimodal models and techniques such as Mixture of Experts (MoE) parallelism is a plus. Familiarity with software-hardware co-design principles will set you apart. You have a keen interest in the latest advancements in AI and are always looking to learn and apply new technologies.

What you'll do

In this role, you will design and develop fault-tolerant, high-concurrency distributed inference engines for text, image, and multimodal generation models. You will implement and optimize distributed inference strategies, including tensor parallelism and pipeline parallelism, to ensure high-performance serving. Your work will involve applying CUDA graph optimizations and TensorRT/TRT-LLM graph optimizations to enhance the efficiency and scalability of large language models. You will collaborate with hardware teams to ensure that the software and hardware components work seamlessly together, contributing to the overall success of the AI infrastructure. You will also engage in research and development to explore new algorithms and techniques that can further improve inference performance.

What we offer

Together AI provides a dynamic work environment where innovation is encouraged. You will have the opportunity to work on cutting-edge AI technologies and contribute to projects that have a significant impact on the industry. We offer competitive compensation and benefits, along with opportunities for professional growth and development. Join us in shaping the future of AI inference infrastructure and be part of a team that values creativity and collaboration.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Together AI.

Apply Now →Get Job Alerts

✨

Similar Jobs You Might Like

Based on your interests and this role

Ai Engineer

Coin Market Cap Ltd•📍 Global - Remote

Coin Market Cap Ltd is hiring an LLM Algorithm Engineer to develop and optimize large language models. You'll work with advanced techniques like SFT and RLHF, utilizing frameworks such as PyTorch and TensorFlow. This position requires 3+ years of experience in the field.

🏠 RemoteMid-Level

9 months ago

Ai Engineer

MongoDB•📍 Austin - Remote

MongoDB is seeking an LLM Optimization Lead to drive growth via Large Language Models and AI platforms. You'll work on optimizing brand visibility and customer acquisition strategies. This role requires expertise in AI and SEO.

🏠 RemoteLead

9h ago

Backend Engineer

Together AI•📍 San Francisco - On-Site

Together AI is seeking a Senior Backend Engineer to build and optimize their Inference Platform for advanced generative AI models. You'll work with technologies like Python, Docker, and AWS to enhance performance and scalability. This role requires strong experience in backend engineering and machine learning.

🏛️ On-SiteSenior

1 month ago

Ai Engineer

SonarSource•📍 Singapore

SonarSource is hiring an LLM Engineer to work on pioneering AI and ML projects within software engineering. You'll develop novel algorithms and enhance system performance. This role requires expertise in machine learning and Python.

11 months ago

Machine Learning Engineer

OpenAI•📍 San Francisco - On-Site

OpenAI is hiring a Machine Learning Engineer to improve the training throughput of their internal training framework. You'll work with Python, TensorFlow, and PyTorch to enable researchers to experiment with new ideas. This position requires strong engineering skills and knowledge of supercomputer performance.

🏛️ On-SiteMid-Level

3 months ago

Browse all jobs →