Baseten

About Baseten

Simplifying machine learning for every organization

🏢 Tech👥 21-100 employees📍 Union Square, San Francisco, CA💰 $285m
B2BAnalyticsBusiness IntelligenceMachine Learning

Key Highlights

  • Headquartered in Union Square, San Francisco, CA
  • $285 million raised in Series C funding
  • Team growth of 3x over the last five years
  • Unlimited PTO with a company-wide holiday break

Baseten is a machine learning application builder headquartered in Union Square, San Francisco, CA. With $285 million in funding from investors like Coatue Management and Founders Fund, Baseten simplifies AI integration for businesses, enabling data scientists to deploy ML models without needing spe...

🎁 Benefits

Baseten offers a remote-first work environment with a $1,000 stipend for home office setup, unlimited PTO with a company-wide break during the holiday...

🌟 Culture

Baseten's culture emphasizes simplifying complex AI technologies for businesses, fostering a collaborative environment where team members can connect ...

Skills & Technologies

Overview

Baseten is hiring a GPU Kernel Engineer to optimize performance for cutting-edge AI workloads. You'll work with C, C++, and CUDA in San Francisco. This position requires experience in low-level optimization and machine learning.

Job Description

Who you are

You have a strong background in low-level programming, particularly in C and C++, with a focus on optimizing performance for GPU workloads. Your experience with Linux environments allows you to navigate and troubleshoot complex systems effectively. You understand the intricacies of machine learning models and how to enhance their performance through kernel optimization. You thrive in collaborative settings, working alongside engineers who are equally passionate about pushing the boundaries of AI technology. Your problem-solving skills are top-notch, enabling you to tackle challenges that arise in high-impact systems work. You are eager to contribute to a team that values technical excellence and innovation.

Desirable

Experience with CUDA programming is a plus, as it will enhance your ability to optimize GPU kernels for machine learning applications. Familiarity with AI frameworks and libraries will also be beneficial in understanding the context of your work and its impact on model performance.

What you'll do

As a GPU Kernel Engineer at Baseten, you will be responsible for designing and implementing high-performance GPU kernels that power state-of-the-art machine learning models. You will collaborate with the Model Performance team to drive optimizations that enhance the efficiency of AI workloads. Your work will directly influence the performance of systems serving millions of users, making your contributions critical to the success of our platform. You will engage in code reviews and provide mentorship to junior engineers, fostering a culture of learning and growth within the team. Additionally, you will participate in cross-functional discussions to align engineering efforts with product goals, ensuring that the solutions you develop meet the needs of our users.

What we offer

At Baseten, we provide a dynamic work environment where innovation is encouraged and rewarded. You will have the opportunity to work on groundbreaking projects that shape the future of AI technology. We offer competitive compensation and benefits, along with a commitment to professional development and career growth. Our team is diverse and inclusive, and we believe that varied perspectives lead to better solutions. Join us in our mission to empower AI companies with the tools they need to succeed.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Baseten.

Similar Jobs You Might Like

Based on your interests and this role

Together AI

Ai Research Engineer

Together AI📍 San Francisco - On-Site

Together AI is hiring a Systems Research Engineer specialized in GPU Programming to develop and optimize GPU-accelerated kernels for ML/AI applications. You'll collaborate with cross-functional teams and leverage your expertise in GPU programming and parallel computing. This role requires a strong background in GPU programming techniques.

🏛️ On-SiteMid-Level
2w ago
OpenAI

Technical Lead

OpenAI📍 San Francisco

OpenAI is hiring a Technical Lead for the Sora team to optimize model serving efficiency and enhance inference performance. You'll work closely with research and product teams, leveraging your expertise in GPU and kernel-level systems.

Lead
10 months ago
Genmo

Gpu Performance Engineer

Genmo📍 San Francisco - On-Site

Genmo is seeking a GPU Performance Engineer to optimize their H100 infrastructure for video generation. You'll leverage advanced profiling tools and write high-performance CUDA kernels to achieve significant speedups. This role requires 5+ years of systems programming experience.

🏛️ On-SiteSenior
7 months ago
OpenAI

Software Engineering

OpenAI📍 San Francisco - On-Site

OpenAI is hiring a Software Engineer for their Inference team to optimize and scale inference infrastructure on AMD GPU platforms. You'll work with technologies like Python, CUDA, and Triton. This position requires experience in distributed systems and performance optimization.

🏛️ On-SiteMid-Level
4 months ago
OpenAI

Software Engineering

OpenAI📍 San Francisco - On-Site

OpenAI is hiring a Software Engineer for their GPU Infrastructure team to ensure the reliability and uptime of their compute fleet. You'll work with cutting-edge technologies in a high-performance computing environment. This position requires experience in system-level investigations and automation.

🏛️ On-SiteMid-Level
2w ago