
About Baseten
Simplifying machine learning for every organization
Key Highlights
- Headquartered in Union Square, San Francisco, CA
- $285 million raised in Series C funding
- Team growth of 3x over the last five years
- Unlimited PTO with a company-wide holiday break
Baseten is a machine learning application builder headquartered in Union Square, San Francisco, CA. With $285 million in funding from investors like Coatue Management and Founders Fund, Baseten simplifies AI integration for businesses, enabling data scientists to deploy ML models without needing spe...
🎁 Benefits
Baseten offers a remote-first work environment with a $1,000 stipend for home office setup, unlimited PTO with a company-wide break during the holiday...
🌟 Culture
Baseten's culture emphasizes simplifying complex AI technologies for businesses, fostering a collaborative environment where team members can connect ...
Skills & Technologies
Overview
Baseten is hiring a GPU Kernel Engineer to optimize performance for cutting-edge AI workloads. You'll work with C, C++, and CUDA in San Francisco. This position requires experience in low-level optimization and machine learning.
Job Description
Who you are
You have a strong background in low-level programming, particularly in C and C++, with a focus on optimizing performance for GPU workloads. Your experience with Linux environments allows you to navigate and troubleshoot complex systems effectively. You understand the intricacies of machine learning models and how to enhance their performance through kernel optimization. You thrive in collaborative settings, working alongside engineers who are equally passionate about pushing the boundaries of AI technology. Your problem-solving skills are top-notch, enabling you to tackle challenges that arise in high-impact systems work. You are eager to contribute to a team that values technical excellence and innovation.
Desirable
Experience with CUDA programming is a plus, as it will enhance your ability to optimize GPU kernels for machine learning applications. Familiarity with AI frameworks and libraries will also be beneficial in understanding the context of your work and its impact on model performance.
What you'll do
As a GPU Kernel Engineer at Baseten, you will be responsible for designing and implementing high-performance GPU kernels that power state-of-the-art machine learning models. You will collaborate with the Model Performance team to drive optimizations that enhance the efficiency of AI workloads. Your work will directly influence the performance of systems serving millions of users, making your contributions critical to the success of our platform. You will engage in code reviews and provide mentorship to junior engineers, fostering a culture of learning and growth within the team. Additionally, you will participate in cross-functional discussions to align engineering efforts with product goals, ensuring that the solutions you develop meet the needs of our users.
What we offer
At Baseten, we provide a dynamic work environment where innovation is encouraged and rewarded. You will have the opportunity to work on groundbreaking projects that shape the future of AI technology. We offer competitive compensation and benefits, along with a commitment to professional development and career growth. Our team is diverse and inclusive, and we believe that varied perspectives lead to better solutions. Join us in our mission to empower AI companies with the tools they need to succeed.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at Baseten.
Similar Jobs You Might Like
Based on your interests and this role

Ai Research Engineer
Together AI is hiring a Systems Research Engineer specialized in GPU Programming to develop and optimize GPU-accelerated kernels for ML/AI applications. You'll collaborate with cross-functional teams and leverage your expertise in GPU programming and parallel computing. This role requires a strong background in GPU programming techniques.

Technical Lead
OpenAI is hiring a Technical Lead for the Sora team to optimize model serving efficiency and enhance inference performance. You'll work closely with research and product teams, leveraging your expertise in GPU and kernel-level systems.

Gpu Performance Engineer
Genmo is seeking a GPU Performance Engineer to optimize their H100 infrastructure for video generation. You'll leverage advanced profiling tools and write high-performance CUDA kernels to achieve significant speedups. This role requires 5+ years of systems programming experience.

Software Engineering
OpenAI is hiring a Software Engineer for their Inference team to optimize and scale inference infrastructure on AMD GPU platforms. You'll work with technologies like Python, CUDA, and Triton. This position requires experience in distributed systems and performance optimization.

Software Engineering
OpenAI is hiring a Software Engineer for their GPU Infrastructure team to ensure the reliability and uptime of their compute fleet. You'll work with cutting-edge technologies in a high-performance computing environment. This position requires experience in system-level investigations and automation.