About Genmo

Your AI partner for creative content generation

🏢 Tech📅 Founded 2023

Key Highlights

Raised $15 million in funding from leading investors
Headquartered in San Francisco, CA
Focus on generative AI for creative industries
Supports creators in entertainment and marketing sectors

Genmo is a pioneering AI company that provides a creative copilot tool enabling users to generate images, videos, and 3D models through advanced generative models. With a focus on Creative General Intelligence, Genmo empowers creators across various industries, including entertainment and marketing,...

🎁 Benefits

Genmo offers competitive salaries, equity options, flexible remote work policies, and generous PTO to support work-life balance....

🌟 Culture

Genmo fosters a culture of innovation and creativity, encouraging employees to experiment with AI technologies while promoting a collaborative environ...

🌐 Website 𝕏 Twitter All 5 jobs →

Gpu Performance Engineer • Senior

Genmo • San Francisco - On-Site

Posted 7 months ago🏛️ On-Site Senior Gpu Performance Engineer 📍 San Francisco

Apply Now →

Skills & Technologies

Cuda Nsight systems Nvprof Triton

Overview

Genmo is seeking a GPU Performance Engineer to optimize their H100 infrastructure for video generation. You'll leverage advanced profiling tools and write high-performance CUDA kernels to achieve significant speedups. This role requires 5+ years of systems programming experience.

Job Description

Who you are

You have a Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related field, and you bring over 5 years of systems programming experience. Your expertise lies in performance optimization, and you thrive on squeezing every last FLOP from GPU infrastructure. You are passionate about microsecond optimizations and enjoy pushing hardware to its theoretical limits.

You are proficient in using advanced profiling tools such as Nsight Systems and nvprof, and you have experience writing high-performance CUDA and Triton kernels for critical model operations. Your understanding of GPU workloads allows you to optimize cold start latency and tune memory access patterns effectively.

You have a collaborative spirit, working closely with ML engineers to optimize model implementations and debug performance issues across the full stack from application to hardware. Your ability to implement custom memory pooling and allocation strategies showcases your innovative approach to performance challenges.

What you'll do

In this role, you will be the performance optimization expert at Genmo, focusing on maximizing the efficiency of our H100 infrastructure. You will profile and optimize GPU workloads, ensuring that our model serving stack operates at peak performance. Your responsibilities will include writing custom CUDA kernels and Triton kernels, as well as optimizing cold start latency from seconds to milliseconds.

You will collaborate with ML engineers to enhance model implementations, debug performance issues, and implement custom memory pooling strategies. Your work will directly contribute to achieving 5-10x speedups in our infrastructure, making a significant impact on our video generation capabilities.

You will also share optimization techniques and foster a performance culture across teams, ensuring that best practices are adopted throughout the organization. Your role will be pivotal in shaping the future of AI at Genmo, as you push the boundaries of what's possible in video generation.

What we offer

At Genmo, you will be part of a cutting-edge research lab dedicated to advancing AI technology. We offer a collaborative work environment where innovation is encouraged, and your contributions will be valued. You will have the opportunity to work with state-of-the-art models and infrastructure, making a real impact in the field of AI.

We believe in supporting our employees' growth and development, providing opportunities for continuous learning and professional advancement. Join us in shaping the future of AI and unlocking the potential of video generation.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Genmo.

Apply Now →Get Job Alerts

✨

Similar Jobs You Might Like

Based on your interests and this role

Performance Engineer

Anthropic•📍 San Francisco

Anthropic is hiring a Senior Performance Engineer specializing in GPU to architect and implement foundational systems for AI. You'll focus on maximizing GPU utilization and performance, requiring deep experience in GPU programming and optimization.

Senior

7h ago

Gpu Performance Analysis Engineer

Apple•📍 San Diego - On-Site

Apple is hiring a GPU Performance Analysis Engineer to design and manufacture high-performance, power-efficient GPUs. You'll analyze performance issues and collaborate with architecture and verification teams. This role requires 3+ years of relevant experience and expertise in C, C++, and Python.

🏛️ On-SiteMid-Level

1 year ago

Software Engineering

Google•📍 Sunnyvale

Google is hiring a Software Engineer specializing in GPU Performance to work on optimizing high-performance GPU kernels and influencing the technical direction of the GPU software ecosystem. You'll work with technologies like CUDA and Triton, requiring 2 years of experience in software development.

Mid-Level

1 month ago

Gpu Kernel Engineer

Baseten•📍 San Francisco - On-Site

Baseten is hiring a GPU Kernel Engineer to optimize performance for cutting-edge AI workloads. You'll work with C, C++, and CUDA in San Francisco. This position requires experience in low-level optimization and machine learning.

🏛️ On-SiteMid-Level

7 months ago

Graphics (gpu) Performance Analysis Engineer

Apple•📍 Austin - On-Site

Apple is hiring a Graphics (GPU) Performance Analysis Engineer to develop performance test plans and analyze GPU performance issues. You'll work with C, C++, and OpenGL in Austin. This position requires experience in computer architecture and GPU performance analysis.

🏛️ On-SiteMid-Level

3 months ago

Browse all jobs →