
About OpenAI
Empowering humanity through safe AI innovation
Key Highlights
- Headquartered in San Francisco, CA with 1,001+ employees
- $68.9 billion raised in funding from top investors
- Launched ChatGPT, gaining 1 million users in 5 days
- 20-week paid parental leave and unlimited PTO policy
OpenAI is a leading AI research and development platform headquartered in the Mission District of San Francisco, CA. With over 1,001 employees, OpenAI has raised $68.9 billion in funding and is known for its groundbreaking products like ChatGPT, which gained over 1 million users within just five day...
🎁 Benefits
OpenAI offers flexible work hours and encourages unlimited paid time off, promoting at least 4 weeks of vacation per year. Employees enjoy comprehensi...
🌟 Culture
OpenAI's culture is centered around its mission to ensure that AGI benefits all of humanity. The company values transparency and ethical consideration...
Skills & Technologies
Overview
OpenAI is hiring a Software Engineer for their Model Inference team to optimize AI models for high-volume production environments. You'll work with Azure and Python to enhance model performance and efficiency. This position requires 5+ years of experience in software engineering.
Job Description
Who you are
You have at least 5 years of experience in software engineering, particularly in optimizing machine learning models for production environments. You possess a strong understanding of modern ML architectures and are adept at enhancing their performance, especially for inference tasks. Your problem-solving skills are top-notch, and you are willing to learn and adapt to tackle challenges effectively. You thrive in collaborative environments, working alongside researchers and engineers to bring cutting-edge technologies into production. You have experience with cloud platforms, particularly Azure, and understand how to leverage them for optimal performance.
Desirable
Experience with high-volume, low-latency systems is a plus, as is familiarity with tools that provide visibility into system performance and bottlenecks. You are comfortable working in a fast-paced environment and are eager to contribute to innovative solutions that push the boundaries of AI technology.
What you'll do
In this role, you will collaborate with machine learning researchers and product managers to optimize OpenAI's advanced AI models for production use. You will introduce new techniques and tools to improve the performance, latency, and efficiency of the model inference stack. Your responsibilities will include building tools to identify bottlenecks and sources of instability, then designing and implementing solutions to address these issues. You will optimize code and manage a fleet of Azure VMs to maximize hardware utilization, ensuring that every FLOP and GB of GPU RAM is effectively used. You will also contribute to the overall research progression by enabling advanced research through robust engineering practices.
What we offer
At OpenAI, you will be part of a mission-driven team that believes in the transformative potential of artificial intelligence. We offer a collaborative work environment where your contributions will directly impact the future of technology. You will have opportunities for professional growth and development, working with some of the brightest minds in the field. We are committed to providing reasonable accommodations to applicants with disabilities, ensuring an inclusive workplace for all. Join us in shaping the future of AI and making a difference in the world.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at OpenAI.
Similar Jobs You Might Like
Based on your interests and this role

Software Engineering
Apple is hiring a Senior Software Engineer for the Model Inference team to enhance Apple Maps with advanced deep learning and large language models. You'll work with Python, Java, and C++ to optimize high-performance inference services. This role requires 5+ years of experience in software engineering focused on ML inference and large-scale systems.

Software Engineering
OpenAI is hiring a Software Engineer for their Inference team to build and optimize infrastructure for multimodal models. You'll work with technologies like Python, TensorFlow, and Kubernetes to serve real-time audio and image workloads. This position requires experience in machine learning and software engineering.

Software Engineering
Baseten is hiring a Software Engineer focused on Model Performance to enhance AI model inference. You'll work with technologies like Python, PyTorch, and TensorFlow in San Francisco.

Technical Lead
OpenAI is hiring a Technical Lead for the Sora team to optimize model serving efficiency and enhance inference performance. You'll work closely with research and product teams, leveraging your expertise in GPU and kernel-level systems.

Staff Engineer
Cohere is hiring a Staff Software Engineer for their Inference Infrastructure team to build high-performance AI platforms. You'll work with technologies like Python and Docker to deploy optimized NLP models. This role requires experience in machine learning and scalable systems.