
About OpenAI
Empowering humanity through safe AI innovation
Key Highlights
- Headquartered in San Francisco, CA with 1,001+ employees
- $68.9 billion raised in funding from top investors
- Launched ChatGPT, gaining 1 million users in 5 days
- 20-week paid parental leave and unlimited PTO policy
OpenAI is a leading AI research and development platform headquartered in the Mission District of San Francisco, CA. With over 1,001 employees, OpenAI has raised $68.9 billion in funding and is known for its groundbreaking products like ChatGPT, which gained over 1 million users within just five day...
🎁 Benefits
OpenAI offers flexible work hours and encourages unlimited paid time off, promoting at least 4 weeks of vacation per year. Employees enjoy comprehensi...
🌟 Culture
OpenAI's culture is centered around its mission to ensure that AGI benefits all of humanity. The company values transparency and ethical consideration...
Skills & Technologies
Overview
OpenAI is hiring a Senior Software Engineer to design and build a load balancer for their research inference stack. You'll work with technologies like Java and Python, focusing on distributed systems and performance optimization. This role requires strong experience in building reliable and efficient systems.
Job Description
Who you are
You have 5+ years of experience in software engineering, particularly in designing and building distributed systems. Your expertise includes working with load balancers and ensuring high availability and performance in complex environments. You are skilled in programming languages such as Java and Python, and you understand the intricacies of network protocols and traffic management.
You possess a strong background in Kubernetes and Docker, enabling you to manage containerized applications effectively. Your experience with observability tools allows you to instrument and debug systems, ensuring they operate smoothly under load. You thrive in collaborative environments, working closely with researchers and machine learning engineers to optimize model performance.
What you'll do
In this role, you will architect and build the gateway and network load balancer that supports OpenAI's research jobs. You will design traffic stickiness and routing strategies to optimize for reliability and throughput, ensuring that requests are handled with millisecond precision. Your responsibilities will include instrumenting and debugging complex distributed systems, focusing on building world-class observability and debuggability tools.
You will own the end-to-end system lifecycle, from design and code to deployment, operation, and scaling. Collaborating closely with cross-functional teams, you will ensure that infrastructure decisions positively impact model performance and training dynamics. Your contributions will be critical in maintaining the reliability and efficiency of OpenAI's AI models.
What we offer
At OpenAI, you will be part of a mission-driven team that believes in the potential of artificial intelligence to solve global challenges. We offer a competitive salary and benefits package, along with opportunities for professional growth and development. Join us in shaping the future of technology and making a meaningful impact in the world.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at OpenAI.
Similar Jobs You Might Like
Based on your interests and this role

Data Engineer
Exa is seeking a Data Engineer to architect and build the data infrastructure for their innovative search engine. You'll work with technologies like Rust, Kafka, and Flink to develop large-scale data processing systems. This role requires a deep understanding of lakehouse architectures and distributed data systems.

Distributed Systems Engineer
Krea is hiring a Distributed Systems Engineer to design and maintain large-scale distributed infrastructure for AI research and real-time model serving. You'll work with technologies like Kubernetes and Python, and collaborate closely with ML engineers. This position requires experience in distributed systems and cloud deployments.

Software Engineering
OpenAI is hiring a Software Engineer for the Sora team to design and scale infrastructure for multimodal training and evaluation. You'll work with distributed data systems and collaborate closely with researchers. This position requires strong experience in building reliable infrastructure.

Software Engineering
Replit is hiring a Software Engineer for their Compute Platform team to enhance cloud infrastructure and optimize performance. You'll work with distributed systems and cloud technologies to deliver high-quality solutions. This position requires a strong foundation in software development and experience with cloud technologies.

Software Engineering
Hightouch is seeking a Software Engineer specializing in Distributed Systems to enhance their syncing engine. You'll work on performance optimization and troubleshooting in a multi-cloud infrastructure. This role requires expertise in distributed systems.