OpenAI

About OpenAI

Empowering humanity through safe AI innovation

🏢 Tech👥 1001+ employees📅 Founded 2015📍 Mission District, San Francisco, CA💰 $68.9b4.2
B2CB2BArtificial IntelligenceEnterpriseSaaSAPIDevOps

Key Highlights

  • Headquartered in San Francisco, CA with 1,001+ employees
  • $68.9 billion raised in funding from top investors
  • Launched ChatGPT, gaining 1 million users in 5 days
  • 20-week paid parental leave and unlimited PTO policy

OpenAI is a leading AI research and development platform headquartered in the Mission District of San Francisco, CA. With over 1,001 employees, OpenAI has raised $68.9 billion in funding and is known for its groundbreaking products like ChatGPT, which gained over 1 million users within just five day...

🎁 Benefits

OpenAI offers flexible work hours and encourages unlimited paid time off, promoting at least 4 weeks of vacation per year. Employees enjoy comprehensi...

🌟 Culture

OpenAI's culture is centered around its mission to ensure that AGI benefits all of humanity. The company values transparency and ethical consideration...

Overview

OpenAI is hiring a Software Engineer for their Data Infrastructure team to design and implement dataset infrastructure for next-generation training stacks. You'll work with technologies like Python, Docker, and AWS in San Francisco.

Job Description

Who you are

You have a strong background in software engineering, particularly in designing and implementing scalable data infrastructure. With experience in Python and familiarity with containerization technologies like Docker and orchestration tools such as Kubernetes, you are well-equipped to handle the complexities of large-scale data systems. You understand the importance of performance and efficiency in data pipelines and have a proactive approach to identifying and resolving bottlenecks.

Your collaborative spirit shines through as you work closely with researchers and other infrastructure teams to ensure seamless integration of datasets into training and inference pipelines. You are detail-oriented, ensuring that dataset interfaces are standardized, discoverable, and easy for other teams to adopt. Your ability to document processes and maintain clear communication is key to fostering a productive team environment.

Desirable

Experience with cloud platforms like AWS is a plus, as is familiarity with data storage solutions such as Elasticsearch. You may also have exposure to machine learning frameworks, which will enhance your contributions to the team.

What you'll do

In this role, you will design and maintain standardized dataset APIs that cater to multimodal data that cannot fit in memory. You will build proactive testing and scale validation pipelines for dataset loading at GPU scale, ensuring that the infrastructure can handle the demands of OpenAI's next-generation models. Collaborating with teammates, you will integrate datasets seamlessly into training and inference pipelines, ensuring a smooth user experience.

Your responsibilities will include documenting and maintaining dataset interfaces, establishing safeguards and validation systems to ensure datasets remain reproducible, and proactively testing for performance bottlenecks. You will play a crucial role in enabling researchers to focus on advancing model capabilities while you handle the scale, efficiency, and reliability required to bring those models to life.

What we offer

At OpenAI, you will be part of a mission-driven team that believes in the potential of artificial intelligence to solve global challenges. We offer a collaborative work environment where your contributions will directly impact the future of technology. Join us in shaping the future of AI and enjoy the opportunity to work with cutting-edge technologies in a supportive and innovative atmosphere.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at OpenAI.

Similar Jobs You Might Like

Based on your interests and this role

OpenAI

Software Engineering

OpenAI📍 San Francisco - Hybrid

OpenAI is hiring a Software Engineer for their Fleet Infrastructure team to design and operate systems for model deployment and training on a large GPU fleet. You'll work with technologies like Kubernetes and Docker, and this position requires experience in infrastructure systems.

🏢 HybridMid-Level
1 year ago
OpenAI

Data Engineer

OpenAI📍 San Francisco - On-Site

OpenAI is hiring a Data Engineer to build and operate data infrastructure that supports massive compute fleets and storage systems. You'll work with technologies like Apache Spark, Kafka, and Airflow in San Francisco.

🏛️ On-SiteMid-Level
1 year ago
Plaid

Software Engineering

Plaid📍 San Francisco - On-Site

Plaid is hiring a Senior Software Engineer for their Data Infrastructure team to scale data systems and maintain data integrity. You'll work with technologies like Apache Spark and Data Warehousing in San Francisco.

🏛️ On-SiteSenior
5 months ago
OpenAI

Software Engineering

OpenAI📍 San Francisco - On-Site

OpenAI is hiring a Software Engineer for their Privacy Infrastructure team to design and operate technical systems supporting legal compliance workflows. You'll work with technologies like Apache Spark and Databricks in San Francisco.

🏛️ On-SiteMid-Level
10 months ago
OpenAI

Data Scientist

OpenAI📍 San Francisco - On-Site

OpenAI is hiring a Data Scientist for their Infrastructure team to shape how they scale the infrastructure that powers their products and research. You'll work with Python, SQL, and machine learning techniques to develop metrics and optimize resource allocation. This position requires experience in data analysis and collaboration with engineering and research teams.

🏛️ On-SiteMid-Level
7 months ago