
About Reflection
Unlocking knowledge with AI for smarter organizations
Key Highlights
- Headquartered in Brooklyn, New York
- AI-powered platform using natural language processing
- Focused on eliminating information silos
- Team size of 11-50 employees
ReflectionAI, headquartered in Brooklyn, New York, provides an AI-driven knowledge management platform that leverages natural language processing to transform unstructured information from meetings, documents, and conversations into a searchable knowledge base. With a focus on enhancing productivity...
π Benefits
Employees at ReflectionAI enjoy competitive salaries, equity options, flexible remote work policies, and generous PTO to maintain a healthy work-life ...
π Culture
ReflectionAI fosters a culture of innovation and collaboration, encouraging employees to contribute ideas and solutions while prioritizing work-life b...

Ai Engineer β’ Mid-Level
Reflection β’ San Francisco - On-Site
Overview
Reflection is hiring a Member of Technical Staff - GPU Infrastructure to design and operate large-scale GPU infrastructure. You'll work with technologies like CUDA, PyTorch, and Kubernetes to optimize performance and reliability. This position requires deep systems engineering experience in high-performance computing environments.
Job Description
Who you are
You have deep systems or infrastructure engineering experience in high-performance or distributed computing environments β you've designed and built reliable systems that operate at scale, ensuring optimal performance and efficiency. Your strong understanding of GPUs, CUDA, and NCCL allows you to push the limits of hardware and software to achieve remarkable results in large-scale training and inference.
Hands-on experience with containerization and orchestration tools like Kubernetes or Slurm is essential β you've managed clusters and optimized resource utilization, ensuring that thousands of GPUs are effectively utilized. You thrive in a fast-paced, high-ownership startup environment, demonstrating high agency and the ability to collaborate closely with cross-functional teams.
Your familiarity with modern observability stacks and performance profiling tools enables you to monitor and enhance system performance β you understand the importance of observability in maintaining reliability and efficiency in complex systems. You are passionate about building tools and automation for distributed training and inference, contributing to the acceleration of AI research and development.
Desirable
Experience with large-scale training frameworks and libraries such as PyTorch, DeepSpeed, JAX, and Megatron-LM is a plus β you are eager to learn and adapt to new technologies that can enhance system capabilities. A background in AI research or collaboration with research teams will further enrich your contributions to the team.
What you'll do
In this role, you will design, build, and operate Reflectionβs large-scale GPU infrastructure β your work will power pre-training, post-training, and inference for cutting-edge AI models. You will develop reliable, high-performance systems for scheduling, orchestration, and observability across thousands of GPUs, ensuring that the infrastructure meets the demands of AI research and deployment.
Your responsibilities will include optimizing cluster utilization, throughput, and cost efficiency while maintaining reliability at scale β you will push the limits of hardware, networking, and software to accelerate the path from idea to model. Collaborating closely with research, training, and platform teams, you will enable large-scale training and inference, contributing to the advancement of open superintelligence.
You will build tools and automation for distributed training, inference, monitoring, and experiment management β your contributions will directly impact the efficiency and effectiveness of AI research at Reflection. You will also engage in performance profiling and optimization, ensuring that the infrastructure remains robust and responsive to the needs of the team.
What we offer
At Reflection, we provide a supportive and inclusive work environment where you can thrive β we offer fully paid parental leave for all new parents, including adoptive and surrogate journeys, and financial support for family planning. Our benefits include paid time off when you need it, relocation support, and perks that optimize your time.
You will have opportunities to connect with teammates through daily lunches and dinners, as well as regular off-sites and team celebrations β we believe in fostering a strong team culture that encourages collaboration and innovation. Join us in our mission to build open superintelligence and make it accessible to all.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at Reflection.
Similar Jobs You Might Like
Based on your interests and this role

Staff Engineer
Cohere is hiring a Staff Software Engineer for their GPU Infrastructure team to build and operate superclusters for AI model training. You'll work with technologies like Python, Kubernetes, and AWS. This role requires expertise in high-performance computing (HPC) and cloud infrastructure.

Software Engineering
Reflection is hiring a Member of Technical Staff - Software Engineer to build core software systems and tools for AI research and production. You'll work with technologies like Python, Java, and C++ in San Francisco.

Software Engineering
OpenAI is hiring a Software Engineer for their GPU Infrastructure team to ensure the reliability and uptime of their compute fleet. You'll work with cutting-edge technologies in a high-performance computing environment. This position requires experience in system-level investigations and automation.

Gpu Kernel Engineer
Baseten is hiring a GPU Kernel Engineer to optimize performance for cutting-edge AI workloads. You'll work with C, C++, and CUDA in San Francisco. This position requires experience in low-level optimization and machine learning.

Gpu Cloud Platform Engineer
Yotta Labs is hiring a GPU Cloud Platform Engineer to design and operate large-scale GPU infrastructure for AI workloads. You'll work with technologies like Kubernetes and Docker to ensure high availability and performance. This role requires experience in high-performance computing and cloud environments.