
About Meta (Facebook)
Connecting people through innovative technology
Key Highlights
- Over 2.9 billion monthly active users across platforms
- Headquartered in Menlo Park, California
- Valued at over $800 billion
- Significant investments in Oculus and AR/VR technology
Meta (formerly Facebook) is a leading technology company focused on building the metaverse, with over 2.9 billion monthly active users across its platforms, including Facebook, Instagram, and WhatsApp. Headquartered in Menlo Park, California, Meta has invested heavily in virtual reality and augmente...
π Benefits
Meta offers competitive salaries, equity compensation, generous PTO policies, comprehensive health benefits, and a robust parental leave program. Empl...
π Culture
Meta fosters a culture of innovation and experimentation, encouraging employees to take risks and explore new ideas. The company emphasizes a mission-...

Ai Engineer β’ Mid-Level
Meta (Facebook) β’ Menlo Park - On-Site
Skills & Technologies
Overview
Meta is hiring an AI/HPC Systems Performance Engineer to enhance their AI Training and Inference Infrastructure. You'll work with technologies like Linux, Python, and TensorFlow to ensure optimal performance of network systems. This role requires experience in high-performance computing and networking.
Job Description
Who you are
You have a strong background in high-performance computing and networking, with experience in building and optimizing systems that support AI workloads. Your expertise in Linux and Python allows you to troubleshoot and enhance system performance effectively. You understand the intricacies of RDMA workloads and are familiar with loss-less fabric interconnects, ensuring that network infrastructure meets stringent performance and availability requirements.
You are skilled in using TensorFlow and Kubernetes, which enables you to manage and deploy AI models efficiently. Your experience with Docker helps you create and manage containerized applications, ensuring smooth integration and deployment across various environments. You thrive in collaborative settings, working closely with cross-functional teams to address scaling challenges and improve system performance.
What you'll do
In this role, you will be responsible for building and evolving Meta's network infrastructure that connects various training accelerators like GPUs. You will ensure that the network operates smoothly and meets the performance requirements for AI workloads. Your daily tasks will involve identifying opportunities for performance improvements across the stack, including network fabric, host networking, and scheduling infrastructure.
You will collaborate with engineers to tackle scaling challenges and implement solutions that enhance the overall efficiency of the AI Training and Inference Infrastructure. Your role will also involve monitoring system performance and availability, troubleshooting issues, and optimizing configurations to meet the demands of rapidly growing AI use cases.
What we offer
Meta provides a dynamic work environment where innovation is encouraged. You will have the opportunity to work on cutting-edge technologies and contribute to projects that have a significant impact on the future of AI. The company offers competitive compensation and benefits, fostering a culture of collaboration and continuous learning. We encourage you to apply even if your experience doesn't match every requirement, as we value diverse perspectives and backgrounds.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at Meta (Facebook).
Similar Jobs You Might Like
Based on your interests and this role

Ai Engineer
Meta is hiring an AI Engineer to tackle scaling challenges in AI Training and Inference Infrastructure. You'll work with technologies like Python and TensorFlow to enhance network performance. This role requires a PhD and expertise in high-performance computing.

Ai Engineer
Apple is hiring a Senior AI Infra Performance Engineer to tackle performance challenges in machine learning workloads. You'll work with technologies like C++, Python, and frameworks such as PyTorch and JAX. This position requires 7+ years of experience in large-scale distributed systems.

Ai Engineer
Meta is hiring an AI Production Engineer to build and scale production-grade AI systems that enhance executive productivity. You'll work with Python and automation tools to design resilient systems. This role requires strong systems engineering skills.

Product Designer
Meta is hiring a Product Design Engineer to build cutting-edge AI developer tools. You'll work on tooling and platforms that power Metaβs AI efforts, contributing to product strategy and user experience design. This role requires a strong background in design and engineering.

Ai Engineer
Meta is hiring an AI Capacity Planning Engineer to focus on AI strategy and planning projects. You'll work cross-functionally to optimize AI computing resources. This role requires experience in performance and capacity engineering.