About Nebius AI

Empowering AI with robust infrastructure solutions

🏢 Tech👥 51-250📅 Founded 2022📍 Amsterdam, North Holland, Netherlands

Key Highlights

Publicly traded on Nasdaq, expanding AI infrastructure market
Headquartered in Amsterdam with hubs in the US, Europe, and Israel
Team of around 400 skilled engineers focused on AI/ML
Specializes in large-scale GPU clusters and cloud platforms

Nebius is a Nasdaq-listed company headquartered in Amsterdam, specializing in AI infrastructure solutions. With a team of around 400 engineers, Nebius provides large-scale GPU clusters and cloud platforms designed to support the rapid growth of the AI industry. The company has established R&D and co...

🎁 Benefits

Nebius offers competitive equity packages, a flexible PTO policy, and opportunities for remote work. Employees also benefit from a learning budget to ...

🌟 Culture

Nebius fosters a culture centered around engineering excellence and innovation in AI infrastructure. The company values collaboration across its globa...

🌐 Website 💼 LinkedIn All 256 jobs →

Hpc Cluster Engineer • Senior

Nebius AI • Prague - Remote

Posted 22h ago🏠 Remote Senior Hpc Cluster Engineer 📍 Prague

Apply Now →

Skills & Technologies

linux kvm qemu gpu infiniband

Overview

Nebius AI is seeking a Senior HPC Cluster Engineer to enhance and optimize their cloud platform focused on GPU computing and InfiniBand networks. You'll work with technologies like KVM and QEMU to ensure high performance in multi-GPU environments. This role requires expertise in Linux and virtualization technologies.

Job Description

Who you are

You have extensive experience in high-performance computing (HPC) environments, particularly with GPU clusters and InfiniBand networks. Your background includes a strong understanding of Linux systems and virtualization technologies such as KVM and QEMU. You are skilled in performance tuning and have a knack for troubleshooting complex infrastructure issues. You thrive in collaborative settings, working closely with hardware and software teams to optimize system performance. You are proactive in automating fault detection and resolution processes, ensuring the reliability of HPC systems. You are passionate about AI and cloud computing, eager to contribute to innovative solutions that drive the AI economy forward.

Desirable

Experience with cloud infrastructure and a solid understanding of AI/ML workloads would be a plus. Familiarity with automation tools and scripting languages can enhance your effectiveness in this role. A background in working with large-scale systems and a keen interest in emerging technologies will set you apart.

What you'll do

As a Senior HPC Cluster Engineer at Nebius AI, you will play a pivotal role in the development of our cutting-edge hyperscaler platform. You will be responsible for tuning the performance of GPU clusters and InfiniBand networks, ensuring optimal operation in multi-GPU environments. Your role will involve analyzing and improving infrastructure to support new hardware, as well as fine-tuning system performance to meet the demands of our AI-driven applications. You will collaborate with cross-functional teams to implement enhancements and optimizations that elevate our cloud platform's capabilities. Additionally, you will automate fault detection and resolution processes, contributing to the overall reliability and efficiency of our systems. Your expertise will help shape the future of AI cloud infrastructure, making a significant impact on how industries leverage AI technologies.

What we offer

At Nebius AI, we provide a competitive salary and a comprehensive benefits package designed to support your professional growth. You will have opportunities for career advancement within our organization, as we are committed to fostering talent and innovation. We offer flexible working arrangements to accommodate your lifestyle, along with a collaborative work environment that values initiative and creativity. Join us in our mission to transform industries through AI and cloud computing, and be part of a team that is at the forefront of technological advancement.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Nebius AI.

Apply Now →Get Job Alerts

✨

Similar Jobs You Might Like

Based on your interests and this role

Hpc Cluster Engineer

Nebius AI•📍 Amsterdam - Remote

Nebius AI is seeking a Senior HPC Cluster Engineer to enhance and optimize their cutting-edge hyperscaler platform. You'll work with GPU computing and InfiniBand networks, focusing on performance tuning and automation. This role requires expertise in high-performance computing environments.

🏠 RemoteSenior

22h ago

Hypervisor Engineer

Nebius AI•📍 Prague - Remote

Nebius AI is seeking a Senior Hypervisor Engineer to develop their hyperscaler platform, focusing on KVM hypervisor and QEMU device emulator. You'll optimize I/O for virtual machines and integrate hypervisor services. This role requires expertise in virtualization technologies.

🏠 RemoteSenior

22h ago

Hpc Software Engineer

Canonical•📍 Americas - Remote

Canonical is hiring an HPC Software Engineer to deliver an outstanding HPC experience as part of the broader Ubuntu platform. You'll focus on Python software development for automation in the HPC sphere. This role requires strong mathematical and scientific skills.

🏠 RemoteMid-Level

1 month ago

Systems Engineer

Nebius AI•📍 Amsterdam - On-Site

Nebius AI is seeking a Systems Engineer to support benchmarking of GPU platforms for machine learning and AI workloads. You'll work closely with hardware and development teams to evaluate GPU performance using technologies like CUDA. This position requires expertise in AI and deep learning frameworks.

🏛️ On-SiteMid-Level

22h ago

Systems Engineer

SpaceX•📍 Hawthorne - On-Site

SpaceX is seeking a Senior HPC Systems Engineer to manage HPC clusters and provide application support across engineering disciplines. You'll work with Linux and virtualization technologies in Hawthorne, CA.

🏛️ On-SiteSenior

1w ago

Browse all jobs →