Nebius AI

About Nebius AI

Empowering AI with robust infrastructure solutions

🏢 Tech👥 51-250📅 Founded 2022📍 Amsterdam, North Holland, Netherlands

Key Highlights

  • Publicly traded on Nasdaq, expanding AI infrastructure market
  • Headquartered in Amsterdam with hubs in the US, Europe, and Israel
  • Team of around 400 skilled engineers focused on AI/ML
  • Specializes in large-scale GPU clusters and cloud platforms

Nebius is a Nasdaq-listed company headquartered in Amsterdam, specializing in AI infrastructure solutions. With a team of around 400 engineers, Nebius provides large-scale GPU clusters and cloud platforms designed to support the rapid growth of the AI industry. The company has established R&D and co...

🎁 Benefits

Nebius offers competitive equity packages, a flexible PTO policy, and opportunities for remote work. Employees also benefit from a learning budget to ...

🌟 Culture

Nebius fosters a culture centered around engineering excellence and innovation in AI infrastructure. The company values collaboration across its globa...

Nebius AI

Site Reliability Engineer Senior

Nebius AIBerlin - Remote

Apply Now →

Overview

Nebius AI is seeking a Senior Site Reliability Engineer to ensure fault-tolerance and scale for their cloud services. You'll work with technologies like Go, Python, and Kubernetes to solve infrastructure challenges. This role requires solid experience in programming and Unix systems.

Job Description

Who you are

You have solid experience with programming languages like Go, Python, or C++ — you've tackled complex problems and built reliable systems that scale. Your understanding of classic algorithms and data structures allows you to optimize solutions effectively. You possess commercial experience with Unix systems and network technology, ensuring that you can navigate and manage the intricacies of modern infrastructure. Your familiarity with containerization and configuration management tools such as Ansible, Terraform, Docker, Kubernetes, and Helm means you can implement and improve CI/CD processes seamlessly. You are eager to be involved in backend development and have experience designing, developing, and running high-load distributed systems. Additionally, your commercial experience with various cloud platforms enhances your ability to contribute to our innovative projects.

What you'll do

In this role, you will ensure fault-tolerance, scale, and uninterrupted operations for Nebius AI's services — your expertise will be crucial in maintaining the reliability of our cloud infrastructure. You will use cutting-edge cloud technology to solve a variety of infrastructure problems, implementing solutions that enhance performance and efficiency. Your responsibilities will include implementing and improving CI/CD processes, ensuring that our deployment pipelines are robust and efficient. You will collaborate with a team of experienced engineers, sharing knowledge and best practices to foster a culture of continuous improvement. As you tackle infrastructure challenges, you will have the opportunity to contribute to the design and development of high-load distributed systems, making a significant impact on our operations. You will also engage in coding interviews as part of the hiring process, helping to shape the future of our engineering team.

What we offer

At Nebius AI, we provide a competitive salary and a comprehensive benefits package that supports your professional growth. You will have opportunities for career advancement within our organization as we continue to expand our products and services. We offer flexible working arrangements, allowing you to balance your professional and personal life effectively. Our dynamic and collaborative work environment values initiative and innovation, encouraging you to bring your ideas to the table. As we grow, you will be part of a team that is at the forefront of AI cloud infrastructure, working on projects that transform industries and solve real-world challenges.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Nebius AI.

Similar Jobs You Might Like

Based on your interests and this role

Nebius AI

Site Reliability Engineer

Nebius AI📍 Amsterdam - Remote

Nebius AI is seeking a Senior Site Reliability Engineer to ensure fault-tolerance and scale for their cloud services. You'll work with technologies like Go, Python, and Kubernetes to solve infrastructure challenges. This role requires solid experience in programming and systems management.

🏠 RemoteSenior
15h ago
GetYourGuide

Site Reliability Engineer

GetYourGuide📍 Berlin - Hybrid

GetYourGuide is hiring a Senior Site Reliability Engineer to build and enhance their cloud and container-based infrastructure. You'll work with AWS, Kubernetes, and Istio to ensure high availability and reliability of core services. This role requires experience in managing cloud environments and automation.

🏢 HybridSenior
2d ago
Nebius AI

Site Reliability Engineer

Nebius AI📍 Prague - Remote

Nebius AI is seeking a Senior Site Reliability Engineer to ensure fault-tolerance and scale for their cloud services. You'll work with technologies like Go, Python, and Kubernetes to solve infrastructure challenges. This role requires solid experience in programming and Unix systems.

🏠 RemoteSenior
15h ago
PandaDoc

Site Reliability Engineer

PandaDoc📍 Germany - Remote

PandaDoc is hiring a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll work with Python, Java, AWS, and Kubernetes to manage incident processes and observability stacks. This role requires solid programming experience and expertise in maintaining production services.

🏠 RemoteSenior
1d ago
PandaDoc

Site Reliability Engineer

PandaDoc📍 Remote (Europe) - Remote

PandaDoc is hiring a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll manage incident processes, observability tools, and contribute to service codebases using Python and Java. This role requires solid experience in AWS and Kubernetes.

🏠 RemoteSenior
1d ago