About Scaleway

Your cloud ecosystem for sustainable growth

🏢 Tech👥 501-1000 employees📅 Founded 1999📍 Madeleine, Paris, France⭐ 2.8

B2BAPICloud Computing

Key Highlights

Over 38,000 customers in 160 countries
Headquartered in Paris with 501-1000 employees
Part of the Iliad Group with profit-sharing options
Offers a Startup Program with cloud credits and expertise

Scaleway, headquartered in Paris, France, is a leading cloud computing provider that empowers over 38,000 businesses across 160 countries. With a focus on sustainability and flexibility, Scaleway offers a comprehensive cloud ecosystem, including bare metal, containerization, and serverless architect...

🎁 Benefits

Scaleway offers a range of benefits including 75% reimbursement on public transportation, a €200 annual discount on Scaleway Elements, and profit-shar...

🌟 Culture

Scaleway fosters a culture centered around sustainability and flexibility, making it an attractive choice for startups. The company emphasizes a smoot...

🌐 Website 💼 LinkedIn 𝕏 Twitter All 27 jobs →

Site Reliability Engineer

Scaleway • Paris - On-Site

Posted 5 months ago🏛️ On-Site Site Reliability Engineer 📍 Paris💰 $3 - $3 / daily

Apply Now →

Skills & Technologies

kubernetes docker aws linux prometheus grafana

Overview

Scaleway is hiring a Site Reliability Engineer to build and maintain reliable infrastructure for AI GPU clusters. You'll work with technologies like Kubernetes, Docker, and AWS in Paris. This position requires experience in managing production environments.

Job Description

Who you are

You have a strong background in site reliability engineering, with experience in building and maintaining reliable, observable, and secure infrastructure. Your expertise in managing production environments ensures optimal service availability for customers around the world. You are familiar with cloud computing technologies and have a passion for supporting AI initiatives. Your collaborative spirit allows you to thrive in diverse teams, and you are committed to technical excellence.

You possess a deep understanding of containerization and orchestration tools, particularly Kubernetes and Docker. Your experience with monitoring and alerting tools like Prometheus and Grafana enables you to proactively manage system performance and reliability. You are comfortable working in a Linux environment and have experience with cloud platforms such as AWS. Your problem-solving skills and attention to detail help you identify and resolve issues efficiently.

What you'll do

In this role, you will be responsible for building and maintaining the infrastructure that supports Scaleway's AI GPU clusters. You will work closely with engineering teams to ensure that the systems are reliable and scalable. Your mission will involve implementing best practices for infrastructure management, including CI/CD pipelines and automation. You will monitor system performance and respond to incidents to minimize downtime and ensure service availability.

You will collaborate with cross-functional teams to design and implement solutions that meet the needs of our customers. Your role will also involve capacity planning and optimizing resource usage to support Scaleway's growth. You will contribute to the development of documentation and training materials to help onboard new team members and improve team efficiency.

What we offer

Scaleway offers a dynamic work environment where you can contribute to shaping the future of cloud computing. You will have the opportunity to work on cutting-edge technologies and be part of a team that values collaboration and innovation. We encourage you to apply even if your experience doesn't match every requirement. Join us in building a sovereign cloud alternative that supports ambitious companies across Europe.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Scaleway.

Apply Now →Get Job Alerts

✨

Similar Jobs You Might Like

Based on your interests and this role

Site Reliability Engineer

Scaleway•📍 Paris

Scaleway is hiring a Site Reliability Engineer to build and maintain reliable, observable, and secure infrastructure. You'll work with technologies like Docker, Kubernetes, and AWS to ensure optimal service availability. This role requires experience in cloud computing and infrastructure management.

1 year ago

Site Reliability Engineer

Scaleway•📍 Paris - On-Site

Scaleway is hiring a Site Reliability Engineer to enhance the reliability and performance of their network products. You'll work with technologies like Linux, Docker, and Kubernetes to automate and monitor infrastructure. This role requires expertise in SRE practices and tools.

🏛️ On-Site

9 months ago

Site Reliability Engineer

amo•📍 Paris

amo is hiring a Lead Site Reliability Engineer (SRE) to ensure their systems handle high traffic and maintain performance and reliability. You'll work with technologies like ScyllaDB and focus on automation and system design. This role requires strong leadership and experience in distributed systems.

Lead

1 year ago

Site Reliability Engineer

Scaleway•📍 Paris - On-Site

Scaleway is hiring a Site Reliability Engineer to ensure the robustness and performance of their cloud services. You'll work with technologies like Linux, Docker, and Kubernetes in a collaborative environment based in Paris.

🏛️ On-SiteMid-Level

11 months ago

Site Reliability Engineer

Algolia•📍 Paris

Algolia is seeking a Senior Site Reliability Engineer to ensure the availability of their search products. You'll work with technologies like AWS, Docker, and Kubernetes to optimize performance at scale. This role requires experience in building and operating scalable architectures.

Senior

2d ago

Browse all jobs →