Crusoe

About Crusoe

Sustainable AI cloud solutions for a greener future

🏢 Tech👥 501-1000📅 Founded 2018📍 Denver, Colorado, United States

Key Highlights

  • Headquartered in Denver, Colorado
  • 501-1000 employees focused on AI and renewable energy
  • First vertically integrated AI cloud platform
  • Committed to sustainable computing practices

Crusoe is a pioneering AI cloud platform headquartered in Denver, Colorado, that utilizes clean, renewable energy to power its operations. The company focuses on providing scalable computing resources for AI and machine learning applications, serving a diverse range of clients across various industr...

🎁 Benefits

Crusoe offers competitive salaries, equity options, generous PTO, and a flexible remote work policy to support work-life balance....

🌟 Culture

Crusoe fosters a culture centered on sustainability and innovation, encouraging employees to contribute to environmentally friendly computing solution...

Crusoe

Site Reliability Engineer Mid-Level

CrusoeDublin - On-Site

Apply Now →

Skills & Technologies

Overview

Crusoe is hiring a Site Reliability Engineer to ensure the reliability and performance of their cloud infrastructure. You'll work with Linux, networking, and automation to maintain high service levels. This role requires experience in SRE practices and distributed systems.

Job Description

Who you are

You have a strong background in Site Reliability Engineering (SRE) practices, with a focus on maintaining high service levels through effective monitoring and automation. Your experience with distributed systems allows you to understand the complexities involved in ensuring reliability and performance. You are proficient in Linux and have a solid understanding of networking principles, which are crucial for troubleshooting and optimizing infrastructure. Your passion for automation drives you to seek out opportunities to improve processes and reduce manual intervention, ensuring that systems run smoothly and efficiently.

You thrive in a collaborative environment, working closely with engineering teams to advise on building resilient code. Your problem-solving skills enable you to anticipate potential issues and implement proactive measures to prevent them from impacting customers. You are committed to continuous improvement and conduct thorough post-mortems to learn from incidents, sharing insights with your team to enhance overall performance. You understand the importance of a customer-centric approach and strive to ensure that clients have reliable access to the virtual machines they depend on.

Desirable

Experience with cloud infrastructure and familiarity with various cloud service providers would be a plus. Knowledge of monitoring tools and practices, as well as experience with incident management, will further enhance your ability to contribute to the team's success. A background in software development can also be beneficial, as it allows for better collaboration with engineering teams.

What you'll do

In this role, you will be responsible for ensuring the reliability and performance of Crusoe's AI platform. You will work on automation and tool development to streamline routine processes, allowing for more efficient operations. Your expertise in SRE practices will guide you in detecting, analyzing, and preventing issues that could affect service levels. You will collaborate with various engineering teams to advise them on best practices for building resilient code, ensuring that systems are designed with reliability in mind.

You will also conduct thorough post-mortems following incidents, identifying root causes and implementing solutions to prevent recurrence. Your proactive approach will help anticipate issues before they impact customers, maintaining the high standards of service that Crusoe is known for. You will play a key role in driving continuous improvement initiatives, working to enhance the overall performance of the infrastructure.

What we offer

At Crusoe, you will be part of a mission-driven team that is dedicated to accelerating the abundance of energy and intelligence through sustainable technology. We offer a collaborative work environment where innovation is encouraged, and your contributions will have a tangible impact on the future of AI and cloud infrastructure. You will have opportunities for professional growth and development, as well as the chance to work on cutting-edge projects that are shaping the industry. Join us in our commitment to responsible and transformative technology solutions.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Crusoe.

Similar Jobs You Might Like

Based on your interests and this role

Klaviyo

Site Reliability Engineer

Klaviyo📍 Dublin

Klaviyo is hiring a Senior Site Reliability Engineer to ensure the reliability and scalability of their critical platforms. You'll work with technologies like AWS, Docker, and Kubernetes to solve complex operational challenges. This position requires strong experience in systems engineering and automation.

Senior
1w ago
Klaviyo

Site Reliability Engineer

Klaviyo📍 Dublin

Klaviyo is hiring a Lead Site Reliability Engineer to set technical direction and lead reliability strategy for critical platforms. You'll ensure systems are reliable and scalable while enabling rapid product development. This role requires strong technical leadership and experience with cloud infrastructure.

Lead
1w ago
Udemy

Site Reliability Engineer

Udemy📍 Dublin

Udemy is hiring a Staff Site Reliability Engineer to manage and evolve their infrastructure. You'll work with AWS, Kubernetes, and programming languages like Python and Golang. This role requires extensive knowledge of cloud technologies and infrastructure-as-code tools.

Staff
18h ago
Fivetran

Site Reliability Engineer

Fivetran📍 Dublin - Hybrid

Fivetran is seeking a Senior Site Reliability Engineer to ensure the performance and reliability of their data infrastructure. You'll collaborate with various teams to enhance the Fivetran Data Platform. This role requires expertise in AWS, Docker, and Kubernetes.

🏢 HybridSenior
1w ago
Klaviyo

Site Reliability Engineer

Klaviyo📍 Dublin

Klaviyo is hiring a Site Reliability Engineer to ensure the reliability and scalability of their platforms. You'll work with AWS, Docker, and Kubernetes to solve complex operational challenges. This position requires experience in site reliability engineering.

Mid-Level
1w ago