Crusoe

About Crusoe

Sustainable AI cloud solutions for a greener future

🏢 Tech👥 501-1000📅 Founded 2018📍 Denver, Colorado, United States

Key Highlights

  • Headquartered in Denver, Colorado
  • 501-1000 employees focused on AI and renewable energy
  • First vertically integrated AI cloud platform
  • Committed to sustainable computing practices

Crusoe is a pioneering AI cloud platform headquartered in Denver, Colorado, that utilizes clean, renewable energy to power its operations. The company focuses on providing scalable computing resources for AI and machine learning applications, serving a diverse range of clients across various industr...

🎁 Benefits

Crusoe offers competitive salaries, equity options, generous PTO, and a flexible remote work policy to support work-life balance....

🌟 Culture

Crusoe fosters a culture centered on sustainability and innovation, encouraging employees to contribute to environmentally friendly computing solution...

Overview

Crusoe is seeking a Director of Engineering & Reliability to lead engineering design standards and reliability strategies for their AI and HPC data centers. You'll work with AWS and Azure technologies to ensure world-class uptime and performance. This role requires significant experience in engineering management.

Job Description

Who you are

You have a strong background in engineering management, with at least 10 years of experience leading teams in mechanical, electrical, or critical infrastructure systems. Your expertise in reliability engineering is complemented by a deep understanding of FMEA, RCM, and uptime strategies. You thrive in collaborative environments, working closely with construction, facility operations, and executive leadership to drive engineering excellence. Your ability to govern enterprise engineering design standards ensures that your teams deliver high-performance solutions that meet the demands of hyperscale AI workloads. You are passionate about sustainability and innovation, and you understand the importance of creating systems that are both efficient and reliable. You have a proven track record of managing large-scale projects and are comfortable navigating complex technical challenges.

Desirable

Experience in high-performance computing (HPC) environments is a plus, as is familiarity with cloud infrastructure technologies such as AWS and Azure. You may also have experience with data center operations and a strong understanding of asset lifecycle management. Your leadership style encourages team growth and fosters a culture of accountability and excellence.

What you'll do

In this role, you will lead the development of engineering design standards for Crusoe's mechanical, electrical, and critical infrastructure systems. You will oversee reliability engineering programs, ensuring that all systems meet the highest standards of performance and uptime. Collaborating with cross-functional teams, you will drive initiatives that enhance operational readiness and efficiency across Crusoe's data centers. You will be responsible for system performance modeling and asset lifecycle programs, ensuring that all engineering practices align with the company's mission of sustainability and innovation. Your leadership will be crucial in establishing a culture of continuous improvement, where engineering excellence is prioritized and celebrated. You will also engage with external stakeholders to promote best practices in engineering and reliability, positioning Crusoe as a leader in the industry.

What we offer

Crusoe offers a competitive salary and benefits package, along with the opportunity to work on cutting-edge technology that is shaping the future of AI and cloud infrastructure. You will be part of a mission-driven team that values innovation and sustainability, and you will have the chance to make a tangible impact in the industry. We encourage you to apply even if your experience doesn't match every requirement — we value diverse perspectives and are committed to building a team that reflects the communities we serve.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Crusoe.

Similar Jobs You Might Like

Based on your interests and this role

OpenAI

Reliability/dfx Engineer

OpenAI📍 San Francisco - On-Site

OpenAI is hiring a Reliability/DFX Engineer to oversee the architecture and implementation of reliable AI accelerator systems. You'll work closely with chip design and platform design, leveraging your expertise in machine learning and hardware engineering. This role requires a strong background in making ML systems reliable at scale.

🏛️ On-SiteMid-Level
5 months ago
Okta

Site Reliability Engineer

Okta📍 Bellevue - On-Site

Okta is seeking a Senior Manager for Site Reliability Engineering to lead the Infrastructure Platform team. You'll oversee initiatives in Edge networking, Kubernetes, and DevOps transformation, leveraging skills in AWS and automation. This role requires significant technical leadership experience.

🏛️ On-SiteSenior
1w ago
Crusoe

Site Reliability Engineer

Crusoe📍 San Francisco - On-Site

Crusoe is seeking a Senior Site Reliability Engineer to enhance the stability and performance of their GPU cloud platform. You'll collaborate with cross-functional teams and utilize skills in AWS, Docker, and Kubernetes. This role requires a strong background in operational excellence and incident management.

🏛️ On-SiteSenior
2 months ago
HubSpot

Director Of Reliability Engineering

HubSpot📍 Ireland - Remote

HubSpot is hiring a Director of Reliability Engineering to lead a team focused on enhancing reliability capabilities and pioneering AI-assisted operations. You'll work on defining the reliability roadmap and fostering a culture of operational excellence. This role requires significant experience in reliability engineering and team leadership.

🏠 RemoteLead
1w ago
Samsara

Hardware Engineer

Samsara📍 San Francisco - On-Site

Samsara is seeking a Senior Hardware Reliability Engineer to design quality processes ensuring high standards for hardware. You'll implement comprehensive reliability strategies throughout the product development lifecycle. This role requires expertise in hardware reliability engineering.

🏛️ On-SiteSenior
1w ago