About Waymo

Revolutionizing transportation with autonomous driving

🏢 Tech👥 1001+ employees📅 Founded 2009📍 Mountain View, CA💰 $11.1b⭐ 3.7

B2CTransportAutomation

Key Highlights

Operates in cities like Phoenix, San Francisco, and LA
Completed over 10 million fully driverless rides
Raised $11.1 billion in funding
Aiming for one million trips per week by 2026

Waymo, a subsidiary of Alphabet Inc., is at the forefront of autonomous driving technology, operating robotaxis in cities like Phoenix, San Francisco, and Los Angeles. With over 10 million fully driverless rides and more than 100 million miles driven, Waymo is transforming transportation. The compan...

🎁 Benefits

Waymo offers comprehensive medical, dental, and vision insurance for employees and their dependents, along with commuter benefits and onsite wellness ...

🌟 Culture

Waymo fosters a culture of innovation and safety, focusing on the real-world application of autonomous technology. The company values diversity and in...

🌐 Website 💼 LinkedIn 𝕏 Twitter All 432 jobs →

Site Reliability Engineer • Lead

Waymo • Mountain View

Posted 1w agoLead Site Reliability Engineer 📍 Mountain View💰 $332,000 - $421,000 / yearly

Apply Now →

Skills & Technologies

aws docker kubernetes prometheus grafana

Overview

Waymo is hiring a Director of Site Reliability Engineering to lead the strategy and execution for observability, incident management, and continuous improvement. You'll work with technologies like AWS, Docker, and Kubernetes. This position requires significant experience in site reliability engineering.

Job Description

Who you are

You have extensive experience in site reliability engineering, with a strong focus on ensuring system reliability, performance, and resilience. Your background includes leading teams and developing strategies for observability and incident management — you understand the importance of maintaining high availability in complex systems. You are proficient in cloud technologies, particularly AWS, and have hands-on experience with container orchestration tools like Docker and Kubernetes. Your analytical skills allow you to identify and resolve performance bottlenecks effectively, and you are comfortable working with monitoring tools such as Prometheus and Grafana.

You thrive in collaborative environments and enjoy working closely with software engineers, product managers, and other stakeholders to improve system reliability. Your leadership style emphasizes mentorship and fostering a culture of continuous improvement within your team. You are passionate about leveraging technology to enhance user experiences and are committed to driving operational excellence.

What you'll do

As the Director of Site Reliability Engineering at Waymo, you will lead a team responsible for ensuring the reliability and performance of our autonomous driving systems. You will develop and implement strategies for observability and incident management, ensuring that our systems operate smoothly and efficiently. Your role will involve capacity planning and continuous improvement initiatives to enhance system resilience. You will collaborate with engineering teams to design and implement robust monitoring solutions, leveraging tools like Prometheus and Grafana to gain insights into system performance.

You will also be responsible for incident response and management, leading post-mortem analyses to identify root causes and implement preventive measures. Your leadership will guide the team in adopting best practices for site reliability, fostering a culture of accountability and excellence. You will work closely with cross-functional teams to align reliability goals with business objectives, ensuring that our systems meet the high standards expected by our customers.

What we offer

Waymo offers a competitive salary range of $332,000—$421,000 USD, along with participation in our discretionary annual bonus program and equity incentive plan. We provide generous company benefits, subject to eligibility requirements. Join us in our mission to improve access to mobility and save lives through innovative autonomous driving technology.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Waymo.

Apply Now →Get Job Alerts

✨

Similar Jobs You Might Like

Based on your interests and this role

Site Reliability Engineer

Earnin•📍 Mountain View - On-Site

Earnin is hiring a Director of Site Reliability Engineering to lead the reliability and performance of mission-critical systems. You'll work with AWS, Docker, and Kubernetes to ensure high availability and scalability. This position requires strong engineering leadership and a solid software engineering foundation.

🏛️ On-SiteLead

3d ago

Staff Engineer

Moveworks•📍 Mountain View - On-Site

Moveworks is hiring a Staff Software Engineer to build core infrastructure services that support machine learning and platform teams. You'll work with technologies like Python, Golang, and AWS to enhance performance and scalability. This role requires 8+ years of experience in backend distributed systems.

🏛️ On-SiteSenior

3w ago

Director Of Reliability Engineering

HubSpot•📍 Ireland - Remote

HubSpot is hiring a Director of Reliability Engineering to lead a team focused on enhancing reliability capabilities and pioneering AI-assisted operations. You'll work on defining the reliability roadmap and fostering a culture of operational excellence. This role requires significant experience in reliability engineering and team leadership.

🏠 RemoteLead

1w ago

Site Reliability Engineer

Attain•📍 Chicago - Hybrid

Attain is hiring a Site Reliability Engineer to build and maintain the infrastructure that powers their systems. You'll work with Terraform and AWS to ensure peak efficiency as the company scales. This role requires experience in infrastructure management and compliance processes.

🏢 HybridMid-Level

1 month ago

Site Reliability Engineer

WorkOS•📍 San Francisco - Remote

WorkOS is hiring a Site Reliability Engineer to ensure the platform remains fast, reliable, and resilient at scale. You'll work with AWS, Docker, and Kubernetes to build systems that handle hundreds of millions of requests. This role requires a strong understanding of complex systems and incident response.

🏠 Remote

8 months ago

Browse all jobs →