Roblox

About Roblox

Empowering creators in a vibrant gaming universe

🏢 Tech, Gaming👥 1001+ employees📅 Founded 2006📍 South San Mateo, San Mateo, CA💰 $922.8m3.8
B2CGamingEntertainmentCommunity

Key Highlights

  • Over 200 million monthly active users globally
  • More than $500 million paid to developers in 2022
  • Headquartered in South San Mateo, CA
  • $922.8 million raised in Series G funding

Roblox is an online gaming and entertainment platform headquartered in South San Mateo, CA, that connects over 200 million monthly active users. The platform empowers its community to create and monetize their own games, with over $500 million paid out to developers in 2022 alone. As a leader in the...

🎁 Benefits

Roblox offers competitive salaries, equity options, generous PTO policies, and a flexible remote work policy to support work-life balance. Employees a...

🌟 Culture

Roblox fosters a creator-centric culture, encouraging employees to innovate and collaborate while prioritizing user safety. The company values communi...

Roblox

Site Reliability Engineer Senior

RobloxSan Mateo

Posted 1d agoSeniorSite Reliability Engineer📍 San Mateo💰 $195,780 - $242,100 / yearly
Apply Now →

Skills & Technologies

Overview

Roblox is hiring a Senior Site Reliability Engineer to manage and optimize their infrastructure systems. You'll work with Kubernetes, Python, and AWS to ensure reliability and performance at scale. This role requires strong programming skills and experience in site reliability engineering.

Job Description

Who you are

You have 5+ years of experience in site reliability engineering, with a strong background in managing large-scale infrastructure systems. Your expertise in Kubernetes and cloud technologies allows you to design and implement resilient systems that can handle millions of users. You are proficient in Python and Linux, enabling you to automate processes and troubleshoot complex issues effectively. You understand the importance of observability and have experience implementing monitoring solutions to ensure system reliability. You are a collaborative team player who enjoys working with cross-functional teams to drive best practices in reliability and performance.

Desirable

Experience with AWS services is a plus, as it complements your skills in managing cloud infrastructure. Familiarity with Docker and container orchestration will help you in your role, as you work to productionize Kubernetes-based infrastructure. You are passionate about improving system reliability and have a proactive approach to identifying and resolving potential issues before they impact users.

What you'll do

In this role, you will be responsible for designing and developing systems that promote fault tolerance and resilience across Roblox's infrastructure. You will automate the management and lifecycle of clusters, ensuring that systems are observable and maintain high availability. Your work will involve collaborating with the Infra Compute group to institute reliability best practices and drive common reliability initiatives. You will also participate in incident management and post-mortem analysis to continuously improve system performance and reliability.

You will have the opportunity to shape the future of Roblox's infrastructure by contributing to the development of tools and processes that enhance operational efficiency. Your insights will help guide the team in making informed decisions about infrastructure investments and improvements. You will work closely with developers and other engineers to ensure that the systems you build meet the needs of the community and support the growth of the platform.

What we offer

Roblox offers a dynamic work environment where you can make a significant impact on the future of human interaction through technology. You will be part of a team that values collaboration, innovation, and continuous improvement. We provide competitive compensation and benefits, along with opportunities for professional growth and development. Join us in our mission to connect a billion people with optimism and civility, and help us create safer, more civil shared experiences for everyone.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Roblox.

Similar Jobs You Might Like

Based on your interests and this role

Roblox

Site Reliability Engineer

Roblox📍 San Mateo

Roblox is seeking a Senior Site Reliability Engineer to enhance the reliability and scalability of their platform. You'll work with technologies like Java, Python, and AWS to solve complex technical challenges. This role requires a strong background in system reliability and performance optimization.

Senior
1d ago
Skydio

Battery Reliability Engineer

Skydio📍 San Mateo - On-Site

Skydio is seeking a Senior Battery Reliability Engineer to ensure the reliability of battery subsystems in autonomous drone systems. You'll work with battery design, failure analysis, and testing methodologies in San Mateo, California.

🏛️ On-SiteSenior
2 months ago
Roblox

Network Engineer

Roblox📍 San Mateo - On-Site

Roblox is hiring a Senior Network Reliability Engineer to enhance the reliability and efficiency of their global physical network infrastructure. You'll work with automation and tooling to support network operations. This role requires expertise in network troubleshooting and incident response.

🏛️ On-SiteSenior
1d ago
Google

Site Reliability Engineer

Google📍 Seattle

Google is seeking a Senior Site Reliability Engineer to design, build, and maintain large-scale distributed systems. You'll work with technologies like Java, Python, and AWS to ensure reliability and performance. This role requires 5+ years of experience in software development and systems engineering.

Senior
1 month ago
Crusoe

Site Reliability Engineer

Crusoe📍 San Francisco - On-Site

Crusoe is seeking a Senior Site Reliability Engineer to enhance the stability and performance of their GPU cloud platform. You'll collaborate with cross-functional teams and utilize skills in AWS, Docker, and Kubernetes. This role requires a strong background in operational excellence and incident management.

🏛️ On-SiteSenior
2 months ago