
About Udemy
Empowering learners through accessible online education
Key Highlights
- Over 130,000 courses available across various subjects
- 35 million students worldwide, including corporate clients
- Headquartered in San Francisco, California
- $223 million raised from top investors like Insight Partners
Udemy is a leading online learning platform headquartered in San Francisco, California, offering over 130,000 courses to a global community of 35 million students. The platform provides a diverse range of subjects including programming, marketing, and data science, catering to both individual learne...
🎁 Benefits
Udemy offers competitive salaries, equity options, generous PTO policies, and a remote work flexibility that allows employees to balance their work an...
🌟 Culture
Udemy fosters a culture of continuous learning and innovation, encouraging employees to enhance their skills through access to their own courses and a...
Overview
Udemy is hiring a Staff Site Reliability Engineer to manage and evolve their infrastructure. You'll work with AWS, Kubernetes, and programming languages like Python and Golang. This role requires extensive knowledge of cloud technologies and infrastructure-as-code tools.
Job Description
Who you are
You have extensive knowledge of cloud technologies, with AWS experience being highly advantageous. Your proven expertise in managing containerized workloads using Kubernetes in production environments sets you apart. You are proficient in programming languages such as Python, Golang, or Kotlin, and have a strong familiarity with infrastructure-as-code (IaC) tools like Terraform and Helm. You thrive in collaborative environments and are eager to enhance reliability standards across the organization.
Desirable
Experience with incident response and driving best practices in reliability is a plus. Familiarity with CI/CD pipelines and monitoring tools will help you excel in this role. You are a proactive problem solver who enjoys optimizing infrastructure and tooling to empower engineering teams.
What you'll do
As a Staff Site Reliability Engineer at Udemy, you will play a critical role in managing and evolving our infrastructure, from our CDN to our databases. You will oversee and improve tools like Helm and Terraform, building development environments that empower our engineering teams. Collaborating closely with development teams, you will design internal tools in Python and Golang while responding to incidents and driving best practices in reliability. You will lead projects to enhance and optimize our infrastructure and tooling, ensuring that our systems are robust and scalable. Your work will directly impact the learning experience of millions of users worldwide, making it essential to maintain high reliability standards.
What we offer
At Udemy, we are committed to transforming lives through learning. You will be part of a mission-driven team that values innovation and collaboration. We offer competitive compensation and benefits, along with opportunities for professional growth and development. You will work in a supportive environment that encourages you to share your unique experiences and perspectives. Join us in shaping the future of learning and making a real impact on people's lives around the world.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at Udemy.
Similar Jobs You Might Like
Based on your interests and this role

Site Reliability Engineer
LearnUpon is hiring a Staff Site Reliability Engineer to enhance and scale their infrastructure. You'll work with AWS, Docker, and Linux to ensure performance and reliability. This role requires significant experience in site reliability engineering.

Site Reliability Engineer
Crusoe is hiring a Site Reliability Engineer to ensure the reliability and performance of their cloud infrastructure. You'll work with Linux, networking, and automation to maintain high service levels. This role requires experience in SRE practices and distributed systems.

Site Reliability Engineer
Fivetran is seeking a Senior Site Reliability Engineer to ensure the performance and reliability of their data infrastructure. You'll collaborate with various teams to enhance the Fivetran Data Platform. This role requires expertise in AWS, Docker, and Kubernetes.

Site Reliability Engineer
Google is hiring a Senior Site Reliability Engineer to ensure the reliability and performance of Google Cloud's services. You'll work with distributed systems and automation to optimize existing systems. This position requires 8 years of software development experience and expertise in large-scale systems.

Site Reliability Engineer
Fivetran is hiring a Site Reliability Engineer II to ensure the performance and reliability of their data infrastructure. You'll collaborate with various teams to enhance the Fivetran Data Platform. This role requires experience in cloud technologies and infrastructure management.