Go to dashboard

About Wikimedia

Empowering the world with free knowledge access

Key Highlights

Operates Wikipedia, serving over 1.5 billion unique devices monthly
Headquartered in San Francisco, California
Funded by millions of individual donations averaging $15
501(c)(3) tax-exempt nonprofit organization

The Wikimedia Foundation is the nonprofit organization behind Wikipedia, the world's largest online encyclopedia, and other free knowledge projects. Headquartered in San Francisco, California, Wikimedia operates with a mission to provide free access to knowledge for everyone. The foundation relies o...

🎁 Benefits

Wikimedia offers a flexible remote work policy, generous PTO, and a commitment to employee well-being. Employees also have access to professional deve...

🌟 Culture

Wikimedia fosters a culture of openness and collaboration, emphasizing the importance of free knowledge. The organization values community contributio...

🌐 Website All 40 jobs →

Site Reliability Engineer • Senior

Wikimedia • Remote - Remote

Posted 3w ago🏠 Remote Senior Site Reliability Engineer

Skills & Technologies

kubernetes hadoop opensearch airflow kafka

Overview

Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems for data-oriented teams. You'll work with technologies like Kubernetes, Hadoop, and Kafka to ensure system reliability and scalability. This role requires significant experience in SRE practices.

Job Description

Who you are

You have 5+ years of experience in Site Reliability Engineering, with a strong focus on operating and optimizing large-scale distributed systems. Your expertise includes working with Kubernetes, Hadoop, and other data platforms, allowing you to effectively manage and scale complex infrastructures. You are skilled in identifying and resolving performance bottlenecks, ensuring high availability and reliability of services.

You possess a deep understanding of monitoring and alerting systems, enabling you to proactively address potential issues before they impact users. Your experience with automation tools and practices allows you to streamline operations and improve efficiency across teams. You are comfortable collaborating with diverse teams, providing support and guidance to enhance their productivity.

You have a knack for simplifying complex processes and standardizing deployment practices, which helps in reducing operational overhead. Your analytical mindset enables you to investigate incidents thoroughly, learning from failures to enhance system resilience. You are also passionate about mentoring peers, sharing your knowledge, and fostering a culture of continuous improvement within your team.

What you'll do

As a Senior Site Reliability Engineer at Wikimedia, you will be responsible for operating the systems that support our data-oriented teams. You will work closely with engineering teams to design and implement new systems and solutions, ensuring they scale effectively to meet demand. Your role will involve simplifying operations by standardizing deployment practices and leveraging containerization technologies.

You will monitor systems and services, optimizing performance and resource utilization to maintain high reliability. Your proactive approach will help identify sources of instability in distributed systems, allowing you to analyze and address complex failure scenarios. You will also automate and streamline tasks, identifying process gaps to enhance operational efficiency.

Collaboration is key in this role, as you will work with a global team, communicating asynchronously to support various projects. You may also be expected to travel domestically or internationally for team gatherings and conferences, fostering connections with your colleagues. Your contributions will directly impact the reliability and performance of Wikimedia's data platforms, making a significant difference in how we serve our users.

What we offer

Wikimedia offers a supportive and inclusive work environment where you can thrive as a Senior Site Reliability Engineer. You will have the opportunity to work with cutting-edge technologies and contribute to meaningful projects that impact millions of users worldwide. We encourage you to apply even if your experience doesn't match every requirement, as we value diverse perspectives and backgrounds.

You will receive competitive compensation and benefits, including opportunities for professional development and growth within the organization. Our team is dedicated to fostering a culture of collaboration and innovation, ensuring that you have the resources and support needed to succeed in your role.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Wikimedia.

Apply Now →Get Job Alerts

✨

Similar Jobs You Might Like

Based on your interests and this role

Site Reliability Engineer

Wikimedia•📍 Remote - Remote

Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems for data-oriented teams. You'll work with technologies like Kubernetes and Hadoop to ensure system reliability and scalability. This role requires strong experience in distributed systems and automation.

🏠 RemoteSenior

Site Reliability Engineer

Wikimedia•📍 Remote - Remote

Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems for data-oriented teams. You'll work with technologies like Kubernetes, Hadoop, and Kafka to ensure system reliability and scalability. This role requires strong experience in distributed systems and automation.

🏠 RemoteSenior

Site Reliability Engineer

Wikimedia•📍 Remote - Remote

Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems supporting data-oriented teams. You'll work with technologies like Kubernetes, Hadoop, and Kafka to ensure system reliability and scalability.

🏠 RemoteSenior

Site Reliability Engineer

Wikimedia•📍 Remote - Remote

Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems for data-oriented teams. You'll work with technologies like Kubernetes, Hadoop, and Kafka to ensure system reliability and scalability. This role requires strong experience in managing distributed systems.

🏠 RemoteSenior

Site Reliability Engineer

Circonus•📍 Remote - Remote

Circonus is hiring a Site Reliability Engineer to ensure the reliability of their SaaS and on-premise services. You'll work on automation, scalability, and performance improvements while collaborating with various departments. This role is fully remote.

Browse all jobs →