Wikimedia

About Wikimedia

Empowering the world with free knowledge access

Key Highlights

  • Operates Wikipedia, serving over 1.5 billion unique devices monthly
  • Headquartered in San Francisco, California
  • Funded by millions of individual donations averaging $15
  • 501(c)(3) tax-exempt nonprofit organization

The Wikimedia Foundation is the nonprofit organization behind Wikipedia, the world's largest online encyclopedia, and other free knowledge projects. Headquartered in San Francisco, California, Wikimedia operates with a mission to provide free access to knowledge for everyone. The foundation relies o...

🎁 Benefits

Wikimedia offers a flexible remote work policy, generous PTO, and a commitment to employee well-being. Employees also have access to professional deve...

🌟 Culture

Wikimedia fosters a culture of openness and collaboration, emphasizing the importance of free knowledge. The organization values community contributio...

Wikimedia

Site Reliability Engineer Senior

WikimediaRemote - Remote

Apply Now →

Overview

Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems for data-oriented teams. You'll work with technologies like Kubernetes, Hadoop, and Kafka to ensure system reliability and scalability. This role requires strong experience in managing distributed systems.

Job Description

Who you are

You have 5+ years of experience in site reliability engineering, particularly with data platforms and distributed systems. Your expertise includes operating Kubernetes clusters and managing large-scale data processing frameworks like Hadoop. You understand the complexities of system reliability and have a proactive approach to identifying and resolving issues.

You are skilled in monitoring and optimizing system performance, ensuring that resources are utilized efficiently. Your experience with tools such as OpenSearch and Kafka allows you to design and implement robust solutions that meet the demands of data-oriented teams. You thrive in collaborative environments and enjoy mentoring peers in technical and operational best practices.

You are comfortable working in a remote setting and can effectively communicate with a global team. Your ability to simplify operations through standardization and automation is a key strength. You are passionate about supporting users and removing roadblocks to enhance productivity.

Desirable

Experience with Airflow and Superset is a plus, as is familiarity with incident management and response strategies. You have a strong understanding of cloud infrastructure and are adept at using automation tools to streamline processes.

What you'll do

As a Senior Site Reliability Engineer at Wikimedia, you will be responsible for operating and enhancing the systems that support our data-oriented teams. You will design and implement new systems and solutions, ensuring they scale effectively to meet demand. Your role involves collaborating with client teams to support their initiatives and investigating incidents to maintain system reliability.

You will focus on simplifying operations by standardizing deployment processes and leveraging virtualization and containerization techniques. Monitoring systems and services will be a key part of your responsibilities, as will optimizing performance and resource utilization. You will proactively identify sources of instability in distributed systems and analyze how complex systems fail, applying your insights to improve resilience.

Your work will include automating and streamlining tasks, identifying process gaps, and mentoring peers in your areas of expertise. You may also have opportunities to travel domestically or internationally for team gatherings and conferences, fostering collaboration and knowledge sharing.

What we offer

Wikimedia Foundation provides a supportive and inclusive work environment where you can grow your skills and make a meaningful impact. We encourage you to apply even if your experience doesn't match every requirement. Join us in our mission to provide free knowledge to the world and be part of a team that values collaboration, innovation, and continuous improvement.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Wikimedia.

Similar Jobs You Might Like

Based on your interests and this role

Wikimedia

Site Reliability Engineer

Wikimedia📍 Remote - Remote

Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems for data-oriented teams. You'll work with technologies like Kubernetes and Hadoop to ensure system reliability and scalability. This role requires strong experience in distributed systems and automation.

🏠 RemoteSenior
3w ago
Wikimedia

Site Reliability Engineer

Wikimedia📍 Remote - Remote

Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems for data-oriented teams. You'll work with technologies like Kubernetes, Hadoop, and Kafka to ensure system reliability and scalability. This role requires strong experience in distributed systems and automation.

🏠 RemoteSenior
3w ago
Wikimedia

Site Reliability Engineer

Wikimedia📍 Remote - Remote

Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems supporting data-oriented teams. You'll work with technologies like Kubernetes, Hadoop, and Kafka to ensure system reliability and scalability.

🏠 RemoteSenior
3w ago
Wikimedia

Site Reliability Engineer

Wikimedia📍 Remote - Remote

Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems for data-oriented teams. You'll work with technologies like Kubernetes, Hadoop, and Kafka to ensure system reliability and scalability. This role requires significant experience in SRE practices.

🏠 RemoteSenior
3w ago
Circonus

Site Reliability Engineer

Circonus📍 Remote - Remote

Circonus is hiring a Site Reliability Engineer to ensure the reliability of their SaaS and on-premise services. You'll work on automation, scalability, and performance improvements while collaborating with various departments. This role is fully remote.

🏠 Remote
4 years ago