
About Wikimedia
Empowering the world with free knowledge access
Key Highlights
- Operates Wikipedia, serving over 1.5 billion unique devices monthly
- Headquartered in San Francisco, California
- Funded by millions of individual donations averaging $15
- 501(c)(3) tax-exempt nonprofit organization
The Wikimedia Foundation is the nonprofit organization behind Wikipedia, the world's largest online encyclopedia, and other free knowledge projects. Headquartered in San Francisco, California, Wikimedia operates with a mission to provide free access to knowledge for everyone. The foundation relies o...
🎁 Benefits
Wikimedia offers a flexible remote work policy, generous PTO, and a commitment to employee well-being. Employees also have access to professional deve...
🌟 Culture
Wikimedia fosters a culture of openness and collaboration, emphasizing the importance of free knowledge. The organization values community contributio...
Skills & Technologies
Overview
Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems for data-oriented teams. You'll work with technologies like Kubernetes and Hadoop to ensure system reliability and scalability. This role requires strong experience in distributed systems and automation.
Job Description
Who you are
You have a strong background in site reliability engineering with a focus on data platforms — your experience includes operating large-scale systems and ensuring their reliability and performance. You are proficient in Kubernetes and have hands-on experience with Hadoop, OpenSearch, and Kafka, which allows you to effectively manage and optimize data workflows.
You are skilled in monitoring systems and services, identifying performance bottlenecks, and proactively addressing sources of instability in distributed systems. Your analytical mindset helps you understand how complex systems fail, and you are adept at implementing solutions that enhance resilience and reliability.
You enjoy collaborating with diverse teams and are comfortable working in a remote environment — your communication skills enable you to support users effectively and remove roadblocks that hinder productivity. You are also passionate about mentoring peers, sharing your technical knowledge, and helping others grow in their roles.
Desirable
Experience with automation tools and streamlining operational tasks is a plus, as is familiarity with incident management processes. You may have experience with cloud platforms or other data processing frameworks, which would further enhance your contributions to the team.
What you'll do
As a Senior Site Reliability Engineer at Wikimedia, you will be responsible for operating and optimizing systems that support our data-oriented teams. You will design and implement new systems and solutions, ensuring they scale to meet demand while simplifying operations through standardization and automation. Your role will involve monitoring system performance, optimizing resource utilization, and collaborating with client teams to support their initiatives.
You will investigate incidents and lead efforts to migrate services to Kubernetes, enhancing our infrastructure's reliability. Your proactive approach will help identify and address potential issues before they impact users, ensuring a seamless experience for all stakeholders. You will also have the opportunity to travel for team gatherings and conferences, fostering connections with colleagues across the globe.
What we offer
Wikimedia provides a supportive and inclusive work environment where you can thrive as a Senior Site Reliability Engineer. We value your contributions and encourage you to apply even if your experience doesn't match every requirement. Join us in our mission to provide free knowledge to the world and make a meaningful impact through your work.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at Wikimedia.
Similar Jobs You Might Like
Based on your interests and this role

Site Reliability Engineer
Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems for data-oriented teams. You'll work with technologies like Kubernetes, Hadoop, and Kafka to ensure system reliability and scalability. This role requires strong experience in distributed systems and automation.

Site Reliability Engineer
Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems supporting data-oriented teams. You'll work with technologies like Kubernetes, Hadoop, and Kafka to ensure system reliability and scalability.

Site Reliability Engineer
Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems for data-oriented teams. You'll work with technologies like Kubernetes, Hadoop, and Kafka to ensure system reliability and scalability. This role requires significant experience in SRE practices.

Site Reliability Engineer
Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems for data-oriented teams. You'll work with technologies like Kubernetes, Hadoop, and Kafka to ensure system reliability and scalability. This role requires strong experience in managing distributed systems.

Site Reliability Engineer
Circonus is hiring a Site Reliability Engineer to ensure the reliability of their SaaS and on-premise services. You'll work on automation, scalability, and performance improvements while collaborating with various departments. This role is fully remote.