
About SolarWinds
Empowering IT professionals with powerful management tools
Key Highlights
- Over 300,000 customers including NASA and the U.S. Department of Defense
- Publicly traded since 2018 with a strong market presence
- Headquartered in Austin, Texas with a global workforce
- Raised over $1.5 billion in funding to date
SolarWinds Inc. is a leading provider of IT management software, headquartered in Austin, Texas. The company offers a range of products including network performance monitoring, systems management, and IT security solutions, serving over 300,000 customers worldwide, including major organizations lik...
🎁 Benefits
Employees enjoy competitive salaries, stock options, generous PTO policies, remote work flexibility, and comprehensive health benefits....
🌟 Culture
SolarWinds fosters a culture focused on customer success and product excellence, with a strong emphasis on engineering and innovation in IT management...
Skills & Technologies
Overview
SolarWinds is seeking a Senior Staff Site Reliability Engineer to lead reliability strategy and architecture for their Observability Platform. You'll work with ClickHouse, Kubernetes, and cloud services like AWS and Azure. This role requires deep expertise in large-scale SaaS infrastructure.
Job Description
Who you are
You have extensive experience in site reliability engineering, with a strong focus on maintaining and optimizing large-scale SaaS infrastructures. Your expertise in ClickHouse and Kubernetes allows you to manage production clusters effectively, ensuring high availability and performance. You thrive in collaborative environments, where you can lead teams in implementing reliability strategies and architectural decisions that enhance system performance. Your background in cloud services, particularly AWS and Azure, equips you with the skills to design and manage scalable solutions that meet the demands of modern applications. You are detail-oriented and have a strong understanding of distributed systems, which enables you to troubleshoot complex issues efficiently. You believe in the power of automation and continuously seek ways to improve operational processes through innovative solutions.
Desirable
Experience with GitOps practices is a plus, as it aligns with your commitment to modern DevOps methodologies. Familiarity with high-throughput data pipelines and observability tools will further enhance your ability to contribute to our team. You are open to learning new technologies and adapting to evolving industry standards, which will help you stay at the forefront of site reliability engineering.
What you'll do
In this role, you will take ownership of the reliability strategy for SolarWinds' Observability Platform, focusing on the SaaS Logs and data pipelines powered by ClickHouse. You will lead the design and implementation of performance-optimized schemas, ensuring that our systems can handle massive datasets efficiently. Your responsibilities will include managing ClickHouse production clusters, driving automation around data platform operations, and collaborating with cross-functional teams to enhance system reliability. You will also be tasked with shaping how we ingest, store, and query observability datasets, making critical decisions that impact the overall performance of our services. As a senior member of the team, you will mentor junior engineers and share your knowledge to foster a culture of continuous improvement and learning.
What we offer
At SolarWinds, we prioritize a people-first culture that values collaboration and innovation. You will have the opportunity to work with a talented team dedicated to delivering world-class solutions. We offer competitive compensation and benefits, along with opportunities for professional growth and development. Join us in our mission to empower customers and drive business transformation through our powerful and secure solutions. We encourage you to apply even if your experience doesn't match every requirement, as we value diverse perspectives and backgrounds.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at SolarWinds.
Similar Jobs You Might Like
Based on your interests and this role

Site Reliability Engineer

Site Reliability Engineer
SolarWinds is seeking a Senior Site Reliability Engineer to enhance their infrastructure and site reliability practices. You'll work with AWS, GCP, Kubernetes, and GitOps to ensure high-quality service delivery. This role requires experience in implementing SRE practices and collaboration with cross-functional teams.

Site Reliability Engineer

Site Reliability Engineer
Okta is hiring a Senior Site Reliability Engineer to lead the migration of legacy applications to scalable Kubernetes clusters. You'll work with tools like Jenkins and ArgoCD to build and maintain CI/CD pipelines. This position requires experience in automation and cloud technologies.

Site Reliability Engineer
Point72 is hiring a Site Reliability Engineer to enhance system reliability and automate operational workflows. You'll work with tools like Datadog and Jenkins, and support database platforms such as SQL Server and MongoDB. This role requires expertise in automation and observability solutions.