
About Okta
Secure identity management from cloud to ground
Key Highlights
- Public company (NASDAQ: OKTA) - strong equity potential
- Over 1000 employees with significant growth trajectory
- $229.3 million raised in Series E funding
- Provides identity management solutions for cloud applications
Okta (NASDAQ: OKTA) is a leading identity management platform headquartered in The East Cut, San Francisco, CA. With over 1000 employees, Okta provides secure single sign-on solutions for organizations of all sizes, enabling seamless access to cloud applications. The company has raised $229.3 millio...
🎁 Benefits
Okta offers flexible work-from-home opportunities, comprehensive health and wellness benefits, and financial incentives including stock options. Emplo...
🌟 Culture
Okta fosters a culture focused on security and innovation, emphasizing the importance of identity management in today's digital landscape. The company...

Site Reliability Engineer • Senior
Okta • Washington Dc - On-Site
Skills & Technologies
Overview
Okta is hiring a Senior Site Reliability Engineer to lead the evolution of large-scale production systems. You'll work with AWS, Docker, and Kubernetes to ensure infrastructure reliability and performance. This position requires expertise in managing complex systems and a commitment to automation.
Job Description
Who you are
You have 5+ years of experience in site reliability engineering or a related field, with a strong focus on building and maintaining large-scale production systems. Your technical expertise allows you to thrive on solving complex problems at scale, and you believe in the principle: 'If you have to do it twice, automate it.' You are comfortable working in secure environments and understand the importance of reliability and performance in supporting critical national security missions.
You possess deep knowledge of cloud infrastructure, particularly AWS, and have hands-on experience with containerization technologies like Docker and orchestration tools such as Kubernetes. Your proficiency in Linux systems enables you to manage and optimize server performance effectively. You are also familiar with monitoring and alerting tools that help maintain system health and uptime.
Your strong problem-solving skills are complemented by excellent communication abilities, allowing you to collaborate effectively with cross-functional teams. You are a lifelong learner, eager to stay updated with the latest technologies and best practices in site reliability engineering. You understand the importance of security and compliance in your work and are prepared to obtain and maintain a U.S. security clearance as required.
Desirable
Experience with infrastructure as code tools like Terraform or CloudFormation is a plus. Familiarity with CI/CD pipelines and automation frameworks will enhance your ability to streamline operations and improve deployment processes. You may also have experience in incident management and post-mortem analysis, contributing to a culture of continuous improvement.
What you'll do
In this role, you will lead the evolution of Okta's large-scale production systems, ensuring they maintain uncompromising reliability and performance. You will collaborate with engineering teams to design and implement robust infrastructure solutions that support critical applications. Your responsibilities will include automating operational tasks, monitoring system performance, and responding to incidents to minimize downtime.
You will work closely with the Technical Operations team to develop and maintain CI/CD pipelines, enabling faster and more reliable software delivery. Your expertise will guide the implementation of best practices for system architecture, capacity planning, and disaster recovery. You will also mentor junior engineers, sharing your knowledge and fostering a culture of learning within the team.
Your role will involve regular communication with stakeholders to understand their needs and ensure that the infrastructure aligns with business objectives. You will participate in on-call rotations, providing support during incidents and contributing to post-incident reviews to identify areas for improvement. Your contributions will directly impact the reliability and performance of Okta's services, empowering organizations to securely connect users to the technologies they need.
What we offer
At Okta, we offer a collaborative and inclusive work environment where diverse perspectives are valued. You will have the opportunity to work on cutting-edge technologies and contribute to meaningful projects that impact the security and growth of businesses worldwide. We provide competitive compensation and benefits, including opportunities for professional development and career advancement. Join us in building a world where identity belongs to everyone.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at Okta.
Similar Jobs You Might Like
Based on your interests and this role

Site Reliability Engineer
Okta is hiring a Staff Site Reliability Engineer to ensure the reliability and security of their Core IDaaS platform. You'll work with technologies like AWS, Docker, and Kubernetes in Washington, DC.

Site Reliability Engineer
Okta is hiring a Senior Site Reliability Engineer to manage large-scale cloud production systems. You'll work with Kubernetes and AWS to ensure infrastructure reliability and performance. This position requires experience in site reliability engineering and a security clearance.

Site Reliability Engineer
Okta is hiring a Senior Site Reliability Engineer to design and maintain secure infrastructure for their SaaS offerings. You'll work with technologies like AWS, Docker, and Kubernetes. This position requires a blend of software engineering and systems administration skills.

Site Reliability Engineer
Fivetran is seeking a Senior Site Reliability Engineer to ensure the performance and reliability of their data infrastructure. You'll collaborate with various teams to enhance the Fivetran Data Platform. This role requires expertise in AWS, Docker, and Kubernetes.

Site Reliability Engineer
QGenda is hiring a Senior Site Reliability Engineer to design, operate, and scale highly available services on AWS. You'll lead automation and infrastructure-as-code efforts to improve reliability and performance. This position requires experience in managing cloud infrastructure and enhancing developer velocity.