
About Celonis
Transforming inefficiencies into operational excellence
Key Highlights
- Headquartered in World Trade Center, New York, NY
- Raised $1.8 billion in Series D funding
- Over 1,000 employees dedicated to process mining and execution management
- Significant uptake across industries like supply chain and manufacturing
Celonis, headquartered at the World Trade Center in New York, NY, specializes in execution management and process mining solutions. With over 1,000 employees, the company has raised $1.8 billion in funding and serves a diverse range of industries including supply chain, manufacturing, and business i...
🎁 Benefits
Celonis offers competitive health and wellbeing benefits for employees and their families, along with Restricted Stock Options (RSUs) that allow emplo...
🌟 Culture
Celonis fosters a unique culture rooted in process mining and execution management, emphasizing data-driven decision-making. The company values innova...
Skills & Technologies
Overview
Celonis is hiring a Staff Software Engineer - Site Reliability to ensure the health and performance of their platform. You'll work with Kubernetes, AWS, and Docker to drive system reliability and operational excellence. This role requires expertise in SRE principles and strong technical skills.
Job Description
Who you are
You have extensive experience in Site Reliability Engineering, with a strong background in software engineering principles. You understand the importance of system reliability and have a proven track record of implementing SRE practices to enhance operational excellence. Your technical skills include proficiency in Kubernetes, AWS, and Docker, and you are comfortable working in a Linux environment. You are adept at incident management and have experience with automation and observability tools. You thrive in collaborative environments and enjoy working with cross-functional teams to solve complex problems.
Desirable
Experience with programming languages such as Python or Go is a plus, as is familiarity with CI/CD pipelines and monitoring tools. You have a passion for continuous improvement and are always looking for ways to optimize processes and enhance system performance. You are a proactive communicator and can effectively convey technical concepts to non-technical stakeholders.
What you'll do
As a Staff Software Engineer - Site Reliability at Celonis, you will lead reliability efforts for a fleet of over 80 FedRAMP-compliant microservices. You will apply SRE principles to drive observability, automation, and incident prevention, ensuring the health and performance of our platform. You will own high-priority application incident escalations, performing deep technical analysis and restoration within defined SLOs. Your role will involve engineering solutions to enhance the reliability and scalability of our systems, collaborating closely with development teams to implement best practices in system design and architecture.
You will also be responsible for developing and maintaining monitoring and alerting systems to proactively identify and address potential issues before they impact users. You will work on capacity planning and performance tuning, ensuring that our systems can handle increasing loads while maintaining high availability. Your contributions will directly impact the user experience and the overall success of our platform.
What we offer
At Celonis, we offer a dynamic work environment where innovation is encouraged. You will have the opportunity to work with cutting-edge technologies and be part of a team that is shaping the future of Process Intelligence. We provide competitive compensation and benefits, along with opportunities for professional growth and development. Join us in our mission to unlock unprecedented productivity by embedding data and intelligence at the core of every business process.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at Celonis.
Similar Jobs You Might Like
Based on your interests and this role

Site Reliability Engineer
Stellar Development Foundation is hiring a Senior Site Reliability Engineer to ensure the reliability and scalability of their systems. You'll work with AWS, GCP, and Kubernetes to support the Stellar blockchain network. This position requires experience in maintaining high-availability environments.

Site Reliability Engineer
Ro is hiring a Staff Site Reliability Engineer to ensure the reliability of their production systems. You'll work with AWS, Docker, and Kubernetes to build resilient infrastructure. This position requires significant experience in site reliability engineering.

Site Reliability Engineer
Braze is hiring a Senior Site Reliability Engineer to ensure the uptime of internal-facing services and platforms. You'll work with technologies like Linux, Docker, and Kubernetes in New York City. This position requires a strong background in systems and operational discipline.

Site Reliability Engineer
MongoDB is seeking a Senior or Staff Site Reliability Engineer for their Observability team to build and maintain the observability stack used by all engineering teams. You'll work with technologies like Splunk, Prometheus, and Docker to ensure service reliability. This role requires strong experience in SRE practices and observability tools.

Site Reliability Engineer
Talkspace is seeking a Site Reliability Engineer to ensure the reliability and performance of their behavioral health platform. You'll leverage your technical skills in AWS, Docker, and Linux to maintain live services and implement monitoring strategies. This role requires strong collaboration and communication skills.