About PandaDoc

Streamlining document workflows for growing organizations

🏢 Retail, Tech👥 251-1K📅 Founded 2013📍 San Francisco, California, United States

Key Highlights

Over 35,000 customers including Cisco and HubSpot
Headquartered in San Francisco, California
Raised $50M+ from investors like Rembrandt Venture Partners
Offers unlimited PTO and flexible remote work options

PandaDoc is a document workflow automation platform headquartered in San Francisco, California, that serves over 35,000 organizations, including notable clients like Cisco and HubSpot. The platform streamlines the creation, management, and signing of digital documents such as proposals, quotes, and ...

🎁 Benefits

PandaDoc offers competitive salaries, equity options, unlimited PTO, and a flexible remote work policy, allowing employees to maintain a healthy work-...

🌟 Culture

PandaDoc fosters a remote-friendly culture that emphasizes collaboration and innovation, encouraging employees to contribute ideas and take ownership ...

🌐 Website 💼 LinkedIn 𝕏 Twitter All 87 jobs →

Site Reliability Engineer • Senior

PandaDoc • Remote (Europe) - Remote

Posted 1d ago🏠 Remote Senior Site Reliability Engineer

Apply Now →

Skills & Technologies

python java spring boot aws kubernetes postgresql grafana rabbitmq nats kafka

Overview

PandaDoc is hiring a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll manage incident processes, observability tools, and contribute to service codebases using Python and Java. This role requires solid experience in AWS and Kubernetes.

Job Description

Who you are

You have solid programming experience, particularly with Python (Django and AsyncIO) and/or Java (Spring Boot). Your background includes maintaining an observability tools suite, specifically LGTM - Loki, Grafana, Tempo, and Mimir. You have strong experience in developing and maintaining Python services in production environments, and you are an experienced on-call SRE engineer who understands the importance of reliability in production systems.

You possess a deep understanding of AWS and Kubernetes, which are essential for managing cloud infrastructure and ensuring service availability. Your proficiency in working with relational databases, particularly PostgreSQL, and messaging systems such as RabbitMQ, NATS, and Kafka, allows you to effectively handle data flow and communication between services. You are passionate about driving efforts in observability, incident management, and capacity planning, ensuring that production applications run smoothly.

Desirable

Experience with additional programming languages or frameworks is a plus, as is familiarity with other observability tools or cloud platforms. You enjoy mentoring others and fostering SRE principles within the R&D organization, contributing to a culture of reliability and performance.

What you'll do

In this role, you will own and influence the incident management process end-to-end, ensuring that incidents are handled efficiently and effectively. You will maintain and evolve the on-prem observability stack, keeping production applications running smoothly by participating in the on-call rotation. Your contributions will include developing automations and tools to support platform reliability, as well as collaborating with product engineers to integrate SRE principles into the development process.

You will actively contribute to service codebases, focusing on performance and resiliency to proactively prevent incidents and resolve any performance bottlenecks. Your role will require you to be a mentor for the SRE team or product engineers, sharing your knowledge and experience to help others grow in their roles. You will also be involved in capacity planning, ensuring that the infrastructure can handle the demands placed upon it as the company scales.

What we offer

PandaDoc offers a collaborative and innovative work environment where you can make a significant impact on the reliability of our services. We encourage you to apply even if your experience doesn't match every requirement, as we value diverse perspectives and backgrounds. Join us in our mission to provide a reliable service to our customers and help shape the future of document automation.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at PandaDoc.

Apply Now →Get Job Alerts

✨

Similar Jobs You Might Like

Based on your interests and this role

Site Reliability Engineer

PandaDoc•📍 Germany - Remote

PandaDoc is hiring a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll work with Python, Java, AWS, and Kubernetes to manage incident processes and observability stacks. This role requires solid programming experience and expertise in maintaining production services.

🏠 RemoteSenior

1d ago

Site Reliability Engineer

PandaDoc•📍 Spain - Remote

PandaDoc is hiring a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll work with Python, Java, AWS, and Kubernetes to manage incident processes and observability tools. This role requires solid programming experience and expertise in maintaining production services.

🏠 RemoteSenior

1d ago

Site Reliability Engineer

Clarifai•📍 Canada - Remote

Clarifai is seeking a Senior Site Reliability Engineer to ensure the smooth operation and high availability of their AI platform. You'll work with Kubernetes, Python, and Golang to address infrastructure challenges. This role requires expertise in cloud systems and microservice architecture.

🏠 RemoteSenior

1 month ago

Site Reliability Engineer

PandaDoc•📍 Ukraine - Remote

PandaDoc is hiring a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll work with Python, Java, AWS, and Kubernetes to manage incident processes and observability tools. This role requires solid programming experience and expertise in maintaining production services.

🏠 RemoteSenior

2w ago

Site Reliability Engineer

Clarifai•📍 United States - Remote

Clarifai is seeking a Senior Site Reliability Engineer to ensure the smooth operation and high availability of their AI platform. You'll work with Kubernetes, Python, and Golang to tackle infrastructure challenges. This role requires expertise in cloud infrastructure and microservice architecture.

🏠 RemoteSenior

1 month ago

Browse all jobs →