PandaDoc

About PandaDoc

Streamlining document workflows for growing organizations

🏢 Retail, Tech👥 251-1K📅 Founded 2013📍 San Francisco, California, United States

Key Highlights

  • Over 35,000 customers including Cisco and HubSpot
  • Headquartered in San Francisco, California
  • Raised $50M+ from investors like Rembrandt Venture Partners
  • Offers unlimited PTO and flexible remote work options

PandaDoc is a document workflow automation platform headquartered in San Francisco, California, that serves over 35,000 organizations, including notable clients like Cisco and HubSpot. The platform streamlines the creation, management, and signing of digital documents such as proposals, quotes, and ...

🎁 Benefits

PandaDoc offers competitive salaries, equity options, unlimited PTO, and a flexible remote work policy, allowing employees to maintain a healthy work-...

🌟 Culture

PandaDoc fosters a remote-friendly culture that emphasizes collaboration and innovation, encouraging employees to contribute ideas and take ownership ...

Overview

PandaDoc is hiring a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll work with Python, Java, AWS, and Kubernetes to manage incident processes and observability tools. This role requires solid programming experience and expertise in maintaining production services.

Job Description

Who you are

You have solid programming experience, particularly with Python (Django and AsyncIO) and/or Java (Spring Boot), which allows you to contribute effectively to service codebases. Your experience in maintaining an observability tools suite, specifically LGTM - Loki, Grafana, Tempo, and Mimir, equips you to manage the observability stack efficiently. You have strong experience with AWS and Kubernetes, ensuring that production applications run smoothly. Your proficiency in working with relational databases like PostgreSQL and messaging systems such as RabbitMQ, NATS, or Kafka is essential for maintaining reliable operations. As an experienced on-call SRE engineer, you understand the importance of incident management processes and tools, and you are committed to driving efforts in observability and capacity planning. You enjoy mentoring others, fostering SRE principles within the R&D organization, and contributing to a culture of reliability.

What you'll do

In this role, you will own and influence the incident management process end-to-end, ensuring that incidents are handled efficiently and effectively. You will maintain and evolve the on-prem observability stack, which is crucial for timely investigation and mitigation of issues. Your contributions to production services will focus on performance and resiliency, as you actively participate in the on-call rotation to keep applications running smoothly. You will develop automations and tools that support platform reliability, collaborating closely with product engineers to integrate SRE principles into their workflows. Your role will also involve mentoring the SRE team and product engineers, sharing your knowledge and experience to help them grow in their roles. You will be a key player in driving the efforts that ensure customers receive a reliable service with minimal downtime, making you an essential part of PandaDoc's success.

What we offer

PandaDoc offers a collaborative and innovative work environment where you can make a significant impact on the reliability of our services. We value your expertise and provide opportunities for professional growth and development. As part of our team, you will have the chance to work with cutting-edge technologies and contribute to the evolution of our observability stack. We encourage you to apply even if your experience doesn't match every requirement, as we believe in the potential of diverse backgrounds and perspectives. Join us in our mission to provide exceptional service to our customers while fostering a culture of reliability and excellence.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at PandaDoc.

Similar Jobs You Might Like

Based on your interests and this role

PandaDoc

Site Reliability Engineer

PandaDoc📍 Remote (Europe) - Remote

PandaDoc is hiring a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll manage incident processes, observability tools, and contribute to service codebases using Python and Java. This role requires solid experience in AWS and Kubernetes.

🏠 RemoteSenior
1d ago
PandaDoc

Site Reliability Engineer

PandaDoc📍 Germany - Remote

PandaDoc is hiring a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll work with Python, Java, AWS, and Kubernetes to manage incident processes and observability stacks. This role requires solid programming experience and expertise in maintaining production services.

🏠 RemoteSenior
1d ago
PandaDoc

Site Reliability Engineer

PandaDoc📍 Spain - Remote

PandaDoc is hiring a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll work with Python, Java, AWS, and Kubernetes to manage incident processes and observability tools. This role requires solid programming experience and expertise in maintaining production services.

🏠 RemoteSenior
1d ago
PandaDoc

Site Reliability Engineer

PandaDoc📍 Poland

PandaDoc is seeking a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll manage incident processes, maintain observability tools, and contribute to service codebases using Python and Java. This role requires strong experience in AWS and Kubernetes.

Senior
2w ago
Wikimedia

Site Reliability Engineer

Wikimedia📍 Remote - Remote

Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems for data-oriented teams. You'll work with technologies like Kubernetes and Hadoop to ensure system reliability and scalability. This role requires strong experience in distributed systems and automation.

🏠 RemoteSenior
3w ago