
About Affirm
Transparent financing for modern consumers
Key Highlights
- 21M+ consumers and 337,000+ merchants using Affirm
- Raised $1.1B in funding, currently in Series F
- Flexible payback options from 3 to 36 months
- Headquartered in Chinatown, San Francisco, CA
Affirm, headquartered in Chinatown, San Francisco, CA, is a leading fintech company specializing in point-of-sale installment loans. With over 21 million consumers and 337,000+ merchants including Shopify, KAYAK, and Walmart, Affirm offers flexible payback options ranging from 3 to 36 months. The co...
🎁 Benefits
Affirm offers a remote-first workforce policy, allowing employees to work from anywhere in their home country. Benefits include 18 weeks of paid paren...
🌟 Culture
Affirm's culture is centered around transparency and consumer empowerment, with a focus on delivering honest financial products. The company actively ...
Skills & Technologies
Overview
Affirm is seeking a Staff Site Reliability Engineer to enhance application reliability and performance. You'll work with AWS, Docker, and Kubernetes to implement best practices in reliability engineering. This role requires extensive experience in SRE and distributed systems.
Job Description
Who you are
You have 5+ years of experience in software engineering with a strong focus on Site Reliability Engineering — you've successfully implemented reliability practices in production environments and understand the importance of operational excellence. Your background includes working with distributed systems, and you have a solid grasp of capacity management and performance optimization.
You possess deep expertise in cloud technologies, particularly AWS — you've designed and managed scalable infrastructure and have experience with container orchestration using Docker and Kubernetes. Your familiarity with monitoring tools like Prometheus allows you to provide visibility into application performance and reliability.
You are skilled in incident management and have led post-mortem analyses to improve system resilience — you understand the significance of defining Service Level Objectives (SLOs) and have experience guiding teams in their development. Your ability to engage in architectural discussions helps shape the reliability of applications across the organization.
You are a proactive communicator who enjoys collaborating with engineering teams — you thrive in environments where you can provide training and consulting on best practices for operating applications. Your experience with automation and configuration management tools enhances your ability to streamline processes and improve efficiency.
Desirable
Experience with chaos engineering and load testing is a plus — you understand how to simulate failures to test system resilience. Familiarity with observability frameworks and alerting configurations will help you recommend improvements that enhance operational visibility.
What you'll do
In this role, you will be responsible for driving the implementation of reliability engineering practices across Affirm's engineering teams — you will define frameworks and best practices that ensure applications operate smoothly and efficiently. Your work will involve providing data and insights to leadership on application performance, guiding teams in the development of SLOs, and steering the incident management process.
You will engage in service and architectural conversations, recommending observability and alerting configurations that enhance system reliability — your expertise will help shape the way teams approach reliability and incident response. You will also be involved in change management and deployment practices, ensuring that changes are made with minimal disruption to services.
Your role will require you to build tooling that supports the SRE team's objectives — you will work closely with engineers to develop solutions that improve incident lifecycle management and resilience practices. You will also provide training and consulting to teams, helping them adopt best practices in reliability engineering.
What we offer
At Affirm, we believe in creating a supportive and inclusive work environment — we encourage you to apply even if your experience doesn't match every requirement. You will have the opportunity to work with a talented team of engineers who are passionate about building reliable systems that enhance customer experiences.
We offer competitive compensation and benefits, including flexible work arrangements — you will have the freedom to work remotely while contributing to meaningful projects that impact the way consumers interact with credit. Join us in our mission to reinvent credit and make it more honest and friendly.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at Affirm.
Similar Jobs You Might Like
Based on your interests and this role

Site Reliability Engineer
Oscilar is hiring a Senior/Staff Site Reliability Engineer to shape the future of trust in AI. You'll architect and operate resilient cloud infrastructure while leading initiatives to improve system reliability and performance. This role requires experience with AWS and Kubernetes.

Site Reliability Engineer
Cutover is hiring a Site Reliability Engineer to ensure the reliability and performance of their AI-powered SaaS solution. You'll work with technologies like AWS, Ruby on Rails, and React. This role involves on-call shifts and collaboration with support and engineering teams.

Site Reliability Engineer
Affirm is seeking a Senior Site Reliability Engineer to enhance platform reliability and incident management. You'll work with AWS, Docker, and Kubernetes to ensure application performance and resilience. This role requires strong experience in SRE practices and distributed systems.

Site Reliability Engineer
Affirm is seeking a Staff Site Reliability Engineer to enhance platform reliability and incident management. You'll work with AWS, Docker, and Kubernetes to ensure application performance and resilience. This role requires extensive experience in SRE practices.

Site Reliability Engineer
Affirm is seeking a Staff Site Reliability Engineer to enhance platform reliability and incident management. You'll work with AWS, Docker, and Kubernetes to ensure application performance and resilience. This role requires extensive experience in SRE practices.