
About Mercor
Connecting Indian engineers with US startup opportunities
Key Highlights
- Headquartered in Weston, Connecticut, USA
- Team size of 201-500 employees
- Focus on connecting Indian engineers with US startups
- Flexible remote work policies for employees
Mercor is a tech talent platform based in Weston, Connecticut, that connects skilled Indian software engineers with innovative US startups. By leveraging a vast network of tech professionals, Mercor facilitates the hiring process for companies looking to enhance their development capabilities. With ...
🎁 Benefits
Mercor offers competitive salaries, equity options, flexible remote work arrangements, and generous PTO policies to support work-life balance....
🌟 Culture
Mercor fosters a culture of collaboration and innovation, focusing on creating meaningful connections between engineers and startups while prioritizin...
Skills & Technologies
Overview
Mercor is seeking a Site Reliability Engineer to own production reliability across critical systems. You'll work with AWS, Kubernetes, and Terraform to build and improve high-availability systems in San Francisco.
Job Description
Who you are
You have a strong background in site reliability engineering, with experience in maintaining and improving production systems. You understand the importance of uptime and production quality, and you thrive in high-intensity environments where reliability is paramount. You are comfortable working hands-on with systems, ensuring they are stable, resource-efficient, and well-observed. Your experience includes collaborating with infrastructure leadership to define priorities and reliability standards, and you are eager to help shape the SRE function from the ground up.
You have a solid understanding of the AWS ecosystem and have hands-on experience with Kubernetes and modern infrastructure as code (IaC) tooling such as Terraform. You are familiar with best practices in incident management and have a proactive approach to identifying and resolving potential issues before they impact production. You are a team player who enjoys working alongside researchers and operators to redefine how systems operate in the context of AI development.
Desirable
Experience as a founding SRE or an early SRE hire is a plus, as is a background in establishing SRE practices and organizations from scratch. You are adaptable and open to learning new technologies and methodologies that can enhance the reliability and performance of systems.
What you'll do
As a Site Reliability Engineer at Mercor, you will own the reliability and production safety for core shared services and customer-facing systems. You will partner directly with infrastructure leadership to define SRE priorities and establish a production safety roadmap. Your role will involve repairing and improving the structure of production systems to ensure they are stable and efficient. You will introduce and implement monitoring solutions to enhance observability and response times.
You will be responsible for collaborating with cross-functional teams to ensure that reliability standards are met and maintained. This includes participating in incident response and post-mortem analyses to continuously improve processes and systems. You will also contribute to the development of best practices for system reliability and performance, ensuring that the team is aligned with industry standards.
Your hands-on approach will allow you to build and fix systems while mentoring junior engineers and sharing your knowledge with the team. You will play a key role in shaping the SRE culture at Mercor, fostering an environment of collaboration and continuous improvement.
What we offer
Mercor offers a competitive salary and benefits package, along with the opportunity to work in a dynamic and innovative environment. You will be part of a team that is at the forefront of AI development, working alongside experts and researchers who are passionate about their work. Our San Francisco headquarters provides a collaborative workspace where you can thrive and contribute to meaningful projects that impact the future of AI.
We encourage you to apply even if your experience doesn't match every requirement. At Mercor, we value diverse perspectives and are committed to building a team that reflects the communities we serve. Join us in redefining the intersection of labor markets and AI research.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at Mercor.
Similar Jobs You Might Like
Based on your interests and this role

Site Reliability Engineer
Together AI is hiring a Site Reliability Engineer to ensure the reliability and performance of user-facing services and production systems. You'll work with Ansible, Terraform, and Kubernetes to build and manage infrastructure. This role requires 2+ years of experience in SRE or a related field.

Site Reliability Engineer
WorkOS is hiring a Site Reliability Engineer to ensure the platform remains fast, reliable, and resilient at scale. You'll work with AWS, Docker, and Kubernetes to build systems that handle hundreds of millions of requests. This role requires a strong understanding of complex systems and incident response.

Site Reliability Engineer
Apple is seeking a Site Reliability Engineer to join their Services Engineering team. You'll be responsible for building secure, end-to-end solutions and managing the full infrastructure stack. This role requires expertise in solving complex problems at scale.

Site Reliability Engineer
Braze is hiring a Senior Site Reliability Engineer to ensure the uptime of internal-facing services and platforms. You'll work with Linux, distributed systems, and automation to maintain high service availability. This position requires a strong background in system administration and software engineering.

Site Reliability Engineer
Stellar Development Foundation is hiring a Senior Site Reliability Engineer to enhance the reliability and scalability of their systems. You'll work with AWS, GCP, and Kubernetes to support the Stellar blockchain ecosystem. This role requires strong experience in infrastructure management and automation.