About Coupang

Delivering convenience to millions in South Korea

🏢 Tech, Retail👥 1K-5K📅 Founded 2010📍 Seoul, Korea, South

Key Highlights

Publicly traded on the NYSE under the ticker CPNG
Valuation of approximately $60 billion post-IPO
Over 17 million active customers using the platform
Employs 1,000-5,000 people across various roles

Coupang is a leading e-commerce platform in South Korea, founded in 2010 and headquartered in Seoul. The company serves millions of customers with its fast delivery service, Rocket Delivery, which promises same-day or next-day delivery on a wide range of products. Coupang went public in March 2021, ...

🎁 Benefits

Coupang offers competitive salaries, stock options, generous paid time off, and a flexible remote work policy to support work-life balance....

🌟 Culture

Coupang fosters a customer-centric culture, emphasizing speed and efficiency in its operations. The company values innovation and encourages employees...

🌐 Website 💼 LinkedIn 𝕏 Twitter All 508 jobs →

Site Reliability Engineer • Senior

Coupang • Seattle - On-Site

Posted 1 month ago🏛️ On-Site Senior Site Reliability Engineer 📍 Seattle💰 $18 - $21 / daily

Apply Now →

Skills & Technologies

aws docker kubernetes linux prometheus incident management disaster recovery load testing capacity engineering

Overview

Coupang is hiring a Senior Site Reliability Engineer to ensure the reliability and performance of their customer-facing services. You'll work with AWS, Docker, and Kubernetes to build and maintain scalable infrastructure. This role requires a strong background in SRE principles and large-scale distributed systems.

Job Description

Who you are

You have 5+ years of experience in Site Reliability Engineering or a related field, with a strong focus on building and maintaining large-scale distributed systems. You take pride in your ability to ensure system reliability and performance, and you have a deep understanding of SRE principles and practices. Your background includes extensive experience with automation and infrastructure as code, allowing you to tackle complex technical challenges effectively.

You are proficient in cloud technologies, particularly AWS, and have hands-on experience with containerization tools like Docker and orchestration platforms such as Kubernetes. Your expertise in monitoring and observability tools, including Prometheus, enables you to define and track key performance indicators (KPIs) and service-level objectives (SLOs) related to system availability and reliability.

You possess strong problem-solving skills and a passion for automation, which drives you to continuously improve processes and systems. You thrive in collaborative environments and enjoy working closely with product development teams to influence design decisions and resolve production incidents. Your communication skills allow you to effectively convey technical concepts to both technical and non-technical stakeholders.

Desirable

Experience with incident management and disaster recovery processes is a plus, as is familiarity with load testing and capacity engineering. You are always eager to learn and adapt to new technologies and methodologies, ensuring that you stay at the forefront of the SRE field.

What you'll do

In this role, you will serve as the primary point of responsibility for the platform reliability, health, and performance of all Coupang customer-facing services. You will gain deep knowledge of Coupang's application workflows and dependencies, allowing you to effectively monitor and manage system performance. Your responsibilities will include defining and tracking KPIs and SLOs, ensuring that all services meet the required standards for availability and performance.

You will work closely with product development teams from the early stages of design through to production, helping to resolve any incidents that arise and maintaining the SLI/SLA bar for production services. Your expertise in SRE principles will guide your interactions with these teams, influencing them to adopt best practices in reliability and performance.

You will also be responsible for building world-class infrastructure automation, focusing on areas such as observability, incident management, disaster recovery, load testing, and capacity engineering. Your work will directly impact the reliability and scalability of Coupang's ecommerce systems, ensuring that they can handle the demands of a growing customer base.

What we offer

Coupang offers a dynamic work environment where you can make a significant impact on the reliability of our services. You will have the opportunity to work with cutting-edge technologies and collaborate with a diverse team of professionals who are passionate about delivering exceptional customer experiences. We encourage you to apply even if your experience doesn't match every requirement, as we value diverse perspectives and backgrounds.

We provide competitive compensation and benefits, along with opportunities for professional growth and development. Join us at Coupang and be part of a team that is dedicated to building the future of ecommerce.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Coupang.

Apply Now →Get Job Alerts

✨

Similar Jobs You Might Like

Based on your interests and this role

Site Reliability Engineer

Apple•📍 Seattle - On-Site

Apple is hiring a Senior Site Reliability Engineer to support and scale cloud services for millions of users. You'll work with technologies like Kubernetes, Cassandra, and Kafka to build critical infrastructural systems. This position requires strong expertise in cloud service infrastructure.

🏛️ On-SiteSenior

1 month ago

Site Reliability Engineer

Google•📍 Seattle

Google is seeking a Senior Site Reliability Engineer to design, build, and maintain large-scale distributed systems. You'll work with technologies like Java, Python, and AWS to ensure reliability and performance. This role requires 5+ years of experience in software development and systems engineering.

Senior

1 month ago

Site Reliability Engineer

Axon•📍 Seattle - Hybrid

Axon is hiring a Site Reliability Engineer II to enhance the reliability and performance of their cloud-native global Kubernetes platform. You'll focus on building infrastructure and tools that support engineering operations. This role requires experience in system stability and cloud technologies.

🏢 HybridMid-Level

2d ago

Site Reliability Engineer

Axon•📍 Seattle - Hybrid

Axon is hiring a Site Reliability Engineer II to enhance the reliability of their mission-critical cloud native services. You'll work with AWS, Docker, and Kubernetes to build robust platforms. This position requires experience in SRE practices and cloud technologies.

🏢 HybridMid-Level

2d ago

Site Reliability Engineer

SolarWinds•📍 Bangalore

SolarWinds is seeking a Senior Staff Site Reliability Engineer to lead reliability strategy and architecture for their Observability Platform. You'll work with ClickHouse, Kubernetes, and cloud services like AWS and Azure. This role requires deep expertise in large-scale SaaS infrastructure.

Senior

12h ago

Browse all jobs →