About Arcesium

Empowering asset managers with advanced fintech solutions

🏢 Tech👥 1K-5K📅 Founded 2015📍 New York, New York, United States

Key Highlights

Spin-out from D. E. Shaw group, enhancing expertise
Headquartered in New York City with a strong fintech focus
Supports over 100 clients in the asset management sector
Employs between 1,000 and 5,000 professionals

Arcesium, a spin-out of the D. E. Shaw group, specializes in software and services for asset managers, focusing on post-trade activities. Headquartered in New York City, Arcesium supports over 100 clients, including hedge funds and investment firms, with its comprehensive suite of financial technolo...

🎁 Benefits

Arcesium offers competitive salaries, equity options, generous PTO policies, and a flexible remote work environment to support work-life balance....

🌟 Culture

Arcesium fosters a culture centered on innovation and collaboration, emphasizing a strong engineering focus and a commitment to delivering high-qualit...

🌐 Website 💼 LinkedIn 𝕏 Twitter All 48 jobs →

Site Reliability Engineer • Senior

Arcesium • Lisbon

Posted 3 months agoSenior Site Reliability Engineer 📍 Lisbon

Apply Now →

Skills & Technologies

aws docker kubernetes linux prometheus grafana

Overview

Arcesium is hiring a Senior Site Reliability Engineer to ensure the stability and reliability of mission-critical production applications. You'll work with technologies like AWS, Docker, and Kubernetes in Lisbon.

Job Description

Who you are

You are an experienced Site Reliability Engineer with a strong background in maintaining and improving the reliability of complex systems. With 5+ years of experience in a similar role, you have a deep understanding of observability, monitoring, and incident management — you've successfully implemented tools and processes that enhance system stability and resilience. Your expertise in cloud platforms, particularly AWS, allows you to design and manage scalable infrastructure that meets the demands of high-traffic applications.

You possess strong troubleshooting skills and can quickly diagnose and resolve live production issues — your analytical mindset helps you to proactively detect potential problems before they escalate. You are comfortable working in a collaborative environment, where you can share knowledge and mentor junior engineers, fostering a culture of continuous improvement and learning.

Your technical skills include proficiency in containerization technologies like Docker and orchestration tools such as Kubernetes — you understand how to leverage these tools to streamline deployment processes and improve system performance. You are also well-versed in using monitoring and alerting tools like Prometheus and Grafana to ensure that systems are operating optimally.

Desirable

Experience with infrastructure as code tools such as Terraform or CloudFormation would be a plus, as would familiarity with CI/CD pipelines and automation frameworks. You are always eager to learn new technologies and methodologies that can enhance your team's capabilities and improve operational efficiency.

What you'll do

In this role, you will be a key member of the Platform Site Reliability Engineering (PSRE) team, responsible for ensuring the stability, reliability, and availability of mission-critical production applications on the Arcesium platform. You will implement observability practices, including monitoring, logging, and tracing, to proactively detect and prevent issues that could impact system performance.

You will build and maintain tools and infrastructure that enhance system stability and resilience, working closely with development teams to ensure that applications are designed with reliability in mind. Your responsibilities will include troubleshooting live production issues, focusing on rapid incident resolution, and conducting post-mortem analyses to identify root causes and prevent future occurrences.

You will collaborate with cross-functional teams to improve operational processes and contribute to the development of best practices for incident management and response. Your role will also involve mentoring junior engineers, sharing your expertise, and helping to cultivate a culture of reliability within the organization.

What we offer

Arcesium offers a dynamic work environment where you can make a meaningful impact from day one. We value intellectual curiosity and proactive ownership, providing opportunities for professional development and growth. You will be part of a collaborative team that is committed to innovation and excellence in the financial technology sector. We encourage you to apply even if your experience doesn't match every requirement — we believe diverse teams build better products.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Arcesium.

Apply Now →Get Job Alerts

✨

Similar Jobs You Might Like

Based on your interests and this role

Site Reliability Engineer

Arcesium•📍 Lisbon

Arcesium is hiring a Lead Site Reliability Engineer to ensure the stability and reliability of mission-critical production applications. You'll work with technologies like AWS, Docker, and Kubernetes in Lisbon.

Lead

3 months ago

Site Reliability Engineer

MoonPay•📍 Lisbon

MoonPay is hiring a Senior Site Reliability Engineer to enhance their resilient and secure production platform. You'll work with AWS, Docker, and Kubernetes to ensure smooth deployment of applications. This role requires significant experience in site reliability engineering.

Senior

4 months ago

Site Reliability Engineer

GoCardless•📍 Lisbon

GoCardless is seeking a Senior Site Reliability Engineer to enhance their platform's reliability and performance. You'll work in a global team focused on improving payment systems. This role requires extensive experience in site reliability engineering.

Senior

2w ago

Site Reliability Engineer

PandaDoc•📍 Portugal

PandaDoc is hiring a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll manage incident processes and contribute to service codebases using Python and Java. This role requires strong experience with AWS and Kubernetes.

Senior

2w ago

Site Reliability Engineer

Iterable•📍 Lisbon - Hybrid

Iterable is seeking a Senior Site Reliability Engineer to enhance their cloud platform. You'll work with AWS, Docker, and Kubernetes to ensure system reliability and performance. This role requires strong experience in cloud infrastructure and operations.

🏢 HybridSenior

2 months ago

Browse all jobs →