
About Arcesium
Empowering asset managers with advanced fintech solutions
Key Highlights
- Spin-out from D. E. Shaw group, enhancing expertise
- Headquartered in New York City with a strong fintech focus
- Supports over 100 clients in the asset management sector
- Employs between 1,000 and 5,000 professionals
Arcesium, a spin-out of the D. E. Shaw group, specializes in software and services for asset managers, focusing on post-trade activities. Headquartered in New York City, Arcesium supports over 100 clients, including hedge funds and investment firms, with its comprehensive suite of financial technolo...
π Benefits
Arcesium offers competitive salaries, equity options, generous PTO policies, and a flexible remote work environment to support work-life balance....
π Culture
Arcesium fosters a culture centered on innovation and collaboration, emphasizing a strong engineering focus and a commitment to delivering high-qualit...
Skills & Technologies
Overview
Arcesium is hiring a Lead Site Reliability Engineer to ensure the stability and reliability of mission-critical production applications. You'll work with technologies like AWS, Docker, and Kubernetes in Lisbon.
Job Description
Who you are
You are an intelligent and resourceful engineer with a strong background in Site Reliability Engineering. You have experience in ensuring the stability, reliability, and availability of production applications, and you thrive in environments where you can proactively detect and prevent issues. Your expertise in observability, monitoring, logging, and tracing allows you to troubleshoot live production issues effectively, focusing on rapid incident resolution. You are comfortable building tools and infrastructure that enhance system stability and resilience, and you have a deep understanding of cloud technologies, particularly AWS.
You have a collaborative mindset and enjoy working with cross-functional teams to drive improvements in system performance. Your ability to communicate complex technical concepts clearly makes you a valuable team member. You are committed to continuous learning and professional development, eager to contribute meaningfully from day one. You understand the importance of governance in SRE practices and are dedicated to maintaining high standards in your work.
Desirable
Experience with container orchestration tools like Kubernetes is a plus. Familiarity with CI/CD pipelines and infrastructure as code practices will help you excel in this role. You may also have experience with incident management tools and practices, which will be beneficial in ensuring the reliability of our systems.
What you'll do
As a Lead Site Reliability Engineer at Arcesium, you will play a critical role in the Platform Site Reliability Engineering (PSRE) team. Your primary responsibility will be to ensure the stability and reliability of our mission-critical production applications. You will implement observability and monitoring solutions to proactively detect and prevent issues, ensuring that our systems are resilient and performant.
You will build and maintain tools and infrastructure that enhance system stability, working closely with development teams to integrate these solutions into our workflows. Troubleshooting live production issues will be a key part of your role, and you will focus on rapid incident resolution to minimize downtime and impact on our clients.
You will also be responsible for governing SRE practices within the team, ensuring that we adhere to best practices and continuously improve our processes. Collaborating with other teams, you will help drive initiatives that enhance the overall reliability of our systems and contribute to the strategic goals of the organization.
What we offer
At Arcesium, we offer a dynamic work environment where innovation is at the forefront of our operations. You will have the opportunity to work with some of the most sophisticated financial institutions in the world, tackling complex data-driven challenges. We value intellectual curiosity and proactive ownership, empowering you to contribute meaningfully from day one. Our commitment to your professional development means you will have access to resources and opportunities to grow your skills and advance your career.
We believe in fostering a collaborative culture where diverse perspectives are valued. You will be part of a team that encourages open communication and teamwork, allowing you to thrive in your role. Join us at this exciting time in our growth as we expand our operations and pursue strategic new business opportunities.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at Arcesium.
Similar Jobs You Might Like
Based on your interests and this role

Site Reliability Engineer
Arcesium is hiring a Senior Site Reliability Engineer to ensure the stability and reliability of mission-critical production applications. You'll work with technologies like AWS, Docker, and Kubernetes in Lisbon.

Site Reliability Engineer
MoonPay is hiring a Senior Site Reliability Engineer to enhance their resilient and secure production platform. You'll work with AWS, Docker, and Kubernetes to ensure smooth deployment of applications. This role requires significant experience in site reliability engineering.

Site Reliability Engineer
GoCardless is seeking a Senior Site Reliability Engineer to enhance their platform's reliability and performance. You'll work in a global team focused on improving payment systems. This role requires extensive experience in site reliability engineering.

Site Reliability Engineer
amo is hiring a Lead Site Reliability Engineer (SRE) to ensure their systems handle high traffic and maintain performance and reliability. You'll work with technologies like ScyllaDB and focus on automation and system design. This role requires strong leadership and experience in distributed systems.

Site Reliability Engineer
ComplyAdvantage is hiring a Site Reliability Engineering Manager to lead a team of SREs focused on building resilient and scalable platforms. You'll work with AWS, Docker, and Kubernetes to enhance system performance and reliability. This role requires experience in cloud systems and team leadership.