About Alembic

Transforming marketing insights with AI-driven analytics

🏢 Media👥 21-100 employees📅 Founded 2018📍 SoMa, San Francisco, CA💰 $169.1m

B2BBig dataMarketingAnalyticsBusiness IntelligenceSaaS

Key Highlights

Raised $169.1 million in Series A funding
Headquartered in SoMa, San Francisco, CA
Employs advanced mathematics for marketing attribution
Analyzes marketing campaigns across multiple channels

Alembic, headquartered in SoMa, San Francisco, CA, specializes in AI-enabled predictive analytics for marketing attribution. The company leverages advanced mathematics and AI techniques, initially developed for pandemic-related research, to analyze marketing effectiveness across channels like TV, ra...

🎁 Benefits

Alembic offers competitive equity options, generous PTO, and a flexible remote work policy to support work-life balance. Employees also benefit from a...

🌟 Culture

Alembic fosters a data-driven culture focused on innovation in marketing analytics. The company values transparency and collaboration, empowering team...

🌐 Website 💼 LinkedIn 𝕏 Twitter All 14 jobs →

Site Reliability Engineer • Senior

Alembic • San Francisco - On-Site

Posted 2 months ago🏛️ On-Site Senior Site Reliability Engineer 📍 San Francisco

Apply Now →

Skills & Technologies

docker kubernetes linux ci/cd monitoring incident management

Overview

Alembic is seeking a Senior Site Reliability Engineer to enhance the reliability and performance of their platform. You'll work with technologies like Docker and Kubernetes to build and maintain scalable infrastructure. This role requires 8+ years of experience in SRE or DevOps.

Job Description

Who you are

You have over 8 years of experience in Site Reliability Engineering, DevOps, or infrastructure engineering roles, demonstrating a strong ability to design and maintain scalable systems. Your background includes at least 5 years of experience with datacenter operations and system administration, giving you a solid foundation in managing complex infrastructures.

You possess deep knowledge of Linux systems and networking, allowing you to optimize system performance effectively. Your expertise extends to containerization technologies like Docker and orchestration tools such as Kubernetes, which you have used to streamline deployment processes and enhance operational efficiency.

You are well-versed in CI/CD pipelines and deployment automation, having owned and evolved these processes in your previous roles. Your experience includes implementing monitoring and alerting systems, as well as incident response processes, ensuring that you can maintain high availability and reliability across services.

You thrive in collaborative environments, working closely with engineering and data science teams to foster a culture of performance and reliability. Your proactive approach to capacity planning and system improvements has consistently driven operational excellence in your past positions.

You understand the importance of security and compliance, especially in high-compliance environments, and you have experience ensuring operational readiness across cloud infrastructures. Your ability to conduct post-incident analyses and drive continuous improvement initiatives sets you apart as a leader in the SRE space.

Desirable

Experience in high-compliance or SOC-2 environments would be a plus, as it aligns with the operational standards we uphold at Alembic. Your curiosity and accountability will be key in influencing how our platform scales and operates.

What you'll do

In this hands-on role, you will design, build, and maintain scalable infrastructure to support real-time analytics and machine learning workloads. Your primary focus will be on improving system reliability and performance through automation and observability, ensuring that our platform operates at peak efficiency.

You will own and evolve our CI/CD pipelines, enhancing deployment automation and rollback mechanisms to minimize downtime and streamline updates. Your expertise in monitoring and incident management will be crucial as you implement processes for alerting and response, including the development of SLOs and runbooks.

Collaboration is key in this role, as you will work closely with engineers and data scientists to drive a culture of performance and reliability. You will ensure that security and compliance measures are integrated into our cloud infrastructure, maintaining operational readiness at all times.

Your contributions will include driving post-incident analysis and leading continuous improvement initiatives, helping to shape the future of our infrastructure and operations. You will have the opportunity to influence how our platform scales, from deployment strategies to incident management practices.

What we offer

At Alembic, you will have ownership of mission-critical infrastructure, playing a vital role in solving real-world enterprise problems. You will be part of a high-performance engineering culture that values curiosity, accountability, and impact. Your insights and expertise will directly influence the scalability and reliability of our platform, providing you with a front-row seat to our engineering efforts.

We encourage you to apply even if your experience doesn't match every requirement. Join us in shaping the future of our infrastructure and operations, and be part of a team that is dedicated to excellence.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Alembic.

Apply Now →Get Job Alerts

✨

Similar Jobs You Might Like

Based on your interests and this role

Site Reliability Engineer

Braze•📍 San Francisco - On-Site

Braze is hiring a Senior Site Reliability Engineer to ensure the uptime of internal-facing services and platforms. You'll work with Linux, distributed systems, and automation to maintain high service availability. This position requires a strong background in system administration and software engineering.

🏛️ On-SiteSenior

1w ago

Site Reliability Engineer

Stellar Development Foundation•📍 San Francisco - On-Site

Stellar Development Foundation is hiring a Senior Site Reliability Engineer to enhance the reliability and scalability of their systems. You'll work with AWS, GCP, and Kubernetes to support the Stellar blockchain ecosystem. This role requires strong experience in infrastructure management and automation.

🏛️ On-SiteSenior

3w ago

Site Reliability Engineer

Baseten•📍 San Francisco - On-Site

Baseten is hiring a Site Reliability Engineer to build and maintain scalable infrastructure for deploying machine learning models. You'll work with technologies like AWS, Docker, and Kubernetes. This position requires experience in managing CI/CD pipelines and optimizing performance.

🏛️ On-Site

4 months ago

Site Reliability Engineer

Crusoe•📍 San Francisco - On-Site

Crusoe is seeking a Senior Site Reliability Engineer to enhance the stability and performance of their GPU cloud platform. You'll collaborate with cross-functional teams and utilize skills in AWS, Docker, and Kubernetes. This role requires a strong background in operational excellence and incident management.

🏛️ On-SiteSenior

2 months ago

Site Reliability Engineer

ConductorOne•📍 Portland - On-Site

ConductorOne is hiring a Site Reliability Engineer to design and operate highly reliable infrastructure across cloud environments. You'll work with AWS, GCP, and Azure while building automation and tooling to enhance system reliability. This position requires 3+ years of experience in SRE or DevOps.

🏛️ On-SiteMid-Level

4 months ago

Browse all jobs →