About Clarifai

Empowering developers with advanced AI tools

🏢 Tech👥 101-200 employees📅 Founded 2013📍 Wilmington, DE💰 $101.2m⭐ 3.7

B2BArtificial IntelligenceInternal toolsDeep TechComputer VisionMachine LearningSaaSAPI

Key Highlights

Headquartered in Wilmington, DE with 101-200 employees
$101.2 million raised in Series C funding
API tools for image recognition and metadata tagging
Cooperative research agreement with the U.S. Army

Clarifai is a leading deep learning AI platform headquartered in Wilmington, DE, specializing in computer vision and artificial intelligence. With over $101.2 million raised in Series C funding, Clarifai empowers developers by providing API tools that automate image recognition and metadata tagging,...

🎁 Benefits

Clarifai offers a work-from-home stipend, cell phone reimbursement, and comprehensive insurance including medical, dental, and vision. Employees enjoy...

🌟 Culture

Clarifai fosters a culture of innovation and collaboration, focusing on empowering developers with advanced AI tools. The company encourages continuou...

🌐 Website 💼 LinkedIn 𝕏 Twitter All 5 jobs →

Site Reliability Engineer • Senior

Clarifai • United States - Remote

Posted 1 month ago🏠 Remote Senior Site Reliability Engineer 📍 United States

Apply Now →

Skills & Technologies

Kubernetes Python Go Microservice architecture Cloud infrastructure Relational databases Message queues

Overview

Clarifai is seeking a Senior Site Reliability Engineer to ensure the smooth operation and high availability of their AI platform. You'll work with Kubernetes, Python, and Golang to tackle infrastructure challenges. This role requires expertise in cloud infrastructure and microservice architecture.

Job Description

Who you are

You have 5+ years of experience in site reliability engineering, focusing on ensuring the availability and performance of distributed systems. Your background includes working with Kubernetes and cloud infrastructure, allowing you to effectively manage and orchestrate complex environments. You are proficient in programming languages such as Python and Golang, enabling you to develop tools and scripts that enhance system reliability. Your understanding of microservice architecture principles helps you design resilient systems that can scale efficiently. You are familiar with security best practices for cloud-based systems, ensuring that the infrastructure remains secure and compliant. Additionally, you have experience with relational databases and message queues, which are critical for maintaining data integrity and communication between services.

Desirable

Knowledge of developing and building custom Kubernetes operators is a plus, as it allows for greater automation and efficiency in managing Kubernetes clusters. Familiarity with various RPC frameworks can enhance your ability to implement efficient communication between microservices. You are always eager to learn and adapt to new technologies, contributing to a culture of continuous improvement within the team.

What you'll do

In this role, you will be responsible for ensuring the smooth operation and high availability of Clarifai's core services. You will monitor system performance, identify bottlenecks, and implement solutions to enhance system reliability. Collaborating with engineering teams, you will address infrastructure challenges related to serving and training large neural networks in both cloud and on-premise environments. Your expertise will guide the development of best practices for incident management and response, ensuring that the team can quickly address any issues that arise. You will also play a key role in capacity planning, helping to forecast resource needs and optimize costs associated with cloud infrastructure.

As part of your responsibilities, you will develop and maintain CI/CD pipelines to streamline deployment processes and improve the overall efficiency of the development lifecycle. You will work closely with cross-functional teams to ensure that infrastructure changes align with product goals and user needs. Your contributions will directly impact the performance and reliability of Clarifai's AI platform, enabling organizations to leverage AI technology effectively.

What we offer

Clarifai offers a collaborative and inclusive work environment where you can thrive as a Senior Site Reliability Engineer. You will have the opportunity to work on cutting-edge AI technology and contribute to projects that have a meaningful impact on various industries. We provide competitive compensation and benefits, along with opportunities for professional growth and development. Join us in our mission to empower organizations with AI-driven insights and solutions.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Clarifai.

Apply Now →Get Job Alerts

✨

Similar Jobs You Might Like

Based on your interests and this role

Site Reliability Engineer

Clarifai•📍 Canada - Remote

Clarifai is seeking a Senior Site Reliability Engineer to ensure the smooth operation and high availability of their AI platform. You'll work with Kubernetes, Python, and Golang to address infrastructure challenges. This role requires expertise in cloud systems and microservice architecture.

🏠 RemoteSenior

1 month ago

Site Reliability Engineer

Curology•📍 United States - Remote

Curology is hiring a Senior Site Reliability Engineer to architect and lead the delivery of reliable solutions. You'll work with AWS, Docker, and Kubernetes to automate operations and enhance reliability. This position requires strong technical expertise and problem-solving skills.

🏠 RemoteSenior

1 year ago

Site Reliability Engineer

PandaDoc•📍 Germany - Remote

PandaDoc is hiring a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll work with Python, Java, AWS, and Kubernetes to manage incident processes and observability stacks. This role requires solid programming experience and expertise in maintaining production services.

🏠 RemoteSenior

1d ago

Site Reliability Engineer

Wikimedia•📍 Remote - Remote

Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems for data-oriented teams. You'll work with technologies like Kubernetes and Hadoop to ensure system reliability and scalability. This role requires strong experience in distributed systems and automation.

🏠 RemoteSenior

3w ago

Site Reliability Engineer

Wikimedia•📍 Remote - Remote

Wikimedia is hiring a Senior Site Reliability Engineer to operate and enhance systems for data-oriented teams. You'll work with technologies like Kubernetes, Hadoop, and Kafka to ensure system reliability and scalability. This role requires strong experience in distributed systems and automation.

🏠 RemoteSenior

3w ago

Browse all jobs →