About Clarifai

Empowering developers with advanced AI tools

🏢 Tech👥 101-200 employees📅 Founded 2013📍 Wilmington, DE💰 $101.2m⭐ 3.7

B2BArtificial IntelligenceInternal toolsDeep TechComputer VisionMachine LearningSaaSAPI

Key Highlights

Headquartered in Wilmington, DE with 101-200 employees
$101.2 million raised in Series C funding
API tools for image recognition and metadata tagging
Cooperative research agreement with the U.S. Army

Clarifai is a leading deep learning AI platform headquartered in Wilmington, DE, specializing in computer vision and artificial intelligence. With over $101.2 million raised in Series C funding, Clarifai empowers developers by providing API tools that automate image recognition and metadata tagging,...

🎁 Benefits

Clarifai offers a work-from-home stipend, cell phone reimbursement, and comprehensive insurance including medical, dental, and vision. Employees enjoy...

🌟 Culture

Clarifai fosters a culture of innovation and collaboration, focusing on empowering developers with advanced AI tools. The company encourages continuou...

🌐 Website 💼 LinkedIn 𝕏 Twitter All 5 jobs →

Site Reliability Engineer • Senior

Clarifai • Canada - Remote

Posted 1 month ago🏠 Remote Senior Site Reliability Engineer 📍 Canada

Apply Now →

Skills & Technologies

kubernetes python golang microservice architecture cloud infrastructure

Overview

Clarifai is seeking a Senior Site Reliability Engineer to ensure the smooth operation and high availability of their AI platform. You'll work with Kubernetes, Python, and Golang to address infrastructure challenges. This role requires expertise in cloud systems and microservice architecture.

Job Description

Who you are

You have 5+ years of experience in site reliability engineering, focusing on maintaining high availability and performance of distributed systems. Your background includes working with Kubernetes and cloud infrastructure, allowing you to tackle complex challenges in a cloud-native environment. You are proficient in programming languages such as Python and Golang, which you use to develop and maintain tools that enhance system reliability. Your understanding of microservice architecture principles enables you to design scalable and efficient systems that meet the demands of modern applications. You are familiar with security best practices for cloud-based systems, ensuring that the infrastructure remains secure and compliant. You thrive in collaborative environments, working closely with development teams to implement best practices in system monitoring and incident response.

Desirable

Experience with relational databases, message queues, and key-value stores is a plus, as it allows you to optimize data flow and storage solutions. Familiarity with RPC frameworks can enhance your ability to integrate various services within the architecture. You have a keen interest in developing custom Kubernetes operators, which showcases your initiative to improve operational efficiency and automation within the infrastructure.

What you'll do

In this role, you will be responsible for ensuring the smooth operation of Clarifai's core services, which involves monitoring system performance and identifying bottlenecks. You will implement solutions to enhance system reliability and performance, addressing issues proactively before they impact users. Your work will involve collaborating with engineering teams to design and deploy scalable infrastructure that supports the training and serving of large neural networks. You will also be tasked with maintaining and improving the cloud infrastructure, ensuring that it meets the evolving needs of the organization. As part of your responsibilities, you will develop and maintain tools that facilitate the deployment and management of applications in a Kubernetes environment. You will participate in incident management processes, helping to resolve issues quickly and effectively while documenting lessons learned to prevent future occurrences. Your expertise will be critical in shaping the operational practices of the team, driving improvements in efficiency and reliability.

What we offer

Clarifai offers a dynamic work environment where innovation is at the forefront of our mission. You will be part of a diverse and inclusive team that values collaboration and creativity. We provide opportunities for professional growth and development, encouraging you to expand your skills and knowledge in the field of site reliability engineering. Our commitment to work-life balance means you can thrive both personally and professionally while contributing to cutting-edge AI solutions. Join us in transforming how organizations leverage AI to gain insights from their data, and be part of a company that is shaping the future of technology.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Clarifai.

Apply Now →Get Job Alerts

✨

Similar Jobs You Might Like

Based on your interests and this role

Site Reliability Engineer

Clarifai•📍 United States - Remote

Clarifai is seeking a Senior Site Reliability Engineer to ensure the smooth operation and high availability of their AI platform. You'll work with Kubernetes, Python, and Golang to tackle infrastructure challenges. This role requires expertise in cloud infrastructure and microservice architecture.

🏠 RemoteSenior

1 month ago

Site Reliability Engineer

PandaDoc•📍 Germany - Remote

PandaDoc is hiring a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll work with Python, Java, AWS, and Kubernetes to manage incident processes and observability stacks. This role requires solid programming experience and expertise in maintaining production services.

🏠 RemoteSenior

1d ago

Site Reliability Engineer

PandaDoc•📍 Spain - Remote

PandaDoc is hiring a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll work with Python, Java, AWS, and Kubernetes to manage incident processes and observability tools. This role requires solid programming experience and expertise in maintaining production services.

🏠 RemoteSenior

1d ago

Site Reliability Engineer

PandaDoc•📍 Remote (Europe) - Remote

PandaDoc is hiring a Senior Site Reliability Engineer to ensure reliable service with minimal downtime. You'll manage incident processes, observability tools, and contribute to service codebases using Python and Java. This role requires solid experience in AWS and Kubernetes.

🏠 RemoteSenior

1d ago

Site Reliability Engineer

Curology•📍 United States - Remote

Curology is hiring a Senior Site Reliability Engineer to architect and lead the delivery of reliable solutions. You'll work with AWS, Docker, and Kubernetes to automate operations and enhance reliability. This position requires strong technical expertise and problem-solving skills.

🏠 RemoteSenior

1 year ago

Browse all jobs →