
About Cohere
AI solutions built for enterprise trust and security
Key Highlights
- Headquartered in Grange Park, Toronto, ON
- $1.5 billion in funding from top investors
- Clients include Royal Bank of Canada, Fujitsu, and Oracle
- Focus on AI solutions for regulated industries
Cohere, headquartered in Grange Park, Toronto, ON, specializes in enterprise-grade AI solutions tailored for regulated industries such as banking and telecom. With $1.5 billion in funding, Cohere has secured contracts with major clients including Royal Bank of Canada, Fujitsu, and Oracle, providing ...
🎁 Benefits
Cohere offers comprehensive benefits including 100% coverage for health, dental, and vision insurance premiums, a $2,000 annual education benefit, six...
🌟 Culture
Cohere's culture emphasizes security and trust in AI adoption, focusing on enterprise needs rather than consumer trends. The company prioritizes a sup...
Skills & Technologies
Overview
Cohere is hiring a Site Reliability Engineer to develop and operate AI platforms for advanced NLP applications. You'll work with technologies like AWS, Docker, and Kubernetes to ensure high-performance and reliable machine learning systems. This role requires experience in deploying scalable systems and a strong understanding of API management.
Job Description
Who you are
You have a strong background in site reliability engineering, with experience in building and maintaining high-performance, scalable systems. You understand the intricacies of deploying machine learning models and have a solid grasp of cloud infrastructure, particularly AWS. Your expertise in containerization technologies like Docker and orchestration tools such as Kubernetes allows you to manage complex deployments effectively. You are proficient in programming languages like Python, enabling you to automate processes and enhance system reliability. You thrive in collaborative environments, working closely with cross-functional teams to deliver optimized solutions that meet customer needs. You are passionate about AI and its potential to transform industries, and you are eager to contribute to innovative projects that push the boundaries of technology.
Desirable
Experience with monitoring and alerting tools, as well as familiarity with REST APIs, will be beneficial in this role. A background in natural language processing (NLP) or machine learning will set you apart as you work on cutting-edge AI applications.
What you'll do
As a Site Reliability Engineer at Cohere, you will be responsible for developing, deploying, and operating the AI platform that delivers large language models through user-friendly API endpoints. You will collaborate with various teams to ensure that NLP models are deployed in low-latency, high-throughput environments, maintaining high availability and performance standards. Your role will involve optimizing system performance, troubleshooting issues, and implementing best practices for reliability and scalability. You will also engage with customers to understand their needs and provide tailored solutions that enhance their experience with our AI products. Your contributions will directly impact the efficiency and effectiveness of our AI systems, helping to drive the widespread adoption of AI technologies.
What we offer
Cohere provides a supportive work environment that values mental health and well-being, offering benefits such as a separate budget for mental health care and a 100% parental leave top-up for up to six months. We encourage personal enrichment through benefits towards arts and culture, fitness, and workspace improvement. Our flexible remote work policy allows you to choose between working from our offices in Toronto, New York, San Francisco, London, or Paris, or from the comfort of your home. You will enjoy a generous vacation policy, with six weeks of vacation (30 working days) to recharge and pursue personal interests. Join us at Cohere and be part of a team that is shaping the future of AI.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at Cohere.
Similar Jobs You Might Like
Based on your interests and this role

Site Reliability Engineer
Achievers is hiring a Staff Site Reliability Engineer to manage and advance their global infrastructure. You'll work with GCP/GKE and AI-driven workflows to build reliable, scalable cloud systems. This position requires approximately 15 years of technical expertise in distributed systems.

Site Reliability Engineer
PagerDuty is hiring a Site Reliability Engineer II to support and improve foundational infrastructure for their real-time digital operations platform. You'll work with technologies like Kubernetes and AWS in Toronto. This position requires experience in reliability and scalability of systems.

Site Reliability Engineer
MongoDB is seeking a Senior Site Reliability Engineer to join the Fabric team, focusing on building and maintaining robust infrastructure for secure communication. You'll leverage your expertise in networking and distributed systems. This role requires 6+ years of experience.

Site Reliability Engineer
Fivetran is seeking a Senior Site Reliability Engineer to enhance the reliability and performance of their data platform. You'll collaborate with engineering teams and utilize skills in AWS, Docker, and Kubernetes to ensure infrastructure stability.

Site Reliability Engineer
Pinterest is hiring a Senior Site Reliability Engineer to ensure the reliability of their large-scale distributed systems. You'll work with technologies like AWS, Docker, and Kubernetes to develop software solutions that enhance system operability. This role requires significant experience in site reliability engineering.