
About Cohere
AI solutions built for enterprise trust and security
Key Highlights
- Headquartered in Grange Park, Toronto, ON
- $1.5 billion in funding from top investors
- Clients include Royal Bank of Canada, Fujitsu, and Oracle
- Focus on AI solutions for regulated industries
Cohere, headquartered in Grange Park, Toronto, ON, specializes in enterprise-grade AI solutions tailored for regulated industries such as banking and telecom. With $1.5 billion in funding, Cohere has secured contracts with major clients including Royal Bank of Canada, Fujitsu, and Oracle, providing ...
🎁 Benefits
Cohere offers comprehensive benefits including 100% coverage for health, dental, and vision insurance premiums, a $2,000 annual education benefit, six...
🌟 Culture
Cohere's culture emphasizes security and trust in AI adoption, focusing on enterprise needs rather than consumer trends. The company prioritizes a sup...
Skills & Technologies
Overview
Cohere is hiring a Staff Research Engineer to enhance model efficiency for AI systems. You'll work on optimizing large language models and improving inference efficiency. This position requires expertise in machine learning and performance optimization.
Job Description
Who you are
You have a strong background in machine learning and AI, with a focus on optimizing model performance and efficiency. Your experience includes working with large language models (LLMs) and understanding the intricacies of model architecture and inference processes. You are passionate about pushing the boundaries of AI technology and are eager to contribute to impactful projects that serve humanity.
You possess a deep understanding of performance optimization techniques and have experience in software/hardware co-design for GPU acceleration. Your ability to analyze and improve model execution stacks is complemented by your collaborative spirit, as you thrive in diverse teams that value different perspectives.
What you'll do
As a Staff Research Engineer at Cohere, you will be responsible for driving breakthroughs in model efficiency across our foundation models. You will explore and implement optimizations in model architecture, focusing on mixture of experts (MoE) routing and inference-time algorithm improvements. Your work will directly impact the performance of AI systems used by developers and enterprises.
You will collaborate closely with researchers and engineers to identify bottlenecks in LLM inference and develop innovative solutions to enhance efficiency without compromising model quality. Your contributions will help shape the future of AI applications, enabling more widespread adoption and creating magical experiences for users.
What we offer
Cohere provides a remote-friendly work environment with offices in major cities including New York, Toronto, and San Francisco. We prioritize mental health and well-being, offering a separate budget for personal enrichment and 100% parental leave top-up for up to 6 months. Our team enjoys a generous vacation policy of 6 weeks, allowing for a healthy work-life balance. Join us in our mission to scale intelligence and make a meaningful impact in the world of AI.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at Cohere.
Similar Jobs You Might Like
Based on your interests and this role

Ai Engineer
Cohere is hiring a Member of Technical Staff, Model Efficiency to enhance the performance of AI models. You'll work with technologies like Python and TensorFlow to optimize model execution. This position requires experience in machine learning and model optimization.

Ai Engineer
Cohere is hiring an Audio Inference Engineer to optimize audio model serving efficiency. You'll work on advancing core audio metrics and collaborate with infrastructure teams. This role requires expertise in machine learning and audio processing.

Ai Research Engineer
Cohere is hiring a Senior Research Engineer, Model Evaluation to develop next-generation evaluation methods for AI models. You'll work with Python, TensorFlow, and PyTorch to enhance model capabilities. This role requires expertise in machine learning and data analysis.

Ai Research Engineer
Doji is hiring a Research Engineer to develop cutting-edge AI avatars and virtual try-on models. You'll work with diffusion models and data pipelines in New York City. This role requires hands-on experience with generative AI technologies.

Ai Research Engineer
Cohere is hiring a Member of Technical Staff, Agents Modeling to drive the development of agentic LLM systems. You'll work with machine learning techniques and data generation strategies to enhance model capabilities. This role requires experience in machine learning research and engineering.