
About Cohere
AI solutions built for enterprise trust and security
Key Highlights
- Headquartered in Grange Park, Toronto, ON
- $1.5 billion in funding from top investors
- Clients include Royal Bank of Canada, Fujitsu, and Oracle
- Focus on AI solutions for regulated industries
Cohere, headquartered in Grange Park, Toronto, ON, specializes in enterprise-grade AI solutions tailored for regulated industries such as banking and telecom. With $1.5 billion in funding, Cohere has secured contracts with major clients including Royal Bank of Canada, Fujitsu, and Oracle, providing ...
🎁 Benefits
Cohere offers comprehensive benefits including 100% coverage for health, dental, and vision insurance premiums, a $2,000 annual education benefit, six...
🌟 Culture
Cohere's culture emphasizes security and trust in AI adoption, focusing on enterprise needs rather than consumer trends. The company prioritizes a sup...
Skills & Technologies
Overview
Cohere is hiring a Machine Learning Engineer specializing in pre-training data to develop data pipelines for advanced language models. You'll work with Python, TensorFlow, and PyTorch to enhance model performance. This role requires experience in machine learning and data engineering.
Job Description
Who you are
You have a strong background in machine learning and data engineering, with experience in developing data pipelines that support advanced AI models. Your expertise in Python allows you to efficiently manipulate and analyze large datasets, ensuring high data quality for model training. You understand the intricacies of natural language processing (NLP) and are familiar with frameworks like TensorFlow and PyTorch, which you have used to build and optimize machine learning models.
You are detail-oriented and analytical, capable of conducting data ablations to evaluate data quality and construct effective pre-training data mixtures. Your ability to bridge the gap between raw data and cutting-edge AI models is essential, as you contribute to improvements in critical training metrics like throughput and accelerator utilization. You thrive in collaborative environments, working alongside researchers and engineers to drive innovation in language understanding and generation capabilities.
Desirable
Experience with large-scale data processing and familiarity with cloud platforms such as AWS or GCP would be advantageous. A background in research or a strong understanding of the latest advancements in AI and machine learning will set you apart.
What you'll do
As a Machine Learning Engineer at Cohere, you will play a pivotal role in developing and optimizing the data pipeline that underpins our advanced language models. You will work closely with cross-functional teams to ensure that the data used for training is of the highest quality, directly impacting the performance of our models. Your responsibilities will include conducting data ablations to assess data quality and constructing pre-training data mixtures that enhance model performance.
You will collaborate with researchers to implement innovative solutions that improve training efficiency and effectiveness. Your work will involve analyzing training metrics and making data-driven decisions to refine our processes. You will also contribute to the development of tools and frameworks that facilitate the integration of new data sources into our training pipeline.
What we offer
Cohere provides a supportive and inclusive work environment where you can thrive. We offer a flexible remote work policy, allowing you to balance your professional and personal life effectively. Our benefits include a generous vacation policy, mental health support, and personal enrichment benefits that promote your well-being and professional growth. Join us in our mission to scale intelligence and shape the future of AI.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at Cohere.
Similar Jobs You Might Like
Based on your interests and this role

Data Engineer
Cohere is hiring a Member of Technical Staff specializing in Data Engineering to develop data pipelines for advanced language models. You'll work with technologies like Airflow and Apache Spark, focusing on data ingestion and optimization. This role requires experience in data management and engineering.

Ai Research Engineer
Reflection is hiring a Member of Technical Staff - Pre-Training to research and build solutions for large language models. You'll work with Python and PyTorch to optimize training infrastructure and design scientific experiments. This position requires a graduate degree in Computer Science or related discipline.

Machine Learning Engineer
Cohere is hiring a Machine Learning Engineer specializing in synthetic data to develop and manage the synthetic data pipeline for advanced language models. You'll work with Python and generative AI technologies to enhance model quality. This role requires experience in data analysis and machine learning.

Ai Engineer
Poolside is hiring a Member of Engineering focused on pre-training and data quality for AI models. You'll work in a remote-first environment across EMEA and the East Coast, contributing to the development of Artificial General Intelligence.

Ai Research Engineer
Reddit is hiring a Staff Research Engineer for Pre-training Data to define the technical strategy for foundational Large Language Models. You'll work at the intersection of applied research and infrastructure. This role requires expertise in AI and machine learning.