
About Doctolib
Simplifying healthcare access for millions
Key Highlights
- 17,000+ healthcare professionals using the platform
- 6 million patients served monthly
- Presence in 435 healthcare facilities across Europe
- Headquartered in Levallois-Perret, France
Doctolib is the leading European platform for online medical appointment scheduling, serving over 17,000 healthcare professionals and connecting with 6 million patients monthly. Headquartered in Levallois-Perret, Γle-de-France, Doctolib is present in 435 healthcare facilities across France and Germa...
π Benefits
Employees enjoy competitive salaries, stock options, generous PTO, and a flexible remote work policy, promoting a healthy work-life balance....
π Culture
Doctolib fosters a culture centered around improving healthcare access, emphasizing technology-driven solutions and a commitment to user experience. T...
Skills & Technologies
Overview
Doctolib is seeking a Senior Data Engineer focused on AI to build and optimize data foundations for AI models. You'll work with GCP and various data technologies to ensure high-quality data for healthcare applications.
Job Description
Who you are
You have 5+ years of experience as a Data Engineer, with a strong focus on building scalable data pipelines and ensuring data quality for AI applications. Your expertise in Google Cloud Platform (GCP) allows you to design and maintain data infrastructures that support machine learning and AI initiatives. You are familiar with both structured and unstructured data, and you understand how to integrate various data sources into unified models that can be utilized for AI consumption.
Your background includes working with NoSQL and Vector Databases, enabling you to efficiently store and retrieve embeddings and documents. You have a solid understanding of data governance and privacy, ensuring that the data you work with is compliant and reliable. You thrive in collaborative environments, working closely with machine learning and platform teams to define data schemas and partitioning strategies that enhance performance and scalability.
Desirable
Experience with large language models (LLMs) and multimodal models is a plus, as is familiarity with data quality and lineage frameworks. You are comfortable optimizing data pipelines for performance and cost, leveraging GCP native services to achieve the best results.
What you'll do
In your role at Doctolib, you will be responsible for building and optimizing the data foundations within the AI Team. This includes designing, building, and maintaining scalable data pipelines on GCP tailored for AI and machine learning use cases. You will implement data ingestion and transformation frameworks that power retrieval systems and training datasets for LLMs and multimodal models.
You will ensure high standards of data quality for AI model inputs, collaborating with engineers and data scientists to facilitate efficient training, evaluation, and deployment of AI models. Your work will involve architecting and managing NoSQL and Vector Databases to store and retrieve data effectively, ensuring that the data is well-structured and compliant.
You will also integrate various data sources, including text, speech, images, and documents, into unified data models that are ready for AI consumption. Your role will require you to optimize the performance and cost of data pipelines using GCP services such as BigQuery, Dataflow, Pub/Sub, Cloud Storage, and Vertex AI. You will contribute to data quality and lineage frameworks, ensuring that AI models are trained on validated and reliable data.
What we offer
At Doctolib, you will join a dedicated team on a mission to transform healthcare through AI. We offer a collaborative work environment where your contributions will have a direct impact on the healthcare industry. You will have the opportunity to work with cutting-edge technologies and be part of a team that values innovation and excellence. We encourage you to apply even if your experience doesn't match every requirement, as we believe in the potential of diverse backgrounds and perspectives.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at Doctolib.
Similar Jobs You Might Like
Based on your interests and this role

Data Engineer
Doctolib is seeking a Senior Data Engineer AI to build robust data pipelines that power AI systems in healthcare. You'll work with technologies like Python, BigQuery, and SQL to support innovative healthcare solutions.

Analytics Engineer
Doctolib is seeking a Senior Analytics Engineer to build data products that provide insights and support decision-making. You'll work with Python, SQL, and BigQuery to develop data pipelines and dashboards. This role requires experience in analytics engineering and collaboration with cross-functional teams.

Data Engineer
Doctolib is hiring a Senior Data Engineer for their Data & AI Platform team to design and manage data infrastructure. You'll work with technologies like Airflow, AWS, and PostgreSQL to ensure seamless data flow across the organization. This position requires significant experience in data engineering.

Machine Learning Engineer
Qonto is hiring a Senior Machine Learning Engineer for their AI Product team to build and ship customer-facing AI solutions. You'll work with Generative AI and machine-learning techniques to impact over 500,000 business customers. This position requires experience in delivering client-facing products end-to-end.

Data Engineer
Contentsquare is hiring a Senior Data Engineer to lead large-scale projects and design the next generation of data architecture. You'll work with technologies like Airflow, Apache Spark, and AWS in the Paris area.