
About Databricks
Empowering data teams with unified analytics
Key Highlights
- Headquartered in San Francisco, CA
- Valuation of $43 billion with $3.5 billion raised
- Serves over 7,000 customers including Comcast and Shell
- Utilizes Apache Spark for big data processing
Databricks, headquartered in San Francisco, California, is a unified data analytics platform that simplifies data engineering and collaborative data science. Trusted by over 7,000 organizations, including Fortune 500 companies like Comcast and Shell, Databricks has raised $3.5 billion in funding, ac...
🎁 Benefits
Databricks offers competitive salaries, equity options, generous PTO policies, and a remote-friendly work environment. Employees also benefit from a l...
🌟 Culture
Databricks fosters a culture of innovation with a strong emphasis on data-driven decision-making. The company values collaboration across teams and en...

Platform Engineer • Senior
Databricks • Heredia - Hybrid
Skills & Technologies
Overview
Databricks is hiring a Senior Platform Monitoring Engineer to lead platform incident investigations and enhance customer experience. You'll work with AWS, Docker, and Kubernetes in a hybrid role based in Heredia, Costa Rica.
Job Description
Who you are
You have a strong background in platform reliability and incident response, with at least 5 years of experience in a technical role focused on monitoring and observability. Your expertise in AWS and container orchestration tools like Docker and Kubernetes allows you to design effective observability solutions that enhance platform stability.
You thrive in high-pressure situations, coordinating cross-functional teams to rapidly detect, mitigate, and resolve incidents. Your analytical mindset and problem-solving skills enable you to lead complex investigations, ensuring that customer experience remains a top priority.
You are proficient in scripting and programming languages, particularly Python, which you use to automate monitoring tasks and improve system performance. Your familiarity with monitoring tools such as Prometheus and Grafana helps you visualize system health and performance metrics effectively.
You possess excellent communication skills, allowing you to articulate technical concepts to both technical and non-technical stakeholders. You are a team player who enjoys collaborating with engineers, product managers, and customer support teams to drive systemic improvements.
Desirable
Experience with cloud-native architectures and familiarity with CI/CD practices would be a plus. Knowledge of incident management tools and frameworks can further enhance your contributions to the team.
What you'll do
As a Senior Platform Monitoring Engineer at Databricks, you will lead the investigation of platform incidents, coordinating efforts across various teams to ensure swift resolution. You will design and implement observability solutions that provide deep insights into system performance, enabling proactive incident management.
You will work closely with engineering teams to identify and address potential reliability issues before they impact customers. Your role will involve analyzing system metrics and logs to uncover trends and anomalies, driving improvements in platform stability and performance.
You will also contribute to the development of monitoring tools and dashboards, ensuring that critical metrics are easily accessible to the team. Your insights will help shape the future of Databricks' platform, enhancing the overall customer experience.
In addition to your technical responsibilities, you will mentor junior engineers, sharing your knowledge and best practices in incident response and monitoring. You will play a key role in fostering a culture of reliability and customer obsession within the team.
What we offer
Databricks provides a dynamic work environment where you can grow your skills and make a significant impact. We offer competitive compensation and benefits, including flexible working hours and opportunities for professional development. Join us in our mission to empower data teams and tackle the world's most challenging problems.
Interested in this role?
Apply now or save it for later. Get alerts for similar jobs at Databricks.
Similar Jobs You Might Like
Based on your interests and this role

Platform Engineer
Databricks is hiring a Senior Platform Monitoring Engineer to lead platform incident investigations and enhance customer experience. You'll work with AWS, Linux, Docker, and Kubernetes in a hybrid role based in São Paulo.

Cloud Engineer
Cloudbeds is hiring a Cloud Operations Engineer to support their global infrastructure and ensure operational stability across their AWS-based environment. You'll work with tools like Datadog and CloudWatch to monitor systems and respond to incidents. This role requires experience in cloud operations and a strong understanding of monitoring platforms.

Noc Engineer
DAZN is hiring a NOC Engineer to manage and maintain a global network infrastructure. You'll work with technologies like Cisco and Linux to ensure seamless service delivery to millions of users. This position requires experience in network support and incident management.

Noc Engineer
NICE is hiring a NOC Engineer to ensure maximum service availability and performance for all CXone products. You'll provide support for various internal teams and troubleshoot network and platform issues. This role requires experience in networking and troubleshooting.

Noc Engineer
Navisite is hiring a NOC Engineer to provide first-level support for customer applications and equipment. You'll work with monitoring tools like Zabbix and ServiceNow to ensure high-quality customer service. This position requires 3-5 years of experience in a NOC/SOC environment.