Amazon

About Amazon

The everything store and cloud computing leader

🏢 Tech👥 1001+ employees📅 Founded 1995📍 South Lake Union, Seattle, WA3.7
B2CB2BMarketplaceCloud ComputingeCommerce

Key Highlights

  • Headquartered in South Lake Union, Seattle, WA
  • Over 1.5 million employees worldwide
  • Leading cloud services through Amazon Web Services (AWS)
  • Acquired Whole Foods, Twitch, and Ring

Amazon, headquartered in South Lake Union, Seattle, WA, is the world's largest online retailer and a leader in cloud computing through Amazon Web Services (AWS). With over 1.5 million employees globally, Amazon operates in various sectors, including AI with its Alexa devices and a vast marketplace k...

🎁 Benefits

Amazon offers competitive salaries, stock options, generous PTO policies, and comprehensive health benefits. Employees also have access to a learning ...

🌟 Culture

Amazon's culture is driven by customer obsession and a focus on innovation. The company encourages employees to think big and move fast, fostering an ...

Amazon

Machine Learning Engineer Senior

AmazonCupertino - On-Site

Posted 0 year ago🏛️ On-SiteSeniorMachine Learning Engineer📍 Cupertino💰 $151,300 - $151,300 / yearly
Apply Now →

Overview

Amazon is hiring a Senior Machine Learning Engineer for the AWS Neuron Distributed Training team. You'll develop and optimize distributed training solutions for large-scale ML models using Python and various libraries. This role requires expertise in machine learning and cloud technologies.

Job Description

Who you are

You have 5+ years of experience in machine learning engineering, particularly with distributed training of large models. Your expertise includes working with frameworks like PyTorch and JAX, and you understand the intricacies of optimizing models for performance on custom silicon. You are proficient in Python and have experience with libraries such as Deepspeed and Nemo, which are essential for building efficient distributed training solutions.

You thrive in collaborative environments, working alongside chip architects and compiler engineers to create innovative solutions. Your strong analytical skills allow you to tackle complex technical challenges, and you have a proven track record of delivering results that drive significant impact. You are passionate about advancing the field of machine learning and are eager to contribute to cutting-edge projects.

Desirable

Experience with AWS services and cloud-based solutions is a plus. Familiarity with large language models like GPT and Llama, as well as vision transformers, will help you excel in this role. You are also open to learning new technologies and methodologies that can enhance your work and the team's output.

What you'll do

In this role, you will lead efforts to integrate distributed training support into PyTorch and JAX, utilizing the Neuron compiler and runtime stacks. Your primary responsibility will be to optimize machine learning models to achieve peak performance on AWS custom silicon, ensuring that they run efficiently and effectively. You will collaborate closely with cross-functional teams to develop and enable a wide variety of ML model families, including massive-scale models.

You will be responsible for performance tuning and enabling distributed training solutions that can handle the demands of large-scale machine learning tasks. Your work will directly contribute to the success of AWS Neuron, helping to deliver innovative cloud solutions that address complex challenges. You will also mentor junior engineers, sharing your knowledge and expertise to foster a culture of learning and growth within the team.

What we offer

Amazon provides a dynamic work environment where you can make a significant impact on the future of machine learning and cloud computing. You will have access to cutting-edge technologies and the opportunity to work on projects that push the boundaries of what is possible. We offer competitive compensation packages, including equity and comprehensive benefits, to ensure that you are well-supported in your role. Join us and be part of a team that is changing the world through technology.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Amazon.

Similar Jobs You Might Like

Based on your interests and this role

Amazon

Machine Learning Engineer

Amazon📍 Cupertino - On-Site

Amazon is hiring a Senior Machine Learning Engineer to develop and optimize distributed training solutions for AWS Neuron. You'll work with technologies like Python, PyTorch, and AWS to enhance performance for large-scale ML models. This position requires experience in training large models and distributed systems.

🏛️ On-SiteSenior
8 months ago
Amazon

Machine Learning Engineer

Amazon📍 Cupertino - On-Site

Amazon is hiring a Senior Machine Learning Engineer to develop and optimize distributed training solutions for large-scale ML models. You'll work with AWS Trainium and frameworks like Hugging Face and TensorFlow. This position requires expertise in machine learning and distributed systems.

🏛️ On-SiteSenior
3w ago
Amazon

Machine Learning Engineer

Amazon📍 Cupertino - On-Site

Amazon is hiring a Senior Machine Learning Engineer to develop and optimize software solutions for AWS Neuron. You'll work with AWS services and machine learning frameworks to build scalable applications. This position requires expertise in Python and machine learning technologies.

🏛️ On-SiteSenior
3 months ago
Amazon

Machine Learning Engineer

Amazon📍 Cupertino - On-Site

Amazon is hiring a Machine Learning Engineer to develop and optimize large-scale ML model training solutions. You'll work with AWS Trainium and collaborate with cross-functional teams to deliver impactful machine learning products. This position requires experience in machine learning frameworks and AWS technologies.

🏛️ On-SiteMid-Level
1 month ago
Amazon

Machine Learning Engineer

Amazon📍 Cupertino - On-Site

Amazon is hiring a Machine Learning Engineer for the AWS Neuron team to develop and optimize distributed training solutions for large-scale machine learning models. You'll work with technologies like Python, AWS, and PyTorch. This position requires experience in training large models and performance tuning.

🏛️ On-SiteMid-Level
4 months ago