Machine Learning Engineer Job at Evolve Group, San Francisco, CA

MGY2dSsvUW9wbCswbVJmdm8yTHRIMTh1eEE9PQ==
  • Evolve Group
  • San Francisco, CA

Job Description

Machine Learning Engineer

Tech start-up

San Fransisco based

We’ve partnered with one of the most ambitious and technically rigorous AI research labs in the world. Based in San Francisco, this team is building foundation models entirely from scratch.

They are now hiring ML Infrastructure Engineers to design and scale the systems that power large-scale, distributed model training. If you’ve built infrastructure that runs across hundreds of GPUs, thrive under technical complexity, and want to work side-by-side with elite AI researchers — this is the role.

Key Responsibilities:

  • Build and scale distributed training systems for large-scale model training across LLMs, vision, and robotics.
  • Set up and run large-scale training across many GPUs using tools like Kubernetes, DeepSpeed, and FSDP.
  • Troubleshoot system issues (GPU errors, network problems) and build tools to monitor and recover from failures.
  • Optimize PyTorch pipelines, sharding, and sampling strategies.
  • Collaborate closely with researchers to support novel model training at scale.

Requirements:

  • 3–15 years in ML infrastructure, systems, or research engineering roles.
  • Proven experience scaling distributed training for large models.
  • Strong with PyTorch, CUDA, NCCL, Kubernetes.
  • Familiar with setting up distributed training clusters.
  • Deep understanding of PyTorch dataloaders, data sharding, and sampling.
  • Strong communicator with a collaborative, mission-driven mindset.

This is a fully in-person role based in San Francisco , it's ideal for engineers excited to build at the edge of what's possible in AI.

Job Tags

Immediate start,

Similar Jobs

Saracina Vineyards

Property/Facilities Manager Job at Saracina Vineyards

 ...caves in Mendocino County. Website: Position Overview: We are seeking a highly motivated and experienced Winery Property/Facilities Manager to oversee the maintenance, repair, and operational efficiency of all buildings, vineyards, and infrastructure at... 

Southern Health Partners

Correctional LPN Nights Job at Southern Health Partners

 ...in you. Southern Health Partners has been a leading provider of correctional healthcare for over 30 years. Our experience offers you a...  ...at: Location: Montgomery County Jail Open Position: LPN Schedule: 3/4 12hr Night Shift Rotation; EVO Weekend... 

Akkodis

Scrum Master Job at Akkodis

 ...Job Description Role : Sr Scrum Master Location : Regina, SK Hybrid (2-3 Days Onsite) Employment Type : Contract - May 30...  ...facilitating communication between teams, etc. Required Experience : A bachelors degree in business administration or... 

Palladio AI

Founding Data Scientists and Machine Learning Engineers Job at Palladio AI

 ...Seeking Founding Data Scientists and Machine Learning Engineers Palladio AI | San Francisco Bay Area (Hybrid) Imagine Multiplying Your...  ...used by millions of people. Now, picture the idea of leveling up the entire app ecosystem by scaling your impact across... 

BYROE

Graphic Designer Job at BYROE

Byroe is a superfood-powered skincare brand rooted in upcycling, sustainability, and innovation. Were looking for a highly creative, detail-oriented Graphic Designer with a passion for beauty, storytelling, and digital design to join our growing team. If you thrive...