Site Reliability Engineer

Recruitment Consultant

Bliss Verna

Contact Details

bv@eu-recruit.com +44 (0) 3333 078 366

Posted

1 day ago

What we’re looking for

We need someone with 3+ years of experience in SRE, Production Engineering, or Infrastructure roles who has built and owned automation, observability, and tooling systems end-to-end in production. You should be comfortable working across a multi-cloud environment with strong distributed systems instincts and a track record of improving platform reliability and reducing operational burden. Bonus points if you have exposure to GPU/AI-ML infrastructure or accelerated compute workloads.

What you’ll do

Build and own the observability stack – dashboards, alerts, and distributed tracing using tools like OpenTelemetry, Prometheus, and Grafana – to provide high-granularity visibility into Mithril’s multi-cloud GPU orchestration platform
Define and implement SLIs and SLOs across Mithril’s API layer and internal orchestration services, partnering with Product and Platform teams to ensure new features are designed for operability from the start
Develop automation in Python (or Go) to eliminate repetitive operational tasks — from provider API reconciliation to automated health checks and capacity rebalancing
Maintain and extend Terraform/Pulumi modules and Kubernetes configurations to manage a growing multi-cloud provider footprint
Participate in on-call rotation, drive rigorous root cause analysis for production incidents, and implement durable fixes to prevent recurrence
Work directly with the founding engineering team to shape how infrastructure engineering operates as the company scales — this is a greenfield opportunity to build the playbook, not inherit a rigid system

Industry

AI & Machine Learning

Contract Type

Permanent

Location

United States

City

san francisco

Work Model

On-Site

Apply for a vacancy

Apply Now

By applying to this role, you acknowledge that we may collect, store, and process your personal data on our systems.

For more information, please refer to our
Privacy Notice

Name

Phone

Location

Message

Upload CV:

Choose file

Formats: Word, PDF (max. size: 20MB)

Subscribe for industry highlights.

Send Application

Site Reliability Engineer

Apply Now

Other relevant jobs

Senior Compiler Researcher / Architect

Site Reliability Engineer

Senior Thermal Engineer

Multimodal Content Intelligence Expert – Permanent

Senior Data Centre Network Engineer

Senior Engineer – AI & Agentic Sandbox

Founding Engineer

Model Based Tool Engineer

SLAM Engineer – XR Labs

Senior Platform Engineer – Customer Facing

Founding Silicon Engineer

ML Research Engineer

Technical Lead – AI and Computing Systems

Ads Recommendation Expert

Senior Research Engineer in Artificial Intelligence and Embedded Systems

LLM Engineer

Software Engineer (Frontend)

Product Engineer (Python)

Senior LLM Agent Researcher – Contract Role

Simulation Engineer

Simulation Platform Engineer

Member of Technical Staff

Frontend CFD Visualization Engineer

Research Scientist / Founding Member – Agentic AI

System Administrator

Software Engineer

3D Machine Learning Engineer

Senior Researcher: AI Computing Systems

Founding Engineer (Full Stack)

Sr MLOps Enigneer

Head of Global Marketing & Communications

Founding Frontend Software Engineer

Model Based Developer – Senior Expert

Programmatic Bidding Data Scientist – Contractor

Systems Engineer (ML/C++/C)

Senior Researcher – LLM System Architecture

Fullstack Software Engineer

Senior DevSecOps Engineer

Physics Simulation Team Lead

Software Engineer (C++ Systems)

M/L Compiler Engineer

US – Enterprise Account Executive (AI / LLM / Infrastructure)

Embedded Software Senior Engineer –SoC Firmware

Senior Deep Learning Researcher – Model Efficiency

DataOps & MLOps Engineer

Infrastructure & DevOps Engineer

Technical Leader: AI Systems Architecture

Deep Learning & Computer Vision Engineer

Fullstack Web Developer

C++ CUDA Engineer

Neural Rendering & Graphics Engineer

3D Computer Vision Engineer

Principal AI Researcher

AI Strategy Consultant (Contractor)

Looking for tech jobs in the US?

Looking for
tech jobs in the US?