OUR SECTORS
At European Tech Recruit, our sectors cover a wide range of industries within the field of technology.
tech jobs in the US?
Looking for
tech jobs in the US?
At European Recruitment, our sectors cover a wide range of industries within the field of technology
At European Recruitment, our sectors cover a wide
range of industries within the field of technology
At European Recruitment, our sectors cover a wide
range of industries within the field of technology
Client services
Learn about the range of client services we offer at European Tech Recruit, and browse through our case sudies.
tech jobs in the US?
Looking for
tech jobs in the US?
At European Recruitment, our sectors cover a wide range of industries within the field of technology
About us
Learn about European Tech Recruit's mission, values, our team, and our commitment to DE&I.
tech jobs in the US?
Looking for
tech jobs in the US?
At European Recruitment, our sectors cover a wide range of industries within the field of technology
Site Reliability Engineer
I’m partnering with a profitable, developer-focused infrastructure startup building an open-source platform that standardises access to 100+ large language model APIs. Their tooling is widely adopted by both fast-growing startups and large enterprises, powering mission-critical AI workloads at scale. Backed by leading investors and already generating strong revenue, they are now hiring a Site Reliability Engineer to take ownership of system performance and uptime for key enterprise customers.
This is a highly hands-on role with significant ownership, focused on debugging complex production issues, improving system reliability, and working directly with customers operating at scale.
The position is either onsite in SF Bay Area or fully remote within the US.
Key Responsibilities:
- Work directly with enterprise customers to diagnose and resolve production issues in real time.
- Own system reliability and performance, including debugging memory leaks, connection pooling issues, and other critical system-level problems.
- Proactively identify and address reliability risks to prevent future incidents.
- Profile systems, run benchmarks, and optimize for latency and throughput.
- Collaborate closely with founders and engineering teams in a fast-paced environment to improve overall system robustness and customer experience.
Key Qualifications:
- 1–4 years of experience in production engineering, site reliability engineering, or similar roles focused on debugging and fixing system-level issues.
- Hands-on experience identifying and resolving memory leaks in production systems.
- Experience operating and debugging large-scale systems (e.g., handling 1k+ requests per second).
- Strong programming skills; experience with systems languages such as C or Rust is a plus.
- Familiarity with infrastructure and observability tools such as PostgreSQL, Redis, Kubernetes, Prometheus, and Grafana.
- Strong academic background in Computer Science, ideally with coursework in operating systems, compilers, or performance engineering (top-tier programs preferred).
- Demonstrated ownership mindset — experience fixing issues directly rather than only identifying or escalating them.
Apply Now
By applying to this role, you acknowledge that we may collect, store, and process your personal data on our systems.
For more information, please refer to our
Privacy
Notice