OUR SECTORS
At European Tech Recruit, our sectors cover a wide range of industries within the field of technology.
tech jobs in the US?
Looking for
tech jobs in the US?
At European Recruitment, our sectors cover a wide range of industries within the field of technology
At European Recruitment, our sectors cover a wide
range of industries within the field of technology
At European Recruitment, our sectors cover a wide
range of industries within the field of technology
Client services
Learn about the range of client services we offer at European Tech Recruit, and browse through our case sudies.
tech jobs in the US?
Looking for
tech jobs in the US?
At European Recruitment, our sectors cover a wide range of industries within the field of technology
About us
Learn about European Tech Recruit's mission, values, our team, and our commitment to DE&I.
tech jobs in the US?
Looking for
tech jobs in the US?
At European Recruitment, our sectors cover a wide range of industries within the field of technology
Infrastructure & DevOps Engineer
Position Overview
We’re looking for an Infrastructure & DevOps Engineer to build and maintain the foundation of our compute infrastructure. You’ll work on hardware provisioning, networking, container orchestration, and deployment pipelines across cloud and on-premise environments. This role focuses on making our multi-GPU clusters reliable, our deployments reproducible, and our developers productive.
Main Responsibilities
- Provision, configure, and maintain heterogeneous compute clusters (CPU/GPU) across multiple physical locations
- Implement dynamic compute and storage provisioning based on workload demands
- Design storage solutions at both hardware and software level (NAS, distributed filesystems, storage tiering)
- Implement and manage container orchestration systems (Kubernetes, Docker) for development and production workloads
- Design and maintain infrastructure as code using tools like Terraform and Ansible
- Build and optimize job scheduling and resource allocation systems (Slurm, Kubernetes)
- Set up monitoring, alerting, and observability infrastructure (Prometheus, Grafana, IPMI)
- Profile and optimize system-level performance: GPU utilization, memory bandwidth, I/O throughput, network latency
- Manage networking, VPNs, and secure access across distributed systems
- Handle reliability concerns: hardware failure detection, job checkpointing, disaster recovery
Qualifications
- Strong Linux system administration knowledge
- Experience with containerization (Docker) and orchestration (Kubernetes)
- Knowledge of infrastructure as code (Terraform, Ansible)
- Experience with HPC clusters and job scheduling (Slurm)
- Familiarity with monitoring solutions (Prometheus, Grafana)
- Understanding of networking principles and implementation
- Experience with hardware infrastructure management (IPMI, BMC, server maintenance)
- Knowledge of storage systems design (NFS, Ceph, distributed filesystems)
Nice to Have
- Experience with cloud services (AWS, or others)
- Familiarity with bare-metal provisioning (MaaS)
Apply Now
By applying to this role, you acknowledge that we may collect, store, and process your personal data on our systems.
For more information, please refer to our
Privacy
Notice