ScalewayPublié il y a environ 1 mois
Logo Scaleway

Senior Linux System Administrator/SRE

Scaleway 🌐

SRE - Virtualization 💻

About Scaleway

  • Founded in 1999, Scaleway is the cloud subsidiary of Iliad Group, a leading European telecommunications company.
  • Mission: Foster a more responsible digital industry by helping developers and businesses create, deploy, and adapt applications to any infrastructure.
  • 25,000+ customers choose Scaleway for its multi-AZ redundancy, seamless user experience, carbon-neutral datacenters, and native multi-cloud management tools.
  • Products include fully managed solutions for bare metal, containerization, and serverless architectures, offering a responsible choice in cloud computing.
  • Join a dynamic team of nearly 600 diverse professionals in a stimulating international environment that combines technical excellence, creativity, and collaboration.

About the Job

  • Ensure reliable delivery of virtual machines and bare metal servers to users worldwide.
  • Collaborate with Engineering Manager Emerick Mounoury.
  • Strong background in Python development and system administration, with DevOps experience and SRE practices.
  • Systems and monitoring tools are constantly evolving, requiring adaptability.

Minimum Qualifications

  • Experience in system programming using Python, Bash, Go, or similar languages.
  • Demonstrated ability to troubleshoot production system failures.
  • Positive mindset and desire to work collaboratively.
  • Passion for automation and incremental tooling improvements.
  • Experience with Linux systems (Ubuntu server) and virtualization (QEMU/KVM).
  • Good understanding of computer networks (TCP/IP, DNS, load balancing, IPv6, firewall, BGP, network virtualization).
  • Good command of English.

Preferred Qualifications

  • Ability to meticulously identify and solve bugs in any codebase.
  • Experience with infrastructure-as-code and continuous deployment.
  • Experience with physical hardware automation.
  • Experience with monitoring & logging systems.
  • Experience managing relational databases.
  • Knowledge of at least one cloud platform and related use cases.
  • Experience as an OSS contributor and/or maintainer.
  • Knowledge in HPC (High Performance Computing).

Responsibilities

  • Create/optimize tools & documentation to identify, diagnose, and solve production incidents, automating as much as possible.
  • Troubleshoot high-impact issues by collaborating with multiple Engineering teams (Storage, Network, Hardware).
  • Take on-call responsibilities, mitigate production issues, and respond to customers in real time.
  • Ensure high-quality service for customers using observability and monitoring technologies.
  • Manage the life cycle of hypervisors in production and participate in the fleet-wide migration plan.
  • Empower teammates to swiftly integrate and deploy software components across the virtualization system.
  • Implement best stability, resiliency, scalability, security, and performance practices across the virtualization system.

Our Technical Stack

  • Python/Bash
  • RabbitMQ + Celery
  • PostgreSQL + SQLAlchemy
  • HA Proxy, Nginx, REST APIs / Flask
  • S3 API
  • Sentry, Prometheus, Grafana, ElasticSearch, Fluentd, Kibana
  • Ansible, AWX, Foreman
  • GitLab, Nexus
  • Ubuntu, Debian, CentOS
  • Jira, Confluence, Slack, GSuite

Location

  • Paris or Lille, France

Recruitment Process

  • Screening call (30 mins) with the recruiter
  • Manager Interview (45 mins)
  • Technical Interviews (1h30mins)
  • HR Interview (45 mins)
  • Offer sent within 48 hours

Don't meet all the criteria? Apply anyway! We're open to exceptional candidates.

🌐 Scaleway | Scaleway Blog | Scaleway sur X

Skills

Back-end
Python
Flask
Go
SQLAlchemy
Tooling
Bash
Gitlab
RabbitMQ
Sentry
Gestion de projet
Confluence
Jira
Management
Slack
Cloud
Cloud Computing
Prometheus
Serverless
Data
Elasticsearch
Grafana
PostgreSQL
Ops
Ansible
Nginx
Autres
Celery
Linux System
Sécurité
Firewall