Site Reliability Engineers

Site Reliability Engineers - Multiple Openings


🚀 The Role:

You will join a team working with Observability, Escalations, Post-mortems, Correction of Errors, and other practices that will contribute to the company's goal of cloud resiliency. You will be responsible for driving processes around reliability, best practices, cultural change, and enforcement of these practices.


🔧 Main Responsibilities:

  • Honor and practice the Resiliency pillar of the Well Architected Framework in all tasks and responsibilities
  • Conduct Chaos Engineering experiments and relevant exercises to improve resiliency and fault-tolerance
  • Research workloads for migrating to the cloud with minimal disruption and impact
  • Monitor cloud migration projects to ensure seamless transitions
  • Design, consult, re-platform, and re-factor the observability of current cloud infrastructure
  • Coordinate with other IT departments and teams regarding observability for both individual and organizational needs
  • Regularly assess cloud deployments for compliance with the company’s standards and best practices
  • Investigate and correct areas where observability is lagging
  • Stay up to date and provide training on new and current technologies, services, tools, methodologies, and practices
  • Occasionally participate in service capacity planning, software performance analysis, and system tuning
  • Mentor colleagues in technical skills and knowledge
  • Analyze, oversee, and remediate the company’s resiliency
  • Participate in on-call support 24/7 based on a rotation schedule

Main Requirements:

  • BSc/MSc degree in Computer Science or related field
  • 5+ years of cloud services experience, with at least 3 years on AWS cloud
  • 3+ years of experience in SRE or a similar role
  • Experience with monitoring, APM, logging, and notification tools
  • Familiarity with incident, problem and change management procedures and practices
  • Advanced knowledge of SRE practices and methods
  • Understanding and practice of Service Levels
  • Strong troubleshooting skills and the ability to mentor others
  • Extensive experience with Kubernetes and related technologies, services, and ecosystem
  • Advanced knowledge of CI/CD, Infrastructure as Code (IaC) concepts and tools, especially HCL Terraform and AWS CloudFormation
  • Experience with versioning tools like Git
  • Strong organizational and documentation skills
  • Exceptional time management and research abilities
  • Advanced Linux, networking, and scripting skills

🌟 The Following Will Be Considered an Advantage:

  • Experience with platforms like Kafka (MSK)
  • Experience with RDBMSs, particularly Postgres and MySQL
  • Knowledge of scripting languages such as Python or Go

🎁 Benefit From:

  • Attractive remuneration package and perks
  • Intellectually stimulating work environment
  • Continuous personal development and international training opportunities

💡 The Hiring Experience: What Awaits You

  • Show Your Skills – Online Technical Challenge
  • Let’s Connect – Intro Chat with Talent Acquisition
  • Deep Dive – First Interview with Your Future Team
  • Final Connection – Final Interview

🔒 All applications will be treated with strict confidentiality!

Reference :xm-lever+XM-Careers-Site-Reliability-Engineers

Skills

Backend
Go
Python
Data
Kafka
Mysql
Ops
Kubernetes
Terraform
Project Management
Management
Tooling
Git

Similar Jobs

brand cover
automation java qa team lead
XM CareersPermanent contract
XM CareersPermanent contract
Greece& 3 others
No remote work
≥ 5 years experience
Java
Kubernetes
Docker
2 hours ago
brand cover
senior application devops engineers
XM CareersPermanent contract
XM CareersPermanent contract
Dehradun, IN& 3 others
& Remote
Hybrid remote
≥ 5 years experience
Terraform
Gitlab
Jenkins
1 day ago
brand cover
email deliverability engineer
XM CareersPermanent contract
XM CareersPermanent contract
Nicosia, CY& 3 others
No remote work
≥ 1 year experience
Management
1 day ago
brand cover
machine learning manager
PennylanePermanent contract
PennylanePermanent contract
100% Remote work
Juniors accepted
Machine Learning
Make
Management
6 days ago
brand cover
patch & vulnerability management specialist
XM CareersPermanent contract
XM CareersPermanent contract
Nicosia, CY& 4 others
No remote work
≥ 2 years experience
Management
Azure
14 days ago
brand cover
cloud devops solutions specialist
XM CareersPermanent contract
XM CareersPermanent contract
Nicosia, CY& 2 others
No remote work
≥ 3 years experience
Management
Progress
Kubernetes
14 days ago
brand cover
data engineering manager
PennylanePermanent contract
PennylanePermanent contract
100% Remote work
Juniors accepted
Make
Management
Less
15 days ago
brand cover
senior application security engineer
PennylanePermanent contract
PennylanePermanent contract
Paris, FR& 11 others
& Remote
Hybrid remote
Juniors accepted
Make
Management
Ruby
15 days ago
brand cover
information security engineer
XM CareersPermanent contract
XM CareersPermanent contract
Nicosia, CY& 4 others
No remote work
≥ 3 years experience
Management
Firewall
23 days ago