Non renseigné
* Guarantee the stability, performance, and availability of services in both production and non-production environments, fostering a reliability-driven culture across delivery teams. * Act as the gatekeeper of product evolution towards production, ensuring quality always matches customer expectations. * Collaborate with Product, Tech, and Platform teams to maintain the right balance between innovation, velocity, and operational robustness. * Define, monitor, and report Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets across environments to ensure measurable reliability per application domain. * Ensure robust observability, monitoring, and alerting frameworks are implemented and continuously improved. * Oversee operational readiness of each release, ensuring production stability through cross-functional coordination with Product and Tech teams. * Veto product delivery when quality measured is not in line with customer expectations. * Manage incident response, root cause analysis, and post-mortem reviews to ensure accountability and continuous improvement per application domain. * Collaborate with Core Platform and Observability & FinOps teams to strengthen system resilience, optimize cost efficiency, and maintain platform performance. * Report reliability status, risks, and improvement actions to Agile Release Managers and domain leadership to ensure alignment across Agile Release Trains (ARTs). * Actively participate in the Agile Release Train as the voice of reliability and operations, supporting delivery cadence and quality.
Do you have experience in SaaS?, Technical * Strong expertise in Site Reliability Engineering (SRE) within SaaS or cloud-native environments. * Deep understanding of system observability, automation, and monitoring frameworks. * Experience defining and managing SLOs, SLIs, and error budgets in collaboration with engineering teams. * Proficiency in DevSecOps, CI/CD pipelines, and continuous monitoring practices. Functional * Solid experience in incident management, post-mortem analysis, and operational readiness. * Proven ability to coordinate reliability initiatives across Product, Tech, and Platform domains. * Strong focus on performance metrics, root cause prevention, and operational governance. Soft Skills * Data-driven mindset and analytical rigor in reliability tracking., * Strong expertise in Site Reliability Engineering (SRE) within SaaS or cloud-native environments * Deep understanding of system observability, automation, and monitoring frameworks * Experience defining and managing SLOs, SLIs, and error budgets in collaboration with engineering teams * Proficiency in DevSecOps, CI/CD pipelines, and continuous monitoring practices * Solid experience in incident management, post-mortem analysis, and operational readiness * Proven ability to coordinate reliability initiatives across Product, Tech, and Platform domains * Strong focus on performance metrics, root cause prevention, and operational governance
Partager cette mission via
Explore ces missions en lien avec tes compétences et ton expérience.